When a registration response from cloudflared gets lost on it's way back to the edge, the edge service will retry and send another registration request. Since cloudflared already has bound the local UDP socket for the provided request id, we want to re-send the registration response.
There are three types of retries that the edge will send:
1. A retry from the same QUIC connection index; cloudflared will just respond back with a registration response and reset the idle timer for the session.
2. A retry from a different QUIC connection index; cloudflared will need to migrate the current session connection to this new QUIC connection and reset the idle timer for the session.
3. A retry to a different cloudflared connector; cloudflared will eventually time the session out since no further packets will arrive to the session at the original connector.
Closes TUN-8709
The current supervisor serves the quic connection by performing all of the following in one method:
1. Dial QUIC edge connection
2. Initialize datagram muxer for UDP sessions and ICMP
3. Wrap all together in a single struct to serve the process loops
In an effort to better support modularity, each of these steps were broken out into their own separate methods that the supervisor will compose together to create the TunnelConnection and run its `Serve` method.
This also provides us with the capability to better interchange the functionality supported by the datagram session manager in the future with a new mechanism.
Closes TUN-8661
For macOS, we want to set the DF bit for the UDP packets used by the QUIC
connection; to achieve this, you need to explicitly set the network
to either "udp4" or "udp6". When determining which network type to pick
we need to use the edge IP address chosen to align with what the local
IP family interface we will use. This means we want cloudflared to bind
to local interfaces for a random port, so we provide a zero IP and 0 port
number (ex. 0.0.0.0:0). However, instead of providing the zero IP, we
can leave the value as nil and let the kernel decide which interface and
pick a random port as defined by the target edge IP family.
This was previously broken for IPv6-only edge connectivity on macOS and
all other operating systems should be unaffected because the network type
was left as default "udp" which will rely on the provided local or remote
IP for selection.
Closes TUN-8688
Some more legacy h2mux code to be cleaned up and moved out of the way.
The h2mux.Header used in the serialization for http2 proxied headers is moved to connection module. Additionally, the booleanfuse structure is also moved to supervisor as it is also needed. Both of these structures could be evaluated later for removal/updates, however, the intent of the proposed changes here is to remove the dependencies on the h2mux code and removal.
Approved-by: Chung-Ting Huang <chungting@cloudflare.com>
Approved-by: Luis Neto <lneto@cloudflare.com>
Approved-by: Gonçalo Garcia <ggarcia@cloudflare.com>
MR: https://gitlab.cfdata.org/cloudflare/tun/cloudflared/-/merge_requests/1576
Whenever cloudflared receives a SIGTERM or SIGINT it goes into graceful shutdown mode, which unregisters the connection and closes the control stream. Unregistering makes it so we no longer receive any new requests and makes the edge close the connection, allowing in-flight requests to finish (within a 3 minute period).
This was working fine for http2 connections, but the quic proxy was cancelling the context as soon as the controls stream ended, forcing the process to stop immediately.
This commit changes the behavior so that we wait the full grace period before cancelling the request
Revert "TUN-8621: Fix cloudflared version in change notes."
Revert "PPIP-2310: Update quick tunnel disclaimer"
Revert "TUN-8621: Prevent QUIC connection from closing before grace period after unregistering"
Revert "TUN-8484: Print response when QuickTunnel can't be unmarshalled"
Revert "TUN-8592: Use metadata from the edge to determine if request body is empty for QUIC transport"
Whenever cloudflared receives a SIGTERM or SIGINT it goes into graceful shutdown mode, which unregisters the connection and closes the control stream. Unregistering makes it so we no longer receive any new requests and makes the edge close the connection, allowing in-flight requests to finish (within a 3 minute period).
This was working fine for http2 connections, but the quic proxy was cancelling the context as soon as the controls stream ended, forcing the process to stop immediately.
This commit changes the behavior so that we wait the full grace period before cancelling the request
Adds new suite of metrics to capture the following for capnp rpcs operations:
- Method calls
- Method call failures
- Method call latencies
Each of the operations is labeled by the handler that serves the method and
the method of operation invoked. Additionally, each of these are split
between if the operation was called by a client or served.
Since legacy tunnels have been removed for a while now, we can remove
many of the capnp rpc interfaces that are no longer leveraged by the
legacy tunnel registration and authentication mechanisms.
Combines the tunnelrpc and quic/schema capnp files into the same module.
To help reduce future issues with capnp id generation, capnpids are
provided in the capnp files from the existing capnp struct ids generated
in the go files.
Reduces the overall interface of the Capnp methods to the rest of
the code by providing an interface that will handle the quic protocol
selection.
Introduces a new `rpc-timeout` config that will allow all of the
SessionManager and ConfigurationManager RPC requests to have a timeout.
The timeout for these values is set to 5 seconds as non of these operations
for the managers should take a long time to complete.
Removed the RPC-specific logger as it never provided good debugging value
as the RPC method names were not visible in the logs.
If cloudflared was unable to register the UDP session with the
edge, the socket would be left open to be eventually closed by the
OS, or garbage collected by the runtime. Considering that either of
these closes happened significantly after some delay, it was causing
cloudflared to hold open file descriptors longer than usual if continuously
unable to register sessions.
## Summary
To prevent bad eyeballs and severs to be able to exhaust the quic
control flows we are adding the possibility of having a timeout
for a write operation to be acknowledged. This will prevent hanging
connections from exhausting the quic control flows, creating a DDoS.
In the streambased origin proxy flow (example ssh over access), there is
a chance when we do not flush on http.ResponseWriter writes. This PR
guarantees that the response writer passed to proxy stream has a flusher
embedded after writes. This means we write much more often back to the
ResponseWriter and are not waiting. Note, this is only something we do
when proxyHTTP-ing to a StreamBasedOriginProxy because we do not want to
have situations where we are not sending information that is needed by
the other side (eyeball).
I deliberately kept this as an unregistertimeout because that was the
intent. In the future we could change this to a UDPConnConfig if we want
to pass multiple values here.
The idea of this PR is simply to add a configurable unregister UDP
timeout.
The lucas-clemente/quic-go package moved namespaces and our branch
went stale, this new fork provides support for the new quic-go repo
and applies the max datagram frame size change.
Until the max datagram frame size support gets upstreamed into quic-go,
this can be used to unblock go 1.20 support as the old
lucas-clemente/quic-go will not get go 1.20 support.
Going forward, the only protocols supported will be QUIC and HTTP2,
defaulting to QUIC for "auto". Selecting h2mux protocol will be forcibly
upgraded to http2 internally.
This changes fixes a bug where cloudflared was not propagating errors
when proxying the body of an HTTP request.
In a situation where we already sent HTTP status code, the eyeball would
see the request as sucessfully when in fact it wasn't.
To solve this, we need to guarantee that we produce HTTP RST_STREAM
frames.
This change was applied to both http2 and quic transports.
This PR changes protocol initialization of the other N connections to be
the same as the one we know the initial tunnel connected with. This is
so we homogenize connections and not lead to some connections being
QUIC-able and the others not.
There's also an improvement to the connection registered log so we know
what protocol every individual connection connected with from the
cloudflared side.
Remove send and return methods from Funnel interface. Users of Funnel can provide their own send and return methods without wrapper to comply with the interface.
Move packet router to ingress package to avoid circular dependency