Corrects the pattern of using errgroup's and context cancellation to simplify the logic for canceling extra routines for the QUIC connection. This is because the extra context cancellation is redundant with the fact that the errgroup also cancels it's own provided context when a routine returns (error or not).
For the datagram handler specifically, since it can respond faster to a context cancellation from the QUIC connection, we wrap the error before surfacing it outside of the QUIC connection scope to the supervisor. Additionally, the supervisor will look for this error type to check if it should retry the QUIC connection. These two operations are required because the supervisor does not look for a context canceled error when deciding to retry a connection. If a context canceled from the datagram handler were to be returned up to the supervisor on the initial connection, the cloudflared application would exit. We want to ensure that cloudflared maintains connection attempts even if any of the services on-top of a QUIC connection fail (datagram handler in this case).
Additional logging is also introduced along these paths to help with understanding the error conditions from the specific handlers on-top of a QUIC connection.
Related CUSTESC-53681
Closes TUN-9610
During a refresh of the supported features via the DNS TXT record,
cloudflared would update the internal feature list, but would not
propagate this information to the edge during a new connection.
This meant that a situation could occur in which cloudflared would
think that the client's connection could support datagram V3, and
would setup that muxer locally, but would not propagate that information
to the edge during a register connection in the `ClientInfo` of the
`ConnectionOptions`. This meant that the edge still thought that the
client was setup to support datagram V2 and since the protocols are
not backwards compatible, the local muxer for datagram V3 would reject
the incoming RPC calls.
To address this, the feature list will be fetched only once during
client bootstrapping and will persist as-is until the client is restarted.
This helps reduce the complexity involved with different connections
having possibly different sets of features when connecting to the edge.
The features will now be tied to the client and never diverge across
connections.
Also, retires the use of `support_datagram_v3` in-favor of
`support_datagram_v3_1` to reduce the risk of reusing the feature key.
The `dv3` TXT feature key is also deprecated.
Closes TUN-9291
## Summary
The new endpoint returns the current information to be used when calling the diagnostic procedure.
This also adds:
- add indexed connection info and method to extract active connections from connTracker
- add edge address to Event struct and conn tracker
- remove unnecessary event send
- add tunnel configuration handler
- adjust cmd and metrics to create diagnostic server
Closes TUN-8728
Whenever cloudflared receives a SIGTERM or SIGINT it goes into graceful shutdown mode, which unregisters the connection and closes the control stream. Unregistering makes it so we no longer receive any new requests and makes the edge close the connection, allowing in-flight requests to finish (within a 3 minute period).
This was working fine for http2 connections, but the quic proxy was cancelling the context as soon as the controls stream ended, forcing the process to stop immediately.
This commit changes the behavior so that we wait the full grace period before cancelling the request
Revert "TUN-8621: Fix cloudflared version in change notes."
Revert "PPIP-2310: Update quick tunnel disclaimer"
Revert "TUN-8621: Prevent QUIC connection from closing before grace period after unregistering"
Revert "TUN-8484: Print response when QuickTunnel can't be unmarshalled"
Revert "TUN-8592: Use metadata from the edge to determine if request body is empty for QUIC transport"
Whenever cloudflared receives a SIGTERM or SIGINT it goes into graceful shutdown mode, which unregisters the connection and closes the control stream. Unregistering makes it so we no longer receive any new requests and makes the edge close the connection, allowing in-flight requests to finish (within a 3 minute period).
This was working fine for http2 connections, but the quic proxy was cancelling the context as soon as the controls stream ended, forcing the process to stop immediately.
This commit changes the behavior so that we wait the full grace period before cancelling the request
This PR changes protocol initialization of the other N connections to be
the same as the one we know the initial tunnel connected with. This is
so we homogenize connections and not lead to some connections being
QUIC-able and the others not.
There's also an improvement to the connection registered log so we know
what protocol every individual connection connected with from the
cloudflared side.
cloudflared falls back aggressively to HTTP/2 protocol if a connection
attempt with QUIC failed. This was done to ensure that machines with UDP
egress disabled did not stop clients from connecting to the cloudlfare
edge. This PR improves on that experience by having cloudflared remember
if a QUIC connection was successful which implies UDP egress works. In
this case, cloudflared does not fallback to HTTP/2 and keeps trying to
connect to the edge with QUIC.
cloudflared falls back aggressively to HTTP/2 protocol if a connection
attempt with QUIC failed. This was done to ensure that machines with UDP
egress disabled did not stop clients from connecting to the cloudlfare
edge. This PR improves on that experience by having cloudflared remember
if a QUIC connection was successful which implies UDP egress works. In
this case, cloudflared does not fallback to HTTP/2 and keeps trying to
connect to the edge with QUIC.
This adds various bug fixes when investigating why QUIC transports were
not being unregistered when they should (and only when the graceful shutdown
started).
Most of these bug fixes are making the QUIC transport implementation closer
to its HTTP2 counterpart:
- ServeControlStream is now a blocking function (it's up to the transport to handle that)
- QUIC transport then handles the control plane as part of its Serve, thus waiting for it on shutdown
- QUIC transport now returns "non recoverable" for connections with similar semantics to HTTP2 and H2mux
- QUIC transport no longer has a loop around its Serve logic that retries connections on its own (that logic is upstream)
ServeControlStream accidentally became non-blocking in the last quic
change causing stream to not be returned until a SIGTERM was received.
This change makes ServeControlStream be non-blocking for QUIC streams.