A clock structure was used to help support unit testing timetravel
but it is a globally shared object and is likely unsafe to share
across tests. Reordering of the tests seemed to have intermittent
failures for the TestWaitForBackoffFallback specifically on windows
builds.
Adjusting this to be a shim inside the BackoffHandler struct should
resolve shared object overrides in unit testing.
Additionally, added the reset retries functionality to be inline with
the ResetNow function of the BackoffHandler to align better with
expected functionality of the method.
Removes unused reconnectCredentialManager.
Also update golang.org/x/net and google.golang.org/grpc to fix vulnerabilities,
although cloudflared is using them in a way that is not exposed to those risks
The lucas-clemente/quic-go package moved namespaces and our branch
went stale, this new fork provides support for the new quic-go repo
and applies the max datagram frame size change.
Until the max datagram frame size support gets upstreamed into quic-go,
this can be used to unblock go 1.20 support as the old
lucas-clemente/quic-go will not get go 1.20 support.
This PR adds ApplicationError as one of the "try_again" error types for
startfirstTunnel. This ensures that these kind of errors (which we've
seen occur when a tunnel gets rate-limited) are retried.
Going forward, the only protocols supported will be QUIC and HTTP2,
defaulting to QUIC for "auto". Selecting h2mux protocol will be forcibly
upgraded to http2 internally.
This PR does two things:
It changes how we fallback to a lower protocol: The current state
is to try connecting with a protocol. If it fails, fall back to a
lower protocol. And try connecting with that and so on. With this PR,
if we fail to connect with a protocol, we will try to connect to other
edge addresses first. Only if we fail to connect to those will we
fall back to a lower protocol.
It fixes a behaviour where if we fail to connect to an edge addr,
we keep re-trying the same address over and over again.
This PR now switches between edge addresses on subsequent connecton attempts.
Note that through these switches, it still respects the backoff time.
(We are connecting to a different edge, but this helps to not bombard an edge
address with connect requests if a particular edge addresses stops working).
This PR changes protocol initialization of the other N connections to be
the same as the one we know the initial tunnel connected with. This is
so we homogenize connections and not lead to some connections being
QUIC-able and the others not.
There's also an improvement to the connection registered log so we know
what protocol every individual connection connected with from the
cloudflared side.
Previously allowing the reconnect signal forcibly close the connection
caused a race condition on which error was returned by the errgroup
in the tunnel connection. Allowing the signal to return and provide
a context cancel to the connection provides a safer shutdown of the
tunnel for this test-only scenario.
cloudflared falls back aggressively to HTTP/2 protocol if a connection
attempt with QUIC failed. This was done to ensure that machines with UDP
egress disabled did not stop clients from connecting to the cloudlfare
edge. This PR improves on that experience by having cloudflared remember
if a QUIC connection was successful which implies UDP egress works. In
this case, cloudflared does not fallback to HTTP/2 and keeps trying to
connect to the edge with QUIC.
cloudflared falls back aggressively to HTTP/2 protocol if a connection
attempt with QUIC failed. This was done to ensure that machines with UDP
egress disabled did not stop clients from connecting to the cloudlfare
edge. This PR improves on that experience by having cloudflared remember
if a QUIC connection was successful which implies UDP egress works. In
this case, cloudflared does not fallback to HTTP/2 and keeps trying to
connect to the edge with QUIC.