In a previous commit, we fixed a bug where the client roundtrip code
could close the request body, which in fact would be the quic.Stream,
thus closing the write-side.
The way that was fixed, prevented the client roundtrip code from closing
also read-side (the body).
This fixes that, by allowing close to only close the read side, which
will guarantee that any subsquent will fail with an error or EOF it
occurred before the close.
cloudflared falls back aggressively to HTTP/2 protocol if a connection
attempt with QUIC failed. This was done to ensure that machines with UDP
egress disabled did not stop clients from connecting to the cloudlfare
edge. This PR improves on that experience by having cloudflared remember
if a QUIC connection was successful which implies UDP egress works. In
this case, cloudflared does not fallback to HTTP/2 and keeps trying to
connect to the edge with QUIC.
cloudflared falls back aggressively to HTTP/2 protocol if a connection
attempt with QUIC failed. This was done to ensure that machines with UDP
egress disabled did not stop clients from connecting to the cloudlfare
edge. This PR improves on that experience by having cloudflared remember
if a QUIC connection was successful which implies UDP egress works. In
this case, cloudflared does not fallback to HTTP/2 and keeps trying to
connect to the edge with QUIC.
This commit guarantees that stream is only closed once the are finished
handling the stream. Without it, we were seeing closes being triggered
by the code that proxies to the origin, which was resulting in failures
to actually send downstream the status code of the proxy request to the
eyeball.
This was then subsequently triggering unexpected retries to cloudflared
in situations such as cloudflared being unable to reach the origin.
For Google's managed prometheus, it seems they reserved certain
labels for their internal service regions/locations. This causes
customers to run into issues using our metrics since our
metric: `cloudflared_tunnel_server_locations` has a `location`
label. Renaming this to `edge_location` should unblock the
conflict and usage.
The idle period is set to 5sec.
We now also ping every second since last activity.
This makes the quic.Connection less prone to being closed with
no network activity, since we send multiple pings per idle
period, and thus a single packet loss cannot cause the problem.
Setting the body to nil was rendering cloudflared to crashing with
a SIGSEGV in the odd case where the hostname accessed maps to a
TCP origin (e.g. SSH/RDP/...) but the eyeball sends a plain HTTP
request that does not go through cloudflared access (thus not wrapped
in websocket as it should).
Instead, QUIC transport now sets http.noBody in that condition, which
deals with the situation gracefully.
Until this PR, we were naively closing the quic.Stream whenever
the callstack for handling the request (HTTP or TCP) finished.
However, our proxy handler may still be reading or writing from
the quic.Stream at that point, because we return the callstack if
either side finishes, but not necessarily both.
This is a problem for quic-go library because quic.Stream#Close
cannot be called concurrently with quic.Stream#Write
Furthermore, we also noticed that quic.Stream#Close does nothing
to do receiving stream (since, underneath, quic.Stream has 2 streams,
1 for each direction), thus leaking memory, as explained in:
https://github.com/lucas-clemente/quic-go/issues/3322
This PR addresses both problems by wrapping the quic.Stream that
is passed down to the proxying logic and handle all these concerns.
This adds various bug fixes when investigating why QUIC transports were
not being unregistered when they should (and only when the graceful shutdown
started).
Most of these bug fixes are making the QUIC transport implementation closer
to its HTTP2 counterpart:
- ServeControlStream is now a blocking function (it's up to the transport to handle that)
- QUIC transport then handles the control plane as part of its Serve, thus waiting for it on shutdown
- QUIC transport now returns "non recoverable" for connections with similar semantics to HTTP2 and H2mux
- QUIC transport no longer has a loop around its Serve logic that retries connections on its own (that logic is upstream)
This does a few fixes to make sure that the QUICConnection returns from
Serve when the context is cancelled.
QUIC transport now behaves like other transports: closes as soon as there
is no traffic, or at most by grace-period. Note that we do not wait for
UDP traffic since that's connectionless by design.