The idle period is set to 5sec.
We now also ping every second since last activity.
This makes the quic.Connection less prone to being closed with
no network activity, since we send multiple pings per idle
period, and thus a single packet loss cannot cause the problem.
Setting the body to nil was rendering cloudflared to crashing with
a SIGSEGV in the odd case where the hostname accessed maps to a
TCP origin (e.g. SSH/RDP/...) but the eyeball sends a plain HTTP
request that does not go through cloudflared access (thus not wrapped
in websocket as it should).
Instead, QUIC transport now sets http.noBody in that condition, which
deals with the situation gracefully.
Until this PR, we were naively closing the quic.Stream whenever
the callstack for handling the request (HTTP or TCP) finished.
However, our proxy handler may still be reading or writing from
the quic.Stream at that point, because we return the callstack if
either side finishes, but not necessarily both.
This is a problem for quic-go library because quic.Stream#Close
cannot be called concurrently with quic.Stream#Write
Furthermore, we also noticed that quic.Stream#Close does nothing
to do receiving stream (since, underneath, quic.Stream has 2 streams,
1 for each direction), thus leaking memory, as explained in:
https://github.com/lucas-clemente/quic-go/issues/3322
This PR addresses both problems by wrapping the quic.Stream that
is passed down to the proxying logic and handle all these concerns.
This adds various bug fixes when investigating why QUIC transports were
not being unregistered when they should (and only when the graceful shutdown
started).
Most of these bug fixes are making the QUIC transport implementation closer
to its HTTP2 counterpart:
- ServeControlStream is now a blocking function (it's up to the transport to handle that)
- QUIC transport then handles the control plane as part of its Serve, thus waiting for it on shutdown
- QUIC transport now returns "non recoverable" for connections with similar semantics to HTTP2 and H2mux
- QUIC transport no longer has a loop around its Serve logic that retries connections on its own (that logic is upstream)
This does a few fixes to make sure that the QUICConnection returns from
Serve when the context is cancelled.
QUIC transport now behaves like other transports: closes as soon as there
is no traffic, or at most by grace-period. Note that we do not wait for
UDP traffic since that's connectionless by design.
Creates an abstraction over UDP Conn for origin "connection" which can
be useful for future support of complex protocols that may require
changing ports during protocol negotiation (eg. SIP, TFTP)
In addition, it removes a dependency from ingress on connection package.
- Refactors some h2mux specific logic from connection/header.go to connection/h2mux_header.go
- Do the same for the unit tests
- Add a non-h2mux "is control response header" function (we don't need one for the request flow)
- In that new function, do not consider "content-length" as a control header
- Use that function in the non-h2mux flow for response (and it will be used also in origintunneld)
The default max streams value of 100 is rather small when subject to
high load in terms of connecting QUIC with streams faster than it can
create new ones. This high value allows for more throughput.
Go's client defaults to chunked encoding after a 200ms delay if the following cases are true:
* the request body blocks
* the content length is not set (or set to -1)
* the method doesn't usually have a body (GET, HEAD, DELETE, ...)
* there is no transfer-encoding=chunked already set.
So for non websocket requests, if transfer-encoding isn't chunked and content length is 0, we dont set a request body.
ServeControlStream accidentally became non-blocking in the last quic
change causing stream to not be returned until a SIGTERM was received.
This change makes ServeControlStream be non-blocking for QUIC streams.
This maximum grace period will be honored by Cloudflare edge such that
either side will close the connection after unregistration at most
by this time (3min as of this commit):
- If the connection is unused, it is already closed as soon as possible.
- If the connection is still used, it is closed on the cloudflared configured grace-period.
Even if cloudflared does not close the connection by the grace-period time,
the edge will do so.
time.Tick() does not get garbage collected because the channel
underneath never gets deleted and the underlying Ticker can never be
recovered by the garbage collector. We replace this with NewTicker() to
avoid this.
All header transformation code from h2mux has been consolidated in the connection package since it's used by both h2mux and http2 logic.
Exported headers used by proxying between edge and cloudflared so then can be shared by tunnel service on the edge.
Moved access-related headers to corresponding packages that have the code that sets/uses these headers.
Removed tunnel hostname tracking from h2mux since it wasn't used by anything. We will continue to set the tunnel hostname header from the edge for backward compatibilty, but it's no longer used by cloudflared.
Move bastion-related logic into carrier package, untangled dependencies between carrier, origin, and websocket packages.
added ingress.DefaultStreamHandler and a basic test for tcp stream proxy
moved websocket.Stream to ingress
cloudflared no longer picks tcpstream host from header
- extracted ResponseWriter from proxyConnection
- added bastion tests over websocket
- removed HTTPResp()
- added some docstrings
- Renamed some ingress clients as proxies
- renamed instances of client to proxy in connection and origin
- Stream no longer takes a context and logger.Service
- Don't rely on edge to close connection on graceful shutdown in h2mux, start muxer shutdown from cloudflared.
- Don't retry failed connections after graceful shutdown has started.
- After graceful shutdown channel is closed we stop waiting for retry timer and don't try to restart tunnel loop.
- Use readonly channel for graceful shutdown in functions that only consume the signal
Classic tunnels flow was triggering an event for RegisteringTunnel for
every connection that was about to be established, and then a Connected
event for every connection established.
However, the RegistreringTunnel event had no connection ID, always
causing it to unset/disconnect the 0th connection making the /ready
endpoint report incorrect numbers for classic tunnels.