Commit Graph

125 Commits

Author SHA1 Message Date
Nuno Diegues ed2bac026d TUN-5621: Correctly manage QUIC stream closing
Until this PR, we were naively closing the quic.Stream whenever
the callstack for handling the request (HTTP or TCP) finished.
However, our proxy handler may still be reading or writing from
the quic.Stream at that point, because we return the callstack if
either side finishes, but not necessarily both.

This is a problem for quic-go library because quic.Stream#Close
cannot be called concurrently with quic.Stream#Write

Furthermore, we also noticed that quic.Stream#Close does nothing
to do receiving stream (since, underneath, quic.Stream has 2 streams,
1 for each direction), thus leaking memory, as explained in:
https://github.com/lucas-clemente/quic-go/issues/3322

This PR addresses both problems by wrapping the quic.Stream that
is passed down to the proxying logic and handle all these concerns.
2022-02-01 22:01:57 +00:00
Nuno Diegues 7bac4b15b0 TUN-5719: Re-attempt connection to edge with QUIC despite network error when there is no fallback
We have made 2 changes in the past that caused an unexpected edge case:
 1. when faced with QUIC "no network activity", give up re-attempts and fall-back
 2. when a protocol is chosen explicitly, rather than using auto (the default), do not fallback

The reasoning for 1. was to fallback quickly in situations where the user may not
have chosen QUIC, and simply got it because we auto-chose it (with the TXT DNS record),
but the users' environment does not allow egress via UDP.

The reasoning for 2. was to avoid falling back if the user explicitly chooses a
protocol. E.g., if the user chooses QUIC, she may want to do UDP proxying, so if
we fallback to HTTP2 protocol that will be unexpected since it does not support
UDP (and same applies for HTTP2 falling back to h2mux and TCP proxying).

This PR fixes the edge case that happens when both those changes 1. and 2. are
put together: when faced with a QUIC "no network activity", we should only try
to fallback if there is a possible fallback. Otherwise, we should exhaust the
retries as normal.
2022-01-27 22:12:25 +00:00
cthuang 6fa58aadba TUN-5623: Configure quic max datagram frame size to 1350 bytes for none Windows platforms 2022-01-11 14:55:43 +00:00
Nuno Diegues 1086d5ede5 TUN-5204: Unregister QUIC transports on disconnect
This adds various bug fixes when investigating why QUIC transports were
not being unregistered when they should (and only when the graceful shutdown
started).

Most of these bug fixes are making the QUIC transport implementation closer
to its HTTP2 counterpart:
 - ServeControlStream is now a blocking function (it's up to the transport to handle that)
 - QUIC transport then handles the control plane as part of its Serve, thus waiting for it on shutdown
 - QUIC transport now returns "non recoverable" for connections with similar semantics to HTTP2 and H2mux
 - QUIC transport no longer has a loop around its Serve logic that retries connections on its own (that logic is upstream)
2022-01-06 10:08:38 +00:00
Nuno Diegues 628545d229 TUN-5600: Close QUIC transports as soon as possible while respecting graceful shutdown
This does a few fixes to make sure that the QUICConnection returns from
Serve when the context is cancelled.

QUIC transport now behaves like other transports: closes as soon as there
is no traffic, or at most by grace-period. Note that we do not wait for
UDP traffic since that's connectionless by design.
2022-01-06 08:59:53 +00:00
cthuang fc2333c934 TUN-5300: Define RPC to register UDP sessions 2021-12-06 16:37:09 +00:00
Nuno Diegues 1ee540a166 TUN-5368: Log connection issues with LogLevel that depends on tunnel state
Connections from cloudflared to Cloudflare edge are long lived and may
break over time. That is expected for many reasons (ranging from network
conditions to operations within Cloudflare edge). Hence, logging that as
Error feels too strong and leads to users being concerned that something
is failing when it is actually expected.

With this change, we wrap logging about connection issues to be aware
of the tunnel state:
 - if the tunnel has no connections active, we log as error
 - otherwise we log as warning
2021-11-10 09:00:05 +00:00
Sudarsan Reddy 0146a8d8ed TUN-5285: Fallback to HTTP2 immediately if connection times out with no network activity 2021-11-04 10:42:53 +00:00
cthuang ff7c48568c TUN-5261: Collect QUIC metrics about RTT, packets and bytes transfered and log events at tracing level 2021-10-21 15:26:57 +01:00
Sudarsan Reddy ceb509ee98 TUN-5138: Switch to QUIC on auto protocol based on threshold 2021-10-14 09:18:20 +01:00
Sudarsan Reddy bccf4a63dc UN-5213: Increase MaxStreams value for QUIC transport
The default max streams value of 100 is rather small when subject to
high load in terms of connecting QUIC with streams faster than it can
create new ones. This high value allows for more throughput.
2021-10-08 13:48:20 +01:00
Sudarsan Reddy fd14bf440b TUN-5118: Quic connection now detects duplicate connections similar to http2 2021-09-21 06:30:09 +00:00
cthuang 98c3957d30 TUN-5010: --region should be a string flag 2021-08-30 14:40:07 +00:00
cthuang 27cd83c2d3 Revert "TUN-4926: Implement --region configuration option"
This reverts commit d0a1daac3b.
2021-08-28 16:42:55 +01:00
Areg Harutyunyan d0a1daac3b TUN-4926: Implement --region configuration option 2021-08-27 09:11:10 +00:00
Rishabh Bector a4a9f45b0a TUN-4821: Make quick tunnels the default in cloudflared 2021-08-26 15:53:02 +00:00
Sudarsan Reddy 12ad264eb3 TUN-4866: Add Control Stream for QUIC
This commit adds support to Register and Unregister Connections via RPC
on the QUIC transport protocol
2021-08-17 14:50:32 +00:00
Sudarsan Reddy 5f6e867685 TUN-4602: Added UDP resolves to Edge discovery 2021-08-09 18:41:43 +00:00
Sudarsan Reddy ed1389ef08 TUN-4814: Revert "TUN-4699: Make quick tunnels the default in cloudflared"
This reverts commit 18992efa0c.
2021-07-28 10:02:55 +01:00
Rishabh Bector 18992efa0c TUN-4699: Make quick tunnels the default in cloudflared 2021-07-26 15:57:36 +00:00
Sudarsan Reddy 8f3526289a TUN-4701: Split Proxy into ProxyHTTP and ProxyTCP
http.Request now is only used by ProxyHTTP and not required if the
proxying is TCP. The dest conversion is handled by the transport layer.
2021-07-19 13:43:59 +00:00
Igor Postelnik 8ca0d86c85 TUN-3863: Consolidate header handling logic in the connection package; move headers definitions from h2mux to packages that manage them; cleanup header conversions
All header transformation code from h2mux has been consolidated in the connection package since it's used by both h2mux and http2 logic.
Exported headers used by proxying between edge and cloudflared so then can be shared by tunnel service on the edge.
Moved access-related headers to corresponding packages that have the code that sets/uses these headers.
Removed tunnel hostname tracking from h2mux since it wasn't used by anything. We will continue to set the tunnel hostname header from the edge for backward compatibilty, but it's no longer used by cloudflared.
Move bastion-related logic into carrier package, untangled dependencies between carrier, origin, and websocket packages.
2021-03-29 21:57:56 +00:00
Igor Postelnik 39065377b5 TUN-4063: Cleanup dependencies between packages.
- Move packages the provide generic functionality (such as config) from `cmd` subtree to top level.
- Remove all dependencies on `cmd` subtree from top level packages.
- Consolidate all code dealing with token generation and transfer to a single cohesive package.
2021-03-09 14:02:59 +00:00
Nuno Diegues e9c2afec56 TUN-3948: Log error when retrying connection 2021-02-23 11:40:29 +00:00
Adam Chalmers a278753bbf TUN-3902: Add jitter to backoffhandler
Jitter is important to avoid every cloudflared in the world trying to
reconnect at t=1, 2, 4, etc. That could overwhelm the backend. But
if each cloudflared randomly waits for up to 2, then up to 4, then up
to 8 etc, then the retries get spread out evenly across time.

On average, wait times should be the same (e.g. instead of waiting for
exactly 1 second, cloudflared will wait betweeen 0 and 2 seconds).

This is the "Full Jitter" algorithm from https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/
2021-02-11 14:36:13 +00:00
Igor Postelnik 0b16a473da TUN-3869: Improve reliability of graceful shutdown.
- Don't rely on edge to close connection on graceful shutdown in h2mux, start muxer shutdown from cloudflared.
- Don't retry failed connections after graceful shutdown has started.
- After graceful shutdown channel is closed we stop waiting for retry timer and don't try to restart tunnel loop.
- Use readonly channel for graceful shutdown in functions that only consume the signal
2021-02-08 14:30:32 +00:00
Adam Chalmers 0d22106416 TUN-3848: Use transport logger for h2mux 2021-02-03 17:31:16 -06:00
Igor Postelnik a945518404 TUN-3811: Better error reporting on http2 connection termination. Registration errors from control loop are now propagated out of the connection server code. Unified error handling between h2mux and http2 connections so we log and retry errors the same way, regardless of underlying transport. 2021-01-28 10:38:30 -06:00
Igor Postelnik d503aeaf77 TUN-3118: Changed graceful shutdown to immediately unregister tunnel from the edge, keep the connection open until the edge drops it or grace period expires 2021-01-22 11:14:36 -06:00
Igor Postelnik db0562c7b8 Fixed connection error handling by removing duplicated errors, standardizing on non-pointer error types 2021-01-22 10:58:06 -06:00
Igor Postelnik 04b1e4f859 TUN-3738: Refactor observer to avoid potential of blocking on tunnel notifications 2021-01-18 11:16:23 +00:00
Areg Harutyunyan 55bf904689 TUN-3471: Add structured log context to logs 2021-01-05 20:21:16 +00:00
Areg Harutyunyan 870f5fa907 TUN-3470: Replace in-house logger calls with zerolog 2020-12-23 14:15:17 -06:00
Adam Chalmers 38fb0b28b6 TUN-3593: /ready endpoint for k8s readiness. Move tunnel events out of UI package, into connection package. 2020-12-02 15:22:59 -06:00
cthuang 5974fb4cfd TUN-3500: Integrate replace h2mux by http2 work with multiple origin support 2020-11-11 15:20:57 +00:00
cthuang a490443630 TUN-3458: Upgrade to http2 when available, fallback to h2mux when we reach max retries 2020-11-11 15:11:42 +00:00
cthuang b5cdf3b2c7 TUN-3456: New protocol option auto to automatically select between http2 and h2mux 2020-11-11 15:11:42 +00:00
cthuang 9ac40dcf04 TUN-3462: Refactor cloudflared to separate origin from connection 2020-11-11 15:11:42 +00:00
cthuang a5a5b93b64 TUN-3420: Establish control plane and send RPC over control plane 2020-11-11 15:11:42 +00:00
cthuang 8d7b2575ba TUN-3400: Use Go HTTP2 library as transport to connect with the edge 2020-11-11 15:11:42 +00:00
cthuang d7498b0c03 TUN-3449: Use flag to select transport protocol implementation 2020-11-11 15:11:42 +00:00
cthuang be9a558867 TUN-3503: Matching ingress rule should not take port into account 2020-11-05 15:36:12 +00:00
Adam Chalmers d01770107e TUN-3492: Refactor OriginService, shrink its interface 2020-11-04 21:28:33 +00:00
Adam Chalmers e933ef9e1a TUN-2640: Users can configure per-origin config. Unify single-rule CLI
flow with multi-rule config file code.
2020-10-30 07:42:20 -05:00
Adam Chalmers c96b9e8d8f TUN-3464: Newtype to wrap []ingress.Rule 2020-10-15 12:48:14 -05:00
Adam Chalmers 4a4a1bb6b1 TUN-3441: Multiple-origin routing via ingress rules 2020-10-13 08:55:17 -05:00
Adam Chalmers 0eebc7cef9 TUN-3438: move ingress into own package, read into TunnelConfig 2020-10-12 16:33:22 +00:00
cthuang 2c9b7361b7 TUN-3427: Define a struct that only implements RegistrationServer in tunnelpogs 2020-10-01 09:08:32 +01:00
Areg Harutyunyan 747427f816 TUN-3216: UI improvements 2020-09-17 13:22:08 +04:00
Rachel Williams bb530b87dd TUN-3328: Filter out free tunnel has started log from UI 2020-09-17 11:52:10 +04:00