Commit Graph

75 Commits

Author SHA1 Message Date
Devin Carr 1cc15c6ffa TUN-9882: Improve metrics for datagram v3
Adds new metrics for:
- Dropped UDP datagrams for reads and write paths
- Dropped ICMP packets for write paths
- Failures that preemptively close UDP flows

Closes TUN-9882
2025-10-08 12:17:23 -07:00
Devin Carr 51c5ef726c TUN-9882: Add write deadline for UDP origin writes
Add a deadline for origin writes as a preventative measure in the case that the kernel blocks any writes for too long.
In the case that the socket exceeds the write deadline, the datagram will be dropped.

Closes TUN-9882
2025-10-07 19:54:42 -07:00
Devin Carr 1fb466941a TUN-9882: Add buffers for UDP and ICMP datagrams in datagram v3
Instead of creating a go routine to process each incoming datagram from the tunnel, a single consumer (the demuxer) will
process each of the datagrams in serial.

Registration datagrams will still be spun out into separate go routines since they are responsible for managing the
lifetime of the session once started via the `Serve` method.

UDP payload datagrams will be handled in separate channels to allow for parallel writing inside of the scope of a
session via a new write loop. This channel will have a small buffer to help unblock the demuxer from dequeueing other
datagrams.

ICMP datagrams will be funneled into a single channel across all possible origins with a single consumer to write to
their respective destinations.

Each of these changes is to prevent datagram reordering from occurring when dequeuing from the tunnel connection. By
establishing a single demuxer that serializes the writes per session, each session will be able to write sequentially,
but in parallel to their respective origins.

Closes TUN-9882
2025-10-07 16:14:01 -07:00
Devin Carr 41dffd7f3c CUSTESC-53681: Correct QUIC connection management for datagram handlers
Corrects the pattern of using errgroup's and context cancellation to simplify the logic for canceling extra routines for the QUIC connection. This is because the extra context cancellation is redundant with the fact that the errgroup also cancels it's own provided context when a routine returns (error or not).

For the datagram handler specifically, since it can respond faster to a context cancellation from the QUIC connection, we wrap the error before surfacing it outside of the QUIC connection scope to the supervisor. Additionally, the supervisor will look for this error type to check if it should retry the QUIC connection. These two operations are required because the supervisor does not look for a context canceled error when deciding to retry a connection. If a context canceled from the datagram handler were to be returned up to the supervisor on the initial connection, the cloudflared application would exit. We want to ensure that cloudflared maintains connection attempts even if any of the services on-top of a QUIC connection fail (datagram handler in this case).

Additional logging is also introduced along these paths to help with understanding the error conditions from the specific handlers on-top of a QUIC connection.

Related CUSTESC-53681

Closes TUN-9610
2025-08-19 16:10:00 -07:00
Devin Carr 70ed7ffc5f TUN-9470: Add OriginDialerService to include TCP
Adds an OriginDialerService that takes over the roles of both DialUDP and DialTCP 
towards the origin. This provides the possibility to leverage dialer "middleware"
to inject virtual origins, such as the DNS resolver service.

DNS Resolver service also gains access to the DialTCP operation to service TCP
DNS requests.

Minor refactoring includes changes to remove the needs previously provided by
the warp-routing configuration. This configuration cannot be disabled by cloudflared
so many of the references have been adjusted or removed.

Closes TUN-9470
2025-06-30 13:24:16 -07:00
Devin Carr b4a98b13fe TUN-9469: Centralize UDP origin proxy dialing as ingress service
Introduces a new `UDPOriginProxy` interface and `UDPOriginService`
to standardize how UDP connections are dialed to origins. Allows for
future overrides of the ingress service for specific dial destinations.

Simplifies dependency injection for UDP dialing throughout both datagram
v2 and v3 by using the same ingress service. Previous invocations called
into a DialUDP function in the ingress package that was a light
wrapper over `net.DialUDP`. Now a reference is passed into both datagram
controllers that allows more control over the DialUDP method.

Closes TUN-9469
2025-06-23 18:01:15 +00:00
Luis Neto 96ce66bd30 TUN-9016: update go to 1.24
## Summary

Update several moving parts of cloudflared build system:

* use goboring 1.24.2 in cfsetup
* update linter and fix lint issues
* update packages namely **quic-go and net**
* install script for macos
* update docker files to use go 1.24.1
* remove usage of cloudflare-go
* pin golang linter

Closes TUN-9016
2025-06-06 09:05:49 +00:00
Devin Carr 3bf9217de5 TUN-9319: Add dynamic loading of features to connections via ConnectionOptionsSnapshot
Make sure to enforce snapshots of features and client information for each connection
so that the feature information can change in the background. This allows for new features
to only be applied to a connection if it completely disconnects and attempts a reconnect.

Updates the feature refresh time to 1 hour from previous cloudflared versions which
refreshed every 6 hours.

Closes TUN-9319
2025-05-14 20:11:05 +00:00
Devin Carr 02705c44b2 TUN-9322: Add metric for unsupported RPC commands for datagram v3
Additionally adds support for the connection index as a label for the
datagram v3 specific tunnel metrics.

Closes TUN-9322
2025-05-13 16:11:09 +00:00
João "Pisco" Fernandes 4eb0f8ce5f TUN-8861: Rename Session Limiter to Flow Limiter
## Summary
Session is the concept used for UDP flows. Therefore, to make
the session limiter ambiguous for both TCP and UDP, this commit
renames it to flow limiter.

Closes TUN-8861
2025-01-20 06:33:40 -08:00
João "Pisco" Fernandes bf4954e96a TUN-8861: Add session limiter to UDP session manager
## Summary
In order to make cloudflared behavior more predictable and
prevent an exhaustion of resources, we have decided to add
session limits that can be configured by the user. This first
commit introduces the session limiter and adds it to the UDP
handling path. For now the limiter is set to run only in
unlimited mode.
2025-01-20 02:52:32 -08:00
Gonçalo Garcia c6901551e7 TUN-8822: Prevent concurrent usage of ICMPDecoder
## Summary
Some description...

Closes TUN-8822
2024-12-19 07:19:36 -08:00
Devin Carr bc9c5d2e6e TUN-8817: Increase close session channel by one since there are two writers
When closing a session, there are two possible signals that will occur,
one from the outside, indicating that the session is idle and needs to
be closed, and the internal error condition that will be unblocked
with a net.ErrClosed when the connection underneath is closed. Both of
these routines write to the session's closeChan.

Once the reader for the closeChan reads one value, it will immediately
return. This means that the channel is a one-shot and one of the two
writers will get stuck unless the size of the channel is increased to
accomodate for the second write to the channel.

With the channel size increased to two, the second writer (whichever
loses the race to write) will now be unblocked to end their go routine
and return.

Closes TUN-8817
2024-12-17 14:55:09 -08:00
Devin Carr 588ab7ebaa TUN-8640: Add ICMP support for datagram V3
Closes TUN-8640
2024-12-09 07:23:11 -08:00
Devin Carr 37010529bc TUN-8775: Make sure the session Close can only be called once
The previous capture of the sync.OnceValue was re-initialized for each
call to `Close`. This needed to be initialized during the creation of
the session to ensure that the sync.OnceValue reference was held for
the session's lifetime.

Closes TUN-8775
2024-12-05 14:12:53 -08:00
Devin Carr d779394748 TUN-8748: Migrated datagram V3 flows to use migrated context
Previously, during local flow migration the current connection context
was not part of the migration and would cause the flow to still be listening
on the connection context of the old connection (before the migration).
This meant that if a flow was migrated from connection 0 to
connection 1, and connection 0 goes away, the flow would be early
terminated incorrectly with the context lifetime of connection 0.

The new connection context is provided during migration of a flow
and will trigger the observe loop for the flow lifetime to be rebound
to this provided context.
Closes TUN-8748
2024-11-21 12:56:47 -08:00
chungthuang a26b2a0097
Merge pull request #1355 from pkillarjun/fuzzing
add: new go-fuzz targets
2024-11-18 09:53:12 -06:00
Devin Carr ab3dc5f8fa TUN-8701: Simplify flow registration logs for datagram v3
To help reduce the volume of logs during the happy path of flow registration, there will only be one log message reported when a flow is completed.

There are additional fields added to all flow log messages:
1. `src`: local address
2. `dst`: origin address
3. `durationMS`: capturing the total duration of the flow in milliseconds

Additional logs were added to capture when a flow was migrated or when cloudflared sent off a registration response retry.

Closes TUN-8701
2024-11-12 10:54:37 -08:00
Arjun 53c523444e add: new go-fuzz targets
Signed-off-by: Arjun <pkillarjun@protonmail.com>
2024-11-11 20:45:49 +05:30
Devin Carr 1f3e3045ad TUN-8701: Add metrics and adjust logs for datagram v3
Closes TUN-8701
2024-11-07 11:02:55 -08:00
Devin Carr 952622a965 TUN-8709: Add session migration for datagram v3
When a registration response from cloudflared gets lost on it's way back to the edge, the edge service will retry and send another registration request. Since cloudflared already has bound the local UDP socket for the provided request id, we want to re-send the registration response.

There are three types of retries that the edge will send:

1. A retry from the same QUIC connection index; cloudflared will just respond back with a registration response and reset the idle timer for the session.
2. A retry from a different QUIC connection index; cloudflared will need to migrate the current session connection to this new QUIC connection and reset the idle timer for the session.
3. A retry to a different cloudflared connector; cloudflared will eventually time the session out since no further packets will arrive to the session at the original connector.

Closes TUN-8709
2024-11-06 12:06:07 -08:00
Gonçalo Garcia 3d33f559b1 TUN-8641: Expose methods to simplify V3 Datagram parsing on the edge 2024-11-04 15:23:36 -08:00
Devin Carr 5891c0d955 TUN-8700: Add datagram v3 muxer
The datagram muxer will wrap a QUIC Connection datagram read-writer operations to unmarshal datagrams from the connection to the origin with the session manager. Incoming datagram session registration operations will create new UDP sockets for sessions to proxy UDP packets between the edge and the origin. The muxer is also responsible for marshalling UDP packets and operations into datagrams for communication over the QUIC connection towards the edge.

Closes TUN-8700
2024-11-04 11:20:35 -08:00
Devin Carr 6a6c890700 TUN-8667: Add datagram v3 session manager
New session manager leverages similar functionality that was previously
provided with datagram v2, with the distinct difference that the sessions
are registered via QUIC Datagrams and unregistered via timeouts only; the
sessions will no longer attempt to unregister sessions remotely with the
edge service.

The Session Manager is shared across all QUIC connections that cloudflared
uses to connect to the edge (typically 4). This will help cloudflared be
able to monitor all sessions across the connections and help correlate
in the future if sessions migrate across connections.

The UDP payload size is still limited to 1280 bytes across all OS's. Any
UDP packet that provides a payload size of greater than 1280 will cause
cloudflared to report (as it currently does) a log error and drop the packet.

Closes TUN-8667
2024-10-31 14:05:15 -07:00
Devin Carr abb3466c31 TUN-8638: Add datagram v3 serializers and deserializers
Closes TUN-8638
2024-10-16 12:05:55 -07:00
chungthuang 0b62d45738 TUN-8456: Update quic-go to 0.45 and collect mtu and congestion control metrics 2024-06-17 15:28:56 +00:00
chungthuang a16532dbbb TUN-8451: Log QUIC flow control frames and transport parameters received 2024-06-12 19:23:39 +00:00
Devin Carr eb2e4349e8 TUN-8415: Refactor capnp rpc into a single module
Combines the tunnelrpc and quic/schema capnp files into the same module.

To help reduce future issues with capnp id generation, capnpids are
provided in the capnp files from the existing capnp struct ids generated
in the go files.

Reduces the overall interface of the Capnp methods to the rest of
the code by providing an interface that will handle the quic protocol
selection.

Introduces a new `rpc-timeout` config that will allow all of the
SessionManager and ConfigurationManager RPC requests to have a timeout.
The timeout for these values is set to 5 seconds as non of these operations
for the managers should take a long time to complete.

Removed the RPC-specific logger as it never provided good debugging value
as the RPC method names were not visible in the logs.
2024-05-17 11:22:07 -07:00
João "Pisco" Fernandes da6fac4133 TUN-8297: Improve write timeout logging on safe_stream.go
## Summary:
In order to properly monitor what is happening with the new write timeouts that we introduced
in TUN-8244 we need proper logging. Right now we were logging write timeouts when the safe
stream was being closed which didn't make sense because it was miss leading, so this commit
prevents that by adding a flag that allows us to know whether we are closing the stream or not.
2024-03-13 13:30:45 +00:00
João "Pisco" Fernandes 4f7165530c TUN-8275: Skip write timeout log on "no network activity"
## Summary
To avoid having to verbose logs we need to only log when an
actual issue occurred. Therefore, we will be skipping any error
logging if the write timeout is caused by no network activity
which just means that nothing is being sent through the stream.
2024-03-06 16:05:48 +00:00
chungthuang 34a876e4e7 TUN-8243: Collect metrics on the number of QUIC frames sent/received
This commit also removed the server metrics that is no longer used
2024-02-19 10:09:14 +00:00
João "Pisco" Fernandes 76badfa01b TUN-8236: Add write timeout to quic and tcp connections
## Summary
To prevent bad eyeballs and severs to be able to exhaust the quic
control flows we are adding the possibility of having a timeout
for a write operation to be acknowledged. This will prevent hanging
connections from exhausting the quic control flows, creating a DDoS.
2024-02-15 17:54:52 +00:00
chungthuang 8e69f41833 TUN-7934: Update quic-go to a version that queues datagrams for better throughput and drops large datagram
Remove TestUnregisterUdpSession
2024-01-03 13:01:01 +00:00
Chung-Ting 8068cdebb6 TUN-8006: Update quic-go to latest upstream 2023-12-04 17:09:40 +00:00
Sudarsan Reddy 1abd22ef0a TUN-7480: Added a timeout for unregisterUDP.
I deliberately kept this as an unregistertimeout because that was the
intent. In the future we could change this to a UDPConnConfig if we want
to pass multiple values here.

The idea of this PR is simply to add a configurable unregister UDP
timeout.
2023-06-20 06:20:09 +00:00
João Oliveirinha 20e36c5bf3 TUN-7468: Increase the limit of incoming streams 2023-06-19 10:41:56 +00:00
Devin Carr 9426b60308 TUN-7227: Migrate to devincarr/quic-go
The lucas-clemente/quic-go package moved namespaces and our branch
went stale, this new fork provides support for the new quic-go repo
and applies the max datagram frame size change.

Until the max datagram frame size support gets upstreamed into quic-go,
this can be used to unblock go 1.20 support as the old
lucas-clemente/quic-go will not get go 1.20 support.
2023-05-10 19:44:15 +00:00
João Oliveirinha 0be1ed5284 TUN-7398: Add support for quic safe stream to set deadline 2023-04-27 19:49:56 +01:00
João Oliveirinha 7ef9bb89d3 TUN-7000: Reduce metric cardinality of closedConnections metric by removing error as tag 2022-12-07 11:09:16 +00:00
cthuang 225c344ceb TUN-6855: Add DatagramV2Type for IP packet with trace and tracing spans 2022-10-17 19:45:01 +01:00
cthuang be0305ec58 TUN-6741: ICMP proxy tries to listen on specific IPv4 & IPv6 when possible
If it cannot determine the correct interface IP, it will fallback to all interfaces.
This commit also introduces the icmpv4-src and icmpv6-src flags
2022-09-26 11:37:08 +01:00
Devin Carr f5f3e6a453 TUN-6689: Utilize new RegisterUDPSession to begin tracing 2022-09-13 14:56:08 +00:00
Devin Carr e380333520 TUN-6688: Update RegisterUdpSession capnproto to include trace context 2022-09-08 21:50:58 +00:00
Chung-Ting Huang 3e0ff3a771 TUN-6531: Implement ICMP proxy for Windows using IcmpSendEcho 2022-09-07 19:18:06 +00:00
cthuang faa86ffeca TUN-6737: Fix datagramV2Type should be declared in its own block so it starts at 0 2022-09-05 15:09:53 +01:00
Nuno Diegues 7ca5f7569a TUN-6726: Fix maxDatagramPayloadSize for Windows QUIC datagrams 2022-09-01 21:32:59 +00:00
João Oliveirinha e131125558 TUN-6699: Add metric for packet too big dropped 2022-08-26 16:02:43 +00:00
cthuang 59f5b0df83 TUN-6530: Implement ICMPv4 proxy
This proxy uses unprivileged datagram-oriented endpoint and is shared by all quic connections
2022-08-24 17:33:03 +01:00
cthuang d2bc15e224 TUN-6667: DatagramMuxerV2 provides a method to receive RawPacket 2022-08-24 14:56:08 +01:00
cthuang bad2e8e812 TUN-6666: Define packet package
This package defines IP and ICMP packet, decoders, encoder and flow
2022-08-24 11:36:57 +01:00