## Summary
This PR will add a new endpoint, "diag/system" to the metrics server that collects system information from different operating systems.
Closes TUN-8731
## Summary
Update how metrics server binds to a listener by using a known set of ports whenever the default address is used with the fallback to a random port in case all address are already in use. The default address changes at compile time in order to bind to a different default address when the final deliverable is a docker image.
Refactor ReadyServer tests.
Closes TUN-8737
Since legacy tunnels have been removed for a while now, we can remove
many of the capnp rpc interfaces that are no longer leveraged by the
legacy tunnel registration and authentication mechanisms.
Combines the tunnelrpc and quic/schema capnp files into the same module.
To help reduce future issues with capnp id generation, capnpids are
provided in the capnp files from the existing capnp struct ids generated
in the go files.
Reduces the overall interface of the Capnp methods to the rest of
the code by providing an interface that will handle the quic protocol
selection.
Introduces a new `rpc-timeout` config that will allow all of the
SessionManager and ConfigurationManager RPC requests to have a timeout.
The timeout for these values is set to 5 seconds as non of these operations
for the managers should take a long time to complete.
Removed the RPC-specific logger as it never provided good debugging value
as the RPC method names were not visible in the logs.
This commit makes the remote diagnostics enabled by default, which is
a useful feature when debugging cloudflared issues without manual intervention from users.
Users can still opt-out by disabling the feature flag.
## Summary
To prevent bad eyeballs and severs to be able to exhaust the quic
control flows we are adding the possibility of having a timeout
for a write operation to be acknowledged. This will prevent hanging
connections from exhausting the quic control flows, creating a DDoS.
When embedding the tunnel command inside another CLI, it
became difficult to test shutdown behavior due to this leaking
tunnel. By using the command context, we're able to shutdown
gracefully.
This commit implements the option to disable PTMU discovery for QUIC
connections.
QUIC finds the PMTU during startup by increasing Ping packet frames
until Ping responses are not received anymore, and it seems to stick
with that PMTU forever.
This is no problem if the PTMU doesn't change over time, but if it does
it may case packet drops.
We add this hidden flag for debugging purposes in such situations as a
quick way to validate if problems that are being seen can be solved by
reducing the packet size to the edge.
Note however, that this option may impact UDP proxying since we expect
being able to send UDP packets of 1280 bytes over QUIC.
So, this option should not be used when tunnel is being used for UDP
proxying.
With the new flag --management-diagnostics (an opt-in flag)
cloudflared's will be able to report additional diagnostic information
over the management.argotunnel.com request path.
Additions include the /metrics prometheus endpoint; which is already
bound to a local port via --metrics.
/debug/pprof/(goroutine|heap) are also provided to allow for remotely
retrieving heap information from a running cloudflared connector.
I deliberately kept this as an unregistertimeout because that was the
intent. In the future we could change this to a UDPConnConfig if we want
to pass multiple values here.
The idea of this PR is simply to add a configurable unregister UDP
timeout.
It might make sense for users to sometimes name their cloudflared
connectors to make identification easier than relying on hostnames that
TUN-7360 provides. This PR provides a new --label option to cloudflared
tunnel that a user could provide to give custom names to their
connectors.
With the management tunnels work, we allow calls to our edge service
using an access JWT provided by Tunnelstore. Given a connector ID,
this request is then proxied to the appropriate Cloudflare Tunnel.
This PR takes advantage of this flow and adds a new host_details
endpoint. Calls to this endpoint will result in cloudflared gathering
some details about the host: hostname (os.hostname()) and ip address
(localAddr in a dial).
Note that the mini spec lists 4 alternatives and this picks alternative
3 because:
1. Ease of implementation: This is quick and non-intrusive to any of our
code path. We expect to change how connection tracking works and
regardless of the direction we take, it may be easy to keep, morph
or throw this away.
2. The cloudflared part of this round trip takes some time with a
hostname call and a dial. But note that this is off the critical path
and not an API that will be exercised often.
cloudflared tail will now fetch the management token from by making
a request to the Cloudflare API using the cert.pem (acquired from
cloudflared login).
Refactored some of the credentials code into it's own package as
to allow for easier use between subcommands outside of
`cloudflared tunnel`.
This PR starts a separate server for proxy-dns if the configuration is
available. This fixes a problem on cloudflared not starting in proxy-dns
mode if the url flag (which isn't necessary for proxy-dns) is not
provided. Note: This is still being supported for legacy reasons and
since proxy-dns is not a tunnel and should not be part of the
cloudflared tunnel group of commands.
This PR does two things:
It changes how we fallback to a lower protocol: The current state
is to try connecting with a protocol. If it fails, fall back to a
lower protocol. And try connecting with that and so on. With this PR,
if we fail to connect with a protocol, we will try to connect to other
edge addresses first. Only if we fail to connect to those will we
fall back to a lower protocol.
It fixes a behaviour where if we fail to connect to an edge addr,
we keep re-trying the same address over and over again.
This PR now switches between edge addresses on subsequent connecton attempts.
Note that through these switches, it still respects the backoff time.
(We are connecting to a different edge, but this helps to not bombard an edge
address with connect requests if a particular edge addresses stops working).
Before this change when running cloudflare tunnel command without any
subcommand and without any additional flag, we would spin up a
QuickTunnel.
This is really a strange behaviour because we can easily create unwanted
tunnels and results in bad user experience.
This also has the side effect on putting more burden in our services
that are probably just mistakes.
This commit fixes that by requiring user to specify the url command
flag.
Running cloudflared tunnel alone will result in an error message
instead.
cloudflared shows possible directories for config files to be present if
it doesn't see one when starting up. For remotely configured files, it
may not be necessary to have a config file present. This PR looks to see
if a token flag was provided, and if yes, does not log this message.
Previously allowing the reconnect signal forcibly close the connection
caused a race condition on which error was returned by the errgroup
in the tunnel connection. Allowing the signal to return and provide
a context cancel to the connection provides a safer shutdown of the
tunnel for this test-only scenario.