Since legacy tunnels have been removed for a while now, we can remove
many of the capnp rpc interfaces that are no longer leveraged by the
legacy tunnel registration and authentication mechanisms.
## Summary
To prevent bad eyeballs and severs to be able to exhaust the quic
control flows we are adding the possibility of having a timeout
for a write operation to be acknowledged. This will prevent hanging
connections from exhausting the quic control flows, creating a DDoS.
With the new flag --management-diagnostics (an opt-in flag)
cloudflared's will be able to report additional diagnostic information
over the management.argotunnel.com request path.
Additions include the /metrics prometheus endpoint; which is already
bound to a local port via --metrics.
/debug/pprof/(goroutine|heap) are also provided to allow for remotely
retrieving heap information from a running cloudflared connector.
In the streambased origin proxy flow (example ssh over access), there is
a chance when we do not flush on http.ResponseWriter writes. This PR
guarantees that the response writer passed to proxy stream has a flusher
embedded after writes. This means we write much more often back to the
ResponseWriter and are not waiting. Note, this is only something we do
when proxyHTTP-ing to a StreamBasedOriginProxy because we do not want to
have situations where we are not sending information that is needed by
the other side (eyeball).
We need to set the default configuration to -1 to accommodate local
to remote configuration migrations that will set the configuration
version to 0. This make's sure to override the local configuration
with the new remote configuration when sent as it does a check against
the local current configuration version.
It might make sense for users to sometimes name their cloudflared
connectors to make identification easier than relying on hostnames that
TUN-7360 provides. This PR provides a new --label option to cloudflared
tunnel that a user could provide to give custom names to their
connectors.
With the management tunnels work, we allow calls to our edge service
using an access JWT provided by Tunnelstore. Given a connector ID,
this request is then proxied to the appropriate Cloudflare Tunnel.
This PR takes advantage of this flow and adds a new host_details
endpoint. Calls to this endpoint will result in cloudflared gathering
some details about the host: hostname (os.hostname()) and ip address
(localAddr in a dial).
Note that the mini spec lists 4 alternatives and this picks alternative
3 because:
1. Ease of implementation: This is quick and non-intrusive to any of our
code path. We expect to change how connection tracking works and
regardless of the direction we take, it may be easy to keep, morph
or throw this away.
2. The cloudflared part of this round trip takes some time with a
hostname call and a dial. But note that this is off the critical path
and not an API that will be exercised often.
I can only reproduce the flakiness, which is the hello world still
responding when it should be shut down already, in Windows (both in
TeamCity as well as my local VM). Locally, it only happens when the
machine is under high load.
Anyway, it's valid that the proxies take some time to shut down since
they handle that via channels asynchronously with regards to the event
that updates the configuration.
Hence, nothing is wrong, as long as they eventually shut down, which the
test still verifies.
For WARP routing the defaults for these new settings are 5 seconds for connect timeout and 30 seconds for keep-alive timeout. These values can be configured either remotely or locally. Local config lives under "warp-routing" section in config.yaml.
For websocket-based proxy, the defaults come from originConfig settings (either global or per-service) and use the same defaults as HTTP proxying.