Commit Graph

146 Commits (01185e436fd39c2aa499b3c56bcb08d6c4dc7b84)

Author SHA1 Message Date
Brad Fitzpatrick c76a6e5167 derp: track client-advertised non-ideal DERP connections in more places
In f77821fd63 (released in v1.72.0), we made the client tell a DERP server
when the connection was not its ideal choice (the first node in its region).

But we didn't do anything with that information until now. This adds a
metric about how many such connections are on a given derper, and also
adds a bit to the PeerPresentFlags bitmask so watchers can identify
(and rebalance) them.

Updates tailscale/corp#372

Change-Id: Ief8af448750aa6d598e5939a57c062f4e55962be
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 month ago
Jordan Whited bb60da2764
derp: add sclient write deadline timeout metric (#13831)
Write timeouts can be indicative of stalled TCP streams. Understanding
changes in the rate of such events can be helpful in an ops context.

Updates tailscale/corp#23668

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 month ago
Brad Fitzpatrick 18fc093c0d derp: give trusted mesh peers longer write timeouts
Updates tailscale/corp#24014

Change-Id: I700872be48ab337dce8e11cabef7f82b97f0422a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 month ago
Brad Fitzpatrick f3de4e96a8 derp: fix omitted word in comment
Fix comment just added in 38f236c725.

Updates tailscale/corp#23668
Updates #cleanup

Change-Id: Icbe112e24fcccf8c61c759c631ad09f3e5480547
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Brad Fitzpatrick 38f236c725 derp: add server metric for batch write sizes
Updates tailscale/corp#23668

Change-Id: Ie6268c4035a3b29fd53c072c5793e4cbba93d031
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Brad Fitzpatrick 8012bb4216 derp: refactor DERP server's peer-gone watch mechanism
In prep for upcoming flow tracking & mutex contention optimization
changes, this change refactors (subjectively simplifying) how the DERP
Server accounts for which peers have written to which other peers, to
be able to send PeerGoneReasonDisconnected messages to writes to
uncache their DRPO (DERP Return Path Optimization) routes.

Notably, this removes the Server.sentTo field which was guarded by
Server.mu and checked on all packet sends. Instead, the accounting is
moved to each sclient's sendLoop goroutine and now only needs to
acquire Server.mu for newly seen senders, the first time a peer sends
a packet to that sclient.

This change reduces the number of reasons to acquire Server.mu
per-packet from two to one. Removing the last one is the subject of an
upcoming change.

Updates #3560
Updates #150

Change-Id: Id226216d6629d61254b6bfd532887534ac38586c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Brad Fitzpatrick cec779e771 util/slicesx: add FirstElementEqual and LastElementEqual
And update a few callers as examples of motivation. (there are a
couple others, but these are the ones where it's prettier)

Updates #cleanup

Change-Id: Ic8c5cb7af0a59c6e790a599136b591ebe16d38eb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Brad Fitzpatrick 910462a8e0 derp: unify server's clientSet interface into concrete type
73280595a8 for #2751 added a "clientSet" interface to
distinguish the two cases of a client being singly connected (the
common case) vs tolerating multiple connections from the client at
once. At the time (three years ago) it was kinda an experiment
and we didn't know whether it'd stop the reconnect floods we saw
from certain clients. It did.

So this promotes it to a be first-class thing a bit, removing the
interface. The old tests from 73280595a were invaluable in ensuring
correctness while writing this change (they failed a bunch).

But the real motivation for this change is that it'll permit a future
optimization to add flow tracking for stats & performance where we
don't contend on Server.mu for each packet sent via DERP. Instead,
each client can track its active flows and hold on to a *clientSet and
ask the clientSet per packet what the active client is via one atomic
load rather than a mutex. And if the atomic load returns nil, we'll
know we need to ask the server to see if they died and reconnected and
got a new clientSet. But that's all coming later.

Updates #3560

Change-Id: I9ccda3e5381226563b5ec171ceeacf5c210e1faf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Brad Fitzpatrick 210264f942 cmd/derper: clarify that derper and tailscaled need to be in sync
Fixes #12617

Change-Id: Ifc87b7d9cf699635087afb57febd01fb9a6d11b7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
Brad Fitzpatrick d91e5c25ce derp: redo, simplify how mesh update writes are queued/written
I couldn't convince myself the old way was safe and couldn't lose
writes.

And it seemed too complicated.

Updates tailscale/corp#21104

Change-Id: I17ba7c7d6fd83458a311ac671146a1f6a458a5c1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
Brad Fitzpatrick ded7734c36 derp: account for increased size of peerPresent messages in mesh updates
sendMeshUpdates tries to write as much as possible without blocking,
being careful to check the bufio.Writer.Available size before writes.

Except that regressed in 6c791f7d60 which made those messages larger, which
meants we were doing network I/O with the Server mutex held.

Updates tailscale/corp#13945

Change-Id: Ic327071d2e37de262931b9b390cae32084811919
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
Brad Fitzpatrick 5ffb2668ef derp: add PeerPresentFlags bitmask to Watch messages
Updates tailscale/corp#17816

Change-Id: Ib5baf6c981a6a4c279f8bbfef02048cfbfb3323b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
Brad Fitzpatrick 25eeafde23 derp: don't verify mesh peers when --verify-clients is set
Updates tailscale/corp#20654

Change-Id: I33c7ca3c7a3c4e492797b73c66eefb699376402c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
Brad Fitzpatrick 4b39b6f7ce derp: fix fmt verb for nodekeys
It was hex-ifying the String() form of key.NodePublic, which was already hex.
I noticed in some logs:

    "client 6e6f64656b65793a353537353..."

And thought that 6x6x6x6x looked strange. It's "nodekey:" in hex.

Updates tailscale/corp#20844

Change-Id: Ib9f2d63b37e324420b86efaa680668a9b807e465
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
James Tucker 87c5ad4c2c derp: add a verifyClients check to the consistency check
Only implemented for the local tailscaled variant for now.

Updates tailscale/corp#20844

Signed-off-by: James Tucker <james@tailscale.com>
5 months ago
Brad Fitzpatrick 6908fb0de3 ipn/localapi,client/tailscale,cmd/derper: add WhoIs lookup by nodekey, use in derper
Fixes #12465

Change-Id: I9b7c87315a3d2b2ecae2b8db9e94b4f5a1eef74a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
Maisem Ali 4a8cb1d9f3 all: use math/rand/v2 more
Updates #11058

Signed-off-by: Maisem Ali <maisem@tailscale.com>
6 months ago
Brad Fitzpatrick f227083539 derp: add some guardrails for derpReason metrics getting out of sync
The derp metrics got out of sync in 74eb99aed1 (2023-03).

They were fixed in 0380cbc90d (2024-05).

This adds some further guardrails (atop the previous fix) to make sure
they don't get out of sync again.

Updates #12288

Change-Id: I809061a81f8ff92f45054d0253bc13871fc71634
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Spike Curtis 0380cbc90d
derp: fix dropReason metrics labels (#12288)
Updates #2745
Updates #7552

Signed-off-by: Spike Curtis <spike@coder.com>
6 months ago
Andrew Dunham c6d42b1093 derp: remove stats goroutine, use a timer
Without changing behaviour, don't create a goroutine per connection that
sits and sleeps, but rather use a timer that wakes up and gathers
statistics on a regular basis.

Fixes #12127

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibc486447e403070bdc3c2cd8ae340e7d02854f21
6 months ago
Maisem Ali a49ed2e145 derp,ipn/ipnlocal: stop calling rand.Seed
It's deprecated and using it gets us the old slow behavior
according to https://go.dev/blog/randv2.

> Having eliminated repeatability of the global output stream, Go 1.20
> was also able to make the global generator scale better in programs
> that don’t call rand.Seed, replacing the Go 1 generator with a very
> cheap per-thread wyrand generator already used inside the Go
> runtime. This removed the global mutex and made the top-level
> functions scale much better. Programs that do call rand.Seed fall
> back to the mutex-protected Go 1 generator.

Updates #7123

Change-Id: Ia5452e66bd16b5457d4b1c290a59294545e13291
Signed-off-by: Maisem Ali <maisem@tailscale.com>
7 months ago
Brad Fitzpatrick 10d130b845 cmd/derper, derp, tailcfg: add admission controller URL option
So derpers can check an external URL for whether to permit access
to a certain public key.

Updates tailscale/corp#17693

Change-Id: I8594de58f54a08be3e2dbef8bcd1ff9b728ab297
Co-authored-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
9 months ago
Brad Fitzpatrick 2988c1ec52 derp: plumb context to Server.verifyClient
Updates tailscale/corp#17693

Change-Id: If17e02c77d5ad86b820e639176da2d3e61296bae
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
9 months ago
James Tucker ee20327496 derp: remove unused per-client struct field
Updates #self

Signed-off-by: James Tucker <james@tailscale.com>
11 months ago
James Tucker a7f65b40c5 derp: optimize field order to reduce GC cost
See the field alignment lints for more information.
Reductions are 64->24 and 64->32 respectively.

Updates #self

Signed-off-by: James Tucker <james@tailscale.com>
11 months ago
James Tucker ff9c1ebb4a derp: reduce excess goroutines blocking on broadcasts
Observed on one busy derp node, there were 600 goroutines blocked
writing to this channel, which represents not only more blocked routines
than we need, but also excess wake-ups downstream as the latent
goroutines writes represent no new work.

Updates #self

Signed-off-by: James Tucker <james@tailscale.com>
11 months ago
Andrew Lytvynov 2716250ee8
all: cleanup unused code, part 2 (#10670)
And enable U1000 check in staticcheck.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
11 months ago
Brad Fitzpatrick 6c791f7d60 derp: include src IPs in mesh watch messages
Updates tailscale/corp#13945

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Claire Wang 90a7d3066c
derp: use tstime (#8634)
Updates #8587

Signed-off-by: Claire Wang <claire@tailscale.com>
1 year ago
valscale 5ae786988c
derp: remove default logging of disconnecting clients (#8163)
~97% of the log messages derper outputs are related to the normal
non-error state of a client disconnecting in some manner. Add a
verbose logging feature that only logs these messages when enabled.

Fixes #8024

Signed-off-by: Val <valerie@tailscale.com>
2 years ago
valscale 74eb99aed1
derp, derphttp, magicsock: send new unknown peer frame when destination is unknown (#7552)
* wgengine/magicsock: add envknob to send CallMeMaybe to non-existent peer

For testing older client version responses to the PeerGone packet format change.

Updates #4326

Signed-off-by: Val <valerie@tailscale.com>

* derp: remove dead sclient struct member replaceLimiter

Leftover from an previous solution to the duplicate client problem.

Updates #2751

Signed-off-by: Val <valerie@tailscale.com>

* derp, derp/derphttp, wgengine/magicsock: add new PeerGone message type Not Here

Extend the PeerGone message type by adding a reason byte. Send a
PeerGone "Not Here" message when an endpoint sends a disco message to
a peer that this server has no record of.

Fixes #4326

Signed-off-by: Val <valerie@tailscale.com>

---------

Signed-off-by: Val <valerie@tailscale.com>
2 years ago
Anton Tolchanov 654b5a0616 derp: add optional debug logging for prober clients
This allows tracking packet flow via logs for prober clients. Note that
the new sclient.debug() function is called on every received packet, but
will do nothing for most clients.

I have adjusted sclient logging to print public keys in short format
rather than full. This takes effect even for existing non-debug logging
(mostly client disconnect messages).

Example logs for a packet being sent from client [SbsJn] (connected to
derper [dM2E3]) to client [10WOo] (connected to derper [AVxvv]):

```
derper [dM2E3]:
derp client 10.0.0.1:35470[SbsJn]: register single client mesh("10.0.1.1"): 4 peers
derp client 10.0.0.1:35470[SbsJn]: read frame type 4 len 40 err <nil>
derp client 10.0.0.1:35470[SbsJn]: SendPacket for [10WOo], forwarding via <derphttp_client.Client [AVxvv] url=https://10.0.1.1/derp>: <nil>
derp client 10.0.0.1:35470[SbsJn]: read frame type 0 len 0 err EOF
derp client 10.0.0.1:35470[SbsJn]: read EOF
derp client 10.0.0.1:35470[SbsJn]: sender failed: context canceled
derp client 10.0.0.1:35470[SbsJn]: removing connection

derper [AVxvv]:
derp client 10.0.1.1:50650[10WOo]: register single client
derp client 10.0.1.1:50650[10WOo]: received forwarded packet from [SbsJn] via [dM2E3]
derp client 10.0.1.1:50650[10WOo]: sendPkt attempt 0 enqueued
derp client 10.0.1.1:50650[10WOo]: sendPacket from [SbsJn]: <nil>
derp client 10.0.1.1:50650[10WOo]: read frame type 0 len 0 err EOF
derp client 10.0.1.1:50650[10WOo]: read EOF
derp client 10.0.1.1:50650[10WOo]: sender failed: context canceled
derp client 10.0.1.1:50650[10WOo]: removing connection
```

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2 years ago
David Anderson 8b2ae47c31 version: unexport all vars, turn Short/Long into funcs
The other formerly exported values aren't used outside the package,
so just unexport them.

Signed-off-by: David Anderson <danderson@tailscale.com>
2 years ago
Will Norris 947c14793a all: update tools that manage copyright headers
Update all code generation tools, and those that check for license
headers to use the new standard header.

Also update copyright statement in LICENSE file.

Fixes #6865

Signed-off-by: Will Norris <will@tailscale.com>
2 years ago
Will Norris 71029cea2d all: update copyright and license headers
This updates all source files to use a new standard header for copyright
and license declaration.  Notably, copyright no longer includes a date,
and we now use the standard SPDX-License-Identifier header.

This commit was done almost entirely mechanically with perl, and then
some minimal manual fixes.

Updates #6865

Signed-off-by: Will Norris <will@tailscale.com>
2 years ago
Anton Tolchanov 6cc6c70d70 derp: prevent concurrent access to multiForwarder map
Instead of iterating over the map to determine the preferred forwarder
on every packet (which could happen concurrently with map mutations),
store it separately in an atomic variable.

Fixes #6445

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2 years ago
Maisem Ali 1f4669a380 all: standardize on LocalAPI
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick f4ff26f577 types/pad32: delete package
Use Go 1.19's new 64-bit alignment ~hidden feature instead.

Fixes #5356

Change-Id: Ifcbcb115875a7da01df3bc29e9e7feadce5bc956
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham 64ea60aaa3
derp: add TCP RTT metric on Linux (#5949)
Periodically poll the TCP RTT metric from all open TCP connections and
update a (bucketed) histogram metric.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I6214902196b05bf7829c9d0ea501ce0e13d984cf
2 years ago
Eng Zer Jun f0347e841f refactor: move from io/ioutil to io and os packages
The io/ioutil package has been deprecated as of Go 1.16 [1]. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.

Reference: https://golang.org/doc/go1.16#ioutil
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2 years ago
Brad Fitzpatrick 74674b110d envknob: support changing envknobs post-init
Updates #5114

Change-Id: Ia423fc7486e1b3f3180a26308278be0086fae49b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 4950fe60bd syncs, all: move to using Go's new atomic types instead of ours
Fixes #5185

Change-Id: I850dd532559af78c3895e2924f8237ccc328449d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick a12aad6b47 all: convert more code to use net/netip directly
perl -i -npe 's,netaddr.IPPrefixFrom,netip.PrefixFrom,' $(git grep -l -F netaddr.)
    perl -i -npe 's,netaddr.IPPortFrom,netip.AddrPortFrom,' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IPPrefix,netip.Prefix,g' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IPPort,netip.AddrPort,g' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IP\b,netip.Addr,g' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IPv6Raw\b,netip.AddrFrom16,g' $(git grep -l -F netaddr. )
    goimports -w .

Then delete some stuff from the net/netaddr shim package which is no
longer neeed.

Updates #5162

Change-Id: Ia7a86893fe21c7e3ee1ec823e8aba288d4566cd8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 6a396731eb all: use various net/netip parse funcs directly
Mechanical change with perl+goimports.

Changed {Must,}Parse{IP,IPPrefix,IPPort} to their netip variants, then
goimports -d .

Finally, removed the net/netaddr wrappers, to prevent future use.

Updates #5162

Change-Id: I59c0e38b5fbca5a935d701645789cddf3d7863ad
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 7eaf5e509f net/netaddr: start migrating to net/netip via new netaddr adapter package
Updates #5162

Change-Id: Id7bdec303b25471f69d542f8ce43805328d56c12
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Charlotte Brandhorst-Satzkorn 4c0feba38e
derp: plumb '/derp' request context through (#5083)
This change is required to implement tracing for derp.

Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
2 years ago
Maisem Ali 9f604f2bd3
derp: add (*Server).IsClientConnectedForTest func. (#4331)
This allows tests to verfiy that a DERP connection was actually
established.

Related to #4326
Updates tailscale/corp#2579

Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Brad Fitzpatrick 18818763d1 derp: set Basic Constraints on metacert
See https://github.com/golang/go/issues/51759#issuecomment-1071147836

Once we deploy this, tailscaled should work again for macOS users with
Go 1.18.

Updates golang/go#51759

Change-Id: I869b6ddc556a2de885e96ccf9f335dfc8f6f6a7e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder 0868329936 all: use any instead of interface{}
My favorite part of generics.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 41fd4eab5c envknob: add new package for all the strconv.ParseBool(os.Getenv(..))
A new package can also later record/report which knobs are checked and
set. It also makes the code cleaner & easier to grep for env knobs.

Change-Id: Id8a123ab7539f1fadbd27e0cbeac79c2e4f09751
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago