Commit Graph

879 Commits (alexc/tailscale-lock-status-json)

Author SHA1 Message Date
Claus Lensbøl c54d243690
net/tstun: add TSMPDiscoAdvertisement to TSMPPing (#17995)
Adds a new types of TSMP messages for advertising disco keys keys
to/from a peer, and implements the advertising triggered by a TSMP ping.

Needed as part of the effort to cache the netmap and still let clients
connect without control being reachable.

Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Co-authored-by: James Tucker <james@tailscale.com>
1 week ago
James Tucker c09c95ef67 types/key,wgengine/magicsock,control/controlclient,ipn: add debug disco key rotation
Adds the ability to rotate discovery keys on running clients, needed for
testing upcoming disco key distribution changes.

Introduces key.DiscoKey, an atomic container for a disco private key,
public key, and the public key's ShortString, replacing the prior
separate atomic fields.

magicsock.Conn has a new RotateDiscoKey method, and access to this is
provided via localapi and a CLI debug command.

Note that this implementation is primarily for testing as it stands, and
regular use should likely introduce an additional mechanism that allows
the old key to be used for some time, to provide a seamless key rotation
rather than one that invalidates all sessions.

Updates tailscale/corp#34037

Signed-off-by: James Tucker <james@tailscale.com>
2 weeks ago
Brad Fitzpatrick bd29b189fe types/netmap,*: remove some redundant fields from NetMap
Updates #12639

Change-Id: Ia50b15529bd1c002cdd2c937cdfbe69c06fa2dc8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 weeks ago
Alex Chan c2e474e729 all: rename variables with lowercase-l/uppercase-I
See http://go/no-ell

Signed-off-by: Alex Chan <alexc@tailscale.com>

Updates #cleanup

Change-Id: I8c976b51ce7a60f06315048b1920516129cc1d5d
2 weeks ago
Andrew Lytvynov d01081683c
go.mod: bump golang.org/x/crypto (#17907)
Pick up a fix for https://pkg.go.dev/vuln/GO-2025-4116 (even though
we're not affected).

Updates #cleanup

Change-Id: I9f2571b17c1f14db58ece8a5a34785805217d9dd

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2 weeks ago
Alex Chan 200383dce5 various: add more missing apostrophes in comments
Updates #cleanup

Change-Id: I79a0fda9783064a226ee9bcee2c1148212f6df7b
Signed-off-by: Alex Chan <alexc@tailscale.com>
2 weeks ago
Brad Fitzpatrick 99b06eac49 syncs: add Mutex/RWMutex alias/wrappers for future mutex debugging
Updates #17852

Change-Id: I477340fb8e40686870e981ade11cd61597c34a20
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 weeks ago
Brad Fitzpatrick 653d0738f9 types/netmap: remove PrivateKey from NetworkMap
It's an unnecessary nuisance having it. We go out of our way to redact
it in so many places when we don't even need it there anyway.

Updates #12639

Change-Id: I5fc72e19e9cf36caeb42cf80ba430873f67167c3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 weeks ago
Brad Fitzpatrick f387b1010e wgengine/wgcfg: remove two unused Config fields
They distracted me in some refactoring. They're set but never used.

Updates #17858

Change-Id: I6ec7d6841ab684a55bccca7b7cbf7da9c782694f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Brad Fitzpatrick 42ce5c88be wgengine/magicsock: unblock Conn.Synchronize on Conn.Close
I noticed a deadlock in a test in a in-development PR where during a
shutdown storm of things (from a tsnet.Server.Close), LocalBackend was
trying to call magicsock.Conn.Synchronize but the magicsock and/or
eventbus was already shut down and no longer processing events.

Updates #16369

Change-Id: I58b1f86c8959303c3fb46e2e3b7f38f6385036f1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Jordan Whited 2ad2d4d409
wgengine/magicsock: fix UDPRelayAllocReq/Resp deadlock (#17831)
Updates #17830

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
Jordan Whited 18806de400
wgengine/magicsock: validate endpoint.derpAddr in Conn.onUDPRelayAllocResp (#17828)
Otherwise a zero value will panic in Conn.sendUDPStd.

Updates #17827

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
Jordan Whited e059382174
wgengine/magicsock: clean up determineEndpoints docs (#17822)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
Brad Fitzpatrick edb11e0e60 wgengine/magicsock: fix js/wasm crash regression loading non-existent portmapper
Thanks for the report, @Need-an-AwP!

Fixes #17681
Updates #9394

Change-Id: I2e0b722ef9b460bd7e79499192d1a315504ca84c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 month ago
Alex Chan 8d119f62ee wgengine/magicsock: minor tidies in Test_endpoint_maybeProbeUDPLifetimeLocked
* Remove a couple of single-letter `l` variables
* Use named struct parameters in the test cases for readability
* Delete `wantAfterInactivityForFn` parameter when it returns the
  default zero

Updates #cleanup

Signed-off-by: Alex Chan <alexc@tailscale.com>
2 months ago
Joe Tsai e804b64358
wgengine/netlog: merge connstats into package (#17557)
Merge the connstats package into the netlog package
and unexport all of its declarations.

Remove the buildfeatures.HasConnStats and use HasNetLog instead.

Updates tailscale/corp#33352

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 months ago
Joe Tsai e75f13bd93
net/connstats: prepare to remove package (#17554)
The connstats package was an unnecessary layer of indirection.
It was seperated out of wgengine/netlog so that net/tstun and
wgengine/magicsock wouldn't need a depenedency on the concrete
implementation of network flow logging.

Instead, we simply register a callback for counting connections.
This PR does the bare minimum work to prepare tstun and magicsock
to only care about that callback.

A future PR will delete connstats and merge it into netlog.

Updates tailscale/corp#33352

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 months ago
Jordan Whited af15ee9c5f
wgengine/magicsock: add clientmetrics for TX bytes/packets by af & conn type (#17515)
Updates tailscale/corp#33206

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2 months ago
M. J. Fromberger 154d36f73d
wgengine/magicsock: do not apply node view updates to a closed Conn (#17517)
Fixes #17516

Change-Id: Iae2dab42d6f7bc618478d360a1005537c1fa1bbd
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2 months ago
Jordan Whited 16a05c7680
wgengine/magicsock: fix docs for send clientmetrics (#17514)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2 months ago
Jordan Whited adf308a064
wgengine/magicsock: add clientmetrics for RX bytes by af & conn type (#17512)
Updates tailscale/corp#33206

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2 months ago
Jordan Whited d72370a6eb
wgengine/magicsock: remove unused arg in deregisterMetrics (#17513)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2 months ago
Jordan Whited 4543ea5c8a
wgengine/magicsock: start peer relay path discovery sooner (#17485)
This commit also shuffles the hasPeerRelayServers atomic load
to happen sooner, reducing the cost for clients with no peer relay
servers.

Updates tailscale/corp#33099

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2 months ago
M. J. Fromberger 241ea1c98b wgengine/magicsock: use eventbus.SubscribeFunc in Conn
Updates #15160
Updates #17487

Change-Id: Ic9eb8d82b21d9dc38cb3c681b87101dfbc95af16
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2 months ago
Claus Lensbøl 63f7a400a8
wgengine/{magicsock,userspace,router}: move portupdates to the eventbus (#17423)
Also pull out interface method only needed in Linux.

Instead of having userspace do the call into the router, just let the
router pick up the change itself.

Updates #15160

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2 months ago
Brad Fitzpatrick 059f53e67a feature/condlite/expvar: add expvar stub package when metrics not needed
Saves ~53 KB from the min build.

Updates #12614

Change-Id: I73f9544a9feea06027c6ebdd222d712ada851299
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Jordan Whited 192f8d2804
wgengine/magicsock: add more handleNewServerEndpointRunLoop tests (#17469)
Updates tailscale/corp#32978

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2 months ago
Brad Fitzpatrick cf520a3371 feature/featuretags: add LazyWG modular feature
Due to iOS memory limitations in 2020 (see
https://tailscale.com/blog/go-linker, etc) and wireguard-go using
multiple goroutines per peer, commit 16a9cfe2f4 introduced some
convoluted pathsways through Tailscale to look at packets before
they're delivered to wireguard-go and lazily reconfigure wireguard on
the fly before delivering a packet, only telling wireguard about peers
that are active.

We eventually want to remove that code and integrate wireguard-go's
configuration with Tailscale's existing netmap tracking.

To make it easier to find that code later, this makes it modular. It
saves 12 KB (of disk) to turn it off (at the expense of lots of RAM),
but that's not really the point. The point is rather making it obvious
(via the new constants) where this code even is.

Updates #12614

Change-Id: I113b040f3e35f7d861c457eaa710d35f47cee1cb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Jordan Whited e44e28efcd
wgengine/magicsock: fix relayManager deadlock (#17449)
Updates tailscale/corp#32978

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2 months ago
Jordan Whited 3aa8b6d683
wgengine/magicsock: remove misleading unexpected log message (#17445)
Switching to a Geneve-encapsulated (peer relay) path in
endpoint.handlePongConnLocked is expected around port rebinds, which end
up clearing endpoint.bestAddr.

Fixes tailscale/corp#33036

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2 months ago
Brad Fitzpatrick 3c7e351671 net/connstats: make it modular (omittable)
Saves only 12 KB, but notably removes some deps on packages that future
changes can then eliminate entirely.

Updates #12614

Change-Id: Ibf830d3ee08f621d0a2011b1d4cd175427ef50df
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Brad Fitzpatrick 447cbdd1d0 health: make it omittable
Saves 86 KB.

And stop depending on expvar and usermetrics when disabled,
in prep to removing all the expvar/metrics/tsweb stuff.

Updates #12614

Change-Id: I35d2479ddd1d39b615bab32b1fa940ae8cbf9b11
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Brad Fitzpatrick c45f8813b4 feature/featuretags, all: add build features, use existing ones in more places
Saves 270 KB.

Updates #12614

Change-Id: I4c3fe06d32c49edb3a4bb0758a8617d83f291cf5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Brad Fitzpatrick ee034d48fc feature/featuretags: add a catch-all "Debug" feature flag
Saves 168 KB.

Updates #12614

Change-Id: Iaab3ae3efc6ddc7da39629ef13e5ec44976952ba
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Brad Fitzpatrick 01e645fae1 util/backoff: rename logtail/backoff package to util/backoff
It has nothing to do with logtail and is confusing named like that.

Updates #cleanup
Updates #17323

Change-Id: Idd34587ba186a2416725f72ffc4c5778b0b9db4a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Alex Chan 002ecb78d0 all: don't rebind variables in for loops
See https://tip.golang.org/wiki/LoopvarExperiment#does-this-mean-i-dont-have-to-write-x--x-in-my-loops-anymore

Updates https://github.com/tailscale/tailscale/issues/11058

Signed-off-by: Alex Chan <alexc@tailscale.com>
2 months ago
James Tucker 8b3e88cd09
wgengine/magicsock: fix rebind debouncing (#17282)
On platforms that are causing EPIPE at a high frequency this is
resulting in non-working connections, for example when Apple decides to
forcefully close UDP sockets due to an unsoliced packet rejection in the
firewall.

Too frequent rebinds cause a failure to solicit the endpoints triggering
the rebinds, that would normally happen via CallMeMaybe.

Updates #14551
Updates tailscale/corp#25648

Signed-off-by: James Tucker <james@tailscale.com>
2 months ago
Simon Law 34242df51b
derp/derpserver: clean up extraction of derp.Server (#17264)
PR #17258 extracted `derp.Server` into `derp/derpserver.Server`.

This followup patch adds the following cleanups:
1. Rename `derp_server*.go` files to `derpserver*.go` to match
   the package name.
2. Rename the `derpserver.NewServer` constructor to `derpserver.New`
   to reduce stuttering.
3. Remove the unnecessary `derpserver.Conn` type alias.

Updates #17257
Updates #cleanup

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2 months ago
Brad Fitzpatrick 21dc5f4e21 derp/derpserver: split off derp.Server out of derp into its own package
This exports a number of things from the derp (generic + client) package
to be used by the new derpserver package, as now used by cmd/derper.

And then enough other misc changes to lock in that cmd/tailscaled can
be configured to not bring in tailscale.com/client/local. (The webclient
in particular, even when disabled, was bringing it in, so that's now fixed)

Fixes #17257

Change-Id: I88b6c7958643fb54f386dd900bddf73d2d4d96d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Brad Fitzpatrick b54cdf9f38 all: use buildfeatures.HasCapture const in a handful of places
Help out the linker's dead code elimination.

Updates #12614

Change-Id: I6c13cb44d3250bf1e3a01ad393c637da4613affb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Jonathan Nobels 4af15a1148
magicsock: fix deadlock in SetStaticEndpoints (#17247)
updates tailscale/corp#32600

A localAPI/cli call to reload-config can end up leaving magicsock's mutex
locked.   We were missing an unlock for the early exit where there's no change in
the static endpoints when the disk-based config is loaded.  This is not likely
the root cause of the linked issue - just noted during investigation.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2 months ago
M. J. Fromberger 2b6bc11586
wgengine: use eventbus.Client.Monitor to simplify subscriber maintenance (#17203)
This commit does not change the order or meaning of any eventbus activity, it
only updates the way the plumbing is set up.

Updates #15160

Change-Id: I40c23b183c2a6a6ea3feec7767c8e5417019fc07
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
3 months ago
Brad Fitzpatrick 99b3f69126 feature/portmapper: make the portmapper & its debugging tools modular
Starting at a minimal binary and adding one feature back...
    tailscaled tailscale combined (linux/amd64)
     30073135  17451704  31543692 omitting everything
    +  480302 +   10258 +  493896 .. add debugportmapper
    +  475317 +  151943 +  467660 .. add portmapper
    +  500086 +  162873 +  510511 .. add portmapper+debugportmapper

Fixes #17148

Change-Id: I90bd0e9d1bd8cbe64fa2e885e9afef8fb5ee74b1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 months ago
M. J. Fromberger 8608e42103
feature,ipn/ipnlocal,wgengine: improve how eventbus shutdown is handled (#17156)
Instead of waiting for a designated subscription to close as a canary for the
bus being stopped, use the bus Client's own signal for closure added in #17118.

Updates #cleanup

Change-Id: I384ea39f3f1f6a030a6282356f7b5bdcdf8d7102
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
3 months ago
Claus Lensbøl 2015ce4081
health,ipn/ipnlocal: introduce eventbus in heath.Tracker (#17085)
The Tracker was using direct callbacks to ipnlocal. This PR moves those
to be triggered via the eventbus.

Additionally, the eventbus is now closed on exit from tailscaled
explicitly, and health is now a SubSystem in tsd.

Updates #15160

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
3 months ago
Brad Fitzpatrick 8b48f3847d net/netmon, wgengine/magicsock: simplify LinkChangeLogLimiter signature
Remove the need for the caller to hold on to and call an unregister
function. Both two callers (one real, one test) already have a context
they can use. Use context.AfterFunc instead. There are no observable
side effects from scheduling too late if the goroutine doesn't run sync.

Updates #17148

Change-Id: Ie697dae0e797494fa8ef27fbafa193bfe5ceb307
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 months ago
Alex Chan 5c24f0ed80 wgengine/magicsock: send a valid payload in TestNetworkDownSendErrors
This test ostensibly checks whether we record an error metric if a packet
is dropped because the network is down, but the network connectivity is
irrelevant -- the send error is actually because the arguments to Send()
are invalid:

    RebindingUDPConn.WriteWireGuardBatchTo:
    [unexpected] offset (0) != Geneve header length (8)

This patch changes the test so we try to send a valid packet, and we
verify this by sending it once before taking the network down.  The new
error is:

    magicsock: network down

which is what we're trying to test.

We then test sending an invalid payload as a separate test case.

Updates tailscale/corp#22075

Signed-off-by: Alex Chan <alexc@tailscale.com>
3 months ago
Jordan Whited 998a667cd5
wgengine/magicsock: don't add DERP addrs to endpointState (#17147)
endpointState is used for tracking UDP direct connection candidate
addresses. If it contains a DERP addr, then direct connection path
discovery will always send a wasteful disco ping over it. Additionally,
CLI "tailscale ping" via peer relay will race over DERP, leading to a
misleading result if pong arrives via DERP first.

Disco pongs arriving via DERP never influence path selection. Disco
ping/pong via DERP only serves "tailscale ping" reporting.

Updates #17121

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 months ago
Jordan Whited fb9d9ba86e
wgengine/magicsock: add TS_DEBUG_NEVER_DIRECT_UDP debug knob (#17094)
Updates tailscale/corp#30903

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 months ago
Jordan Whited 6feb6f3c75
wgengine/magicsock: add relayManager event logs (#17091)
These are gated behind magicsock component debug logging.

Updates tailscale/corp#30818

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 months ago