Commit Graph

9963 Commits (cc5bc518746bb84ad724932dbe04032003a8e71f)
 

Author SHA1 Message Date
Andrew Dunham 208a32af5b logpolicy: fix nil pointer dereference with invalid TS_LOG_TARGET
When TS_LOG_TARGET is set to an invalid URL, url.Parse returns an error
and nil pointer, which caused a panic when accessing u.Host.

Now we check the error from url.Parse and log a helpful message while
falling back to the default log host.

Fixes #17792

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
3 weeks ago
Brad Fitzpatrick 052602752f control/controlclient: make Observer optional
As a baby step towards eventbus-ifying controlclient, make the
Observer optional.

This also means callers that don't care (like this network lock test,
and some tests in other repos) can omit it, rather than passing in a
no-op one.

Updates #12639

Change-Id: Ibd776b45b4425c08db19405bc3172b238e87da4e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Jordan Whited 0285e1d5fb
feature/relayserver: fix Shutdown() deadlock (#17898)
Updates #17894

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
James 'zofrex' Sanderson 124301fbb6
ipn/ipnlocal: log prefs changes and reason in Start (#17876)
Updates tailscale/corp#34238

Signed-off-by: James Sanderson <jsanderson@tailscale.com>
3 weeks ago
Alex Chan b5cd29932e tka: add a test for unmarshaling existing AUMs
Updates https://github.com/tailscale/tailscale/issues/17613

Change-Id: I693a580949eef59263353af6e7e03a7af9bbaa0b
Signed-off-by: Alex Chan <alexc@tailscale.com>
3 weeks ago
Jordan Whited 9e4d1fd87f
feature/relayserver,ipn/ipnlocal,net/udprelay: plumb DERPMap (#17881)
This commit replaces usage of local.Client in net/udprelay with DERPMap
plumbing over the eventbus. This has been a longstanding TODO. This work
was also accelerated by a memory leak in net/http when using
local.Client over long periods of time. So, this commit also addresses
said leak.

Updates #17801

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
Brad Fitzpatrick 146ea42822 ipn/ipnlocal: remove all the weird locking (LockedOnEntry, UnlockEarly, etc)
Fixes #11649
Updates #16369

Co-authored-by: James Sanderson <jsanderson@tailscale.com>
Change-Id: I63eaa18fe870ddf81d84b949efac4d1b44c3db86
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Andrew Dunham 08e74effc0 cmd/cloner: support cloning arbitrarily-nested maps
Fixes #17870

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
3 weeks ago
Naman Sood ca9b68aafd
cmd/tailscale/cli: remove service flag from funnel command (#17850)
Fixes #17849.

Signed-off-by: Naman Sood <mail@nsood.in>
3 weeks ago
Andrew Dunham 6ac80b7334 cmd/{cloner,viewer}: handle maps of views
Instead of trying to call View() on something that's already a View
type (or trying to Clone the view unnecessarily), we can re-use the
existing View values in a map[T]ViewType.

Fixes #17866

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
3 weeks ago
Jordan Whited f4f9dd7f8c
net/udprelay: replace VNI pool with selection algorithm (#17868)
This reduces memory usage when tailscaled is acting as a peer relay.

Updates #17801

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
License Updater 31fe75ad9e licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
3 weeks ago
Fran Bull 37aa7e6935 util/dnsname: fix test error message
Updates #17788

Signed-off-by: Fran Bull <fran@tailscale.com>
3 weeks ago
Brad Fitzpatrick f387b1010e wgengine/wgcfg: remove two unused Config fields
They distracted me in some refactoring. They're set but never used.

Updates #17858

Change-Id: I6ec7d6841ab684a55bccca7b7cbf7da9c782694f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Fran Bull 27a0168cdc util/dnsname: increase maxNameLength to account for trailing dot
Fixes #17788

Signed-off-by: Fran Bull <fran@tailscale.com>
3 weeks ago
Jonathan Nobels e8d2f96449
ipn/ipnlocal, net/netns: add node cap to disable netns interface binding on netext Apple clients (#17691)
updates tailscale/corp#31571

It appears that on the latest macOS, iOS and tVOS versions, the work
that netns is doing to bind outgoing connections to the default interface (and all
of the trimmings and workarounds in netmon et al that make that work) are
not needed. The kernel is extension-aware and doing nothing, is the right
thing.  This is, however, not the case for tailscaled (which is not a
special process).

To allow us to test this assertion (and where it might break things), we add a
new node cap that turns this behaviour off only for network-extension equipped clients,
making it possible to turn this off tailnet-wide, without breaking any tailscaled
macos nodes.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
3 weeks ago
Sachin Iyer 16e90dcb27
net/batching: fix gro size handling for misordered UDP_GRO messages (#17842)
Fixes #17835

Signed-off-by: Sachin Iyer <siyer@detail.dev>
3 weeks ago
Sachin Iyer d37884c734
cmd/k8s-operator: remove early return in ingress matching (#17841)
Fixes #17834

Signed-off-by: Sachin Iyer <siyer@detail.dev>
3 weeks ago
Sachin Iyer 85cb64c4ff
wf: correct IPv6 link-local range from ff80::/10 to fe80::/10 (#17840)
Fixes #17833

Signed-off-by: Sachin Iyer <siyer@detail.dev>
3 weeks ago
Sachin Iyer 3280dac797 wgengine/router/osrouter: fix linux magicsock port changing
Fixes #17837

Signed-off-by: Sachin Iyer <siyer@detail.dev>
3 weeks ago
Brad Fitzpatrick 1eba5b0cbd util/eventbus: log goroutine stacks when hung in CI
Updates #17680

Change-Id: Ie48dc2d64b7583d68578a28af52f6926f903ca4f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Brad Fitzpatrick 42ce5c88be wgengine/magicsock: unblock Conn.Synchronize on Conn.Close
I noticed a deadlock in a test in a in-development PR where during a
shutdown storm of things (from a tsnet.Server.Close), LocalBackend was
trying to call magicsock.Conn.Synchronize but the magicsock and/or
eventbus was already shut down and no longer processing events.

Updates #16369

Change-Id: I58b1f86c8959303c3fb46e2e3b7f38f6385036f1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Jordan Whited 2ad2d4d409
wgengine/magicsock: fix UDPRelayAllocReq/Resp deadlock (#17831)
Updates #17830

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
Jordan Whited 18806de400
wgengine/magicsock: validate endpoint.derpAddr in Conn.onUDPRelayAllocResp (#17828)
Otherwise a zero value will panic in Conn.sendUDPStd.

Updates #17827

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
Brad Fitzpatrick 4650061326 ipn/ipnlocal: fix state_test data race seen in CI
Unfortunately I closed the tab and lost it in my sea of CI failures
I'm currently fighting.

Updates #cleanup

Change-Id: I4e3a652d57d52b75238f25d104fc1987add64191
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Brad Fitzpatrick 6e24f50946 tsnet: add tstest.Shard on the slow tests
So they're not all run N times on the sharded oss builders
and are only run one time each.

Updates tailscale/corp#28679

Change-Id: Ie21e84b06731fdc8ec3212eceb136c8fc26b0115
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Brad Fitzpatrick 8ed6bb3198 ipn/ipnlocal: move vipServiceHash etc to serve.go, out of local.go
Updates #12614

Change-Id: I3c16b94fcb997088ff18d5a21355e0279845ed7e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Brad Fitzpatrick e0e8731130 feature, ipn/ipnlocal: add, use feature.CanSystemdStatus for more DCE
When systemd notification support was omitted from the build, or on
non-Linux systems, we were unnecessarily emitting code and generating
garbage stringifying addresses upon transition to the Running state.

Updates #12614

Change-Id: If713f47351c7922bb70e9da85bf92725b25954b9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Jordan Whited e059382174
wgengine/magicsock: clean up determineEndpoints docs (#17822)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
Brad Fitzpatrick fe5501a4e9 wgengine: make getStatus a bit cheaper (less alloc-y)
This removes one of the O(n=peers) allocs in getStatus, as
Engine.getStatus happens more often than Reconfig.

Updates #17814

Change-Id: I8a87fbebbecca3aedadba38e46cc418fd163c2b0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Alex Chan 4c67df42f6 tka: log a better error if there are no chain candidates
Previously if `chains` was empty, it would be passed to `computeActiveAncestor()`,
which would fail with the misleading error "multiple distinct chains".

Updates tailscale/corp#33846

Signed-off-by: Alex Chan <alexc@tailscale.com>
Change-Id: Ib93a755dbdf4127f81cbf69f3eece5a388db31c8
3 weeks ago
Alex Chan c7dbd3987e tka: remove an unused parameter from `computeActiveAncestor`
Updates #cleanup

Change-Id: I86ee7a0d048dafc8c0d030291261240050451721
Signed-off-by: Alex Chan <alexc@tailscale.com>
3 weeks ago
Andrew Lytvynov ae3dff15e4
ipn/ipnlocal: clean up some of the weird locking (#17802)
* lock released early just to call `b.send` when it can call
  `b.sendToLocked` instead
* `UnlockEarly` called to release the lock before trivially fast
  operations, we can wait for a defer there

Updates #11649

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
3 weeks ago
Brad Fitzpatrick 2e265213fd tsnet: fix TestConn to be fast, not flaky
Fixes #17805

Change-Id: I36e37cb0cfb2ea7b2341fd4b9809fbf1dd46d991
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Brad Fitzpatrick de733c5951 tailcfg: kill off rest of HairPinning symbols
It was disabled in May 2024 in #12205 (9eb72bb51).

This removes the unused symbols.

Updates #188
Updates tailscale/corp#19106
Updates tailscale/corp#19116

Change-Id: I5208b7b750b18226ed703532ed58c4ea17195a8e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Brad Fitzpatrick 875a9c526d tsnet: skip a 30s long flaky-ish test on macOS
Updates #17805

Change-Id: I540f50d067eee12e430dfd9de6871dc784fffb8a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 weeks ago
Raj Singh bab5e68d0a
net/udprelay: use GetGlobalAddrs and add local port endpoint (#17797)
Use GetGlobalAddrs() to discover all STUN endpoints, handling bad NATs
that create multiple mappings. When MappingVariesByDestIP is true, also
add the first STUN IPv4 address with the relay's local port for static
port mapping scenarios.

Updates #17796

Signed-off-by: Raj Singh <raj@tailscale.com>
4 weeks ago
Tom Proctor d4c5b278b3 cmd/k8s-operator: support workload identity federation
The feature is currently in private alpha, so requires a tailnet feature
flag. Initially focuses on supporting the operator's own auth, because the
operator is the only device we maintain that uses static long-lived
credentials. All other operator-created devices use single-use auth keys.

Testing steps:

* Create a cluster with an API server accessible over public internet
* kubectl get --raw /.well-known/openid-configuration | jq '.issuer'
* Create a federated OAuth client in the Tailscale admin console with:
  * The issuer from the previous step
  * Subject claim `system:serviceaccount:tailscale:operator`
  * Write scopes services, devices:core, auth_keys
  * Tag tag:k8s-operator
* Allow the Tailscale control plane to get the public portion of
  the ServiceAccount token signing key without authentication:
  * kubectl create clusterrolebinding oidc-discovery \
      --clusterrole=system:service-account-issuer-discovery \
      --group=system:unauthenticated
* helm install --set oauth.clientId=... --set oauth.audience=...

Updates #17457

Change-Id: Ib29c85ba97b093c70b002f4f41793ffc02e6c6e9
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
4 weeks ago
Tom Proctor 1ed117dbc0 cmd/k8s-operator: remove Services feature flag detection
Now that the feature is in beta, no one should encounter this error.

Updates #cleanup

Change-Id: I69ed3f460b7f28c44da43ce2f552042f980a0420
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
4 weeks ago
Joe Tsai 5b40f0bc54
cmd/vet: add static vet checker that runs jsontags (#17778)
This starts running the jsontags vet checker on the module.
All existing findings are adding to an allowlist.

Updates tailscale/corp#791

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
4 weeks ago
Joe Tsai 446752687c
cmd/vet: move jsontags into vet (#17777)
The cmd/jsontags is non-idiomatic since it is not a main binary.
Move it to a vet directory, which will eventually contain a vettool binary.

Update tailscale/corp#791

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
4 weeks ago
Joe Tsai 77123a569b
wgengine/netlog: include node OS in logged attributes (#17755)
Include the node's OS with network flow log information.

Refactor the JSON-length computation to be a bit more precise.

Updates tailscale/corp#33352
Fixes tailscale/corp#34030

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
4 weeks ago
Andrew Lytvynov db7dcd516f
Revert "control/controlclient: back out HW key attestation (#17664)" (#17732)
This reverts commit a760cbe33f.

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
1 month ago
M. J. Fromberger 4c856078e4
util/eventbus: block for the subscriber during SubscribeFunc close (#17642)
Prior to this change a SubscriberFunc treated the call to the subscriber's
function as the completion of delivery. But that means when we are closing the
subscriber, that callback could continue to execute for some time after the
close returns.

For channel-based subscribers that works OK because the close takes effect
before the subscriber ever sees the event. To make the two subscriber types
symmetric, we should also wait for the callback to finish before returning.
This ensures that a Close of the client means the same thing with both kinds of
subscriber.

Updates #17638

Change-Id: I82fd31bcaa4e92fab07981ac0e57e6e3a7d9d60b
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
1 month ago
M. J. Fromberger 061e6266cf
util/eventbus: allow logging of slow subscribers (#17705)
Add options to the eventbus.Bus to plumb in a logger.

Route that logger in to the subscriber machinery, and trigger a log message to
it when a subscriber fails to respond to its delivered events for 5s or more.

The log message includes the package, filename, and line number of the call
site that created the subscription.

Add tests that verify this works.

Updates #17680

Change-Id: I0546516476b1e13e6a9cf79f19db2fe55e56c698
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
1 month ago
Andrew Lytvynov f522b9dbb7
feature/tpm: protect all TPM handle operations with a mutex (#17708)
In particular on Windows, the `transport.TPMCloser` we get is not safe
for concurrent use. This is especially noticeable because
`tpm.attestationKey.Clone` uses the same open handle as the original
key. So wrap the operations on ak.tpm with a mutex and make a deep copy
with a new connection in Clone.

Updates #15830
Updates #17662
Updates #17644

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
1 month ago
James 'zofrex' Sanderson b6c6960e40
control/controlclient: remove unused reference to mapCtx (#17614)
Updates #cleanup

Signed-off-by: James Sanderson <jsanderson@tailscale.com>
1 month ago
Gesa Stupperich adee8b9180 cmd/tailscale/cli/serve_v2: improve validation error
Specify the app apability that failed the test, instead of the
entire comma-separated list.

Fixes #cleanup

Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
1 month ago
M. J. Fromberger 95426b79a9
logtail: avoid racing eventbus subscriptions with shutdown (#17695)
In #17639 we moved the subscription into NewLogger to ensure we would not race
subscribing with shutdown of the eventbus client. Doing so fixed that problem,
but exposed another: As we were only servicing events occasionally when waiting
for the network to come up, we could leave the eventbus to stall in cases where
a number of network deltas arrived later and weren't processed.

To address that, let's separate the concerns: As before, we'll Subscribe early
to avoid conflicts with shutdown; but instead of using the subscriber directly
to determine readiness, we'll keep track of the last-known network state in a
selectable condition that the subscriber updates for us.  When we want to wait,
we'll wait on that condition (or until our context ends), ensuring all the
events get processed in a timely manner.

Updates #17638
Updates #15160

Change-Id: I28339a372be4ab24be46e2834a218874c33a0d2d
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
1 month ago
Fernando Serboncini d68513b0db
ipn: add support for HTTP Redirects (#17594)
Adds a new Redirect field to HTTPHandler for serving HTTP redirects
from the Tailscale serve config. The redirect URL supports template
variables ${HOST} and ${REQUEST_URI} that are resolved per request.

By default, it redirects using HTTP Status 302 (Found). For another
redirect status, like 301 - Moved Permanently, pass the HTTP status
code followed by ':' on Redirect, like: "301:https://tailscale.com"

Updates #11252
Updates #11330

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
1 month ago