Commit Graph

9879 Commits (139c395d7df2479657867e24f3a75a1608b6fa6f)
 

Author SHA1 Message Date
Alex Chan 139c395d7d cmd/tailscale/cli: stabilise the output of `tailscale lock log --json`
This patch changes the behaviour of `tailscale lock log --json` to make
it more useful for users. It also introduces versioning of our JSON output.

## Changes to `tailscale lock log --json`

Previously this command would print the hash and base64-encoded bytes of
each AUM, and users would need their own CBOR decoder to interpret it in
a useful way:

```json
[
  {
    "Hash": [
      80,
      136,
      151,
      …
    ],
    "Change": "checkpoint",
    "Raw": "pAEFAvYFpQH2AopYIAkPN+8V3cJpkoC5ZY2+RI2Bcg2q5G7tRAQQd67W3YpnWCDPOo4KGeQBd8hdGsjoEQpSXyiPdlm+NXAlJ5dS1qEbFlggylNJDQM5ZQ2ULNsXxg2ZBFkPl/D93I1M56/rowU+UIlYIPZ/SxT9EA2Idy9kaCbsFzjX/s3Ms7584wWGbWd/f/QAWCBHYZzYiAPpQ+NXN+1Wn2fopQYk4yl7kNQcMXUKNAdt1lggcfjcuVACOH0J9pRNvYZQFOkbiBmLOW1hPKJsbC1D1GdYIKrJ38XMgpVMuTuBxM4YwoLmrK/RgXQw1uVEL3cywl3QWCA0FilVVv8uys8BNhS62cfNvCew1Pw5wIgSe3Prv8d8pFggQrwIt6ldYtyFPQcC5V18qrCnt7VpThACaz5RYzpx7RNYIKskOA7UoNiVtMkOrV2QoXv6EvDpbO26a01lVeh8UCeEA4KjAQECAQNYIORIdNHqSOzz1trIygnP5w3JWK2DtlY5NDIBbD7SKcjWowEBAgEDWCD27LpxiZNiA19k0QZhOWmJRvBdK2mz+dHu7rf0iGTPFwQb69Gt42fKNn0FGwRUiav/k6dDF4GiAVgg5Eh00epI7PPW2sjKCc/nDclYrYO2Vjk0MgFsPtIpyNYCWEDzIAooc+m45ay5PB/OB4AA9Fdki4KJq9Ll+PF6IJHYlOVhpTbc3E0KF7ODu1WURd0f7PXnW72dr89CSfGxIHAF"
  }
]
```

Now we print the AUM in an expanded form that can be easily read by scripts,
although we include the raw bytes for verification and auditing.

```json
{
  "SchemaVersion": "1",
  "Messages": [
    {
      "Hash": "KCEJPRKNSXJG2TPH3EHQRLJNLIIK2DV53FUNPADWA7BZJWBDRXZQ",
      "AUM": {
        "MessageKind": "checkpoint",
        "PrevAUMHash": null,
        "Key": null,
        "KeyID": null,
        "State": {
          …
        },
        "Votes": null,
        "Meta": null,
        "Signatures": [
          {
            "KeyID": "tlpub:e44874d1ea48ecf3d6dac8ca09cfe70dc958ad83b656393432016c3ed229c8d6",
            "Signature": "8yAKKHPpuOWsuTwfzgeAAPRXZIuCiavS5fjxeiCR2JTlYaU23NxNChezg7tVlEXdH+z151u9na/PQknxsSBwBQ=="
          }
        ]
      },
      "Raw": "pAEFAvYFpQH2AopYIAkPN-8V3cJpkoC5ZY2-RI2Bcg2q5G7tRAQQd67W3YpnWCDPOo4KGeQBd8hdGsjoEQpSXyiPdlm-NXAlJ5dS1qEbFlggylNJDQM5ZQ2ULNsXxg2ZBFkPl_D93I1M56_rowU-UIlYIPZ_SxT9EA2Idy9kaCbsFzjX_s3Ms7584wWGbWd_f_QAWCBHYZzYiAPpQ-NXN-1Wn2fopQYk4yl7kNQcMXUKNAdt1lggcfjcuVACOH0J9pRNvYZQFOkbiBmLOW1hPKJsbC1D1GdYIKrJ38XMgpVMuTuBxM4YwoLmrK_RgXQw1uVEL3cywl3QWCA0FilVVv8uys8BNhS62cfNvCew1Pw5wIgSe3Prv8d8pFggQrwIt6ldYtyFPQcC5V18qrCnt7VpThACaz5RYzpx7RNYIKskOA7UoNiVtMkOrV2QoXv6EvDpbO26a01lVeh8UCeEA4KjAQECAQNYIORIdNHqSOzz1trIygnP5w3JWK2DtlY5NDIBbD7SKcjWowEBAgEDWCD27LpxiZNiA19k0QZhOWmJRvBdK2mz-dHu7rf0iGTPFwQb69Gt42fKNn0FGwRUiav_k6dDF4GiAVgg5Eh00epI7PPW2sjKCc_nDclYrYO2Vjk0MgFsPtIpyNYCWEDzIAooc-m45ay5PB_OB4AA9Fdki4KJq9Ll-PF6IJHYlOVhpTbc3E0KF7ODu1WURd0f7PXnW72dr89CSfGxIHAF"
    }
  ]
}
```

This output was previously marked as unstable, and it wasn't very useful,
so changing it should be fine.

## Versioning our JSON output

This patch introduces a way to version our JSON output on the CLI, so we
can make backwards-incompatible changes in future without breaking existing
scripts or integrations.

You can run this command in two ways:

```
tailscale lock log --json
tailscale lock log --json=1
```

Passing an explicit version number allows you to pick a specific JSON schema.
If we ever want to change the schema, we increment the version number and
users must opt-in to the new output.

A bare `--json` flag will always return schema version 1, for compatibility
with existing scripts.

Updates https://github.com/tailscale/tailscale/issues/17613
Updates https://github.com/tailscale/corp/issues/23258

Signed-off-by: Alex Chan <alexc@tailscale.com>

Change-Id: I897f78521cc1a81651f5476228c0882d7b723606
2 weeks ago
Brad Fitzpatrick 99b06eac49 syncs: add Mutex/RWMutex alias/wrappers for future mutex debugging
Updates #17852

Change-Id: I477340fb8e40686870e981ade11cd61597c34a20
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 weeks ago
Andrew Dunham 3a41c0c585 ipn/ipnlocal: add PROXY protocol support to Funnel/Serve
This adds the --proxy-protocol flag to 'tailscale serve' and
'tailscale funnel', which tells the Tailscale client to prepend a PROXY
protocol[1] header when making connections to the proxied-to backend.

I've verified that this works with our existing funnel servers without
additional work, since they pass along source address information via
PeerAPI already.

Updates #7747

[1]: https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt

Change-Id: I647c24d319375c1b33e995555a541b7615d2d203
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2 weeks ago
Brad Fitzpatrick 653d0738f9 types/netmap: remove PrivateKey from NetworkMap
It's an unnecessary nuisance having it. We go out of our way to redact
it in so many places when we don't even need it there anyway.

Updates #12639

Change-Id: I5fc72e19e9cf36caeb42cf80ba430873f67167c3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 weeks ago
Brad Fitzpatrick 98aadbaf54 util/cache: remove unused code
Updates #cleanup

Change-Id: I9be7029c5d2a7d6297125d0147e93205a7c68989
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Brad Fitzpatrick 4e01e8a66e wgengine/netlog: fix send to closed channel in test
Fixes #17922

Change-Id: I2cd600b0ecda389079f2004985ac9a25ffbbfdd1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Avery Palmer 8aa46a3956 util/clientmetric: fix regression causing Metric.v to be uninitialised
m.v was uninitialised when Tailscale built with ts_omit_logtail
Fixes #17918

Signed-off-by: Avery Palmer <quagsirus@catpowered.net>
3 weeks ago
Xinyu Kuo 8444659ed8 cmd/tailscale/cli: fix panic in netcheck with mismatched DERP region IDs
Fixes #17564

Signed-off-by: Xinyu Kuo <gxylong@126.com>
3 weeks ago
Jordan Whited e1f0ad7a05
net/udprelay: implement Server.SetStaticAddrPorts (#17909)
Only used in tests for now.

Updates tailscale/corp#31489

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
James Tucker a96ef432cf control/controlclient,ipn/ipnlocal: replace State enum with boolean flags
Remove the State enum (StateNew, StateNotAuthenticated, etc.) from
controlclient and replace it with two explicit boolean fields:
- LoginFinished: indicates successful authentication
- Synced: indicates we've received at least one netmap

This makes the state more composable and easier to reason about, as
multiple conditions can be true independently rather than being
encoded in a single enum value.

The State enum was originally intended as the state machine for the
whole client, but that abstraction moved to ipn.Backend long ago.
This change continues moving away from the legacy state machine by
representing state as a combination of independent facts.

Also adds test helpers in ipnlocal that check independent, observable
facts (hasValidNetMap, needsLogin, etc.) rather than relying on
derived state enums, making tests more robust.

Updates #12639

Signed-off-by: James Tucker <james@tailscale.com>
3 weeks ago
Andrew Lytvynov c5919b4ed1
feature/tpm: check IsZero in clone instead of just nil (#17884)
The key.NewEmptyHardwareAttestationKey hook returns a non-nil empty
attestationKey, which means that the nil check in Clone doesn't trigger
and proceeds to try and clone an empty key. Check IsZero instead to
reduce log spam from Clone.

As a drive-by, make tpmAvailable check a sync.Once because the result
won't change.

Updates #17882

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
3 weeks ago
Andrew Lytvynov 888a5d4812
ipn/localapi: use constant-time comparison for RequiredPassword (#17906)
Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
3 weeks ago
Alex Chan 9134440008 various: adds missing apostrophes to comments
Updates #cleanup

Change-Id: I7bf29cc153c3c04e087f9bdb146c3437bed0129a
Signed-off-by: Alex Chan <alexc@tailscale.com>
3 weeks ago
Simon Law bd36817e84
scripts/installer.sh: compare major versions numerically (#17904)
Most /etc/os-release files set the VERSION_ID to a `MAJOR.MINOR`
string, but we were trying to compare this numerically against a major
version number. I can only assume that Linux Mint used switched from a
plain integer, since shells only do integer comparisons.

This patch extracts a VERSION_MAJOR from the VERSION_ID using
parameter expansion and unifies all the other ad-hoc comparisons to
use it.

Fixes #15841

Signed-off-by: Simon Law <sfllaw@tailscale.com>
Co-authored-by: Xavier <xhienne@users.noreply.github.com>
3 weeks ago
M. J. Fromberger ab4b990d51
net/netmon: do not abandon a subscriber when exiting early (#17899)
LinkChangeLogLimiter keeps a subscription to track rate limits for log
messages.  But when its context ended, it would exit the subscription loop,
leaving the subscriber still alive. Ensure the subscriber gets cleaned up
when the context ends, so we don't stall event processing.

Updates tailscale/corp#34311

Change-Id: I82749e482e9a00dfc47f04afbc69dd0237537cb2
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
3 weeks ago
Brad Fitzpatrick ce10f7c14c wgengine/wgcfg/nmcfg: reduce wireguard reconfig log spam
On the corp tailnet (using Mullvad exit nodes + bunch of expired
devices + subnet routers), these were generating big ~35 KB blobs of
logging regularly.

This logging shouldn't even exist at this level, and should be rate
limited at a higher level, but for now as a bandaid, make it less
spammy.

Updates #cleanup

Change-Id: I0b5e9e6e859f13df5f982cd71cd5af85b73f0c0a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Andrew Dunham 208a32af5b logpolicy: fix nil pointer dereference with invalid TS_LOG_TARGET
When TS_LOG_TARGET is set to an invalid URL, url.Parse returns an error
and nil pointer, which caused a panic when accessing u.Host.

Now we check the error from url.Parse and log a helpful message while
falling back to the default log host.

Fixes #17792

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
3 weeks ago
Brad Fitzpatrick 052602752f control/controlclient: make Observer optional
As a baby step towards eventbus-ifying controlclient, make the
Observer optional.

This also means callers that don't care (like this network lock test,
and some tests in other repos) can omit it, rather than passing in a
no-op one.

Updates #12639

Change-Id: Ibd776b45b4425c08db19405bc3172b238e87da4e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Jordan Whited 0285e1d5fb
feature/relayserver: fix Shutdown() deadlock (#17898)
Updates #17894

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
James 'zofrex' Sanderson 124301fbb6
ipn/ipnlocal: log prefs changes and reason in Start (#17876)
Updates tailscale/corp#34238

Signed-off-by: James Sanderson <jsanderson@tailscale.com>
3 weeks ago
Alex Chan b5cd29932e tka: add a test for unmarshaling existing AUMs
Updates https://github.com/tailscale/tailscale/issues/17613

Change-Id: I693a580949eef59263353af6e7e03a7af9bbaa0b
Signed-off-by: Alex Chan <alexc@tailscale.com>
3 weeks ago
Jordan Whited 9e4d1fd87f
feature/relayserver,ipn/ipnlocal,net/udprelay: plumb DERPMap (#17881)
This commit replaces usage of local.Client in net/udprelay with DERPMap
plumbing over the eventbus. This has been a longstanding TODO. This work
was also accelerated by a memory leak in net/http when using
local.Client over long periods of time. So, this commit also addresses
said leak.

Updates #17801

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
Brad Fitzpatrick 146ea42822 ipn/ipnlocal: remove all the weird locking (LockedOnEntry, UnlockEarly, etc)
Fixes #11649
Updates #16369

Co-authored-by: James Sanderson <jsanderson@tailscale.com>
Change-Id: I63eaa18fe870ddf81d84b949efac4d1b44c3db86
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Andrew Dunham 08e74effc0 cmd/cloner: support cloning arbitrarily-nested maps
Fixes #17870

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
3 weeks ago
Naman Sood ca9b68aafd
cmd/tailscale/cli: remove service flag from funnel command (#17850)
Fixes #17849.

Signed-off-by: Naman Sood <mail@nsood.in>
3 weeks ago
Andrew Dunham 6ac80b7334 cmd/{cloner,viewer}: handle maps of views
Instead of trying to call View() on something that's already a View
type (or trying to Clone the view unnecessarily), we can re-use the
existing View values in a map[T]ViewType.

Fixes #17866

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
3 weeks ago
Jordan Whited f4f9dd7f8c
net/udprelay: replace VNI pool with selection algorithm (#17868)
This reduces memory usage when tailscaled is acting as a peer relay.

Updates #17801

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
License Updater 31fe75ad9e licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
3 weeks ago
Fran Bull 37aa7e6935 util/dnsname: fix test error message
Updates #17788

Signed-off-by: Fran Bull <fran@tailscale.com>
3 weeks ago
Brad Fitzpatrick f387b1010e wgengine/wgcfg: remove two unused Config fields
They distracted me in some refactoring. They're set but never used.

Updates #17858

Change-Id: I6ec7d6841ab684a55bccca7b7cbf7da9c782694f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Fran Bull 27a0168cdc util/dnsname: increase maxNameLength to account for trailing dot
Fixes #17788

Signed-off-by: Fran Bull <fran@tailscale.com>
3 weeks ago
Jonathan Nobels e8d2f96449
ipn/ipnlocal, net/netns: add node cap to disable netns interface binding on netext Apple clients (#17691)
updates tailscale/corp#31571

It appears that on the latest macOS, iOS and tVOS versions, the work
that netns is doing to bind outgoing connections to the default interface (and all
of the trimmings and workarounds in netmon et al that make that work) are
not needed. The kernel is extension-aware and doing nothing, is the right
thing.  This is, however, not the case for tailscaled (which is not a
special process).

To allow us to test this assertion (and where it might break things), we add a
new node cap that turns this behaviour off only for network-extension equipped clients,
making it possible to turn this off tailnet-wide, without breaking any tailscaled
macos nodes.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
3 weeks ago
Sachin Iyer 16e90dcb27
net/batching: fix gro size handling for misordered UDP_GRO messages (#17842)
Fixes #17835

Signed-off-by: Sachin Iyer <siyer@detail.dev>
3 weeks ago
Sachin Iyer d37884c734
cmd/k8s-operator: remove early return in ingress matching (#17841)
Fixes #17834

Signed-off-by: Sachin Iyer <siyer@detail.dev>
3 weeks ago
Sachin Iyer 85cb64c4ff
wf: correct IPv6 link-local range from ff80::/10 to fe80::/10 (#17840)
Fixes #17833

Signed-off-by: Sachin Iyer <siyer@detail.dev>
3 weeks ago
Sachin Iyer 3280dac797 wgengine/router/osrouter: fix linux magicsock port changing
Fixes #17837

Signed-off-by: Sachin Iyer <siyer@detail.dev>
3 weeks ago
Brad Fitzpatrick 1eba5b0cbd util/eventbus: log goroutine stacks when hung in CI
Updates #17680

Change-Id: Ie48dc2d64b7583d68578a28af52f6926f903ca4f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Brad Fitzpatrick 42ce5c88be wgengine/magicsock: unblock Conn.Synchronize on Conn.Close
I noticed a deadlock in a test in a in-development PR where during a
shutdown storm of things (from a tsnet.Server.Close), LocalBackend was
trying to call magicsock.Conn.Synchronize but the magicsock and/or
eventbus was already shut down and no longer processing events.

Updates #16369

Change-Id: I58b1f86c8959303c3fb46e2e3b7f38f6385036f1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Jordan Whited 2ad2d4d409
wgengine/magicsock: fix UDPRelayAllocReq/Resp deadlock (#17831)
Updates #17830

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
Jordan Whited 18806de400
wgengine/magicsock: validate endpoint.derpAddr in Conn.onUDPRelayAllocResp (#17828)
Otherwise a zero value will panic in Conn.sendUDPStd.

Updates #17827

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
Brad Fitzpatrick 4650061326 ipn/ipnlocal: fix state_test data race seen in CI
Unfortunately I closed the tab and lost it in my sea of CI failures
I'm currently fighting.

Updates #cleanup

Change-Id: I4e3a652d57d52b75238f25d104fc1987add64191
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Brad Fitzpatrick 6e24f50946 tsnet: add tstest.Shard on the slow tests
So they're not all run N times on the sharded oss builders
and are only run one time each.

Updates tailscale/corp#28679

Change-Id: Ie21e84b06731fdc8ec3212eceb136c8fc26b0115
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Brad Fitzpatrick 8ed6bb3198 ipn/ipnlocal: move vipServiceHash etc to serve.go, out of local.go
Updates #12614

Change-Id: I3c16b94fcb997088ff18d5a21355e0279845ed7e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Brad Fitzpatrick e0e8731130 feature, ipn/ipnlocal: add, use feature.CanSystemdStatus for more DCE
When systemd notification support was omitted from the build, or on
non-Linux systems, we were unnecessarily emitting code and generating
garbage stringifying addresses upon transition to the Running state.

Updates #12614

Change-Id: If713f47351c7922bb70e9da85bf92725b25954b9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Jordan Whited e059382174
wgengine/magicsock: clean up determineEndpoints docs (#17822)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 weeks ago
Brad Fitzpatrick fe5501a4e9 wgengine: make getStatus a bit cheaper (less alloc-y)
This removes one of the O(n=peers) allocs in getStatus, as
Engine.getStatus happens more often than Reconfig.

Updates #17814

Change-Id: I8a87fbebbecca3aedadba38e46cc418fd163c2b0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago
Alex Chan 4c67df42f6 tka: log a better error if there are no chain candidates
Previously if `chains` was empty, it would be passed to `computeActiveAncestor()`,
which would fail with the misleading error "multiple distinct chains".

Updates tailscale/corp#33846

Signed-off-by: Alex Chan <alexc@tailscale.com>
Change-Id: Ib93a755dbdf4127f81cbf69f3eece5a388db31c8
3 weeks ago
Alex Chan c7dbd3987e tka: remove an unused parameter from `computeActiveAncestor`
Updates #cleanup

Change-Id: I86ee7a0d048dafc8c0d030291261240050451721
Signed-off-by: Alex Chan <alexc@tailscale.com>
3 weeks ago
Andrew Lytvynov ae3dff15e4
ipn/ipnlocal: clean up some of the weird locking (#17802)
* lock released early just to call `b.send` when it can call
  `b.sendToLocked` instead
* `UnlockEarly` called to release the lock before trivially fast
  operations, we can wait for a defer there

Updates #11649

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
3 weeks ago
Brad Fitzpatrick 2e265213fd tsnet: fix TestConn to be fast, not flaky
Fixes #17805

Change-Id: I36e37cb0cfb2ea7b2341fd4b9809fbf1dd46d991
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 weeks ago