Commit Graph

7553 Commits (1002d10dcd56e21c9d73e491bbbbdf6f501346fa)
 

Author SHA1 Message Date
Charlotte Brandhorst-Satzkorn acb611f034
ipn/localipn: introduce logs for tailfs (#11496)
This change introduces some basic logging into the access and share
pathways for tailfs.

Updates tailscale/corp#17818

Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
7 months ago
Irbe Krumina 4cbef20569
cmd/k8s-operator: redact auth key from debug logs (#11523)
Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
7 months ago
Brad Fitzpatrick 55baf9474f metrics, tsweb/varz: add multi-label map metrics
Updates tailscale/corp#18640

Change-Id: Ia9ae25956038e9d3266ea165537ac6f02485b74c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Flakes Updater 90a4d6ce69 go.mod.sri: update SRI hash for go.mod changes
Signed-off-by: Flakes Updater <noreply+flakes-updater@tailscale.com>
7 months ago
Brad Fitzpatrick 6d90966c1f logtail: move a scratch buffer to Logger
Rather than pass around a scratch buffer, put it on the Logger.

This is a baby step towards removing the background uploading
goroutine and starting it as needed.

Updates tailscale/corp#18514 (insofar as it led me to look at this code)

Change-Id: I6fd94581c28bde40fdb9fca788eb9590bcedae1b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Irbe Krumina 06e22a96b1
.github/workflows: fix path filter for 'Kubernetes manifests' test job (#11520)
Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
7 months ago
Chris Milson-Tokunaga b6dfd7443a
Change type of installCRDs (#11478)
Including the double quotes (`"`) around the value made it appear like the helm chart should expect a string value for `installCRDs`.

Signed-off-by: Chris Milson-Tokunaga <chris.w.milson@gmail.com>
7 months ago
Percy Wegmann 8b8b315258 net/tstun: use gaissmai/bart instead of tempfork/device
This implementation uses less memory than tempfork/device,
which helps avoid OOM conditions in the iOS VPN extension when
switching to a Tailnet with ExitNode routing enabled.

Updates tailscale/corp#18514

Signed-off-by: Percy Wegmann <percy@tailscale.com>
7 months ago
Andrew Lytvynov 1e7050e73a
go.mod: bump github.com/docker/docker (#11515)
There's a vulnerability https://pkg.go.dev/vuln/GO-2024-2659 that
govulncheck flags, even though it's only reachable from tests and
cmd/sync-containers and cannot be exploited there.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
7 months ago
Brad Fitzpatrick a36cfb4d3d tailcfg, ipn/ipnlocal, wgengine/magicsock: add only-tcp-443 node attr
Updates tailscale/corp#17879

Change-Id: I0dc305d147b76c409cf729b599a94fa723aef0e0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick 7b34154df2 all: deprecate Node.Capabilities (more), remove PeerChange.Capabilities [capver 89]
First we had Capabilities []string. Then
https://tailscale.com/blog/acl-grants (#4217) brought CapMap, a
superset of Capabilities. Except we never really finished the
transition inside the codebase to go all-in on CapMap. This does so.

Notably, this coverts Capabilities on the wire early to CapMap
internally so the code can only deal in CapMap, even against an old
control server.

In the process, this removes PeerChange.Capabilities support, which no
known control plane sent anyway. They can and should use
PeerChange.CapMap instead.

Updates #11508
Updates #4217

Change-Id: I872074e226b873f9a578d9603897b831d50b25d9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick 4992aca6ec tsweb/varz: flesh out munging of expvar keys into valid Prometheus metrics
From a problem we hit with how badger registers expvars; it broke
trunkd's exported metrics.

Updates tailscale/corp#1297

Change-Id: I42e1552e25f734c6f521b6e993d57a82849464b2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick b104688e04 ipn/ipnlocal, types/netmap: replace hasCapability with set lookup on NetworkMap
When node attributes were super rare, the O(n) slice scans looking for
node attributes was more acceptable. But now more code and more users
are using increasingly more node attributes. Time to make it a map.

Noticed while working on tailscale/corp#17879

Updates #cleanup

Change-Id: Ic17c80341f418421002fbceb47490729048756d2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Percy Wegmann 8c88853db6 ipn/ipnlocal: add c2n /debug/pprof/allocs endpoint
This behaves the same as typical debug/pprof/allocs.

Updates tailscale/corp#18514

Signed-off-by: Percy Wegmann <percy@tailscale.com>
7 months ago
Brad Fitzpatrick f45594d2c9 control/controlclient: free memory on iOS before full netmap work
Updates tailscale/corp#18514

Change-Id: I8d0330334b030ed8692b25549a0ee887ac6d7188
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
James Tucker e0f97738ee localapi: reduce garbage production in bus watcher
Updates #optimization

Signed-off-by: James Tucker <james@tailscale.com>
7 months ago
James Tucker 3f7313dbdb util/linuxfw,wgengine/router: enable IPv6 configuration when netfilter is disabled
Updates #11434

Signed-off-by: James Tucker <james@tailscale.com>
7 months ago
Brad Fitzpatrick 8444937c89 control/controlclient: fix panic regression from earlier load balancer hint header
In the recent 20e9f3369 we made HealthChangeRequest machine requests
include a NodeKey, as it was the oddball machine request that didn't
include one. Unfortunately, that code was sometimes being called (at
least in some of our integration tests) without a node key due to its
registration with health.RegisterWatcher(direct.ReportHealthChange).

Fortunately tests in corp caught this before we cut a release. It's
possible this only affects this particular integration test's
environment, but still worth fixing.

Updates tailscale/corp#1297

Change-Id: I84046779955105763dc1be5121c69fec3c138672
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Joe Tsai 85febda86d
all: use zstdframe where sensible (#11491)
Use the zstdframe package where sensible instead of plumbing
around our own zstd.Encoder just for stateless operations.

This causes logtail to have a dependency on zstd,
but that's arguably okay since zstd support is implicit
to the protocol between a client and the logging service.
Also, virtually every caller to logger.NewLogger was
manually setting up a zstd.Encoder anyways,
meaning that zstd was functionally always a dependency.

Updates #cleanup
Updates tailscale/corp#18514

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
7 months ago
Joe Tsai d4bfe34ba7
util/zstdframe: add package for stateless zstd compression (#11481)
The Go zstd package is not friendly for stateless zstd compression.
Passing around multiple zstd.Encoder just for stateless compression
is a waste of memory since the memory is never freed and seldom
used if no compression operations are happening.

For performance, we pool the relevant Encoder/Decoder
with the specific options set.

Functionally, this package is a wrapper over the Go zstd package
with a more ergonomic API for stateless operations.

This package can be used to cleanup various pre-existing zstd.Encoder
pools or one-off handlers spread throughout our codebases.

Performance:

	BenchmarkEncode/Best               1690        610926 ns/op      25.78 MB/s           1 B/op          0 allocs/op
	    zstd_test.go:137: memory: 50.336 MiB
	    zstd_test.go:138: ratio:  3.269x
	BenchmarkEncode/Better            10000        100939 ns/op     156.04 MB/s           0 B/op          0 allocs/op
	    zstd_test.go:137: memory: 20.399 MiB
	    zstd_test.go:138: ratio:  3.131x
	BenchmarkEncode/Default            15775         74976 ns/op     210.08 MB/s         105 B/op          0 allocs/op
	    zstd_test.go:137: memory: 1.586 MiB
	    zstd_test.go:138: ratio:  3.064x
	BenchmarkEncode/Fastest            23222         53977 ns/op     291.81 MB/s          26 B/op          0 allocs/op
	    zstd_test.go:137: memory: 599.458 KiB
	    zstd_test.go:138: ratio:  2.898x
	BenchmarkEncode/FastestLowMemory                   23361         50789 ns/op     310.13 MB/s          15 B/op          0 allocs/op
	    zstd_test.go:137: memory: 334.458 KiB
	    zstd_test.go:138: ratio:  2.898x
	BenchmarkEncode/FastestNoChecksum                  23086         50253 ns/op     313.44 MB/s          26 B/op          0 allocs/op
	    zstd_test.go:137: memory: 599.458 KiB
	    zstd_test.go:138: ratio:  2.900x

	BenchmarkDecode/Checksum                           70794         17082 ns/op     300.96 MB/s           4 B/op          0 allocs/op
	    zstd_test.go:163: memory: 316.438 KiB
	BenchmarkDecode/NoChecksum                         74935         15990 ns/op     321.51 MB/s           4 B/op          0 allocs/op
	    zstd_test.go:163: memory: 316.438 KiB
	BenchmarkDecode/LowMemory                          71043         16739 ns/op     307.13 MB/s           0 B/op          0 allocs/op
	    zstd_test.go:163: memory: 79.347 KiB

We can see that the options are taking effect where compression ratio improves
with higher levels and compression speed diminishes.
We can also see that LowMemory takes effect where the pooled coder object
references less memory than other cases.
We can see that the pooling is taking effect as there are 0 amortized allocations.

Additional performance:

	BenchmarkEncodeParallel/zstd-24                     1857        619264 ns/op        1796 B/op         49 allocs/op
	BenchmarkEncodeParallel/zstdframe-24                1954        532023 ns/op        4293 B/op         49 allocs/op
	BenchmarkDecodeParallel/zstd-24                     5288        197281 ns/op        2516 B/op         49 allocs/op
	BenchmarkDecodeParallel/zstdframe-24                6441        196254 ns/op        2513 B/op         49 allocs/op

In concurrent usage, handling the pooling in this package
has a marginal benefit over the zstd package,
which relies on a Go channel as the pooling mechanism.
In particular, coders can be freed by the GC when not in use.
Coders can be shared throughout the program if they use this package
instead of multiple independent pools doing the same thing.
The allocations are unrelated to pooling as they're caused by the spawning of goroutines.

Updates #cleanup
Updates tailscale/corp#18514
Updates tailscale/corp#17653
Updates tailscale/corp#18005

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
7 months ago
Brad Fitzpatrick 6a860cfb35 ipn/ipnlocal: add c2n pprof option to force a GC
Like net/http/pprof has.

Updates tailscale/corp#18514

Change-Id: I264adb6dcf5732d19707783b29b7273b4ca69cf4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick 5d1c72f76b wgengine/magicsock: don't use endpoint debug ringbuffer on mobile.
Save some memory.

Updates tailscale/corp#18514

Change-Id: Ibcaf3c6d8e5cc275c81f04141d0f176e2249509b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Andrew Dunham 512fc0b502 util/reload: add new package to handle periodic value loading
This can be used to reload a value periodically, whether from disk or
another source, while handling jitter and graceful shutdown.

Updates tailscale/corp#1297

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Iee2b4385c9abae59805f642a7308837877cb5b3f
8 months ago
Adrian Dewhurst 2f7e7be2ea control/controlclient: do not alias peer CapMap
Updates #cleanup

Change-Id: I10fd5e04310cdd7894a3caa3045b86eb0a06b6a0
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
8 months ago
Percy Wegmann 067ed0bf6f ipnlocal: ensure TailFS share notifications are non-nil
This allows the UI to distinguish between 'no shares' versus
'not being notified about shares'.

Updates ENG-2843

Signed-off-by: Percy Wegmann <percy@tailscale.com>
8 months ago
Brad Fitzpatrick 20e9f3369d control/controlclient: send load balancing hint HTTP request header
Updates tailscale/corp#1297

Change-Id: I0b102081e81dfc1261f4b05521ab248a2e4a1298
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Percy Wegmann 15c58cb77c tailfs: include whitespace in test share and filenames
Since TailFS allows spaces in folder and file names, test with spaces.

Updates tailscale/corp#16827

Signed-off-by: Percy Wegmann <percy@tailscale.com>
8 months ago
James Tucker e37eded256
tool/gocross: add android autoflags (#11465)
Updates tailscale/corp#18202

Signed-off-by: James Tucker <james@tailscale.com>
8 months ago
Claire Wang 221de01745
control/controlclient: fix sending peer capmap changes (#11457)
Instead of just checking if a peer capmap is nil, compare the previous
state peer capmap with the new peer capmap.
Updates tailscale/corp#17516

Signed-off-by: Claire Wang <claire@tailscale.com>
8 months ago
Andrew Dunham 6da1dc84de wgengine: fix logger data race in tests
Observed in:
    https://github.com/tailscale/tailscale/actions/runs/8350904950/job/22858266932?pr=11463

Updates #11226

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I9b57db4b34b6ad91d240cd9fa7e344fc0376d52d
8 months ago
Andrew Dunham e382e4cee6 syncs: add Swap method
To mimic sync.Map.Swap, sync/atomic.Value.Swap, etc.

Updates tailscale/corp#1297

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: If7627da1bce8b552873b21d7e5ebb98904e9a650
8 months ago
Andrea Gottardo 6288c9b41e
version/prop: remove IsMacAppSandboxEnabled (#11461)
Fixes tailscale/corp#18441

For a few days, IsMacAppStore() has been returning `false` on App Store builds (IPN-macOS target in Xcode).

I regressed this in #11369 by introducing logic to detect the sandbox by checking for the APP_SANDBOX_CONTAINER_ID environment variable. I thought that was a more robust approach instead of checking the name of the executable. However, it appears that on recent macOS versions this environment variable is no longer getting set, so we should go back to the previous logic that checks for the executable path, or HOME containing references to macsys.

This PR also adds additional checks to the logic by also checking XPC_SERVICE_NAME in addition to HOME where possible. That environment variable is set inside the network extension, either macos or macsys and is good to look at if for any reason HOME is not set.
8 months ago
Mario Minardi 68d9e49a5b
api.md: add missing backtick to GET searchpaths doc (#11459)
Add missing backtick to GET searchpaths api documentation.

Updates #cleanup

Signed-off-by: Mario Minardi <mario@tailscale.com>
8 months ago
Will Norris 349799a1ba api.md: format API docs with prettier
Mostly inconsequential minor fixes for consistency.  A couple of changes
to actual JSON examples, but all still very readable, so I think it's
fine.

Updates #cleanup

Signed-off-by: Will Norris <will@tailscale.com>
8 months ago
Irbe Krumina b0c3e6f6c5
cmd/k8s-operator,ipn/conf.go: fix --accept-routes for proxies (#11453)
Fix a bug where all proxies got configured with --accept-routes set to true.
The bug was introduced in https://github.com/tailscale/tailscale/pull/11238.

Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
8 months ago
James Tucker 7fe4cbbaf3
types/views: optimize slices contains under some conditions (#11449)
In control there are conditions where the leaf functions are not being
optimized away (i.e. At is not inlined), resulting in undesirable time
spent copying during SliceContains. This optimization is likely
irrelevant to simpler code or smaller structures.

Updates #optimization

Signed-off-by: James Tucker <james@tailscale.com>
8 months ago
Mario Minardi d2ccfa4edd
cmd/tailscale,ipn/ipnlocal: enable web client over quad 100 by default (#11419)
Enable the web client over 100.100.100.100 by default. Accepting traffic
from [tailnet IP]:5252 still requires setting the `webclient` user pref.

Updates https://github.com/tailscale/tailscale/issues/10261

Signed-off-by: Mario Minardi <mario@tailscale.com>
8 months ago
Will Norris 4d747c1833 api.md: document device expiration endpoint
This was originally built for testing node expiration flows, but is also
useful for customers to force device re-auth without actually deleting
the device from the tailnet.

Updates tailscale/corp#18408

Signed-off-by: Will Norris <will@tailscale.com>
8 months ago
Mario Minardi e0886ad167
ipn/ipnlocal, tailcfg: add disable-web-client node attribute (#11418)
Add a disable-web-client node attribute and add handling for disabling
the web client when this node attribute is set.

Updates https://github.com/tailscale/tailscale/issues/10261

Signed-off-by: Mario Minardi <mario@tailscale.com>
8 months ago
Marwan Sulaiman da7c3d1753 envknob: ensure f is not nil before using it
This PR fixes a panic that I saw in the mac app where
parsing the env file fails but we don't get to see the
error due to the panic of using f.Name()

Fixes #11425

Signed-off-by: Marwan Sulaiman <marwan@tailscale.com>
8 months ago
Andrea Gottardo 08ebac9acb
version,cli,safesocket: detect non-sandboxed macOS GUI (#11369)
Updates ENG-2848

We can safely disable the App Sandbox for our macsys GUI, allowing us to use `tailscale ssh` and do a few other things that we've wanted to do for a while. This PR:

- allows Tailscale SSH to be used from the macsys GUI binary when called from a CLI
- tweaks the detection of client variants in prop.go, with new functions `IsMacSys()`, `IsMacSysApp()` and `IsMacAppSandboxEnabled()`

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
8 months ago
Irbe Krumina ea55f96310
cmd/tailscale/cli: fix configuring partially empty kubeconfig (#11417)
When a user deletes the last cluster/user/context from their
kubeconfig via 'kubectl delete-[cluster|user|context] command,
kubectx sets the relevant field in kubeconfig to 'null'.
This was breaking our conversion logic that was assuming that the field
is either non-existant or is an array.

Updates tailscale/corp#18320

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
8 months ago
Anton Tolchanov cf8948da5f net/routetable: increase route limit used by the test
I was running all tests while preparing a recent stable release, and
this was failing because my computer is connected to a fairly large
tailnet.

```
--- FAIL: TestGetRouteTable (0.01s)
    routetable_linux_test.go:32: expected at least one default route;
    ...
```

```
$ ip route show table 52  | wc -l
1051
```

Updates #cleanup

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
8 months ago
Andrew Lytvynov decd9893e4
ipn/ipnlocal: validate domain of PopBrowserURL on default control URL (#11394)
If the client uses the default Tailscale control URL, validate that all
PopBrowserURLs are under tailscale.com or *.tailscale.com. This reduces
the risk of a compromised control plane opening phishing pages for
example.

The client trusts control for many other things, but this is one easy
way to reduce that trust a bit.

Fixes #11393

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
8 months ago
Andrew Lytvynov 48eef9e6eb
clientupdate: do not allow msiexec to reboot the OS (#11409)
According to
https://learn.microsoft.com/en-us/windows/win32/msi/standard-installer-command-line-options#promptrestart,
`/promptrestart` is ignored with `/quiet` is set, so msiexec.exe can
sometimes silently trigger a reboot. The best we can do to reduce
unexpected disruption is to just prevent restarts, until the user
chooses to do it. Restarts aren't normally needed for Tailscale updates,
but there seem to be some situations where it's triggered.

Updates #18254

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
8 months ago
Anton Tolchanov da3cf12194 VERSION.txt: this is v1.63.0
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
8 months ago
Anton Tolchanov f12d2557f9 prober: add a DERP bandwidth probe
Updates tailscale/corp#17912

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
8 months ago
Anton Tolchanov 5018683d58 prober: remove unused derp prober latency measurements
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
8 months ago
Anton Tolchanov 205a10b51a prober: export probe counters and cumulative latency
Updates #cleanup

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
8 months ago
Andrew Dunham 7429e8912a wgengine/netstack: fix bug with duplicate SYN packets in client limit
This fixes a bug that was introduced in #11258 where the handling of the
per-client limit didn't properly account for the fact that the gVisor
TCP forwarder will return 'true' to indicate that it's handled a
duplicate SYN packet, but not launch the handler goroutine.

In such a case, we neither decremented our per-client limit in the
wrapper function, nor did we do so in the handler function, leading to
our per-client limit table slowly filling up without bound.

Fix this by doing the same duplicate-tracking logic that the TCP
forwarder does so we can detect such cases and appropriately decrement
our in-flight counter.

Updates tailscale/corp#12184

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib6011a71d382a10d68c0802593f34b8153d06892
8 months ago