Commit Graph

278 Commits (be107f92d349d19d5b5f461ec3aa10ef6b551ffe)

Author SHA1 Message Date
Andrew Dunham be107f92d3 wgengine/magicsock: track per-endpoint changes in ringbuffer
This change adds a ringbuffer to each magicsock endpoint that keeps a
fixed set of "changes"–debug information about what updates have been
made to that endpoint.

Additionally, this adds a LocalAPI endpoint and associated
"debug peer-status" CLI subcommand to fetch the set of changes for a given
IP or hostname.

Updates tailscale/corp#9364

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I34f726a71bddd0dfa36ec05ebafffb24f6e0516a
1 year ago
Andrew Dunham 73fa7dd7af util/slicesx: add package for generic slice functions, use
Now that we're using rand.Shuffle in a few locations, create a generic
shuffle function and use it instead. While we're at it, move the
interleaveSlices function to the same package for use.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I0b00920e5b3eea846b6cedc30bd34d978a049fd3
1 year ago
Mihai Parparita 9cb332f0e2 sockstats: instrument networking code paths
Uses the hooks added by tailscale/go#45 to instrument the reads and
writes on the major code paths that do network I/O in the client. The
convention is to use "<package>.<type>:<label>" as the annotation for
the responsible code path.

Enabled on iOS, macOS and Android only, since mobile platforms are the
ones we're most interested in, and we are less sensitive to any
throughput degradation due to the per-I/O callback overhead (macOS is
also enabled for ease of testing during development).

For now just exposed as counters on a /v0/sockstats PeerAPI endpoint.

We also keep track of the current interface so that we can break out
the stats by interface.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
1 year ago
Joe Tsai 0d19f5d421
all: replace logtail.{Public,Private}ID with logid.{Public,Private}ID (#7404)
The log ID types were moved to a separate package so that
code that only depend on log ID types do not need to link
in the logic for the logtail client itself.
Not all code need the logtail client.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
1 year ago
David Anderson 70a2929a12 version: make all exported funcs compile-time constant or lazy
Signed-off-by: David Anderson <danderson@tailscale.com>
1 year ago
Tom DNetto 99b9d7a621 all: implement pcap streaming for datapath debugging
Updates: tailscale/corp#8470

Signed-off-by: Tom DNetto <tom@tailscale.com>
1 year ago
Brad Fitzpatrick 03645f0c27 net/{netns,netstat}: use new x/sys/cpu.IsBigEndian
See golang/go#57237

Change-Id: If47ab6de7c1610998a5808e945c4177c561eab45
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick b1248442c3 all: update to Go 1.20, use strings.CutPrefix/Suffix instead of our fork
Updates #7123
Updates #5309

Change-Id: I90bcd87a2fb85a91834a0dd4be6e03db08438672
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Will Norris 51e1ab5560 fixup! util/vizerror: add new package for visible errors
Signed-off-by: Will Norris <will@tailscale.com>
1 year ago
Brad Fitzpatrick a1b4ab34e6 util/httpm: add new package for prettier HTTP method constants
See package doc.

Change-Id: Ibbfc8e1f98294217c56f3a9452bd93ffa3103572
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 6793685bba go.mod: bump AWS SDK past a breaking API change of theirs
They changed a type in their SDK which meant others using the AWS APIs
in their Go programs (with newer AWS modules in their caller go.mod)
and then depending on Tailscale (for e.g. tsnet) then couldn't compile
ipn/store/awsstore.

Thanks to @thisisaaronland for bringing this up.

Fixes #7019

Change-Id: I8d2919183dabd6045a96120bb52940a9bb27193b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Tom DNetto 673b3d8dbd net/dns,userspace: remove unused DNS paths, normalize query limit on iOS
With a42a594bb3, iOS uses netstack and
hence there are no longer any platforms which use the legacy MagicDNS path. As such, we remove it.

We also normalize the limit for max in-flight DNS queries on iOS (it was 64, now its 256 as per other platforms).
It was 64 for the sake of being cautious about memory, but now we have 50Mb (iOS-15 and greater) instead of 15Mb
so we have the spare headroom.

Signed-off-by: Tom DNetto <tom@tailscale.com>
1 year ago
andig 14e8afe444 go.mod, etc: bump gvisor
Fixes #6554

Change-Id: Ia04ae37a47b67fa57091c9bfe1d45a1842589aa8
Signed-off-by: andig <cpuidle@gmx.de>
1 year ago
Jordan Whited 914d115f65
go.mod: bump tailscale/wireguard-go for big-endian fix (#6785)
Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Brad Fitzpatrick c02ccf6424 go.mod: bump dhcp dep to remove another endian package from our tree
To pull in insomniacslk/dhcp#484 to pull in u-root/uio#8

Updates golang/go#57237

Change-Id: I1e56656e0dc9ec0b870f799fe3bc18b3caac1ee4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick ca08e316af util/endian: delete package; use updated josharian/native instead
See josharian/native#3

Updates golang/go#57237

Change-Id: I238c04c6654e5b9e7d9cfb81a7bbc5e1043a84a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Jordan Whited ea5ee6f87c
all: update golang.zx2c4.com/wireguard to github.com/tailscale/wireguard-go (#6692)
This is temporary while we work to upstream performance work in
https://github.com/WireGuard/wireguard-go/pull/64. A replace directive
is less ideal as it breaks dependent code without duplication of the
directive.

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Jordan Whited 76389d8baf
net/tstun, wgengine/magicsock: enable vectorized I/O on Linux (#6663)
This commit updates the wireguard-go dependency and implements the
necessary changes to the tun.Device and conn.Bind implementations to
support passing vectors of packets in tailscaled. This significantly
improves throughput performance on Linux.

Updates #414

Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Co-authored-by: James Tucker <james@tailscale.com>
1 year ago
Aaron Klotz 98f21354c6 cmd/tailscaled: add a special command to tailscaled's Windows service for removing WinTun
WinTun is installed lazily by tailscaled while it is running as LocalSystem.
Based upon what we're seeing in bug reports and support requests, removing
WinTun as a lesser user may fail under certain Windows versions, even when that
user is an Administrator.

By adding a user-defined command code to tailscaled, we can ask the service to
do the removal on our behalf while it is still running as LocalSystem.

* The uninstall code is basically the same as it is in corp;
* The command code will be sent as a service control request and is protected by
  the SERVICE_USER_DEFINED_CONTROL access right, which requires Administrator.

I'll be adding follow-up patches in corp to engage this functionality.

Updates https://github.com/tailscale/tailscale/issues/6433

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
1 year ago
Andrew Dunham a887ca7efe ipn/ipnlocal: improve redactErr to handle more cases
This handles the case where the inner *os.PathError is wrapped in
another error type, and additionally will redact errors of type
*os.LinkError. Finally, add tests to verify that redaction works.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ie83424ff6c85cdb29fb48b641330c495107aab7c
1 year ago
Brad Fitzpatrick 197a4f1ae8 types/ptr: move all the ptrTo funcs to one new package's ptr.To
Change-Id: Ia0b820ffe7aa72897515f19bd415204b6fe743c7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Maisem Ali adc302f428 all: use named pipes on windows
Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Joe Tsai 2e5d08ec4f
net/connstats: invert network logging data flow (#6272)
Previously, tstun.Wrapper and magicsock.Conn managed their
own statistics data structure and relied on an external call to
Extract to extract (and reset) the statistics.
This makes it difficult to ensure a maximum size on the statistics
as the caller has no introspection into whether the number
of unique connections is getting too large.

Invert the control flow such that a *connstats.Statistics
is registered with tstun.Wrapper and magicsock.Conn.
Methods on non-nil *connstats.Statistics are called for every packet.
This allows the implementation of connstats.Statistics (in the future)
to better control when it needs to flush to ensure
bounds on maximum sizes.

The value registered into tstun.Wrapper and magicsock.Conn could
be an interface, but that has two performance detriments:

1. Method calls on interface values are more expensive since
they must go through a virtual method dispatch.

2. The implementation would need a sync.Mutex to protect the
statistics value instead of using an atomic.Pointer.

Given that methods on constats.Statistics are called for every packet,
we want reduce the CPU cost on this hot path.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
1 year ago
Joe Tsai 35c10373b5
types/logid: move logtail ID types here (#6336)
Many packages reference the logtail ID types,
but unfortunately pull in the transitive dependencies of logtail.
Fix this problem by putting the log ID types in its own package
with minimal dependencies.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
1 year ago
Brad Fitzpatrick ea25ef8236 util/set: add new set package for SetHandle type
We use this pattern in a number of places (in this repo and elsewhere)
and I was about to add a fourth to this repo which was crossing the line.
Add this type instead so they're all the same.

Also, we have another Set type (SliceSet, which tracks its keys in
order) in another repo we can move to this package later.

Change-Id: Ibbdcdba5443fae9b6956f63990bdb9e9443cefa9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick e8cc78b1af ipn/ipnserver: change Server to let LocalBackend be supplied async
This is step 1 of de-special-casing of Windows and letting the
LocalAPI HTTP server start serving immediately, even while the rest of
the world (notably the Engine and its TUN device) are being created,
which can take a few to dozens of seconds on Windows.

With this change, the ipnserver.New function changes to not take an
Engine and to return immediately, not returning an error, and let its
Run run immediately. If its ServeHTTP is called when it doesn't yet
have a LocalBackend, it returns an error. A TODO in there shows where
a future handler will serve status before an engine is available.

Future changes will:

* delete a bunch of tailscaled_windows.go code and use this new API
* add the ipnserver.Server ServerHTTP handler to await the engine
  being available
* use that handler in the Windows GUI client

Updates #6522

Change-Id: Iae94e68c235e850b112a72ea24ad0e0959b568ee
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick f45106d47c ipn/ipnserver: move Windows-specific code to tailscaled_windows.go
We'll eventually remove it entirely, but for now move get it out of ipnserver
where it's distracting and move it to its sole caller.

Updates #6522

Change-Id: I9c6f6a91bf9a8e3c5ea997952b7c08c81723d447
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick f3ba268a96 ipn/ipnserver: move BabysitProc to tailscaled_windows.go
It's only used by Windows. No need for it to be in ipn/ipnserver,
which we're trying to trim down.

Change-Id: Idf923ac8b6cdae8b5338ec26c16fb8b5ea548071
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 0a842f353c ipn/ipnserver: move more connection acceptance logic to LocalBackend
Follow-up to #6467 and #6506.

LocalBackend knows the server-mode state, so move more auth checking
there, removing some bookkeeping from ipnserver.Server.

Updates #6417
Updates tailscale/corp#8051

Change-Id: Ic5d14a077bf0dccc92a3621bd2646bab2cc5b837
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Aaron Klotz 033bd94d4c cmd/tailscaled, wgengine/router: use wingoes/com for COM initialization instead of go-ole
This patch removes the crappy, half-backed COM initialization used by `go-ole`
and replaces that with the `StartRuntime` function from `wingoes`, a library I
have started which, among other things, initializes COM properly.

In particular, we should always be initializing COM to use the multithreaded
apartment. Every single OS thread in the process becomes implicitly initialized
as part of the MTA, so we do not need to concern ourselves as to whether or not
any particular OS thread has initialized COM. Furthermore, we no longer need to
lock the OS thread when calling methods on COM interfaces.

Single-threaded apartments are designed solely for working with Win32 threads
that have a message pump; any other use of the STA is invalid.

Fixes https://github.com/tailscale/tailscale/issues/3137

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
1 year ago
Brad Fitzpatrick 7bff7345cc ipn/ipnauth: start splitting ipnserver into new ipnauth package
We're trying to gut 90% of the ipnserver package. A lot will get
deleted, some will move to LocalBackend, and a lot is being moved into
this new ipn/ipnauth package which will be leaf-y and testable.

This is a baby step towards moving some stuff to ipnauth.

Update #6417
Updates tailscale/corp#8051

Change-Id: I28bc2126764f46597d92a2d72565009dc6927ee0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 001f482aca net/dns: make "direct" mode on Linux warn on resolv.conf fights
Run an inotify goroutine and watch if another program takes over
/etc/inotify.conf. Log if so.

For now this only logs. In the future I want to wire it up into the
health system to warn (visible in "tailscale status", etc) about the
situation, with a short URL to more info about how you should really
be using systemd-resolved if you want programs to not fight over your
DNS files on Linux.

Updates #4254 etc etc

Change-Id: I86ad9125717d266d0e3822d4d847d88da6a0daaa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 08e110ebc5 cmd/tailscale: make "up", "status" warn if routes and --accept-routes off
Example output:

    # Health check:
    #     - Some peers are advertising routes but --accept-routes is false

Also, move "tailscale status" health checks to the bottom, where they
won't be lost in large netmaps.

Updates #2053
Updates #6266

Change-Id: I5ae76a0cd69a452ce70063875cd7d974bfeb8f1a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 2daf0f146c ipn/ipnlocal, wgengine/netstack: start handling ports for future serving
Updates tailscale/corp#7515

Change-Id: I966e936e72a2ee99be8d0f5f16872b48cc150258
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 6d8320a6e9 ipn/{ipnlocal,localapi}: move most of cert.go to ipnlocal
Leave only the HTTP/auth bits in localapi.

Change-Id: I8e23fb417367f1e0e31483e2982c343ca74086ab
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick db2cc393af util/dirwalk, metrics, portlist: add new package for fast directory walking
This is similar to the golang.org/x/tools/internal/fastwalk I'd
previously written but not recursive and using mem.RO.

The metrics package already had some Linux-specific directory reading
code in it. Move that out to a new general package that can be reused
by portlist too, which helps its scanning of all /proc files:

    name                old time/op    new time/op    delta
    FindProcessNames-8    2.79ms ± 6%    2.45ms ± 7%  -12.11%  (p=0.000 n=10+10)

    name                old alloc/op   new alloc/op   delta
    FindProcessNames-8    62.9kB ± 0%    33.5kB ± 0%  -46.76%  (p=0.000 n=9+10)

    name                old allocs/op  new allocs/op  delta
    FindProcessNames-8     2.25k ± 0%     0.38k ± 0%  -82.98%  (p=0.000 n=9+10)

Change-Id: I75db393032c328f12d95c39f71c9742c375f207a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick f4ff26f577 types/pad32: delete package
Use Go 1.19's new 64-bit alignment ~hidden feature instead.

Fixes #5356

Change-Id: Ifcbcb115875a7da01df3bc29e9e7feadce5bc956
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Joe Tsai c21a3c4733
types/netlogtype: new package for network logging types (#6092)
The netlog.Message type is useful to depend on from other packages,
but doing so would transitively cause gvisor and other large packages
to be linked in.

Avoid this problem by moving all network logging types to a single package.

We also update staticcheck to take in:

	003d277bcf

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Brad Fitzpatrick 35bee36549 portlist: use win32 calls instead of running netstat process [windows]
Turns out using win32 instead of shelling out to child processes is a
bit faster:

    name                  old time/op    new time/op    delta
    GetListIncremental-4     278ms ± 2%       0ms ± 7%  -99.93%  (p=0.000 n=8+10)

    name                  old alloc/op   new alloc/op   delta
    GetListIncremental-4     238kB ± 0%       9kB ± 0%  -96.12%  (p=0.000 n=10+8)

    name                  old allocs/op  new allocs/op  delta
    GetListIncremental-4     1.19k ± 0%     0.02k ± 0%  -98.49%  (p=0.000 n=10+10)

Fixes #3876 (sadly)

Change-Id: I1195ac5de21a8a8b3cdace5871d263e81aa27e91
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 3697609aaa portlist: remove unix.Readlink allocs on Linux
name       old time/op    new time/op    delta
    GetList-8    11.2ms ± 5%    11.1ms ± 3%     ~     (p=0.661 n=10+9)

    name       old alloc/op   new alloc/op   delta
    GetList-8    83.3kB ± 1%    67.4kB ± 1%  -19.05%  (p=0.000 n=10+10)

    name       old allocs/op  new allocs/op  delta
    GetList-8     2.89k ± 2%     2.19k ± 1%  -24.24%  (p=0.000 n=10+10)

(real issue is we're calling this code as much as we are, but easy
enough to make it efficient because it'll still need to be called
sometimes in any case)

Updates #5958

Change-Id: I90c20278d73e80315a840aed1397d24faa308d93
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Mihai Parparita 9d04ffc782 net/wsconn: add back custom wrapper for turning a websocket.Conn into a net.Conn
We removed it in #4806 in favor of the built-in functionality from the
nhooyr.io/websocket package. However, it has an issue with deadlines
that has not been fixed yet (see nhooyr/websocket#350). Temporarily
go back to using a custom wrapper (using the fix from our fork) so that
derpers will stop closing connections too aggressively.

Updates #5921

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Joe Tsai f9120eee57
wgengine: start network logger in Userspace.Reconfig (#5908)
If the wgcfg.Config is specified with network logging arguments,
then Userspace.Reconfig starts up an asynchronous network logger,
which is shutdown either upon Userspace.Close or when Userspace.Reconfig
is called again without network logging or route arguments.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Joe Tsai 24ebf161e8
net/tstun: instrument Wrapper with statistics gathering (#5847)
If Wrapper.StatisticsEnable is enabled,
then per-connection counters are maintained.
If enabled, Wrapper.StatisticsExtract must be periodically called
otherwise there is unbounded memory growth.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Andrew Dunham e5636997c5
wgengine: don't re-allocate trimmedNodes map (#5825)
Change-Id: I512945b662ba952c47309d3bf8a1b243e05a4736
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
Aaron Klotz 44f13d32d7 cmd/tailscaled, util/winutil: log Windows service diagnostics when the wintun device fails to install
I added new functions to winutil to obtain the state of a service and all
its depedencies, serialize them to JSON, and write them to a Logf.

When tstun.New returns a wrapped ERROR_DEVICE_NOT_AVAILABLE, we know that wintun
installation failed. We then log the service graph rooted at "NetSetupSvc".
We are interested in that specific service because network devices will not
install if that service is not running.

Updates https://github.com/tailscale/tailscale/issues/5531

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2 years ago
Brad Fitzpatrick 9bdf0cd8cd ipn/ipnlocal: add c2n /debug/{goroutines,prefs,metrics}
* and move goroutine scrubbing code to its own package for reuse
* bump capver to 45

Change-Id: I9b4dfa5af44d2ecada6cc044cd1b5674ee427575
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham b1867457a6
doctor: add package for running in-depth healthchecks; use in bugreport (#5413)
Change-Id: Iaa4e5b021a545447f319cfe8b3da2bd3e5e5782b
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2 years ago
Brad Fitzpatrick 58abae1f83 net/dns/{publicdns,resolver}: add NextDNS DoH support
NextDNS is unique in that users create accounts and then get
user-specific DNS IPs & DoH URLs.

For DoH, the customer ID is in the URL path.

For IPv6, the IP address includes the customer ID in the lower bits.

For IPv4, there's a fragile "IP linking" mechanism to associate your
public IPv4 with an assigned NextDNS IPv4 and that tuple maps to your
customer ID.

We don't use the IP linking mechanism.

Instead, NextDNS is DoH-only. Which means using NextDNS necessarily
shunts all DNS traffic through 100.100.100.100 (programming the OS to
use 100.100.100.100 as the global resolver) because operating systems
can't usually do DoH themselves.

Once it's in Tailscale's DoH client, we then connect out to the known
NextDNS IPv4/IPv6 anycast addresses.

If the control plane sends the client a NextDNS IPv6 address, we then
map it to the corresponding NextDNS DoH with the same client ID, and
we dial that DoH server using the combination of v4/v6 anycast IPs.

Updates #2452

Change-Id: I3439d798d21d5fc9df5a2701839910f5bef85463
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham c72caa6672 wgengine/magicsock: use AF_PACKET socket + BPF to read disco messages
This is entirely optional (i.e. failing in this code is non-fatal) and
only enabled on Linux for now. Additionally, this new behaviour can be
disabled by setting the TS_DEBUG_DISABLE_AF_PACKET environment variable.

Updates #3824
Replaces #5474

Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: David Anderson <danderson@tailscale.com>
2 years ago
James Tucker ad1cc6cff9 wgengine: use Go API rather than UAPI for status
Signed-off-by: James Tucker <james@tailscale.com>
2 years ago