Commit Graph

449 Commits (be107f92d349d19d5b5f461ec3aa10ef6b551ffe)

Author SHA1 Message Date
Andrew Dunham be107f92d3 wgengine/magicsock: track per-endpoint changes in ringbuffer
This change adds a ringbuffer to each magicsock endpoint that keeps a
fixed set of "changes"–debug information about what updates have been
made to that endpoint.

Additionally, this adds a LocalAPI endpoint and associated
"debug peer-status" CLI subcommand to fetch the set of changes for a given
IP or hostname.

Updates tailscale/corp#9364

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I34f726a71bddd0dfa36ec05ebafffb24f6e0516a
1 year ago
Mihai Parparita 6ac6ddbb47 sockstats: switch label to enum
Makes it cheaper/simpler to persist values, and encourages reuse of
labels as opposed to generating an arbitrary number.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
1 year ago
Andrew Dunham 3f8e8b04fd cmd/tailscale, cmd/tailscaled: move portmapper debugging into tailscale CLI
The debug flag on tailscaled isn't available in the macOS App Store
build, since we don't have a tailscaled binary; move it to the
'tailscale debug' CLI that is available on all platforms instead,
accessed over LocalAPI.

Updates #7377

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I47bffe4461e036fab577c2e51e173f4003592ff7
1 year ago
Mihai Parparita 9cb332f0e2 sockstats: instrument networking code paths
Uses the hooks added by tailscale/go#45 to instrument the reads and
writes on the major code paths that do network I/O in the client. The
convention is to use "<package>.<type>:<label>" as the annotation for
the responsible code path.

Enabled on iOS, macOS and Android only, since mobile platforms are the
ones we're most interested in, and we are less sensitive to any
throughput degradation due to the per-I/O callback overhead (macOS is
also enabled for ease of testing during development).

For now just exposed as counters on a /v0/sockstats PeerAPI endpoint.

We also keep track of the current interface so that we can break out
the stats by interface.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
1 year ago
Denton Gentry fdc2018d67 wgengine/magicsock: remove superfluous "discokey" from log
The stringification of the discokey type now prepends "discokey:"
Fixes https://github.com/tailscale/tailscale/issues/3074

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
1 year ago
Tom DNetto 2ca6dd1f1d wgengine: start logging DISCO frames to pcap stream
Signed-off-by: Tom DNetto <tom@tailscale.com>
1 year ago
Andrew Dunham 7393ce5e4f wgengine/magicsock: add envknob to print information about port selection
To aid in debugging where a customer has static port-forwards set up and
there are issues establishing a connection through that port.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ic5558bcdb40c9119b83f79dcacf2233b07777f2a
1 year ago
Jordan Whited 39e52c4b0a
wgengine/magicsock: fix de-dup disco ping handling for netmap endpoints (#7118)
Fixes #7116

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Jordan Whited 7921198c05
wgengine/magicsock: de-dup disco pings (#7093)
Fixes #7078

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Will Norris 947c14793a all: update tools that manage copyright headers
Update all code generation tools, and those that check for license
headers to use the new standard header.

Also update copyright statement in LICENSE file.

Fixes #6865

Signed-off-by: Will Norris <will@tailscale.com>
1 year ago
Will Norris 71029cea2d all: update copyright and license headers
This updates all source files to use a new standard header for copyright
and license declaration.  Notably, copyright no longer includes a date,
and we now use the standard SPDX-License-Identifier header.

This commit was done almost entirely mechanically with perl, and then
some minimal manual fixes.

Updates #6865

Signed-off-by: Will Norris <will@tailscale.com>
1 year ago
Brad Fitzpatrick d8feeeee4c wgengine/magicsock: fix buggy fast path in Conn.SetNetworkMap
Spotted by Maisem.

Fixes #6680

Change-Id: I5fdc01de8b006a1c43a2a4848f69397f54b4453a
Co-authored-By: Maisem Ali <maisem@tailscale.com>
Co-authored-By: Andrew Dunham <andrew@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Jordan Whited fec888581a
wgengine/magicsock: retry failed single packet ops across rebinds (#6990)
The single packet WriteTo() through RebindingUDPConn.WriteBatch() was
not checking for a rebind between loading the PacketConn and writing to
it. Same with ReadFrom()/ReadBatch().

Fixes #6989

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Brad Fitzpatrick 2df38b1feb wgengine/magicsock: quiet log flood at tailscaled shutdown
When you hit control-C on a tailscaled (notably in dev mode, but
also on any systemctl stop/restart), there is a flood of messages like:

magicsock: doing cleanup for discovery key d:aa9c92321db0807f
magicsock: doing cleanup for discovery key d:bb0f16aacadbfd46
magicsock: doing cleanup for discovery key d:b5b2d386296536f2
magicsock: doing cleanup for discovery key d:3b640649f6796c91
magicsock: doing cleanup for discovery key d:71d7b1afbcce52cd
magicsock: doing cleanup for discovery key d:315b61d7e0111377
magicsock: doing cleanup for discovery key d:9301f63dce69bf45
magicsock: doing cleanup for discovery key d:376141884d6fe072
....

It can be hundreds or even tens of thousands.

So don't do that. Not a useful log message during shutdown.

Change-Id: I029a8510741023f740877df28adff778246c18e5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Andrew Dunham 11ce5b7e57 ipn/ipnlocal, wgengine/magicsock: check Expired bool on Node; print error in Ping
Change-Id: Ic5f533f175a6e1bb73d4957d8c3f970add42e82e
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
1 year ago
Brad Fitzpatrick be10b529ec wgengine/magicsock: add TS_DISCO_PONG_IPV4_DELAY knob to bias IPv6 paths
Fixes #6818

Change-Id: I71597a045c5b4117af69fba869cb616271c0dfe1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 2eff9c8277 wgengine/magicsock: avoid ReadBatch/WriteBatch on old Linux kernels
Fixes #6807

Change-Id: I161424ef8a7338e1941d5e43d72dc6529993a0e3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 0f604923d3 ipn/ipnlocal: fix StatusWithoutPeers not populating parts of Status
Fixes #4311

Change-Id: Iaae0615148fa7154f4ef8f66b455e3a6c2fa9df3
Co-authored-by: Claire Wang <claire@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 44be59c15a wgengine/magicsock: fix panic in wireguard-go rate limiting path
Fixes #6686

Change-Id: I1055a87141b07261afed8e36c963a69f3be26088
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Jordan Whited ea5ee6f87c
all: update golang.zx2c4.com/wireguard to github.com/tailscale/wireguard-go (#6692)
This is temporary while we work to upstream performance work in
https://github.com/WireGuard/wireguard-go/pull/64. A replace directive
is less ideal as it breaks dependent code without duplication of the
directive.

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Jordan Whited 76389d8baf
net/tstun, wgengine/magicsock: enable vectorized I/O on Linux (#6663)
This commit updates the wireguard-go dependency and implements the
necessary changes to the tun.Device and conn.Bind implementations to
support passing vectors of packets in tailscaled. This significantly
improves throughput performance on Linux.

Updates #414

Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Co-authored-by: James Tucker <james@tailscale.com>
1 year ago
Mihai Parparita bdc45b9066 wgengine/magicsock: fix panic when rebinding fails
We would replace the existing real implementation of nettype.PacketConn
with a blockForeverConn, but that violates the contract of atomic.Value
(where the type cannot change). Fix by switching to a pointer value
(atomic.Pointer[nettype.PacketConn]).

A longstanding issue, but became more prevalent when we started binding
connections to interfaces on macOS and iOS (#6566), which could lead to
the bind call failing if the interface was no longer available.

Fixes #6641

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
1 year ago
Joe Tsai 2e5d08ec4f
net/connstats: invert network logging data flow (#6272)
Previously, tstun.Wrapper and magicsock.Conn managed their
own statistics data structure and relied on an external call to
Extract to extract (and reset) the statistics.
This makes it difficult to ensure a maximum size on the statistics
as the caller has no introspection into whether the number
of unique connections is getting too large.

Invert the control flow such that a *connstats.Statistics
is registered with tstun.Wrapper and magicsock.Conn.
Methods on non-nil *connstats.Statistics are called for every packet.
This allows the implementation of connstats.Statistics (in the future)
to better control when it needs to flush to ensure
bounds on maximum sizes.

The value registered into tstun.Wrapper and magicsock.Conn could
be an interface, but that has two performance detriments:

1. Method calls on interface values are more expensive since
they must go through a virtual method dispatch.

2. The implementation would need a sync.Mutex to protect the
statistics value instead of using an atomic.Pointer.

Given that methods on constats.Statistics are called for every packet,
we want reduce the CPU cost on this hot path.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
1 year ago
Brad Fitzpatrick 3a168cc1ff wgengine/magicsock: ignore pre-disco (pre-0.100) peers
There aren't any in the wild, other than one we ran on purpose to keep
us honest, but we can bump that one forward to 0.100.

Change-Id: I129e70724b2d3f8edf3b496dc01eba3ac5a2a907
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
phirework a011320370
magicsock: cleanup canp2p (#6391)
This renames canP2P in magicsock to canP2PLocked to reflect
expectation of mutex lock, fixes a race we discovered in the meantime,
and updates the current stats.

Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jenny Zhang <jz@tailscale.com>
1 year ago
Joe Tsai 81fd259133
wgengine/magicsock: gather physical-layer statistics (#5925)
There is utility in logging traffic statistics that occurs at the physical layer.
That is, in order to send packets virtually to a particular tailscale IP address,
what physical endpoints did we need to communicate with?

This functionality logs IP addresses identical to
what had always been logged in magicsock prior to #5823,
so there is no increase in PII being logged.

ExtractStatistics returns a mapping of connections to counts.
The source is always a Tailscale IP address (without port),
while the destination is some endpoint reachable on WAN or LAN.
As a special case, traffic routed through DERP will use 127.3.3.40
as the destination address with the port being the DERP region.

This entire feature is only enabled if data-plane audit logging
is enabled on the tailnet (by default it is disabled).

Example of type of information logged:

	------------------------------------  Tx[P/s]    Tx[B/s]  Rx[P/s]   Rx[B/s]
	PhysicalTraffic:                       25.80      3.39Ki   38.80     5.57Ki
	    100.1.2.3 -> 143.11.22.33:41641    15.40      2.00Ki   23.20     3.37Ki
	    100.4.5.6 -> 192.168.0.100:41641   10.20      1.38Ki   15.60     2.20Ki
	    100.7.8.9 -> 127.3.3.40:2           0.20      6.40      0.00     0.00

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Andrew Dunham 74693793be net/netcheck, tailcfg: track whether OS supports IPv6
We had previously added this to the netcheck report in #5087 but never
copied it into the NetInfo struct. Additionally, add it to log lines so
it's visible to support.

Change-Id: Ib6266f7c6aeb2eb2a28922aeafd950fe1bf5627e
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
phirework d13c9cdfb4
wgengine/magicsock: set up pathfinder (#5994)
Sets up new file for separate silent disco goroutine, tentatively named
pathfinder for now.

Updates #540

Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jenny Zhang <jz@tailscale.com>
2 years ago
Brad Fitzpatrick deac82231c wgengine/magicsock: add start of alternate send path
During development of silent disco (#540), an alternate send policy
for magicsock that doesn't wake up the radio frequently with
heartbeats, we want the old & new policies to coexist, like we did
previously pre- and post-disco.

We started to do that earlier in 5c42990c2f but only set up the
env+control knob plumbing to set a bool about which path should be
used.

This starts to add a way for the silent disco code to update the send
path from a separate goroutine. (Part of the effort is going to
de-state-machinify the event based soup that is the current disco
code and make it more Go synchronous style.)

So far this does nothing. (It does add an atomic load on each send
but that should be noise in the grand scheme of things, and a even more
rare atomic store of nil on node config changes.)

Baby steps.

Updates #540

Co-authored-by: Jenny Zhang <jz@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Joe Tsai 14100c0985
wgengine/magicsock: restore allocation-free endpoint.DstToString (#5971)
The wireguard-go code unfortunately calls this unconditionally
even when verbose logging is disabled.

Partial revert of #5911.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Joe Tsai 9ee3df02ee
wgengine/magicsock: remove endpoint.wgEndpoint (#5911)
This field seems seldom used and the documentation is wrong.
It is simpler to just derive its original value dynamically
when endpoint.DstToString is called.

This method is potentially used by wireguard-go,
but not in any code path is performance sensitive.
All calls to it use it in conjunction with fmt.Printf,
which is going to be slow anyways since it uses Go reflection.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
James Tucker 539c073cf0 wgengine/magicsock: set UDP socket buffer sizes to 7MB
- At high data rates more buffer space is required in order to avoid
  packet loss during any cause of delay.
- On slower machines more buffer space is required in order to avoid
  packet loss while decryption & tun writing is underway.
- On higher latency network paths more buffer space is required in order
  to overcome BDP.
- On Linux set with SO_*BUFFORCE to bypass net.core.{r,w}mem_max.
- 7MB is the current default maximum on macOS 12.6
- Windows test is omitted, as Windows does not support getsockopt for
  these options.

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Brad Fitzpatrick 1841d0bf98 wgengine/magicsock: make debug-level stuff not logged by default
And add a CLI/localapi and c2n mechanism to enable it for a fixed
amount of time.

Updates #1548

Change-Id: I71674aaf959a9c6761ff33bbf4a417ffd42195a7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Josh Soref d4811f11a0 all: fix spelling mistakes
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2 years ago
Kyle Carberry 91794f6498 wgengine/magicsock: move firstDerp check after nil derpMap check
This fixes a race condition which caused `c.muCond.Broadcast()` to
never fire in the `firstDerp` if block. It resulted in `Close()`
hanging forever.

Signed-off-by: Kyle Carberry <kyle@carberry.com>
2 years ago
Brad Fitzpatrick 832031d54b wgengine/magicsock: fix recently introduced data race
From 5c42990c2f, not yet released in a stable build.
Caught by existing tests.

Fixes #5685

Change-Id: Ia76bb328809d9644e8b96910767facf627830600
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
phirework 5c42990c2f
wgengine/magicsock: add client flag and envknob to disable heartbeat (#5638)
Baby steps towards turning off heartbeat pings entirely as per #540.
This doesn't change any current magicsock functionality and requires additional
changes to send/disco paths before the flag can be turned on.

Updates #540

Change-Id: Idc9a72748e74145b068d67e6dd4a4ffe3932efd0
Signed-off-by: Jenny Zhang <jz@tailscale.com>

Signed-off-by: Jenny Zhang <jz@tailscale.com>
2 years ago
Brad Fitzpatrick 74674b110d envknob: support changing envknobs post-init
Updates #5114

Change-Id: Ia423fc7486e1b3f3180a26308278be0086fae49b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
James Tucker 672c2c8de8 wgengine/magicsock: add filter to ignore disco to old/other ports
Incoming disco packets are now dropped unless they match one of the
current bound ports, or have a zero port*.

The BPF filter passes all packets with a disco header to the raw packet
sockets regardless of destination port (in order to avoid needing to
reconfigure BPF on rebind).

If a BPF enabled node has just rebound, due to restart or rebind, it may
receive and reply to disco ping packets destined for ports other than
those which are presently bound. If the pong is accepted, the pinging
node will now assume that it can send WireGuard traffic to the pinged
port - such traffic will not reach the node as it is not destined for a
bound port.

*The zero port is ignored, if received. This is a speculative defense
and would indicate a problem in the receive path, or the BPF filter.
This condition is allowed to pass as it may enable traffic to flow,
however it will also enable problems with the same symptoms this patch
otherwise fixes.

Fixes #5536

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
James Tucker be140add75 wgengine/magicsock: fix regression in initial bind for js
1f959edeb0 introduced a regression for JS
where the initial bind no longer occurred at all for JS.

The condition is moved deeper in the call tree to avoid proliferation of
higher level conditions.

Updates #5537

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
James Tucker 1f959edeb0 wgengine/magicksock: remove nullability of RebindingUDPConns
Both RebindingUDPConns now always exist. the initial bind (which now
just calls rebind) now ensures that bind is called for both, such that
they both at least contain a blockForeverConn. Calling code no longer
needs to assert their state.

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Brad Fitzpatrick e470893ba0 wgengine/magicsock: use mak in another spot
Change-Id: I0a46d6243371ae6d126005a2bd63820cb2d1db6b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham c72caa6672 wgengine/magicsock: use AF_PACKET socket + BPF to read disco messages
This is entirely optional (i.e. failing in this code is non-fatal) and
only enabled on Linux for now. Additionally, this new behaviour can be
disabled by setting the TS_DEBUG_DISABLE_AF_PACKET environment variable.

Updates #3824
Replaces #5474

Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: David Anderson <danderson@tailscale.com>
2 years ago
Andrew Dunham e945d87d76
util/uniq: use generics instead of reflect (#5491)
This takes 75% less time per operation per some benchmarks on my mac.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2 years ago
Kris Brandow 5d559141d5 wgengine/magicsock: remove mention of Start
The Start method was removed in 4c27e2fa22, but the comment on NewConn
still mentioned it doesn't do anything until this method is called.

Signed-off-by: Kris Brandow <kris.brandow@gmail.com>
2 years ago
Andrew Dunham f0d6f173c9
net/netcheck: try ICMP if UDP is blocked (#5056)
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2 years ago
Maisem Ali a9f6cd41fd all: use syncs.AtomicValue
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick 4950fe60bd syncs, all: move to using Go's new atomic types instead of ours
Fixes #5185

Change-Id: I850dd532559af78c3895e2924f8237ccc328449d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Maisem Ali 9bb5a038e5 all: use atomic.Pointer
Also add some missing docs.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick 5381437664 logtail, net/portmapper, wgengine/magicsock: use fmt.Appendf
Fixes #5206

Change-Id: I490bb92e774ce7c044040537e2cd864fcf1dbe5a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago