Commit Graph

692 Commits (main)

Author SHA1 Message Date
Jordan Whited 76389d8baf
net/tstun, wgengine/magicsock: enable vectorized I/O on Linux (#6663)
This commit updates the wireguard-go dependency and implements the
necessary changes to the tun.Device and conn.Bind implementations to
support passing vectors of packets in tailscaled. This significantly
improves throughput performance on Linux.

Updates #414

Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Co-authored-by: James Tucker <james@tailscale.com>
1 year ago
Mihai Parparita bdc45b9066 wgengine/magicsock: fix panic when rebinding fails
We would replace the existing real implementation of nettype.PacketConn
with a blockForeverConn, but that violates the contract of atomic.Value
(where the type cannot change). Fix by switching to a pointer value
(atomic.Pointer[nettype.PacketConn]).

A longstanding issue, but became more prevalent when we started binding
connections to interfaces on macOS and iOS (#6566), which could lead to
the bind call failing if the interface was no longer available.

Fixes #6641

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
1 year ago
Joe Tsai 2e5d08ec4f
net/connstats: invert network logging data flow (#6272)
Previously, tstun.Wrapper and magicsock.Conn managed their
own statistics data structure and relied on an external call to
Extract to extract (and reset) the statistics.
This makes it difficult to ensure a maximum size on the statistics
as the caller has no introspection into whether the number
of unique connections is getting too large.

Invert the control flow such that a *connstats.Statistics
is registered with tstun.Wrapper and magicsock.Conn.
Methods on non-nil *connstats.Statistics are called for every packet.
This allows the implementation of connstats.Statistics (in the future)
to better control when it needs to flush to ensure
bounds on maximum sizes.

The value registered into tstun.Wrapper and magicsock.Conn could
be an interface, but that has two performance detriments:

1. Method calls on interface values are more expensive since
they must go through a virtual method dispatch.

2. The implementation would need a sync.Mutex to protect the
statistics value instead of using an atomic.Pointer.

Given that methods on constats.Statistics are called for every packet,
we want reduce the CPU cost on this hot path.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
1 year ago
Brad Fitzpatrick 3a168cc1ff wgengine/magicsock: ignore pre-disco (pre-0.100) peers
There aren't any in the wild, other than one we ran on purpose to keep
us honest, but we can bump that one forward to 0.100.

Change-Id: I129e70724b2d3f8edf3b496dc01eba3ac5a2a907
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
phirework a011320370
magicsock: cleanup canp2p (#6391)
This renames canP2P in magicsock to canP2PLocked to reflect
expectation of mutex lock, fixes a race we discovered in the meantime,
and updates the current stats.

Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jenny Zhang <jz@tailscale.com>
2 years ago
Brad Fitzpatrick da8def8e13 all: remove old +build tags
The //go:build syntax was introduced in Go 1.17:

https://go.dev/doc/go1.17#build-lines

gofmt has kept the +build and go:build lines in sync since
then, but enough time has passed. Time to remove them.

Done with:

    perl -i -npe 's,^// \+build.*\n,,' $(git grep -l -F '+build')

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Joe Tsai 81fd259133
wgengine/magicsock: gather physical-layer statistics (#5925)
There is utility in logging traffic statistics that occurs at the physical layer.
That is, in order to send packets virtually to a particular tailscale IP address,
what physical endpoints did we need to communicate with?

This functionality logs IP addresses identical to
what had always been logged in magicsock prior to #5823,
so there is no increase in PII being logged.

ExtractStatistics returns a mapping of connections to counts.
The source is always a Tailscale IP address (without port),
while the destination is some endpoint reachable on WAN or LAN.
As a special case, traffic routed through DERP will use 127.3.3.40
as the destination address with the port being the DERP region.

This entire feature is only enabled if data-plane audit logging
is enabled on the tailnet (by default it is disabled).

Example of type of information logged:

	------------------------------------  Tx[P/s]    Tx[B/s]  Rx[P/s]   Rx[B/s]
	PhysicalTraffic:                       25.80      3.39Ki   38.80     5.57Ki
	    100.1.2.3 -> 143.11.22.33:41641    15.40      2.00Ki   23.20     3.37Ki
	    100.4.5.6 -> 192.168.0.100:41641   10.20      1.38Ki   15.60     2.20Ki
	    100.7.8.9 -> 127.3.3.40:2           0.20      6.40      0.00     0.00

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Andrew Dunham 74693793be net/netcheck, tailcfg: track whether OS supports IPv6
We had previously added this to the netcheck report in #5087 but never
copied it into the NetInfo struct. Additionally, add it to log lines so
it's visible to support.

Change-Id: Ib6266f7c6aeb2eb2a28922aeafd950fe1bf5627e
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
phirework d13c9cdfb4
wgengine/magicsock: set up pathfinder (#5994)
Sets up new file for separate silent disco goroutine, tentatively named
pathfinder for now.

Updates #540

Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jenny Zhang <jz@tailscale.com>
2 years ago
Brad Fitzpatrick deac82231c wgengine/magicsock: add start of alternate send path
During development of silent disco (#540), an alternate send policy
for magicsock that doesn't wake up the radio frequently with
heartbeats, we want the old & new policies to coexist, like we did
previously pre- and post-disco.

We started to do that earlier in 5c42990c2f but only set up the
env+control knob plumbing to set a bool about which path should be
used.

This starts to add a way for the silent disco code to update the send
path from a separate goroutine. (Part of the effort is going to
de-state-machinify the event based soup that is the current disco
code and make it more Go synchronous style.)

So far this does nothing. (It does add an atomic load on each send
but that should be noise in the grand scheme of things, and a even more
rare atomic store of nil on node config changes.)

Baby steps.

Updates #540

Co-authored-by: Jenny Zhang <jz@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Joe Tsai 14100c0985
wgengine/magicsock: restore allocation-free endpoint.DstToString (#5971)
The wireguard-go code unfortunately calls this unconditionally
even when verbose logging is disabled.

Partial revert of #5911.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Joe Tsai 9ee3df02ee
wgengine/magicsock: remove endpoint.wgEndpoint (#5911)
This field seems seldom used and the documentation is wrong.
It is simpler to just derive its original value dynamically
when endpoint.DstToString is called.

This method is potentially used by wireguard-go,
but not in any code path is performance sensitive.
All calls to it use it in conjunction with fmt.Printf,
which is going to be slow anyways since it uses Go reflection.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
James Tucker 539c073cf0 wgengine/magicsock: set UDP socket buffer sizes to 7MB
- At high data rates more buffer space is required in order to avoid
  packet loss during any cause of delay.
- On slower machines more buffer space is required in order to avoid
  packet loss while decryption & tun writing is underway.
- On higher latency network paths more buffer space is required in order
  to overcome BDP.
- On Linux set with SO_*BUFFORCE to bypass net.core.{r,w}mem_max.
- 7MB is the current default maximum on macOS 12.6
- Windows test is omitted, as Windows does not support getsockopt for
  these options.

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Brad Fitzpatrick 1841d0bf98 wgengine/magicsock: make debug-level stuff not logged by default
And add a CLI/localapi and c2n mechanism to enable it for a fixed
amount of time.

Updates #1548

Change-Id: I71674aaf959a9c6761ff33bbf4a417ffd42195a7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Josh Soref d4811f11a0 all: fix spelling mistakes
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2 years ago
Kyle Carberry 91794f6498 wgengine/magicsock: move firstDerp check after nil derpMap check
This fixes a race condition which caused `c.muCond.Broadcast()` to
never fire in the `firstDerp` if block. It resulted in `Close()`
hanging forever.

Signed-off-by: Kyle Carberry <kyle@carberry.com>
2 years ago
Brad Fitzpatrick 832031d54b wgengine/magicsock: fix recently introduced data race
From 5c42990c2f, not yet released in a stable build.
Caught by existing tests.

Fixes #5685

Change-Id: Ia76bb328809d9644e8b96910767facf627830600
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
phirework 5c42990c2f
wgengine/magicsock: add client flag and envknob to disable heartbeat (#5638)
Baby steps towards turning off heartbeat pings entirely as per #540.
This doesn't change any current magicsock functionality and requires additional
changes to send/disco paths before the flag can be turned on.

Updates #540

Change-Id: Idc9a72748e74145b068d67e6dd4a4ffe3932efd0
Signed-off-by: Jenny Zhang <jz@tailscale.com>

Signed-off-by: Jenny Zhang <jz@tailscale.com>
2 years ago
Eng Zer Jun f0347e841f refactor: move from io/ioutil to io and os packages
The io/ioutil package has been deprecated as of Go 1.16 [1]. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.

Reference: https://golang.org/doc/go1.16#ioutil
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2 years ago
Brad Fitzpatrick 74674b110d envknob: support changing envknobs post-init
Updates #5114

Change-Id: Ia423fc7486e1b3f3180a26308278be0086fae49b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
David Anderson 7c49db02a2 wgengine/magicsock: don't use BPF receive when SO_MARK doesn't work.
Fixes #5607

Signed-off-by: David Anderson <danderson@tailscale.com>
2 years ago
Colin Adler 9c8bbc7888 wgengine/magicsock: fix panic in http debug server
Fixes an panic in `(*magicsock.Conn).ServeHTTPDebug` when the
`recentPongs` ring buffer for an endpoint wraps around.

Signed-off-by: Colin Adler <colin1adler@gmail.com>
2 years ago
James Tucker 672c2c8de8 wgengine/magicsock: add filter to ignore disco to old/other ports
Incoming disco packets are now dropped unless they match one of the
current bound ports, or have a zero port*.

The BPF filter passes all packets with a disco header to the raw packet
sockets regardless of destination port (in order to avoid needing to
reconfigure BPF on rebind).

If a BPF enabled node has just rebound, due to restart or rebind, it may
receive and reply to disco ping packets destined for ports other than
those which are presently bound. If the pong is accepted, the pinging
node will now assume that it can send WireGuard traffic to the pinged
port - such traffic will not reach the node as it is not destined for a
bound port.

*The zero port is ignored, if received. This is a speculative defense
and would indicate a problem in the receive path, or the BPF filter.
This condition is allowed to pass as it may enable traffic to flow,
however it will also enable problems with the same symptoms this patch
otherwise fixes.

Fixes #5536

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
James Tucker be140add75 wgengine/magicsock: fix regression in initial bind for js
1f959edeb0 introduced a regression for JS
where the initial bind no longer occurred at all for JS.

The condition is moved deeper in the call tree to avoid proliferation of
higher level conditions.

Updates #5537

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
James Tucker 1f959edeb0 wgengine/magicksock: remove nullability of RebindingUDPConns
Both RebindingUDPConns now always exist. the initial bind (which now
just calls rebind) now ensures that bind is called for both, such that
they both at least contain a blockForeverConn. Calling code no longer
needs to assert their state.

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Brad Fitzpatrick e470893ba0 wgengine/magicsock: use mak in another spot
Change-Id: I0a46d6243371ae6d126005a2bd63820cb2d1db6b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham c72caa6672 wgengine/magicsock: use AF_PACKET socket + BPF to read disco messages
This is entirely optional (i.e. failing in this code is non-fatal) and
only enabled on Linux for now. Additionally, this new behaviour can be
disabled by setting the TS_DEBUG_DISABLE_AF_PACKET environment variable.

Updates #3824
Replaces #5474

Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: David Anderson <danderson@tailscale.com>
2 years ago
Andrew Dunham e945d87d76
util/uniq: use generics instead of reflect (#5491)
This takes 75% less time per operation per some benchmarks on my mac.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2 years ago
Kris Brandow 5d559141d5 wgengine/magicsock: remove mention of Start
The Start method was removed in 4c27e2fa22, but the comment on NewConn
still mentioned it doesn't do anything until this method is called.

Signed-off-by: Kris Brandow <kris.brandow@gmail.com>
2 years ago
Andrew Dunham f0d6f173c9
net/netcheck: try ICMP if UDP is blocked (#5056)
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2 years ago
Maisem Ali a9f6cd41fd all: use syncs.AtomicValue
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick 4950fe60bd syncs, all: move to using Go's new atomic types instead of ours
Fixes #5185

Change-Id: I850dd532559af78c3895e2924f8237ccc328449d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Maisem Ali 9bb5a038e5 all: use atomic.Pointer
Also add some missing docs.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick 5381437664 logtail, net/portmapper, wgengine/magicsock: use fmt.Appendf
Fixes #5206

Change-Id: I490bb92e774ce7c044040537e2cd864fcf1dbe5a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 5f6abcfa6f all: migrate code from netaddr.FromStdAddr to Go 1.18
With caveat https://github.com/golang/go/issues/53607#issuecomment-1203466984
that then requires a new wrapper. But a simpler one at least.

Updates #5162

Change-Id: I0a5265065bfcd7f21e8dd65b2bd74cae90d76090
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 8725b14056 all: migrate more code code to net/netip directly
Instead of going through the tailscale.com/net/netaddr transitional
wrappers.

Updates #5162

Change-Id: I3dafd1c2effa1a6caa9b7151ecf6edd1a3fda3dd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick fb82299f5a wgengine/magicsock: avoid RebindingUDPConn mutex in common read/write case
Change-Id: I209fac567326f2e926bace2582dbc67a8bc94c78
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 116f55ff66 all: gofmt for Go 1.19
Updates #5210

Change-Id: Ib02cd5e43d0a8db60c1f09755a8ac7b140b670be
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick a12aad6b47 all: convert more code to use net/netip directly
perl -i -npe 's,netaddr.IPPrefixFrom,netip.PrefixFrom,' $(git grep -l -F netaddr.)
    perl -i -npe 's,netaddr.IPPortFrom,netip.AddrPortFrom,' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IPPrefix,netip.Prefix,g' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IPPort,netip.AddrPort,g' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IP\b,netip.Addr,g' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IPv6Raw\b,netip.AddrFrom16,g' $(git grep -l -F netaddr. )
    goimports -w .

Then delete some stuff from the net/netaddr shim package which is no
longer neeed.

Updates #5162

Change-Id: Ia7a86893fe21c7e3ee1ec823e8aba288d4566cd8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 6a396731eb all: use various net/netip parse funcs directly
Mechanical change with perl+goimports.

Changed {Must,}Parse{IP,IPPrefix,IPPort} to their netip variants, then
goimports -d .

Finally, removed the net/netaddr wrappers, to prevent future use.

Updates #5162

Change-Id: I59c0e38b5fbca5a935d701645789cddf3d7863ad
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 7eaf5e509f net/netaddr: start migrating to net/netip via new netaddr adapter package
Updates #5162

Change-Id: Id7bdec303b25471f69d542f8ce43805328d56c12
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick d8cb5aae17 tailcfg, control/controlclient: add tailcfg.PeersChangedPatch [capver 33]
This adds a lighter mechanism for endpoint updates from control.

Change-Id: If169c26becb76d683e9877dc48cfb35f90cc5f24
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Mihai Parparita 27a1ad6a70
wasm: exclude code that's not used on iOS for Wasm too
It has similar size constraints. Saves ~1.9MB from the Wasm build.

Updates #3157

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Mihai Parparita 561f7be434 wgengine/magicsock: remove unused metric
We don't increment the metricRecvData anywhere, just the per-protocol
ones.

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
James Tucker f9e86e64b7 *: use WireGuard where logged, printed or named
Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Maisem Ali 2265587d38 wgengine/{,magicsock}: add metrics for rebinds and restuns
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick 910ae68e0b util/mak: move tailssh's mapSet into a new package for reuse elsewhere
Change-Id: Idfe95db82275fd2be6ca88f245830731a0d5aecf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick f2041c9088 all: use strings.Cut even more
Change-Id: I943ce72c6f339589235bddbe10d07799c4e37979
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Josh Bleecher Snyder 0868329936 all: use any instead of interface{}
My favorite part of generics.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2 years ago
Josh Bleecher Snyder 5f176f24db go.mod: upgrade to the latest wireguard-go
This pulls in a handful of fixes and an update to Go 1.18.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2 years ago
Josh Bleecher Snyder 1b57b0380d wgengine/magicsock: remove final alloc from ReceiveFrom
And now that we don't have to play escape analysis and inlining games,
simplify the code.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2 years ago
Josh Bleecher Snyder 08cf54f386 wgengine/magicsock: fix goMajorVersion for 1.18 ts release
The version string changed slightly. Adapt.
And always check the current Go version to prevent future
accidental regressions. I would have missed this one had
I not explicitly manually checked it.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2 years ago
Maisem Ali 72d8672ef7 tailcfg: make Node.Hostinfo a HostinfoView
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick 2db6cd1025 ipn/ipnlocal, wgengine/magicsock, logpolicy: quiet more logs
Updates #1548

Change-Id: Ied169f872e93be2857890211f2e018307d4aeadc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 730aa1c89c derp/derphttp, wgengine/magicsock: prefer IPv6 to DERPs when IPv6 works
Fixes #3838

Change-Id: Ie47a2a30c7e8e431512824798d2355006d72fb6a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
David Anderson 7a18fe3dca wgengine/magicsock: make debugUseDerpRoute an opt.Bool.
Can still be constant, just needs the extra methods.

Fixes #3812

Signed-off-by: David Anderson <danderson@tailscale.com>
2 years ago
Brad Fitzpatrick 41fd4eab5c envknob: add new package for all the strconv.ParseBool(os.Getenv(..))
A new package can also later record/report which knobs are checked and
set. It also makes the code cleaner & easier to grep for env knobs.

Change-Id: Id8a123ab7539f1fadbd27e0cbeac79c2e4f09751
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Josh Bleecher Snyder de4696da10 wgengine/magicsock: fix deadlock on shutdown
This fixes a deadlock on shutdown.
One goroutine is waiting to send on c.derpRecvCh before unlocking c.mu.
The other goroutine is waiting to lock c.mu before receiving from c.derpRecvCh.

#3736 has a more detailed explanation of the sequence of events.

Fixes #3736

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2 years ago
Brad Fitzpatrick 5404a0557b wgengine/magicsock: remove a per-DERP-packet map lookup in common case
Updates #150

Change-Id: Iffb6eccbe7ca97af97d29be63b7e37d487b3ba28
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 5a317d312d wgengine/magicsock: enable DERP Return Path Optimization (DRPO)
Turning this on at the beginning of the 1.21.x dev cycle, for 1.22.

Updates #150

Change-Id: I1de567cfe0be3df5227087de196ab88e60c9eb56
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick c6c39930cc wgengine/magicsock: fix lock ordering deadlock with derphttp
Fixes #3726

Change-Id: I32631a44dcc1da3ae47764728ec11ace1c78190d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick addda5b96f wgengine/magicsock: fix watchdog timeout on Close when IPv6 not available
The blockForeverConn was only using its sync.Cond one side. Looks like it
was just forgotten.

Fixes #3671

Change-Id: I4ed0191982cdd0bfd451f133139428a4fa48238c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 28bf53f502 wgengine/magicsock: reduce disco ping heartbeat aggressiveness a bit
Bigger changes coming later, but this should improve things a bit in
the meantime.

Rationale:

* 2 minutes -> 45 seconds: 2 minutes was overkill and never considered
  phones/battery at the time. It was totally arbitrary. 45 seconds is
  also arbitrary but is less than 2 minutes.

* heartbeat from 2 seconds to 3 seconds: in practice this meant two
  packets per second (2 pings and 2 pongs every 2 seconds) because the
  other side was also pinging us every 2 seconds on their own.
  That's just overkill. (see #540 too)

So in the worst case before: when we sent a single packet (say: a DNS
packet), we ended up sending 61 packets over 2 minutes: the 1 DNS
query and then then 60 disco pings (2 minutes / 2 seconds) & received
the same (1 DNS response + 60 pongs).  Now it's 15. In 1.22 we plan to
remove this whole timer-based heartbeat mechanism entirely.

The 5 seconds to 6.5 seconds change is just stretching out that
interval so you can still miss two heartbeats (other 3 + 3 seconds
would be greater than 5 seconds). This means that if your peer moves
without telling you, you can have a path out for 6.5 seconds
now instead of 5 seconds before disco finds a new one. That will also
improve in 1.22 when we start doing UDP+DERP at the same time
when confidence starts to go down on a UDP path.

Updates #3363

Change-Id: Ic2314bbdaf42edcdd7103014b775db9cf4facb47
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick a201b89e4a wgengine/magicsock: reconnect to DERP when its definition changes
Change-Id: I7c560feb9e4a6e155a35ec764a68354f19f694e4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 7d9b1de3aa netcheck,portmapper,magicsock: ignore some UDP write errors on Linux
Treat UDP send EPERM errors as a lost UDP packet, not something super
fatal. That's just the Linux firewall preventing it from going out.

And add a leaf package net/neterror for that (and future) policy that
all three packages can share, with tests.

Updates #3619

Change-Id: Ibdb838c43ee9efe70f4f25f7fc7fdf4607ba9c1d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 2c94e3c4ad wgengine/magicsock: don't unconditionally close DERP connections on rebind
Only if the source address isn't on the currently active interface or
a ping of the DERP server fails.

Updates #3619

Change-Id: I6bf06503cff4d781f518b437c8744ac29577acc8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick ae319b4636 wgengine/magicsock: add HTML debug handler to see magicsock state
Change-Id: Ibc46f4e9651e1c86ec6f5d139f5e9bdc7a488415
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick c7f5bc0f69 wgengine/magicsock: add metrics for sent disco messages
We only tracked the transport type (UDP vs DERP), not what they were.

Change-Id: Ia4430c1c53afd4634e2d9893d96751a885d77955
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 486059589b all: gofmt -w -s (simplify) tests
And it updates the build tag style on a couple files.

Change-Id: I84478d822c8de3f84b56fa1176c99d2ea5083237
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 7b9c7bc42b ipn/ipnstate: remove old deprecated TailAddr IPv4-only field
It's been a bunch of releases now since the TailscaleIPs slice
replacement was added.

Change-Id: I3bd80e1466b3d9e4a4ac5bedba8b4d3d3e430a03
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
David Anderson 33c541ae30 ipn/ipnlocal: populate self status from netmap in ipnlocal, not magicsock.
Fixes #1933

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Josh Bleecher Snyder 758c37b83d net/netns: thread logf into control functions
So that darwin can log there without panicking during tests.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 8ec44d0d5f wgengine/magicsock: remove some log spam
Fixes tailscale/corp#3070

Change-Id: Ie50031800ec8669e0596ad6d59d1e329a5c88516
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder b3d6704aa3 wgengine/magicsock: fix data race on endpoint.discoKey
endpoint.discoKey is protected by endpoint.mu.
endpoint.sendDiscoMessage was reading it without holding the lock.
This showed up in a CI failure and is readily reproducible locally.

The fix is in two parts.

First, for Conn.enqueueCallMeMaybe, eliminate the one-line helper method endpoint.sendDiscoMessage; call Conn.sendDiscoMessage directly.
This makes it more natural to read endpoint.discoKey in a context
in which endpoint.mu is already held.

Second, for endpoint.sendDiscoPing, explicitly pass the disco key
as an argument. Again, this makes it easier to read endpoint.discoKey
in a context in which endpoint.mu is already held.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 7901289578 wgengine/magicsock: add a stress test
And add a peerMap validate method that checks its internal invariants.

Updates tailscale/corp#3016

Change-Id: I23708e68ed44d81986d9e2be82029d4555547592
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 5a60781919 wgengine/magicsock: increase TestDiscokeyChange connection timeout
I believe that this should eliminate the flakiness.
If GitHub CI manages to be even slower that can be believed
(and I can believe a lot at this point),
then we should roll this back and make some more invasive changes.

Updates #654
Fixes #3247 (I hope)

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 773af7292b wgengine/magicsock: simplify peerMap.upsertEndpoint
We can do the "maybe delete" check unilaterally:
In the case of an insert, both oldDiscoKey
and ep.discoKey will be the zero value.

And since we don't use pi again, we can skip
giving it a name, which makes scoping clearer.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 9da22dac3d wgengine/magicsock: fix bug in peerMap.upsertEndpoint
Found by inspection by David Crawshaw while
investigating tailscale/corp#3016.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 16870cb754 wgengine/magicsock: fix typo in comment
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
David Anderson 41da7620af go.mod: update wireguard-go to pick up roaming toggle
wgengine/wgcfg: introduce wgcfg.NewDevice helper to disable roaming
at all call sites (one real plus several tests).

Fixes tailscale/corp#3016.

Signed-off-by: David Anderson <danderson@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 24ea365d48 netcheck, controlclient, magicsock: add more metrics
Updates #3307

Change-Id: Ibb33425764a75bde49230632f1b472f923551126
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 57b039c51d util/clientmetrics: add new package to add metrics to the client
And annotate magicsock as a start.

And add localapi and debug handlers with the Prometheus-format
exporter.

Updates #3307

Change-Id: I47c5d535fe54424741df143d052760387248f8d3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson 0532eb30db all: replace tailcfg.DiscoKey with key.DiscoPublic.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 7e6a1ef4f1 tailcfg: use key.NodePublic in wire protocol types.
Updates #3206.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 72ace0acba wgengine/magicsock: use key.NodePublic instead of tailcfg.NodeKey.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson d6e7cec6a7 types/netmap: use key.NodePublic instead of tailcfg.NodeKey.
Update #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 84c3a09a8d types/key: export constants for key size, not a method.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 6422789ea0 disco: use key.NodePublic instead of tailcfg.NodeKey.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 418adae379 various: use NodePublic.AsNodeKey() instead of tailcfg.NodeKeyFromNodePublic()
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson eeb97fd89f various: remove remaining uses of key.NewPrivate.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson ef241f782e wgengine/magicsock: remove uses of tailcfg.DiscoKey.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 55b6753c11 wgengine/magicsock: remove use of key.{Public,Private}.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson c1d009b9e9 ipn/ipnstate: use key.NodePublic instead of the generic key.Public.
Updates #3206.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 37c150aee1 derp: use new node key type.
Update #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson e03fda7ae6 wgengine/magicsock: remove test uses of wgkey.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Josh Bleecher Snyder 94fb42d4b2 all: use testingutil.MinAllocsPerRun
There are a few remaining uses of testing.AllocsPerRun:
Two in which we only log the number of allocations,
and one in which dynamically calculate the allocations
target based on a different AllocsPerRun run.

This also allows us to tighten the "no allocs"
test in wgengine/filter.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 1df865a580 wgengine/magicsock: allow even fewer allocs per UDP receive
We improved things again for Go 1.18. Lock that in.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder c1d377078d wgengine/magicsock: use testingutil.MinAllocsPerRun
This speeds up and deflakes the test.

Fixes #2826 (again)

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
David Anderson c9bf773312 wgengine/magicsock: replace use of wgkey with new node key type.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 6e5175373e types/netmap: use new node key type.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson a9c78910bd wgengine/wgcfg: convert to use new node key type.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Brad Fitzpatrick b0b0a80318 net/netcheck: implement netcheck for js/wasm clients
And the derper change to add a CORS endpoint for latency measurement.

And a little magicsock change to cut down some log spam on js/wasm.

Updates #3157

Change-Id: I5fd9e6f5098c815116ddc8ac90cbcd0602098a48
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Crawshaw 0b62f26349 magicsock: remove test data race
Speculative, I haven't been able to replicate it locally.

Fixes #3156

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
3 years ago
Brad Fitzpatrick ed3fb197ad wgengine/magicsock: fix/disable a few misc things to get js/wasm working
Updates #3157

Change-Id: Ie9e3a772bb9878584080bb257b32150492e26eaf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick e25afc6656 wgengine/magicsock: don't try to determine endpoints on js/wasm
Avoid netcheck, LocalAddr, etc.

Updates #3157

Change-Id: Ibc875c787c0e101b8076e64833f4fcc809372815
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 6cb2705833 wgengine/magicsock: don't run UDP listeners on js/wasm
Be DERP-only for now. (WebRTC can come later :))

Updates #3157

Change-Id: I56ebb3d914e37e8f4ab651306fd705b817ca381c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c30fa5903d wgengine/magicsock: remove peerMap.byDiscoKey map
No longer used.

Updates #3088

Change-Id: I0ced3f87baa4053d3838d3c4a828ed0293923825
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Crawshaw 3552d86525 wgengine/magicsock: turn down timeouts in tests
Before:

	--- PASS: TestActiveDiscovery (11.78s)
	    --- PASS: TestActiveDiscovery/facing_easy_firewalls (5.89s)
	    --- PASS: TestActiveDiscovery/facing_nats (5.89s)
	    --- PASS: TestActiveDiscovery/simple_internet (0.89s)

After:

	--- PASS: TestActiveDiscovery (1.98s)
	    --- PASS: TestActiveDiscovery/facing_easy_firewalls (0.99s)
	    --- PASS: TestActiveDiscovery/facing_nats (0.99s)
	    --- PASS: TestActiveDiscovery/simple_internet (0.89s)

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
3 years ago
David Anderson b956139b0c wgengine/magicsock: track IP<>node mappings without relying on discokeys.
Updates #3088.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Brad Fitzpatrick 7a243ae5b1 wgengine/magicsock: finish TODO to speed up peerMap.forEachEndpointWithDiscoKey
Now that peerMap tracks the set of nodes for a DiscoKey.

Updates #3088

Change-Id: I927bf2bdfd2b8126475f6b6acc44bc799fcb489f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 11fdb14c53 wgengine/magicsock: don't check always-non-nil endpoint for nil-ness
Continuation of 2aa5df7ac1, remove nil
check because it can never be nil. (It previously was able to be nil.)

Change-Id: I59cd9ad611dbdcbfba680ed9b22e841b00c9d5e6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson e7eb46bced wgengine/magicsock: add an explicit else branch to peerMap update.
Clarifies that the replace+delete of peerinfo data is only when peerInfo
already exists.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 2aa5df7ac1 wgengine/magicsock: document and enforce that peerInfo.ep is non-nil.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 521b44e653 wgengine/magicsock: move discoKey fields to the mutex-protected section.
Fixes #3106

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Brad Fitzpatrick a6d02dc122 wgengine/magicsock: track which NodeKey each DiscoKey was last for
This adds new fields (currently unused) to discoInfo to track what the
last verified (unambiguous) NodeKey a DiscoKey last mapped to, and
when.

Then on CallMeMaybe, Pong and on most Pings, we update the mapping
from DiscoKey to the current NodeKey for that DiscoKey.

Updates #3088

Change-Id: Idc4261972084dec71cf8ec7f9861fb9178eb0a4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c759fcc7d3 wgengine/magicsock: fix data race with sync.Pool in error+logging path
Fixes #3122

Change-Id: Ib52e84f9bd5813d6cf2e80ce5b2296912a48e064
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 75a7779b42 disco, wgengine/magicsock: send self node key in disco pings
This lets clients quickly (sub-millisecond within a local LAN) map
from an ambiguous disco key to a node key without waiting for a
CallMeMaybe (over relatively high latency DERP).

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Denton Gentry def650b3e8 wgengine/magicsock: don't Rebind after STUN error if closed.
https://github.com/tailscale/tailscale/pull/3014 added a
rebind on STUN failure, which means there can now be a
tailscale.com/wgengine/magicsock.(*RebindingUDPConn).ReadFromNetaddr
in progress at the end of the test waiting for a STUN
response which will never arrive.

This causes a test flake due to the resource leak in those
cases where the Conn decided to rebind. For whatever reason,
it mostly flakes with Windows.

If the Conn is closed, don't Rebind after a send error.

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
3 years ago
Brad Fitzpatrick f55c2bccf5 wgengine/magicsock: don't call setAddrToDiscoLocked on DERP ping
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 569f70abfd wgengine/magicsock: finish some renamings of discoEndpoint to endpoint
Renames only; continuation of earlier 8049063d35

These kept confusing me while working on #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 695df497ba wgengine/magicsock: delete peerMap.endpointForDiscoKey, remove remaining caller
The one remaining caller of peerMap.endpointForDiscoKey was making the
improper assumption that there's exactly 1 node with a given DiscoKey
in the network. That was the cause of #3088.

Now that all the other callers have been updated to not use
endpointForDiscoKey, there's no need to try to keep maintaining that
prone-to-misuse index.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 04fd94acd6 wgengine/magicsock: remove endpointForDiscoKey call from handleDiscoMessage
A DiscoKey maps 1:n to endpoints. When we get a disco pong, we don't
necessarily know which endpoint sent it to us. Ask them all. There
will only usually be 1 (and in rare circumstances 2). So it's easier
to ask all two rather than building new maps from the random ping TxID
to its endpoint.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 151b4415ca wgengine/magicsock: remove endpoint parameter from handlePingLocked
We can reply to a ping without knowing which exact node it's from.  As
long as it's in our netmap, it's safe to reply. If there's more than
one node with that discokey, it doesn't matter who we're relpying to.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick d86081f353 wgengine/magicsock: add new discoInfo type for DiscoKey state, move some fields
As more prep for removing the false assumption that you're able to
map from DiscoKey to a single peer, move the lastPingFrom and lastPingTime
fields from the endpoint type to a new discoInfo type, effectively upgrading
the old sharedDiscoKey map (which only held a *[32]byte nacl precomputed key
as its value) to discoInfo which then includes that naclbox key.

Then start plumbing it into handlePing in prep for removing the need
for handlePing to take an endpoint parameter.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick e5779f019e wgengine/magicsock: move temporary endpoint lookup later, add TODO to remove
Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 36a07089ee wgengine/magicsock: remove redundant/wrong sharedDiscoKey delete
The pass just after in this method handles cleaning up sharedDiscoKey.
No need to do it wrong (assuming DiscoKey => 1 node) earlier.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 3e80806804 wgengine/magicsock: pass src NodeKey to handleDiscoMessage for DERP disco msgs
And then use it to avoid another lookup-by-DiscoKey.

Updates #3088
3 years ago
Brad Fitzpatrick 82fa15fa3b wgengine/magicsock: start removing endpointForDiscoKey
It's not valid to assume that a discokey is globally unique.

This removes the first two of the four callers.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Avery Pennarun 0d4a0bf60e magicsock: if STUN failed to send before, rebind before STUNning again.
On iOS (and possibly other platforms), sometimes our UDP socket would
get stuck in a state where it was bound to an invalid interface (or no
interface) after a network reconfiguration. We can detect this by
actually checking the error codes from sending our STUN packets.

If we completely fail to send any STUN packets, we know something is
very broken. So on the next STUN attempt, let's rebind the UDP socket
to try to correct any problems.

This fixes a problem where iOS would sometimes get stuck using DERP
instead of direct connections until the backend was restarted.

Fixes #2994

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
3 years ago
David Anderson 830f641c6b wgengine/magicsock: update discokeys on netmap change.
Fixes #3008.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Josh Bleecher Snyder a722e48cef wgengine/magicsock: skip alloc test with -race
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 31c1331415 wgengine/magicsock: deflake TestReceiveFromAllocs
100 iterations isn't enough with background allocs happening
apparently. 1000 seems to be reliable.

Fixes #2826

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 2238814b99 wgengine/magicsock: fix crash introduced in recent cleanups
Fixes #2801

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 640134421e all: update tests to use tstest.MemLogger
And give MemLogger a mutex, as one caller had, which does match the logf
contract better.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson efe8020dfa wgengine/magicsock: fix race condition in tests.
AFAICT this was always present, the log read mid-execution was never safe.
But it seems like the recent magicsock refactoring made the race much
more likely.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Brad Fitzpatrick 5bacbf3744 wgengine/magicsock, health, ipn/ipnstate: track DERP-advertised health
And add health check errors to ipnstate.Status (tailscale status --json).

Updates #2746
Updates #2775

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson bb10443edf wgengine/wgcfg: use just the hexlified node key as the WireGuard endpoint.
The node key is all magicsock needs to find the endpoint that WireGuard
needs.

Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson d00341360f wgengine/magicsock: remove unused debug knob.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson dfd978f0f2 wgengine/magicsock: use NodeKey, not DiscoKey, as the trigger for lazy reconfig.
Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 4c27e2fa22 wgengine/magicsock: remove Start method from Conn.
Over time, other magicsock refactors have made Start effectively a
no-op, except that some other functions choose to panic if called
before Start.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 1a899344bd wgengine/magicsock: don't store tailcfg.Nodes alongside endpoints.
Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson b2181608b5 wgengine/magicsock: eagerly create endpoints in SetNetworkMap.
Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Emmanuel T Odeke 0daa32943e all: add (*testing.B).ReportAllocs() to every benchmark
This ensures that we can properly track and catch allocation
slippages that could otherwise have been missed.

Fixes #2748
3 years ago
David Anderson 44d71d1e42 wgengine/magicsock: fix race in test shutdown, again.
We were returning an error almost, but not quite like errConnClosed in
a single codepath, which could still trip the panic on reconfig in the
test logic.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson f09ede9243 wgengine/magicsock: don't configure eager WireGuard handshaking in tests.
Our prod code doesn't eagerly handshake, because our disco layer enables
on-demand handshaking. Configuring both peers to eagerly handshake leads
to WireGuard handshake races that make TestTwoDevicePing flaky.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 86d1c4eceb wgengine/magicsock: ignore close races even harder.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 8bacfe6a37 wgengine/magicsock: remove unused sendLogLimit limiter.
Magicsock these days gets its logs limited by the global log limiter.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson e151b74f93 wgengine/magicsock: remove opts.SimulatedNetwork.
It only existed to override one test-only behavior with a
different test-only behavior, in both cases working around
an annoying feature of our CI environments. Instead, handle
that weirdness entirely in the test code, with a tweaked
TestOnlyPacketListener that gets injected.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 58c1f7d51a wgengine/magicsock: rename opts.PacketListener to TestOnlyPacketListener.
The docstring said it was meant for use in tests, but it's specifically a
special codepath that is _only_ used in tests, so make the claim stronger.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 8049063d35 wgengine/magicsock: rename discoEndpoint to just endpoint.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson f2d949e2db wgengine/magicsock: fold findEndpoint into its only remaining caller.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson fe2f89deab wgengine/magicsock: fix rare shutdown race in test.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 97693f2e42 wgengine/magicsock: delete legacy AddrSet endpoints.
Instead of using the legacy codepath, teach discoEndpoint to handle
peers that have a home DERP, but no disco key. We can still communicate
with them, but only over DERP.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
slowy07 ac0353e982 fix: typo spelling grammar
Signed-off-by: slowy07 <slowy.arfy@gmail.com>
3 years ago
Brad Fitzpatrick 37053801bb wgengine/magicsock: restore a bit of logging on node becoming active
Fixes #2695

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 39610aeb09 wgengine/magicsock: move debug knobs to their own file, compile out on iOS
No need for these knobs on iOS where you can set the environment
variables anyway.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick f3c96df162 ipn/ipnstate: move tailscale status "active" determination to tailscaled
Fixes #2579

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick b622c60ed0 derp,wgengine/magicsock: don't assume stringer is in $PATH for go:generate
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder 8a3d52e882 wgengine/magicsock: use mono.Time
magicsock makes multiple calls to Now per packet.
Move to mono.Now. Changing some of the calls to
use package mono has a cascading effect,
causing non-per-packet call sites to also switch.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 4dbbd0aa4a cmd/addlicense: add command to add licenseheaders to generated code
And use it to make our stringer invocations match the existing code.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder c179580599 wgengine/magicsock: add debug envvar to force all traffic over DERP
This would have been useful during debugging DERP issues recently.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 4f4dae32dd wgengine/magicsock: fix latent data race in test
logBufWriter had no serialization.
It just so happens that none of its users currently ever log concurrently.
Make it safe for concurrent use.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 7e7c4c1bbe tailcfg: break DERPNode.DERPTestPort into DERPPort & InsecureForTests
The DERPTestPort int meant two things before: which port to use, and
whether to disable TLS verification. Users would like to set the port
without disabling TLS, so break it into two options.

Updates #1264

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 92077ae78c wgengine/magicsock: make portmapping async
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
julianknodt 506c2fe8e2 cmd/tailscale: make netcheck use active DERP map, delete static copy
After allowing for custom DERP maps, it's convenient to be able to see their latency in
netcheck. This adds a query to the local tailscaled for the current DERPMap.

Updates #1264

Signed-off-by: julianknodt <julianknodt@gmail.com>
3 years ago
David Crawshaw 5f8ffbe166 magicsock: add SetPreferredPort method
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
3 years ago
Josh Bleecher Snyder ddf6c8c729 wgengine/magicsock: delete dead code
Co-authored-by: Adrian Dewhurst <adrian@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 1ece91cede go.mod: upgrade wireguard-windows, de-fork wireguard-go
Pull in the latest version of wireguard-windows.

Switch to upstream wireguard-go.
This requires reverting all of our import paths.

Unfortunately, this has to happen at the same time.
The wireguard-go change is very low risk,
as that commit matches our fork almost exactly.
(The only changes are import paths, CI files, and a go.mod entry.)
So if there are issues as a result of this commit,
the first place to look is wireguard-windows changes.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 25df067dd0 all: adapt to opaque netaddr types
This commit is a mishmash of automated edits using gofmt:

gofmt -r 'netaddr.IPPort{IP: a, Port: b} -> netaddr.IPPortFrom(a, b)' -w .
gofmt -r 'netaddr.IPPrefix{IP: a, Port: b} -> netaddr.IPPrefixFrom(a, b)' -w .

gofmt -r 'a.IP.Is4 -> a.IP().Is4' -w .
gofmt -r 'a.IP.As16 -> a.IP().As16' -w .
gofmt -r 'a.IP.Is6 -> a.IP().Is6' -w .
gofmt -r 'a.IP.As4 -> a.IP().As4' -w .
gofmt -r 'a.IP.String -> a.IP().String' -w .

And regexps:

\w*(.*)\.Port = (.*)  ->  $1 = $1.WithPort($2)
\w*(.*)\.IP = (.*)  ->  $1 = $1.WithIP($2)

And lots of manual fixups.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder ebcd7ab890 wgengine: remove wireguard-go DeviceOptions
We no longer need them.
This also removes the 32 bytes of prefix junk before endpoints.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder aacb2107ae all: add extra information to serialized endpoints
magicsock.Conn.ParseEndpoint requires a peer's public key,
disco key, and legacy ip/ports in order to do its job.
We currently accomplish that by:

* adding the public key in our wireguard-go fork
* encoding the disco key as magic hostname
* using a bespoke comma-separated encoding

It's a bit messy.

Instead, switch to something simpler: use a json-encoded struct
containing exactly the information we need, in the form we use it.

Our wireguard-go fork still adds the public key to the
address when it passes it to ParseEndpoint, but now the code
compensating for that is just a couple of simple, well-commented lines.
Once this commit is in, we can remove that part of the fork
and remove the compensating code.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder ddd85b9d91 wgengine/magicsock: rename discoEndpoint.wgEndpointHostPort to wgEndpoint
Fields rename only.

Part of the general effort to make our code agnostic about endpoint formatting.
It's just a name, but it will soon be a misleading one; be more generic.
Do this as a separate commit because it generates a lot of whitespace changes.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder e0bd3cc70c wgengine/magicsock: use netaddr.MustParseIPPrefix
Delete our bespoke helper.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder bc68e22c5b all: s/CreateEndpoint/ParseEndpoint/ in docs
Upstream wireguard-go renamed the interface method
from CreateEndpoint to ParseEndpoint.
I missed some comments. Fix them.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder a0dacba877 wgengine/magicsock: simplify legacy endpoint DstToString
Legacy endpoints (addrSet) currently reconstruct their dst string when requested.

Instead, store the dst string we were given to begin with.
In addition to being simpler and cheaper, this makes less code
aware of how to interpret endpoint strings.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 64047815b0 wgenengine/magicsock: delete cursed tests
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 7ee891f5fd all: delete wgcfg.Key and wgcfg.PrivateKey
For historical reasons, we ended up with two near-duplicate
copies of curve25519 key types, one in the wireguard-go module
(wgcfg) and one in the tailscale module (types/wgkey).
Then we moved wgcfg to the tailscale module.
We can now remove the wgcfg key type in favor of wgkey.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 9d542e08e2 wgengine/magicsock: always run ReceiveIPv6
One of the consequences of the 	bind refactoring in 6f23087175
is that attempting to bind an IPv6 socket will always
result in c.pconn6.pconn being non-nil.
If the bind fails, it'll be set to a placeholder packet conn
that blocks forever.

As a result, we can always run ReceiveIPv6 and health check it.
This removes IPv4/IPv6 asymmetry and also will allow health checks
to detect any IPv6 receive func failures.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder fe50ded95c health: track whether we have a functional udp4 bind
Suggested-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 7dc7078d96 wgengine/magicsock: use netaddr.IP in listenPacket
It must be an IP address; enforce that at the type level.

Suggested-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 3c543c103a wgengine/magicsock: unify initial bind and rebind
We had two separate code paths for the initial UDP listener bind
and any subsequent rebinds.

IPv6 got left out of the rebind code.
Rather than duplicate it there, unify the two code paths.
Then improve the resulting code:

* Rebind had nested listen attempts to try the user-specified port first,
  and then fall back to :0 if that failed. Convert that into a loop.
* Initial bind tried only the user-specified port.
  Rebind tried the user-specified port and 0.
  But there are actually three ports of interest:
  The one the user specified, the most recent port in use, and 0.
  We now try all three in order, as appropriate.
* In the extremely rare case in which binding to port 0 fails,
  use a dummy net.PacketConn whose reads block until close.
  This will keep the wireguard-go receive func goroutine alive.

As a pleasant side-effect of this, if we decide that
we need to resuscitate #1796, it will now be much easier.

Fixes #1799

Co-authored-by: David Anderson <danderson@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 8fb66e20a4 wgengine/magicsock: remove DefaultPort const
Assume it'll stay at 0 forever, so hard-code it
and delete code conditional on it being non-0.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder a8f61969b9 wgengine/magicsock: remove context arg from listenPacket
It was set to context.Background by all callers, for the same reasons.
Set it locally instead, to simplify call sites.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 744de615f1 health, wgenegine: fix receive func health checks for the fourth time
The old implementation knew too much about how wireguard-go worked.
As a result, it missed genuine problems that occurred due to unrelated bugs.

This fourth attempt to fix the health checks takes a black box approach.
A receive func is healthy if one (or both) of these conditions holds:

* It is currently running and blocked.
* It has been executed recently.

The second condition is required because receive functions
are not continuously executing. wireguard-go calls them and then
processes their results before calling them again.

There is a theoretical false positive if wireguard-go go takes
longer than one minute to process the results of a receive func execution.
If that happens, we have other problems.

Updates #1790

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 0d4c8cb2e1 health: delete ReceiveFunc health checks
They were not doing their job.
They need yet another conceptual re-think.
Start by clearing the decks.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 8d7f7fc7ce health, wgenegine: fix receive func health checks yet again
The existing implementation was completely, embarrassingly conceptually broken.

We aren't able to see whether wireguard-go's receive function goroutines
are running or not. All we can do is model that based on what we have done.
This commit fixes that model.

Fixes #1781

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 5835a3f553 health, wgengine/magicsock: avoid receive function false positives
Avery reported a sub-ms health transition from "receiveIPv4 not running" to "ok".

To avoid these transient false-positives, be more precise about
the expected lifetime of receive funcs. The problematic case is one in which
they were started but exited prior to a call to connBind.Close.
Explicitly represent started vs running state, taking care with the order of updates.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder f845aae761 health: track whether magicsock receive functions are running
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 48e30bb8de wgengine/magicsock: remove named return
Doesn't add anything.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder a2a2c0ce1c wgengine/magicsock: fix two comments
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder b1e624ef04 wgengine/magicsock: remove unnecessary type assertions
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 98714e784b wgengine/magicsock: improve Rebind logging
We were accidentally logging oldPort -> oldPort.

Log oldPort as well as c.port; if we failed to get the preferred port
in a previous rebind, oldPort might differ from c.port.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 15ceacc4c5 wgengine/magicsock: accept a host and port instead of an addr in listenPacket
This simplifies call sites and prevents accidental failure to use net.JoinHostPort.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Brad Fitzpatrick b993d9802a ipn/ipnlocal, etc: require file sharing capability to send/recv files
tailscale/corp#1582

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 762180595d ipn/ipnstate: add PeerStatus.TailscaleIPs slice, deprecate TailAddr
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 34d2f5a3d9 tailcfg: add Endpoint, EndpointType, MapRequest.EndpointType
Track endpoints internally with a new tailcfg.Endpoint type that
includes a typed netaddr.IPPort (instead of just a string) and
includes a type for how that endpoint was discovered (STUN, local,
etc).

Use []tailcfg.Endpoint instead of []string internally.

At the last second, send it to the control server as the existing
[]string for endpoints, but also include a new parallel
MapRequest.EndpointType []tailcfg.EndpointType, so the control server
can start filtering out less-important endpoint changes from
new-enough clients. Notably, STUN-discovered endpoints can be filtered
out from 1.6+ clients, as they can discover them amongst each other
via CallMeMaybe disco exchanges started over DERP. And STUN endpoints
change a lot, causing a lot of MapResposne updates. But portmapped
endpoints are worth keeping for now, as they they work right away
without requiring the firewall traversal extra RTT dance.

End result will be less control->client bandwidth. (despite negligible
increase in client->control bandwidth)

Updates tailscale/corp#1543

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder ba72126b72 wgengine/magicsock: remove RebindingUDPConn.FakeClosed
It existed to work around the frequent opening and closing
of the conn.Bind done by wireguard-go.
The preceding commit removed that behavior,
so we can simply close the connections
when we are done with them.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 69cdc30c6d wgengine/wgcfg: remove Config.ListenPort
We don't use the port that wireguard-go passes to us (via magicsock.connBind.Open).
We ignore it entirely and use the port we selected.

When we tell wireguard-go that we're changing the listen_port,
it calls connBind.Close and then connBind.Open.
And in the meantime, it stops calling the receive functions,
which means that we stop receiving and processing UDP and DERP packets.
And that is Very Bad.

That was never a problem prior to b3ceca1dd7,
because we passed the SkipBindUpdate flag to our wireguard-go fork,
which told wireguard-go not to re-bind on listen_port changes.
That commit eliminated the SkipBindUpdate flag.

We could write a bunch of code to work around the gap.
We could add background readers that process UDP and DERP packets when wireguard-go isn't.
But it's simpler to never create the conditions in which wireguard-go rebinds.

The other scenario in which wireguard-go re-binds is device.Down.
Conveniently, we never call device.Down. We go from device.Up to device.Close,
and the latter only when we're shutting down a magicsock.Conn completely.

Rubber-ducked-by: Avery Pennarun <apenwarr@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder b3ceca1dd7 wgengine/...: split into multiple receive functions
Upstream wireguard-go has changed its receive model.
NewDevice now accepts a conn.Bind interface.

The conn.Bind is stateless; magicsock.Conns are stateful.
To work around this, we add a connBind type that supports
cheap teardown and bring-up, backed by a Conn.

The new conn.Bind allows us to specify a set of receive functions,
rather than having to shoehorn everything into ReceiveIPv4 and ReceiveIPv6.
This lets us plumbing DERP messages directly into wireguard-go,
instead of having to mux them via ReceiveIPv4.

One consequence of the new conn.Bind layer is that
closing the wireguard-go device is now indistinguishable
from the routine bring-up and tear-down normally experienced
by a conn.Bind. We thus have to explicitly close the magicsock.Conn
when the close the wireguard-go device.

One downside of this change is that we are reliant on wireguard-go
to call receiveDERP to process DERP messages. This is fine for now,
but is perhaps something we should fix in the future.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 34d4943357 all: gofmt -s
The code is not obviously better or worse, but this makes the little warning
triangle in my editor go away, and the distraction removal is worth it.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago