Commit Graph

409 Commits (640134421eb922e8b85c3686f9966d7001a02879)

Author SHA1 Message Date
Josh Bleecher Snyder 744de615f1 health, wgenegine: fix receive func health checks for the fourth time
The old implementation knew too much about how wireguard-go worked.
As a result, it missed genuine problems that occurred due to unrelated bugs.

This fourth attempt to fix the health checks takes a black box approach.
A receive func is healthy if one (or both) of these conditions holds:

* It is currently running and blocked.
* It has been executed recently.

The second condition is required because receive functions
are not continuously executing. wireguard-go calls them and then
processes their results before calling them again.

There is a theoretical false positive if wireguard-go go takes
longer than one minute to process the results of a receive func execution.
If that happens, we have other problems.

Updates #1790

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 0d4c8cb2e1 health: delete ReceiveFunc health checks
They were not doing their job.
They need yet another conceptual re-think.
Start by clearing the decks.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 8d7f7fc7ce health, wgenegine: fix receive func health checks yet again
The existing implementation was completely, embarrassingly conceptually broken.

We aren't able to see whether wireguard-go's receive function goroutines
are running or not. All we can do is model that based on what we have done.
This commit fixes that model.

Fixes #1781

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 5835a3f553 health, wgengine/magicsock: avoid receive function false positives
Avery reported a sub-ms health transition from "receiveIPv4 not running" to "ok".

To avoid these transient false-positives, be more precise about
the expected lifetime of receive funcs. The problematic case is one in which
they were started but exited prior to a call to connBind.Close.
Explicitly represent started vs running state, taking care with the order of updates.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder f845aae761 health: track whether magicsock receive functions are running
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 48e30bb8de wgengine/magicsock: remove named return
Doesn't add anything.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder a2a2c0ce1c wgengine/magicsock: fix two comments
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder b1e624ef04 wgengine/magicsock: remove unnecessary type assertions
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 98714e784b wgengine/magicsock: improve Rebind logging
We were accidentally logging oldPort -> oldPort.

Log oldPort as well as c.port; if we failed to get the preferred port
in a previous rebind, oldPort might differ from c.port.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 15ceacc4c5 wgengine/magicsock: accept a host and port instead of an addr in listenPacket
This simplifies call sites and prevents accidental failure to use net.JoinHostPort.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Brad Fitzpatrick b993d9802a ipn/ipnlocal, etc: require file sharing capability to send/recv files
tailscale/corp#1582

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 762180595d ipn/ipnstate: add PeerStatus.TailscaleIPs slice, deprecate TailAddr
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 34d2f5a3d9 tailcfg: add Endpoint, EndpointType, MapRequest.EndpointType
Track endpoints internally with a new tailcfg.Endpoint type that
includes a typed netaddr.IPPort (instead of just a string) and
includes a type for how that endpoint was discovered (STUN, local,
etc).

Use []tailcfg.Endpoint instead of []string internally.

At the last second, send it to the control server as the existing
[]string for endpoints, but also include a new parallel
MapRequest.EndpointType []tailcfg.EndpointType, so the control server
can start filtering out less-important endpoint changes from
new-enough clients. Notably, STUN-discovered endpoints can be filtered
out from 1.6+ clients, as they can discover them amongst each other
via CallMeMaybe disco exchanges started over DERP. And STUN endpoints
change a lot, causing a lot of MapResposne updates. But portmapped
endpoints are worth keeping for now, as they they work right away
without requiring the firewall traversal extra RTT dance.

End result will be less control->client bandwidth. (despite negligible
increase in client->control bandwidth)

Updates tailscale/corp#1543

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder ba72126b72 wgengine/magicsock: remove RebindingUDPConn.FakeClosed
It existed to work around the frequent opening and closing
of the conn.Bind done by wireguard-go.
The preceding commit removed that behavior,
so we can simply close the connections
when we are done with them.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 69cdc30c6d wgengine/wgcfg: remove Config.ListenPort
We don't use the port that wireguard-go passes to us (via magicsock.connBind.Open).
We ignore it entirely and use the port we selected.

When we tell wireguard-go that we're changing the listen_port,
it calls connBind.Close and then connBind.Open.
And in the meantime, it stops calling the receive functions,
which means that we stop receiving and processing UDP and DERP packets.
And that is Very Bad.

That was never a problem prior to b3ceca1dd7,
because we passed the SkipBindUpdate flag to our wireguard-go fork,
which told wireguard-go not to re-bind on listen_port changes.
That commit eliminated the SkipBindUpdate flag.

We could write a bunch of code to work around the gap.
We could add background readers that process UDP and DERP packets when wireguard-go isn't.
But it's simpler to never create the conditions in which wireguard-go rebinds.

The other scenario in which wireguard-go re-binds is device.Down.
Conveniently, we never call device.Down. We go from device.Up to device.Close,
and the latter only when we're shutting down a magicsock.Conn completely.

Rubber-ducked-by: Avery Pennarun <apenwarr@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder b3ceca1dd7 wgengine/...: split into multiple receive functions
Upstream wireguard-go has changed its receive model.
NewDevice now accepts a conn.Bind interface.

The conn.Bind is stateless; magicsock.Conns are stateful.
To work around this, we add a connBind type that supports
cheap teardown and bring-up, backed by a Conn.

The new conn.Bind allows us to specify a set of receive functions,
rather than having to shoehorn everything into ReceiveIPv4 and ReceiveIPv6.
This lets us plumbing DERP messages directly into wireguard-go,
instead of having to mux them via ReceiveIPv4.

One consequence of the new conn.Bind layer is that
closing the wireguard-go device is now indistinguishable
from the routine bring-up and tear-down normally experienced
by a conn.Bind. We thus have to explicitly close the magicsock.Conn
when the close the wireguard-go device.

One downside of this change is that we are reliant on wireguard-go
to call receiveDERP to process DERP messages. This is fine for now,
but is perhaps something we should fix in the future.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 34d4943357 all: gofmt -s
The code is not obviously better or worse, but this makes the little warning
triangle in my editor go away, and the distraction removal is worth it.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 1df162b05b wgengine/magicsock: adapt CreateEndpoint signature to match wireguard-go
Part of a temporary change to make merging wireguard-go easier.
See https://github.com/tailscale/wireguard-go/pull/45.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 36a85e1760 wgengine/magicsock: don't call t.Fatal in magicStack.IP
It can end up executing an a new goroutine,
at which point instead of immediately stopping test execution, it hangs.
Since this is unexpected anyway, panic instead.
As a bonus, it makes call sites nicer and removes a kludge comment.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
David Anderson 016de16b2e net/tstun: rename TUN to Wrapper.
The tstun packagen contains both constructors for generic tun
Devices, and a wrapper that provides additional functionality.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 588b70f468 net/tstun: merge in wgengine/tstun.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Brad Fitzpatrick 81143b6d9a ipn/ipnlocal: start of peerapi between nodes
Also some necessary refactoring of the ipn/ipnstate too.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder 28af46fb3b wgengine: pass logger as a separate arg to device.NewDevice
Adapt to minor API changes in wireguard-go.
And factor out device.DeviceOptions variables.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 4b77eca2de wgengine/magicsock: check returned error in addTestEndpoint
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick c99f260e40 wgengine/magicsock: prefer IPv6 transport if roughly equivalent latency
Fixes #1566

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 9643d8b34d wgengine/magicsock: add an addrLatency type to combine an IPPort+time.Duration
Updates #1566 (but no behavior changes as of this change)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 0994a9f7c4 wgengine{,/magicsock}: fix, improve "tailscale ping" to default routes and subnets
e.g.

$ tailscale ping 1.1.1.1
exit node found but not enabled

$ tailscale ping 10.2.200.2
node "tsbfvlan2" found, but not using its 10.2.200.0/24 route

$ sudo tailscale  up --accept-routes
$ tailscale ping 10.2.200.2
pong from tsbfvlan2 (100.124.196.94) via 10.2.200.34:41641 in 1ms

$ tailscale ping mon.ts.tailscale.com
pong from monitoring (100.88.178.64) via DERP(sfo) in 83ms
pong from monitoring (100.88.178.64) via DERP(sfo) in 21ms
pong from monitoring (100.88.178.64) via [2604:a880:4:d1::37:d001]:41641 in 22ms

This necessarily moves code up from magicsock to wgengine, so we can
look at the actual wireguard config.

Fixes #1564

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 7e0d12e7cc wgengine/magicsock: don't update control if only endpoint order changes
Updates #1559

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 32562a82a9 wgengine/magicsock: annotate a few more disco logs as verbose
Fixes #1540

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c19ed37b0f wgengine/magicsock: mark some legacy debug log output as verbose
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick ba8c6d0775 health, controlclient, ipn, magicsock: tell health package state of things
Not yet checking anything. Just plumbing states into the health package.

Updates #1505

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 44ab0acbdb net/portmapper, wgengine/monitor: cache gateway IP info until link changes
Cuts down allocs & CPU in steady state (on regular STUN probes) when network
is unchanging.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c81814e4f8 derp{,/derphttp},magicsock: tell DERP server when ping acks can be expected
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c576fea60e wgengine/magicsock: delete unused WhoIs method that was moved elsewhere
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick ef7bac2895 tailcfg, net/portmapper, wgengine/magicsock: add NetInfo.HavePortMap
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 79d8288f0a wgengine/magicsock, derp, derp/derphttp: respond to DERP server->client pings
No server support yet, but we want Tailscale 1.6 clients to be able to respond
to them when the server can do it.

Updates #1310

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 387e83c8fe wgengine/magicsock: fix Conn.Rebind race that let ErrClosed errors be read
There was a logical race where Conn.Rebind could acquire the
RebindingUDPConn mutex, close the connection, fail to rebind, release
the mutex, and then because the mutex was no longer held, ReceiveIPv4
wouldn't retry reads that failed with net.ErrClosed, letting that
error back to wireguard-go, which would then stop running that receive
IP goroutine.

Instead, keep the RebindingUDPConn mutex held for the entirety of the
replacement in all cases.

Updates tailscale/corp#1289

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c445e3d327 wgengine/magicsock: fix typo in comment 3 years ago
Brad Fitzpatrick a6d098c750 wgengine/magicsock: log when DERP connection succeeds
Updates #1310
3 years ago
Brad Fitzpatrick 829eb8363a net/interfaces: sort returned addresses from LocalAddresses
Also change the type to netaddr.IP while here, because it made sorting
easier.

Updates tailscale/corp#1397

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c3e5903b91 wgengine/magicsock: remove leftover portmapper debug logging
It's already logged at the right time in logEndpointChange.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick ea3715e3ce wgengine/magicsock: remove TODO about endpoints-over-DERP
It was done in Tailscale 1.4 with CallMeMaybe disco messages
containing endpoints.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson 2404c0ffad ipn/ipnlocal: only filter out default routes when computing the local wg config.
UIs need to see the full unedited netmap in order to know what exit nodes they
can offer to the user.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Brad Fitzpatrick e9e4f1063d wgengine/magicsock: fix discoEndpoint caching bug when a node key changes
Fixes #1391

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c64bd587ae net/portmapper: add NAT-PMP client, move port mapping service probing
* move probing out of netcheck into new net/portmapper package
* use PCP ANNOUNCE op codes for PCP discovery, rather than causing
  short-lived (sub-second) side effects with a 1-second-expiring map +
  delete.
* track when we heard things from the router so we can be less wasteful
  in querying the router's port mapping services in the future
* use portmapper from magicsock to map a public port

Fixes #1298
Fixes #1080
Fixes #1001
Updates #864

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder c7e5ab8094 wgengine/magicsock: retry and re-send packets in TestTwoDevicePing
When a handshake race occurs, a queued data packet can get lost.
TestTwoDevicePing expected that the very first data packet would arrive.
This caused occasional flakes.

Change TestTwoDevicePing to repeatedly re-send packets
and succeed when one of them makes it through.

This is acceptable (vs making WireGuard not drop the packets)
because this only affects communication with extremely old clients.
And those extremely old clients will eventually connect,
because the kernel will retry sends on timeout.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 1632f9fd6b wgengine/magicsock: reduce log spam during tests
Only do the type assertion to *net.UDPAddr when addr is non-nil.
This prevents a bunch of log spam during tests.
3 years ago
Josh Bleecher Snyder 88586ec4a4 wgengine/magicsock: remove an alloc from ReceiveIPvN
We modified the standard net package to not allocate a *net.UDPAddr
during a call to (*net.UDPConn).ReadFromUDP if the caller's use
of the *net.UDPAddr does not cause it to escape.
That is https://golang.org/cl/291390.

This is the companion change to magicsock.
There are two changes required.
First, call ReadFromUDP instead of ReadFrom, if possible.
ReadFrom returns a net.Addr, which is an interface, which always allocates.
Second, reduce the lifetime of the returned *net.UDPAddr.
We do this by immediately converting it into a netaddr.IPPort.

We left the existing RebindingUDPConn.ReadFrom method in place,
as it is required to satisfy the net.PacketConn interface.

With the upstream change and both of these fixes in place,
we have removed one large allocation per packet received.

name           old time/op    new time/op    delta
ReceiveFrom-8    16.7µs ± 5%    16.4µs ± 8%     ~     (p=0.310 n=5+5)

name           old alloc/op   new alloc/op   delta
ReceiveFrom-8      112B ± 0%       64B ± 0%  -42.86%  (p=0.008 n=5+5)

name           old allocs/op  new allocs/op  delta
ReceiveFrom-8      3.00 ± 0%      2.00 ± 0%  -33.33%  (p=0.008 n=5+5)

Co-authored-by: Sonia Appasamy <sonia@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 0c673c1344 wgengine/magicsock: unify on netaddr types in addrSet
addrSet maintained duplicate lists of netaddr.IPPorts and net.UDPAddrs.
Unify to use the netaddr type only.

This makes (*Conn).ReceiveIPvN a bit uglier,
but that'll be cleaned up in a subsequent commit.

This is preparatory work to remove an allocation from ReceiveIPv4.

Co-authored-by: Sonia Appasamy <sonia@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 4cd9218351 wgengine/magicsock: prevent logging while running benchmarks
Co-authored-by: Sonia Appasamy <sonia@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 635e4c7435 wgengine/magicsock: increase legacy ping timeout again
I based my estimation of the required timeout based on locally
observed behavior. But CI machines are worse than my local machine.
16s was enough to reduce flakiness but not eliminate it. Bump it up again.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 7e201806b1 wgengine/magicsock: reconnect to DERP home after network comes back up
Updates #1310
3 years ago
Brad Fitzpatrick 9b4e50cec0 wgengine/magicsock: fix typo in comment 3 years ago
Brad Fitzpatrick 6b365b0239 wgengine/magicsock: fix DERP reader hang regression during concurrent reads
Fixes #1282

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder e1f773ebba wgengine/magicsock: allow more time for pings to transit
We removed the "fast retry" code from our wireguard-go fork.
As a result, pings can take longer to transit when retries are required. 
Allow that.

Fixes #1277

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 6d2b8df06d wgengine/magicsock: add disabled failing (deadlocking) test for #1282
The fix can make this test run unconditionally.

This moves code from 5c619882bc for
testability but doesn't fix it yet. The #1282 problem remains (when I
wrote its wake-up mechanism, I forgot there were N DERP readers
funneling into 1 UDP reader, and the code just isn't correct at all
for that case).

Also factor out some test helper code from BenchmarkReceiveFrom.

The refactoring in magicsock.go for testability should have no
behavior change.
3 years ago
Brad Fitzpatrick 1e7a35b225 types/netmap: split controlclient.NetworkMap off into its own leaf package
Updates #1278

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 6064b6ff47 wgengine/wgcfg/nmcfg: split control/controlclient/netmap.go into own package
It couldn't move to ipnlocal due to test dependency cycles.

Updates #1278

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson ace57d7627 wgengine/magicsock: set a dummy private key in benchmark.
Magicsock started dropping all traffic internally when Tailscale is
shut down, to avoid spurious wireguard logspam. This made the benchmark
not receive anything. Setting a dummy private key is sufficient to get
magicsock to pass traffic for benchmarking purposes.

Fixes #1270.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Brad Fitzpatrick f7eed25bb9 wgengine/magicsock: filter disco packets and packets when stopped from wireguard
Fixes #1167
Fixes tailscale/corp#219

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder e8cd7bb66f tstest: simplify goroutine leak tests
Use tb.Cleanup to simplify both the API and the implementation.

One behavior change: When the number of goroutines shrinks, don't log.
I've never found these logs to be useful, and they frequently add noise.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder dd10babaed wgenginer/magicsock: remove Addrs methods
They are now unused.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder fe7c3e9c17 all: move wgcfg from wireguard-go
This is mostly code movement from the wireguard-go repo.

Most of the new wgcfg package corresponds to the wireguard-go wgcfg package.

wgengine/wgcfg/device{_test}.go was device/config{_test}.go.
There were substantive but simple changes to device_test.go to remove
internal package device references.

The API of device.Config (now wgcfg.DeviceConfig) grew an error return;
we previously logged the error and threw it away.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder d5baeeed5c wgengine: use Tailscale-style peer identifiers in logs
Rewrite log lines on the fly, based on the set of known peers.

This enables us to use upstream wireguard-go logging,
but maintain the Tailscale-style peer public key identifiers
that the rest of our systems (and people) expect.

Fixes #1183

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 9541886856 wgengine/magicsock: disable regular STUNs for all platforms by default
Reduces background CPU & network.

Updates #1034

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c55d26967b wgengine/magicsock: log more details of endpoints learned over disco
Also, don't try to use IPv6 LinkLocalUnicast addresses for now. Like endpoints
exchanged with control, we share them but don't yet use them.

Updates #1172
3 years ago
Brad Fitzpatrick 359055d3fa wgengine/magicsock: fix logging regression
c8c493f3d9 made it always say
`created=false` which scared me when I saw it, as that would've implied
things were broken much worse. Fortunately the logging was just wrong.
3 years ago
Brad Fitzpatrick edf64e0901 wgengine/magicsock: send, use endpoints in CallMeMaybe messages
Fixes #1172

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick b5b4992eff disco: support parsing/encoding endpoints in call-me-maybe frames
Updates #1172

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder d3dd7c6270 wgengine/magicsock: make legacy DstToString match Addrs
DstToString is used in two places in wireguard-go: Logging and uapi.

We are switching to use uapi for wireguard-go config.
To preserve existing behavior, we need the full set of addrs.

And for logging, having the full set of addrs seems useful.

(The Addrs method itself is slated for removal. When that happens,
the implementation will move to DstToString.)


Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 187e22a756 wgengine/magicsock: don't run the DERP cleanup so often
To save CPU and wakeups, don't run the DERP cleanup timer regularly
unless there is a non-home DERP connection open.

Also eliminates the goroutine, moving to a time.AfterFunc.

Updates #1034

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder 5fe5402fcd Revert "wgengine/magicsock: shortcircuit discoEndpoint.heartbeat when its connection is closed"
This reverts commit 08baa17d9a.
It caused deadlocks due to lock ordering violations.
It was not the right fix, and thus should simply be reverted
while we look for the right fix (if we haven't already found it
in the interim; we've fixed other logging-after-test issues).

Fixes #1161
3 years ago
Josh Bleecher Snyder e4c075cd95 wgengine/magicsock: prevent log-after-test in TestTwoDevicePing 3 years ago
Brad Fitzpatrick edce91a8a6 wgengine/magicsock: fix a naked return bug/crash where we returned (nil, true)
The 'ok' from 'ipp, ok :=' above was the result parameter ok. Whoops.
3 years ago
Brad Fitzpatrick 51bd1feae4 wgengine/magicsock: add single element IPPort->endpoint cache in receive path
name           old time/op  new time/op  delta
ReceiveFrom-4  21.8µs ± 2%  20.9µs ± 2%  -4.27%  (p=0.000 n=10+10)

Updates #414

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 5c619882bc wgengine/magicsock: simplify ReceiveIPv4+DERP path
name           old time/op  new time/op  delta
ReceiveFrom-4  35.8µs ± 3%  21.9µs ± 5%  -38.92%  (p=0.008 n=5+5)

Fixes #1145
Updates #414

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 3fa86a8b23 wgengine/magicsock: use relatively new netaddr.IPPort.IsZero method 3 years ago
Brad Fitzpatrick 4811236189 wgengine/magicsock: speed up BenchmarkReceiveFrom, store context.Done chan
context.cancelCtx.Done involves a mutex and isn't as cheap as I
previously assumed. Convert the donec method into a struct field and
store the channel value once. Our one magicsock.Conn gets one pointer
larger, but it cuts ~1% of the CPU time of the ReceiveFrom benchmark
and removes a bubble from the --svg output :)
3 years ago
Josh Bleecher Snyder ed2169ae99 wgengine/magicsock: prevent logging after TestActiveDiscovery completes 3 years ago
Josh Bleecher Snyder 63af950d8c wgengine/magicsock: adapt to wireguard-go without UpdateDst
22507adf54 stopped relying on
our fork of wireguard-go's UpdateDst callback.
As a result, we can unwind that code,
and the extra return value of ReceiveIPv{4,6}.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Denton Gentry 23c2dc2165
magicksock: remove TestConnClosing. (#1140)
Test is flakey, remove it and figure out what to do differently later.

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
3 years ago
David Anderson e23b4191c4 wgengine/magicsock: disable legacy networking everywhere except TwoDevicePing.
TwoDevicePing is explicitly testing the behavior of the legacy codepath, everything
else is happy to assume that code no longer exists.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 0733c5d2e0 wgengine/magicsock: disable legacy behavior in a few more tests.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 57d95dd005 wgengine/magicsock: default legacy networking to off for some tests.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson a2463e8948 wgengine/magicsock: add an option to disable legacy peer handling.
Used in tests to ensure we're not relying on behavior we're going
to remove eventually.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson d456bfdc6d wgengine/magicsock: fix BenchmarkReceiveFrom.
Previously, this benchmark relied on behavior of the legacy
receive codepath, which I changed in 22507adf. With this
change, the benchmark instead relies on the new active discovery
path.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Josh Bleecher Snyder 2d837f79dc wgengine/magicsock: close test loggers once we're done with them
This is a big hammer approach to helping with #1132.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 08baa17d9a wgengine/magicsock: shortcircuit discoEndpoint.heartbeat when its connection is closed
This prevents us from continuing to do unnecessary work
(including logging) after the connection has closed.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 7c76435bf7 wgengine/magicsock: simplify
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder d2529affa2 wgengine/magicsock: quiet wireguard-go logging in tests
We already do this in newUserspaceEngineAdvanced.
Apply it to newMagicStack as well.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 654b5f1570 all: convert from []wgcfg.Endpoint to string
This eliminates a dependency on wgcfg.Endpoint,
as part of the effort to eliminate our wireguard-go fork.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
David Anderson 9abcb18061 wgengine/magicsock: import more of wireguard-go, update docstrings.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 22507adf54 wgengine/magicsock: stop depending on UpdateDst in legacy codepaths.
This makes connectivity between ancient and new tailscale nodes slightly
worse in some cases, but only in cases where the ancient version would
likely have failed to get connectivity anyway.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Denton Gentry 8349e10907 magicsock: add description of testClosingContext
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
4 years ago
Denton Gentry 2e9728023b magicsock: test error case in sendDiscoMessage
In sendDiscoMessage there is a check of whether the connection is
closed, which is not being reliably exercised by other tests.
This shows up in code coverage reports, the lines of code in
sendDiscoMessage are alternately added and subtracted from
code coverage.

Add a test to specifically exercise and verify this code path.

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
4 years ago
Denton Gentry 0aa55bffce magicsock: test error case in derpWriteChanOfAddr
In derpWriteChanOfAddr when we call derphttp.NewRegionClient(),
there is a check of whether the connection is already errored and
if so it returns before grabbing the lock. The lock might already
be held and would be a deadlock.

This corner case is not being reliably exercised by other tests.
This shows up in code coverage reports, the lines of code in
derpWriteChanOfAddr are alternately added and subtracted from
code coverage.

Add a test to specifically exercise this code path, and verify that
it doesn't deadlock.

This is the best tradeoff I could come up with:
+ the moment code calls Err() to check if there is an error, we
  grab the lock to make sure it would deadlock if it tries to grab
  the lock itself.
+ if a new call to Err() is added in this code path, only the
  first one will be covered and the rest will not be tested.
+ this test doesn't verify whether code is checking for Err() in
  the right place, which ideally I guess it would.

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
4 years ago
Brad Fitzpatrick 85e54af0d7 wgengine: on TCP connect fail/timeout, log some clues about why it failed
So users can see why things aren't working.

A start. More diagnostics coming.

Updates #1094
4 years ago
Brad Fitzpatrick f85769b1ed wgengine/magicsock: drop netaddr.IPPort cache
netaddr.IP no longer allocates, so don't need a cache or all its associated
code/complexity.

This totally removes groupcache/lru from the deps.

Also go mod tidy.
4 years ago
Brad Fitzpatrick 5aa5db89d6 cmd/tailscaled, wgengine/netstack: add start of gvisor userspace netstack work
Not usefully functional yet (mostly a proof of concept), but getting
it submitted for some work @namansood is going to do atop this.

Updates #707
Updates #634
Updates #48
Updates #835
4 years ago
Brad Fitzpatrick 5efb0a8bca cmd/tailscale: change formatting of "tailscale status"
* show DNS name over hostname, removing domain's common MagicDNS suffix.
  only show hostname if there's no DNS name.
  but still show shared devices' MagicDNS FQDN.

* remove nerdy low-level details by default: endpoints, DERP relay,
  public key.  They're available in JSON mode still for those who need
  them.

* only show endpoint or DERP relay when it's active with the goal of
  making debugging easier. (so it's easier for users to understand
  what's happening) The asterisks are gone.

* remove Tx/Rx numbers by default for idle peers; only show them when
  there's traffic.

* include peers' owner login names

* add CLI option to not show peers (matching --self=true, --peers= also
  defaults to true)

* sort by DNS/host name, not public key

* reorder columns
4 years ago