Commit Graph

433 Commits (7a243ae5b1968727e3f06ffd15c5c40ad8ab0026)

Author SHA1 Message Date
Brad Fitzpatrick 7a243ae5b1 wgengine/magicsock: finish TODO to speed up peerMap.forEachEndpointWithDiscoKey
Now that peerMap tracks the set of nodes for a DiscoKey.

Updates #3088

Change-Id: I927bf2bdfd2b8126475f6b6acc44bc799fcb489f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 11fdb14c53 wgengine/magicsock: don't check always-non-nil endpoint for nil-ness
Continuation of 2aa5df7ac1, remove nil
check because it can never be nil. (It previously was able to be nil.)

Change-Id: I59cd9ad611dbdcbfba680ed9b22e841b00c9d5e6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson e7eb46bced wgengine/magicsock: add an explicit else branch to peerMap update.
Clarifies that the replace+delete of peerinfo data is only when peerInfo
already exists.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 2aa5df7ac1 wgengine/magicsock: document and enforce that peerInfo.ep is non-nil.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 521b44e653 wgengine/magicsock: move discoKey fields to the mutex-protected section.
Fixes #3106

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Brad Fitzpatrick a6d02dc122 wgengine/magicsock: track which NodeKey each DiscoKey was last for
This adds new fields (currently unused) to discoInfo to track what the
last verified (unambiguous) NodeKey a DiscoKey last mapped to, and
when.

Then on CallMeMaybe, Pong and on most Pings, we update the mapping
from DiscoKey to the current NodeKey for that DiscoKey.

Updates #3088

Change-Id: Idc4261972084dec71cf8ec7f9861fb9178eb0a4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c759fcc7d3 wgengine/magicsock: fix data race with sync.Pool in error+logging path
Fixes #3122

Change-Id: Ib52e84f9bd5813d6cf2e80ce5b2296912a48e064
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 75a7779b42 disco, wgengine/magicsock: send self node key in disco pings
This lets clients quickly (sub-millisecond within a local LAN) map
from an ambiguous disco key to a node key without waiting for a
CallMeMaybe (over relatively high latency DERP).

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Denton Gentry def650b3e8 wgengine/magicsock: don't Rebind after STUN error if closed.
https://github.com/tailscale/tailscale/pull/3014 added a
rebind on STUN failure, which means there can now be a
tailscale.com/wgengine/magicsock.(*RebindingUDPConn).ReadFromNetaddr
in progress at the end of the test waiting for a STUN
response which will never arrive.

This causes a test flake due to the resource leak in those
cases where the Conn decided to rebind. For whatever reason,
it mostly flakes with Windows.

If the Conn is closed, don't Rebind after a send error.

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
3 years ago
Brad Fitzpatrick f55c2bccf5 wgengine/magicsock: don't call setAddrToDiscoLocked on DERP ping
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 569f70abfd wgengine/magicsock: finish some renamings of discoEndpoint to endpoint
Renames only; continuation of earlier 8049063d35

These kept confusing me while working on #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 695df497ba wgengine/magicsock: delete peerMap.endpointForDiscoKey, remove remaining caller
The one remaining caller of peerMap.endpointForDiscoKey was making the
improper assumption that there's exactly 1 node with a given DiscoKey
in the network. That was the cause of #3088.

Now that all the other callers have been updated to not use
endpointForDiscoKey, there's no need to try to keep maintaining that
prone-to-misuse index.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 04fd94acd6 wgengine/magicsock: remove endpointForDiscoKey call from handleDiscoMessage
A DiscoKey maps 1:n to endpoints. When we get a disco pong, we don't
necessarily know which endpoint sent it to us. Ask them all. There
will only usually be 1 (and in rare circumstances 2). So it's easier
to ask all two rather than building new maps from the random ping TxID
to its endpoint.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 151b4415ca wgengine/magicsock: remove endpoint parameter from handlePingLocked
We can reply to a ping without knowing which exact node it's from.  As
long as it's in our netmap, it's safe to reply. If there's more than
one node with that discokey, it doesn't matter who we're relpying to.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick d86081f353 wgengine/magicsock: add new discoInfo type for DiscoKey state, move some fields
As more prep for removing the false assumption that you're able to
map from DiscoKey to a single peer, move the lastPingFrom and lastPingTime
fields from the endpoint type to a new discoInfo type, effectively upgrading
the old sharedDiscoKey map (which only held a *[32]byte nacl precomputed key
as its value) to discoInfo which then includes that naclbox key.

Then start plumbing it into handlePing in prep for removing the need
for handlePing to take an endpoint parameter.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick e5779f019e wgengine/magicsock: move temporary endpoint lookup later, add TODO to remove
Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 36a07089ee wgengine/magicsock: remove redundant/wrong sharedDiscoKey delete
The pass just after in this method handles cleaning up sharedDiscoKey.
No need to do it wrong (assuming DiscoKey => 1 node) earlier.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 3e80806804 wgengine/magicsock: pass src NodeKey to handleDiscoMessage for DERP disco msgs
And then use it to avoid another lookup-by-DiscoKey.

Updates #3088
3 years ago
Brad Fitzpatrick 82fa15fa3b wgengine/magicsock: start removing endpointForDiscoKey
It's not valid to assume that a discokey is globally unique.

This removes the first two of the four callers.

Updates #3088

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Avery Pennarun 0d4a0bf60e magicsock: if STUN failed to send before, rebind before STUNning again.
On iOS (and possibly other platforms), sometimes our UDP socket would
get stuck in a state where it was bound to an invalid interface (or no
interface) after a network reconfiguration. We can detect this by
actually checking the error codes from sending our STUN packets.

If we completely fail to send any STUN packets, we know something is
very broken. So on the next STUN attempt, let's rebind the UDP socket
to try to correct any problems.

This fixes a problem where iOS would sometimes get stuck using DERP
instead of direct connections until the backend was restarted.

Fixes #2994

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
3 years ago
David Anderson 830f641c6b wgengine/magicsock: update discokeys on netmap change.
Fixes #3008.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Josh Bleecher Snyder a722e48cef wgengine/magicsock: skip alloc test with -race
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 31c1331415 wgengine/magicsock: deflake TestReceiveFromAllocs
100 iterations isn't enough with background allocs happening
apparently. 1000 seems to be reliable.

Fixes #2826

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 2238814b99 wgengine/magicsock: fix crash introduced in recent cleanups
Fixes #2801

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 640134421e all: update tests to use tstest.MemLogger
And give MemLogger a mutex, as one caller had, which does match the logf
contract better.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson efe8020dfa wgengine/magicsock: fix race condition in tests.
AFAICT this was always present, the log read mid-execution was never safe.
But it seems like the recent magicsock refactoring made the race much
more likely.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Brad Fitzpatrick 5bacbf3744 wgengine/magicsock, health, ipn/ipnstate: track DERP-advertised health
And add health check errors to ipnstate.Status (tailscale status --json).

Updates #2746
Updates #2775

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson bb10443edf wgengine/wgcfg: use just the hexlified node key as the WireGuard endpoint.
The node key is all magicsock needs to find the endpoint that WireGuard
needs.

Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson d00341360f wgengine/magicsock: remove unused debug knob.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson dfd978f0f2 wgengine/magicsock: use NodeKey, not DiscoKey, as the trigger for lazy reconfig.
Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 4c27e2fa22 wgengine/magicsock: remove Start method from Conn.
Over time, other magicsock refactors have made Start effectively a
no-op, except that some other functions choose to panic if called
before Start.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 1a899344bd wgengine/magicsock: don't store tailcfg.Nodes alongside endpoints.
Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson b2181608b5 wgengine/magicsock: eagerly create endpoints in SetNetworkMap.
Updates #2752

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Emmanuel T Odeke 0daa32943e all: add (*testing.B).ReportAllocs() to every benchmark
This ensures that we can properly track and catch allocation
slippages that could otherwise have been missed.

Fixes #2748
3 years ago
David Anderson 44d71d1e42 wgengine/magicsock: fix race in test shutdown, again.
We were returning an error almost, but not quite like errConnClosed in
a single codepath, which could still trip the panic on reconfig in the
test logic.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson f09ede9243 wgengine/magicsock: don't configure eager WireGuard handshaking in tests.
Our prod code doesn't eagerly handshake, because our disco layer enables
on-demand handshaking. Configuring both peers to eagerly handshake leads
to WireGuard handshake races that make TestTwoDevicePing flaky.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 86d1c4eceb wgengine/magicsock: ignore close races even harder.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 8bacfe6a37 wgengine/magicsock: remove unused sendLogLimit limiter.
Magicsock these days gets its logs limited by the global log limiter.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson e151b74f93 wgengine/magicsock: remove opts.SimulatedNetwork.
It only existed to override one test-only behavior with a
different test-only behavior, in both cases working around
an annoying feature of our CI environments. Instead, handle
that weirdness entirely in the test code, with a tweaked
TestOnlyPacketListener that gets injected.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 58c1f7d51a wgengine/magicsock: rename opts.PacketListener to TestOnlyPacketListener.
The docstring said it was meant for use in tests, but it's specifically a
special codepath that is _only_ used in tests, so make the claim stronger.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 8049063d35 wgengine/magicsock: rename discoEndpoint to just endpoint.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson f2d949e2db wgengine/magicsock: fold findEndpoint into its only remaining caller.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson fe2f89deab wgengine/magicsock: fix rare shutdown race in test.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 97693f2e42 wgengine/magicsock: delete legacy AddrSet endpoints.
Instead of using the legacy codepath, teach discoEndpoint to handle
peers that have a home DERP, but no disco key. We can still communicate
with them, but only over DERP.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
slowy07 ac0353e982 fix: typo spelling grammar
Signed-off-by: slowy07 <slowy.arfy@gmail.com>
3 years ago
Brad Fitzpatrick 37053801bb wgengine/magicsock: restore a bit of logging on node becoming active
Fixes #2695

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 39610aeb09 wgengine/magicsock: move debug knobs to their own file, compile out on iOS
No need for these knobs on iOS where you can set the environment
variables anyway.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick f3c96df162 ipn/ipnstate: move tailscale status "active" determination to tailscaled
Fixes #2579

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick b622c60ed0 derp,wgengine/magicsock: don't assume stringer is in $PATH for go:generate
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder 8a3d52e882 wgengine/magicsock: use mono.Time
magicsock makes multiple calls to Now per packet.
Move to mono.Now. Changing some of the calls to
use package mono has a cascading effect,
causing non-per-packet call sites to also switch.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago