Commit Graph

396 Commits (8bacfe6a37c390337a23d5c94a820b671476c2c7)

Author SHA1 Message Date
Brad Fitzpatrick fa412c8760 wgengine/filter, wgengine/magicsock: use new IP.BitLen to simplify some code 4 years ago
David Anderson 9cee0bfa8c wgengine/magicsock: sprinkle more docstrings.
Magicsock is too damn big, but this might help me page it back
in faster next time.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick 7b92f8e718 wgengine/magicsock: add start of magicsock benchmarks (Conn.ReceiveIPv4 for now)
And only single-threaded for now. Will get fancier later.

Updates #414
4 years ago
Brad Fitzpatrick 713cbe84c1 wgengine/magicsock: use net.JoinHostPort when host might have colons (udp6)
Only affected tests. (where it just generated log spam)
4 years ago
Brad Fitzpatrick 450cfedeba wgengine/magicsock: quiet an IPv6 warning in tests
In tests, we force binding to localhost to avoid OS firewall warning
dialogs.

But for IPv6, we were trying (and failing) to bind to 127.0.0.1.

You'd think we'd just say "localhost", but that's apparently ill
defined. See
https://tools.ietf.org/html/draft-ietf-dnsop-let-localhost-be-localhost
and golang/go#22826. (It's bitten me in the past, but I can't
remember specific bugs.)

So use "::1" explicitly for "udp6", which makes the test quieter.
4 years ago
David Anderson 7a54910990 wgengine/filter: remove helper vars, mark NewAllowAll test-only.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson b3634f020d wgengine/filter: use netaddr types in public API.
We still use the packet.* alloc-free types in the data path, but
the compilation from netaddr to packet happens within the filter
package.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick fd2a30cd32 wgengine/magicsock: make test pass on Windows and without firewall dialog box
Updates #50
4 years ago
Brad Fitzpatrick ac866054c7 wgengine/magicsock: add a backoff on DERP reconnects
Fixes #808
4 years ago
Brad Fitzpatrick 105a820622 wgengine/magicsock: skip an endpoint update at start-up
At startup the client doesn't yet have the DERP map so can't do STUN
queries against DERP servers, so it only knows it local interface
addresses, not its STUN-mapped addresses.

We were reporting the interface-local addresses to control, getting
the DERP map, and then immediately reporting the full set of
updates. That was an extra HTTP request to control, but worse: it was
an extra broadcast from control out to all the peers in the network.

Now, skip the initial update if there are no stun results and we don't
have a DERP map.

More work remains optimizing start-up requests/map updates, but this
is a start.

Updates tailscale/corp#557
4 years ago
Brad Fitzpatrick 2076a50862 wgengine/magicsock: finish a comment sentence that ended prematurely 4 years ago
Brad Fitzpatrick 3e4c46259d wgengine/magicsock: don't do netchecks either when network is down
A continuation of 6ee219a25d

Updates #640
4 years ago
Brad Fitzpatrick 6ee219a25d ipn, wgengine, magicsock, tsdns: be quieter and less aggressive when offline
If no interfaces are up, calm down and stop spamming so much. It was
noticed as especially bad on Windows, but probably was bad
everywhere. I just have the best network conditions testing on a
Windows VM.

Updates #604
4 years ago
Brad Fitzpatrick c86761cfd1 Remove tuntap references. We only use TUN. 4 years ago
Christina Wen 48fbe93e72
wgengine/magicsock: clarify pre-disco 'tailscale ping' error message
This change clarifies the error message when a user pings a peer that is using an outdated version of Tailscale.
4 years ago
Josh Bleecher Snyder a5d701095b wgengine/magicsock: increase test timeout to reduce flakiness
Updates #654. See that issue for a discussion of why
this timeout reduces flakiness, and what next steps are.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
Josh Bleecher Snyder 0c0239242c wgengine/magicsock: make discoPingPurpose a stringer
It was useful for debugging once, it'll probably be useful again.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
Josh Bleecher Snyder 6e38d29485 wgengine/magicsock: improve test logging output
This fixes line numbers and reduces timestamp
precision to overwhelming the output.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
Josh Bleecher Snyder 57e642648f wgengine/magicsock: fix typo in comment 4 years ago
Brad Fitzpatrick 756d6a72bd wgengine: lazily create peer wireguard configs more explicitly
Rather than consider bigs jumps in last-received-from activity as a
signal to possibly reconfigure the set of wireguard peers to have
configured, instead just track the set of peers that are currently
excluded from the configuration. Easier to reason about.

Also adds a bit more logging.

This might fix an error we saw on a machine running a recent unstable
build:

2020-08-26 17:54:11.528033751 +0000 UTC: 8.6M/92.6M magicsock: [unexpected] lazy endpoint not created for [UcppE], d:42a770f678357249
2020-08-26 17:54:13.691305296 +0000 UTC: 8.7M/92.6M magicsock: DERP packet received from idle peer [UcppE]; created=false
2020-08-26 17:54:13.691383687 +0000 UTC: 8.7M/92.6M magicsock: DERP packet from unknown key: [UcppE]

If it does happen again, though, we'll have more logs.
4 years ago
halulu f27a57911b
cmd/tailscale: add derp and endpoints status (#703)
cmd/tailscale: add local node's information to status output (by default)

RELNOTE=yes

Updates #477

Signed-off-by: Halulu <lzjluzijie@gmail.com>
4 years ago
David Crawshaw dd2c61a519 magicsock: call RequestStatus when DERP connects
Second attempt.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
4 years ago
David Crawshaw a67b174da1 Revert "magicsock: call RequestStatus when DERP connects"
Seems to break linux CI builder. Cannot reproduce locally,
so attempting a rollback.

This reverts commit cd7bc02ab1.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
4 years ago
David Crawshaw cd7bc02ab1 magicsock: call RequestStatus when DERP connects
Without this, a freshly started ipn client will be stuck in the
"Starting" state until something triggers a call to RequestStatus.
Usually a UI does this, but until then we can sit in this state
until poked by an external event, as is evidenced by our e2e tests
locking up when DERP is attached.

(This only recently became a problem when we enabled lazy handshaking
everywhere, otherwise the wireugard tunnel creation would also
trigger a RequestStatus.)

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
4 years ago
Brad Fitzpatrick f6dc47efe4 tailcfg, controlclient, magicsock: add control feature flag to enable DRPO
Updates #150
4 years ago
Brad Fitzpatrick 85c3d17b3c wgengine/magicsock: use disco ping src as a candidate endpoint
Consider:

   Hard NAT (A) <---> Hard NAT w/ mapped port (B)

If A sends a packet to B's mapped port, A can disco ping B directly,
with low latency, without DERP.

But B couldn't establish a path back to A and needed to use DERP,
despite already logging about A's endpoint and adding a mapping to it
for other purposes (the wireguard conn.Endpoint lookup also needed
it).

This adds the tracking to discoEndpoint too so it'll be used for
finding a path back.

Fixes tailscale/corp#556

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 0512fd89a1 wgengine/magicsock: simplify handlePingLocked
It's no longer true that 'de may be nil'
4 years ago
Brad Fitzpatrick 84dc891843 cmd/tailscale/cli: add ping subcommand
For example:

$ tailscale ping -h
USAGE
  ping <hostname-or-IP>

FLAGS
  -c 10                   max number of pings to send
  -stop-once-direct true  stop once a direct path is established
  -verbose false          verbose output

$ tailscale ping mon.ts.tailscale.com
pong from monitoring (100.88.178.64) via DERP(sfo) in 65ms
pong from monitoring (100.88.178.64) via DERP(sfo) in 252ms
pong from monitoring (100.88.178.64) via [2604:a880:2:d1::36:d001]:41641 in 33ms

Fixes #661

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 9a346fd8b4 wgengine,magicsock: fix two lazy wireguard config issues
1) we weren't waking up a discoEndpoint that once existed and
   went idle for 5 minutes and then got a disco message again.

2) userspaceEngine.noteReceiveActivity had a buggy check; fixed
   and added a test
4 years ago
Brad Fitzpatrick 41c4560592 control/controlclient: remove unused NetworkMap.UAPI method
And remove last remaining use of wgcfg.ToUAPI in a test's debug
output; replace it with JSON.
4 years ago
Brad Fitzpatrick cff737786e wgengine/magicsock: fix lazy config deadlock, document more lock ordering
This removes the atomic bool that tried to track whether we needed to acquire
the lock on a future recursive call back into magicsock. Unfortunately that
hack doesn't work because we also had a lock ordering issue between magicsock
and userspaceEngine (see issue). This documents that too.

Fixes #644
4 years ago
Brad Fitzpatrick 2bd9ad4b40 wgengine: fix deadlock between engine and magicsock 4 years ago
Brad Fitzpatrick 7c38db0c97 wgengine/magicsock: don't deadlock on pre-disco Endpoints w/ lazy wireguard configs
Fixes tailscale/tailscale#637
4 years ago
Brad Fitzpatrick 4987a7d46c wgengine/magicsock: when hard NAT, add stun-ipv4:static-port as candidate
If a node is behind a hard NAT and is using an explicit local port
number, assume they might've mapped a port and add their public IPv4
address with the local tailscaled's port number as a candidate endpoint.
4 years ago
Brad Fitzpatrick bfcb0aa0be wgengine/magicsock: deflake tests, Close deadlock again
Better fix than 37903a9056

Fixes tailscale/corp#533
4 years ago
Dmytro Shynkevych 28e52a0492
all: dns refactor, add Proxied and PerDomain flags from control (#615)
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Brad Fitzpatrick cb970539a6 wgengine/magicsock: remove TODO comment that's no longer applicable 4 years ago
Brad Fitzpatrick 915f65ddae wgengine/magicsock: stop disco activity on IPN stop
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c180abd7cf wgengine/magicsock: merge errClosed and errConnClosed 4 years ago
Brad Fitzpatrick d55fdd4669 wgengine/magicsock: update, flesh out a TODO 4 years ago
David Anderson f8e4c75f6b wgengine/magicsock: check slightly less aggressively for connectivity.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick 58b721f374 wgengine/magicsock: deflake some tests with an ugly hack
Starting with fe68841dc7, some e2e tests
got flaky. Rather than debug them (they're gnarly), just revert to the old
behavior as far as those tests are concerned. The tests were somehow
using magicsock without a private key and expecting it to do ... something.

My goal with fe68841dc7 was to stop log spam
and unnecessary work I saw on the iOS app when when stopping the app.

Instead, only stop doing that work on any transition from
once-had-a-private-key to no-longer-have-a-private-key. That fixes
what I wanted to fix while still making the mysterious e2e tests
happy.
4 years ago
David Anderson 41d0c81859 wgengine/magicsock: make disco subtest name more precise.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 9beea8b314 wgengine/magicsock: remove unnecessary use of context.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson b62341d308 wgengine/magicsock: add docstring.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 9265296b33 wgengine/magicsock: don't deadlock on shutdown if sending blocks.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 0249236cc0 ipn/ipnstate: record assigned Tailscale IPs.
wgengine/magicsock: use ipnstate to find assigned Tailscale IPs.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson c3958898f1 tstest/natlab: be a bit more lenient during test shutdown.
There is a race in natlab where we might start shutdown while natlab is still running
a goroutine or two to deliver packets. This adds a small grace period to try and receive
it before continuing shutdown.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 7578c815be wgengine/magicsock: give pinger a more generous packet timeout.
The first packet to transit may take several seconds to do so, because
setup rates in wgengine may result in the initial WireGuard handshake
init to get dropped. So, we have to wait at least long enough for a
retransmit to correct the fault.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson c3994fd77c derp: remove OnlyDisco option.
Active discovery lets us introspect the state of the network stack precisely
enough that it's unnecessary, and dropping the initial DERP packets greatly
slows down tests. Additionally, it's unrealistic since our production network
will never deliver _only_ discovery packets, it'll be all or nothing.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 5455c64f1d wgengine/magicsock: add a test for two facing endpoint-independent NATs.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson f794493b4f wgengine/magicsock: explicitly check path discovery, add a firewall test.
The test proves that active discovery can traverse two facing firewalls.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson f582eeabd1 wgengine/magicsock: add a test for active path discovery.
Uses natlab only, because the point of this active discovery test is going to be
that it should get through a lot of obstacles.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick 37903a9056 wgengine/magicsock: fix occasional deadlock on Conn.Close on c.derpStarted
The deadlock was:

* Conn.Close was called, which acquired c.mu
* Then this goroutine scheduled:

    if firstDerp {
        startGate = c.derpStarted
        go func() {
            dc.Connect(ctx)
            close(c.derpStarted)
        }()
    }

* The getRegion hook for that derphttp.Client then ran, which also
  tries to acquire c.mu.

This change makes that hook first see if we're already in a closing
state and then it can pretend that region doesn't exist.
4 years ago
Brad Fitzpatrick fe68841dc7 wgengine/magicsock: log better with less spam on transition to stopped state
Required a minor test update too, which now needs a private key to get far
enough to test the thing being tested.
4 years ago
Brad Fitzpatrick e298327ba8 wgengine/magicsock: remove overkill, slow reflect.DeepEqual of NetworkMap
No need to allocate or compare all the fields we don't care about.
4 years ago
David Anderson 3669296cef wgengine/magicsock: refactor twoDevicePing to make stack construction cleaner.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick 16a9cfe2f4 wgengine: configure wireguard peers lazily, as needed
wireguard-go uses 3 goroutines per peer (with reasonably large stacks
& buffers).

Rather than tell wireguard-go about all our peers, only tell it about
peers we're actively communicating with. That means we need hooks into
magicsock's packet receiving path and tstun's packet sending path to
lazily create a wireguard peer on demand from the network map.

This frees up lots of memory for iOS (where we have almost nothing
left for larger domains with many users).

We should ideally do this in wireguard-go itself one day, but that'd
be a pretty big change.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 5066b824a6 wgengine/magicsock: don't log about disco ping timeouts if we have a working address
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c06d2a8513 wgengine/magicsock: fix typo in comment 4 years ago
Brad Fitzpatrick bf195cd3d8 wgengine/magicsock: reduce log verbosity of discovery messages
Don't log heartbeat pings & pongs. Track the reason for pings and then
only log the ping/pong traffic if it was for initial path discovery.
4 years ago
Brad Fitzpatrick a6559a8924 wgengine/magicsock: run test DERP in mode where only disco packets allowed
So we don't accidentally pass a NAT traversal test by having DERP pick up our slack
when we really just wanted DERP as an OOB messaging channel.
4 years ago
Brad Fitzpatrick 10ac066013 all: fix vet warnings 4 years ago
Brad Fitzpatrick d74c9aa95b wgengine/magicsock: update comment, fix earlier commit
891898525c had a continue that meant the didCopy synchronization never ran.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c976264bd1 wgengine/magicsock: gofmt 4 years ago
Dmytro Shynkevych f3e2b65637
wgengine/magicsock: time.Sleep -> time.After
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Dmytro Shynkevych 380ee76d00
wgengine/magicsock: make time.Sleep in runDerpReader respect cancellation.
Before this patch, the 250ms sleep would not be interrupted by context cancellation,
which would result in the goroutine sometimes lingering in tests (100ms grace period).

Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Dmytro Shynkevych 891898525c
wgengine/magicsock: make receive from didCopy respect cancellation.
Very rarely, cancellation occurs between a successful send on derpRecvCh
and a call to copyBuf on the receiving side.
Without this patch, this situation results in <-copyBuf blocking indefinitely.

Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Brad Fitzpatrick a2267aae99 wgengine: only launch pingers for peers predating the discovery protocol
Peers advertising a discovery key know how to speak the discovery
protocol and do their own heartbeats to get through NATs and keep NATs
open. No need for the pinger except for with legacy peers.
4 years ago
David Anderson 45578b47f3 tstest/natlab: refactor PacketHandler into a larger interface.
The new interface lets implementors more precisely distinguish
local traffic from forwarded traffic, and applies different
forwarding logic within Machines for each type. This allows
Machines to be packet forwarders, which didn't quite work
with the implementation of Inject.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Dmytro Shynkevych 2f15894a10 wgengine/magicsock: wait for derphttp client goroutine to exit
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
David Anderson 88e8456e9b wgengine/magicsock: add a connectivity test for facing firewalls.
The test demonstrates that magicsock can traverse two stateful
firewalls facing each other, that each require localhost to
initiate connections.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 1f7b1a4c6c wgengine/magicsock: rearrange TwoDevicePing test for future natlab tests.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 977381f9cc wgengine/magicsock: make trivial natlab test pass.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick 6c74065053 wgengine/magicsock, tstest/natlab: start hooking up natlab to magicsock
Also adds ephemeral port support to natlab.

Work in progress.

Pairing with @danderson.
4 years ago
Brad Fitzpatrick bd59bba8e6 wgengine/magicsock: stop discoEndpoint timers on Close
And add some defensive early returns on c.closed.
4 years ago
Brad Fitzpatrick de875a4d87 wgengine/magicsock: remove DisableSTUNForTesting 4 years ago
Brad Fitzpatrick 5c6d8e3053 netcheck, tailcfg, interfaces, magicsock: survey UPnP, NAT-PMP, PCP
Don't do anything with UPnP, NAT-PMP, PCP yet, but see how common they
are in the wild.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 6196b7e658 wgengine/magicsock: change API to not permit disco key changes
Generate the disco key ourselves and give out the public half instead.

Fixes #525
4 years ago
Brad Fitzpatrick 5132edacf7 wgengine/magicsock: fix data race from undocumented wireguard-go requirement
Endpoints need to be Stringers apparently.

Fixes tailscale/corp#422
4 years ago
Brad Fitzpatrick 630379a1d0 cmd/tailscale: add tailscale status region name, last write, consistently star
There's a lot of confusion around what tailscale status shows, so make it better:
show region names, last write time, and put stars around DERP too if active.

Now stars are always present if activity, and always somewhere.
4 years ago
Brad Fitzpatrick 9a8700b02a wgengine/magicsock: add discoEndpoint heartbeat
Updates #483
4 years ago
Brad Fitzpatrick 9f930ef2bf wgengine/magicsock: remove the discoEndpoint.timers map
It ended up being more complicated than it was worth.
4 years ago
Brad Fitzpatrick f5f3885b5b wgengine/magicsock: bunch of misc discovery path cleanups
* fix tailscale status for peers using discovery
* as part of that, pull out disco address selection into reusable
  and testable discoEndpoint.addrForSendLocked
* truncate ping/pong logged hex txids in half to eliminate noise
* move a bunch of random time constants into named constants
  with docs
* track a history of per-endpoint pong replies for future use &
  status display
* add "send" and " got" prefix to discovery message logging
  immediately before the frame type so it's easier to read than
  searching for the "<-" or "->" arrows earlier in the line; but keep
  those as the more reasily machine readable part for later.

Updates #483
4 years ago
Brad Fitzpatrick 6c70cf7222 wgengine/magicsock: stop ping timeout timer on pong receipt, misc log cleanup
Updates #483
4 years ago
Brad Fitzpatrick c52905abaa wgengine/magicsock: log less on no-op disco route switches
Also, renew trustBestAddrUntil even if latency isn't better.
4 years ago
Brad Fitzpatrick 0f0ed3dca0 wgengine/magicsock: clean up discovery logging
Updates #483
4 years ago
Brad Fitzpatrick 056fbee4ef wgengine/magicsock: add TS_DEBUG_OMIT_LOCAL_ADDRS knob to force STUN use only
For debugging.
4 years ago
Brad Fitzpatrick e03cc2ef57 wgengine/magicsock: populate discoOfAddr upon receiving ping frames
Updates #483
4 years ago
Brad Fitzpatrick 275a20f817 wgengine/magicsock: keep discoOfAddr populated, use it for findEndpoint
Update the mapping from ip:port to discokey, so when we retrieve a
packet from the network, we can find the same conn.Endpoint that we
gave to wireguard-go previously, without making it think we've
roamed. (We did, but we're not using its roaming.)

Updates #483
4 years ago
Brad Fitzpatrick 77e89c4a72 wgengine/magicsock: handle CallMeMaybe discovery mesages
Roughly feature complete now. Testing and polish remains.

Updates #483
4 years ago
Brad Fitzpatrick 710ee88e94 wgengine/magicsock: add timeout on discovery pings, clean up state
Updates #483
4 years ago
Brad Fitzpatrick 77d3ef36f4 wgengine/magicsock: hook up discovery messages, upgrade to LAN works
Ping messages now go out somewhat regularly, pong replies are sent,
and pong replies are now partially handled enough to upgrade off DERP
to LAN.

CallMeMaybe packets are sent & received over DERP, but aren't yet
handled. That's next (and regular maintenance timers), and then WAN
should work.

Updates #483
4 years ago
Brad Fitzpatrick 9b8ca219a1 wgengine/magicsock: remove allocs in UDP write, use new netaddr.PutUDPAddr
The allocs were only introduced yesterday with a TODO. Now they're gone again.
4 years ago
Brad Fitzpatrick 7b3c0bb7f6 wgengine/magicsock: fix crash reading DERP packet
Starting at yesterday's e96f22e560 (convering some UDPAddrs to
IPPorts), Conn.ReceiveIPv4 could return a nil addr, which would make
its way through wireguard-go and blow up later. The DERP read path
wasn't initializing the addr result parameter any more, and wgRecvAddr
wasn't checking it either.

Fixes #515
4 years ago
Brad Fitzpatrick 47b4a19786 wgengine/magicsock: use netaddr.ParseIPPort instead of net.ResolveUDPAddr 4 years ago
Brad Fitzpatrick f7124c7f06 wgengine/magicsock: start of discoEndpoint state tracking
Updates #483
4 years ago
Brad Fitzpatrick 92252b0988 wgengine/magicsock: add a little LRU cache for netaddr.IPPort lookups
And while plumbing, a bit of discovery work I'll need: the
endpointOfAddr map to map from validated paths to the discoEndpoint.
Not being populated yet.

Updates #483
4 years ago
Brad Fitzpatrick 2d6e84e19e net/netcheck, wgengine/magicsock: replace more UDPAddr with netaddr.IPPort 4 years ago
Brad Fitzpatrick 9070aacdee wgengine/magicsock: minor comments & logging & TODO changes 4 years ago
Brad Fitzpatrick e96f22e560 wgengine/magicsock: start handling disco message, use netaddr.IPPort more
Updates #483
4 years ago
Brad Fitzpatrick a83ca9e734 wgengine/magicsock: cache precomputed nacl/box shared keys
Updates #483
4 years ago
Brad Fitzpatrick a975e86bb8 wgengine/magicsock: add new endpoint type used for discovery-supporting peers
This adds a new magicsock endpoint type only used when both sides
support discovery (that is, are advertising a discovery
key). Otherwise the old code is used.

So far the new code only communicates over DERP as proof that the new
code paths are wired up. None of the actually discovery messaging is
implemented yet.

Support for discovery (generating and advertising a key) are still
behind an environment variable for now.

Updates #483
4 years ago
Brad Fitzpatrick 103c06cc68 wgengine/magicsock: open discovery naclbox messages from known peers
And track known peers.

Doesn't yet do anything with the messages. (nor does it send any yet)

Start of docs on the message format. More will come in subsequent changes.

Updates #483
4 years ago
Brad Fitzpatrick 23e74a0f7a wgengine, magicsock, tstun: don't regularly STUN when idle (mobile only for now)
If there's been 5 minutes of inactivity, stop doing STUN lookups. That
means NAT mappings will expire, but they can resume later when there's
activity again.

We'll do this for all platforms later.

Updates tailscale/corp#320

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick fe50cd0c48 ipn, wgengine: plumb NetworkMap down to magicsock
Now we can have magicsock make decisions based on tailcfg.Debug
settings sent by the server.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Dmytro Shynkevych de5f6d70a8 magicsock: eliminate logging race in test
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Brad Fitzpatrick 53fb25fc2f all: generate discovery key, plumb it around
Not actually used yet.

Updates #483
4 years ago
Brad Fitzpatrick abd79ea368 derp: reduce DERP memory use; don't require callers to pass in memory to use
The magicsock derpReader was holding onto 65KB for each DERP
connection forever, just in case.

Make the derp{,http}.Client be in charge of memory instead. It can
reuse its bufio.Reader buffer space.
4 years ago
Brad Fitzpatrick 280e8884dd wgengine/magicsock: limit redundant log spam on packets from low-pri addresses
Fixes #407

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 9e5d79e2f1 wgengine/magicsock: drop a bytes.Buffer sync.Pool, use logger.ArgWriter instead 5 years ago
Brad Fitzpatrick becce82246 net/netns, misc tests: remove TestOnlySkipPrivilegedOps, argv checks
The netns UID check is sufficient for now. We can do something else
later if/when needed.
5 years ago
David Anderson 5114df415e net/netns: set the bypass socket mark on linux.
This allows tailscaled's own traffic to bypass Tailscale-managed routes,
so that things like tailscale-provided default routes don't break
tailscaled itself.

Progress on #144.

Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
Brad Fitzpatrick db2a216561 wgengine/magicsock: don't log on UDP send errors if address family known missing
Fixes #376
5 years ago
Brad Fitzpatrick 9e3ad4f79f net/netns: add package for start of network namespace support
And plumb in netcheck STUN packets.

TODO: derphttp, logs, control.

Updates #144

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick a428656280 wgengine/magicsock: don't report v4 localhost addresses on IPv6-only systems
Updates #376
5 years ago
Avery Pennarun 30e5c19214 magicsock: work around race condition initializing .Regions[].
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
5 years ago
Brad Fitzpatrick b0c10fa610 stun, netcheck: move under net 5 years ago
Brad Fitzpatrick e6b84f2159 all: make client use server-provided DERP map, add DERP region support
Instead of hard-coding the DERP map (except for cmd/tailscale netcheck
for now), get it from the control server at runtime.

And make the DERP map support multiple nodes per region with clients
picking the first one that's available. (The server will balance the
order presented to clients for load balancing)

This deletes the stunner package, merging it into the netcheck package
instead, to minimize all the config hooks that would've been
required.

Also fix some test flakes & races.

Fixes #387 (Don't hard-code the DERP map)
Updates #388 (Add DERP region support)
Fixes #399 (wgengine: flaky tests)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Anderson e8b3a5e7a1 wgengine/filter: implement a destination IP pre-filter.
Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
Brad Fitzpatrick e6d0c92b1d wgengine/magicsock: clean up earlier fix a bit
Move WaitReady from fc88e34f42 into the
test code, and keep the derp-reading goroutine named for debugging.
5 years ago
Avery Pennarun fc88e34f42 wgengine/magicsock/tests: wait for home DERP connection before sending packets.
This fixes an elusive test flake. Fixes #161.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
5 years ago
Avery Pennarun 4f128745d8 magicsock/test: oops, fix a data race in nested-test logf hack.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
5 years ago
Avery Pennarun 42a0e0c601 wgengine/magicsock/tests: call tstest.ResourceCheck for each test.
This didn't catch anything yet, but it's good practice for detecting
goroutine leaks that we might not find otherwise.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
5 years ago
Avery Pennarun 08acb502e5 Add tstest.PanicOnLog(), and fix various problems detected by this.
If a test calls log.Printf, 'go test' horrifyingly rearranges the
output to no longer be in chronological order, which makes debugging
virtually impossible. Let's stop that from happening by making
log.Printf panic if called from any module, no matter how deep, during
tests.

This required us to change the default error handler in at least one
http.Server, as well as plumbing a bunch of logf functions around,
especially in magicsock and wgengine, but also in logtail and backoff.

To add insult to injury, 'go test' also rearranges the output when a
parent test has multiple sub-tests (all the sub-test's t.Logf is always
printed after all the parent tests t.Logf), so we need to screw around
with a special Logf that can point at the "current" t (current_t.Logf)
in some places. Probably our entire way of using subtests is wrong,
since 'go test' would probably like to run them all in parallel if you
called t.Parallel(), but it definitely can't because the're all
manipulating the shared state created by the parent test. They should
probably all be separate toplevel tests instead, with common
setup/teardown logic. But that's a job for another time.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
5 years ago
Dmytro Shynkevych 33b2f30cea
wgengine: wrap tun.Device to support filtering and packet injection (#358)
Right now, filtering and packet injection in wgengine depend
on a patch to wireguard-go that probably isn't suitable for upstreaming.

This need not be the case: wireguard-go/tun.Device is an interface.
For example, faketun.go implements it to mock a TUN device for testing.

This patch implements the same interface to provide filtering
and packet injection at the tunnel device level,
at which point the wireguard-go patch should no longer be necessary.

This patch has the following performance impact on i7-7500U @ 2.70GHz,
tested in the following namespace configuration:
┌────────────────┐    ┌─────────────────────────────────┐     ┌────────────────┐
│      $ns1      │    │               $ns0              │     │      $ns2      │
│    client0     │    │      tailcontrol, logcatcher    │     │     client1    │
│  ┌─────┐       │    │  ┌──────┐         ┌──────┐      │     │  ┌─────┐       │
│  │vethc│───────┼────┼──│vethrc│         │vethrs│──────┼─────┼──│veths│       │
│  ├─────┴─────┐ │    │  ├──────┴────┐    ├──────┴────┐ │     │  ├─────┴─────┐ │
│  │10.0.0.2/24│ │    │  │10.0.0.1/24│    │10.0.1.1/24│ │     │  │10.0.1.2/24│ │
│  └───────────┘ │    │  └───────────┘    └───────────┘ │     │  └───────────┘ │
└────────────────┘    └─────────────────────────────────┘     └────────────────┘
Before:
---------------------------------------------------
| TCP send               | UDP send               |
|------------------------|------------------------|
| 557.0 (±8.5) Mbits/sec | 3.03 (±0.02) Gbits/sec |
---------------------------------------------------
After:
---------------------------------------------------
| TCP send               | UDP send               |
|------------------------|------------------------|
| 544.8 (±1.6) Mbits/sec | 3.13 (±0.02) Gbits/sec |
---------------------------------------------------
The impact on receive performance is similar.

Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
5 years ago
Brad Fitzpatrick fefd7e10dc types/structs: add structs.Incomparable annotation, use it where applicable
Shotizam before and output queries:

sqlite> select sum(size) from bin where func like 'type..%';
129067
=>
120216
5 years ago
Brad Fitzpatrick e1526b796e ipn: don't listen on the unspecified address in test
To avoid the Mac firewall dialog of (test) death.

See 4521a59f30
which I added to help debug this.
5 years ago
Brad Fitzpatrick 18017f7630 ipn, wgengine/magicsock: be more idle when in Stopped state with no peers
(Previously as #288, but with some more.)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Anderson 9669b85b41 wgengine/magicsock: wait for endpoint updater goroutine when closing.
Fixes #204.

Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
Brad Fitzpatrick 268d331cb5 wgengine/magicsock: prune key.Public-keyed on peer removals
Fixes #215
5 years ago
Brad Fitzpatrick 00d053e25a wgengine/magicsock: fix slow memory leak as peer endpoints move around
Updates #215
5 years ago
Brad Fitzpatrick 7fc97c5493 wgengine/magicsock: use netaddr more
In prep for deleting from the ever-growing maps.
5 years ago
Brad Fitzpatrick 6fb30ff543 wgengine/magicsock: start using inet.af/netaddr a bit 5 years ago
Brad Fitzpatrick a7e7c7b548 wgengine/magicsock: close derp connections on rebind
Fixes #276

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 614261d00d wgengine/magicsock: reset AddrSet states on Rebind
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Blake Gentry e19287f60f wgengine/magicsock: fix Conn docs type reference
The docs on magicsock.Conn stated that they implemented the
wireguard/device.Bind interface, yet this type does not exist. In
reality, the Conn type implements the wireguard/conn.Bind interface.

I also fixed a small typo in the same file.

Signed-off-by: Blake Gentry <blakesgentry@gmail.com>
5 years ago
Brad Fitzpatrick 322499473e cmd/tailscaled, wgengine, ipn: add /debug/ipn handler with world state
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 2d48f92a82 wgengine/magicsock: re-stun every [20,27] sec, not 28
28 is cutting it close, and we think jitter will help some spikes
we're seeing.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 577f321c38 wgengine/magicsock: revise derp fallback logic
Revision to earlier 6284454ae5

Don't be sticky if we have no peers.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 6284454ae5 wgengine/magicsock: if UDP blocked, pick DERP where most peers are
Updates #207

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick d321190578 wgengine/magicsock: stringify [IPv6]:port normally in AddrSet.String 5 years ago
Brad Fitzpatrick 3c3ea8bc8a wgengine/magicsock: finish IPv6 transport support
DEBUG_INCLUDE_IPV6=1 is still required, but works now.

Updates #18 (fixes it, once env var gate is removed)
5 years ago
Brad Fitzpatrick 82ed7e527e wgengine/magicsock: remove log allocation
This was the whole point but I goofed at the last line.
5 years ago
Brad Fitzpatrick 8454bbbda5 wgengine/magicsock: more logging improvements
* remove endpoint discovery noise when results unchanged
* consistently spell derp nodes as "derp-N"
* replace "127.3.3.40:" with "derp-" in CreateEndpoint log output
* stop early DERP setup before SetPrivateKey is called;
  it just generates log nosie
* fix stringification of peer ShortStrings (it had an old %x on it,
  rendering it garbage)
* describe why derp routes are changing, with one of:
  shared home, their home, our home, alt
5 years ago
Brad Fitzpatrick 680311b3df wgengine/magicsock: fix few remaining logs without package prefix 5 years ago
Brad Fitzpatrick c473927558 wgengine/magicsock: clean up, add, improve DERP logs 5 years ago
Brad Fitzpatrick ea9310403d wgengine/magicsock: re-STUN on DERP connection death
Fixes #201
5 years ago
Brad Fitzpatrick 1ab5b31c4b derp, magicsock: send new "peer gone" frames when previous sender disconnects
Updates #150 (not yet enabled by default in magicsock)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick b6f77cc48d wgengine/magicsock: return early, outdent in derpWriteChanOfAddr 5 years ago
Brad Fitzpatrick dd31285ad4 wgengine/magicsock: send IPv6 using pconn6, if available
In prep for IPv6 support. Nothing should make it this far yet.
5 years ago
Brad Fitzpatrick af277a6762 controlclient, magicsock: add debug knob to request IPv6 endpoints
Add opt-in method to request IPv6 endpoints from the control plane.
For now they should just be skipped. A previous version of this CL was
unconditional and reportedly had problems that I can't reproduce. So
make it a knob until the mystery is solved.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 221e7d7767 wgengine/magicsock: make log message include DERP port (node) 5 years ago
Brad Fitzpatrick 33bdcabf03 wgengine/magicsock: call stun callback w/ only valid part of STUN packet 5 years ago
David Anderson 0be475ba46 Revert "tailcfg, controlclient, magicsock: request IPv6 endpoints, but ignore them"
Breaks something deep in wireguard or magicsock's brainstem, no packets at all
can flow. All received packets fail decryption with "invalid mac1".

This reverts commit 94024355ed.

Signed-off-by: David Anderson <dave@natulte.net>
5 years ago
Brad Fitzpatrick 94024355ed tailcfg, controlclient, magicsock: request IPv6 endpoints, but ignore them
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 60ea635c6d wgengine/magicsock: delete inaccurate comment
I meant to include this in the earlier commit.
5 years ago
Brad Fitzpatrick a184e05290 wgengine/magicsock: listen on udp6, use it for STUN, report endpoint
More steps towards IPv6 transport.

We now send it to tailcontrol, which ignores it.

But it doesn't actually actually support IPv6 yet (outside of STUN).

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 7caa288213 wgengine/magicsock: rename pconn field to pconn4, in prep for pconn6 5 years ago
David Crawshaw addbdce296 wgengine, ipn: include number of active DERPs in status
Use this when making the ipn state transition from Starting to
Running. This way a network of quiet nodes with no active
handshaking will still transition to Active.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
David Crawshaw 1ad78ce698 magicsock: reconnect to home DERP on key change
Typically the home DERP server is found and set on startup before
magicsock's SetPrivateKey can be called, so no DERP connection is
established. Make sure one is by kicking the home DERP tires in
SetPrivateKey.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
David Crawshaw 455ba751d9 magicsock: start connection to HOME derp immediately
The code as written intended to do this, but it repeated the
comparison of derpNum and c.myDerp after c.myDerp had been
updated, so it never executed.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
David Anderson 315a5e5355 scripts: add a license header checker.
Signed-off-by: David Anderson <dave@natulte.net>
5 years ago
Brad Fitzpatrick e085aec8ef all: update to wireguard-go API changes
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick db2436c7ff wgengine/magicsock: don't interrupt endpoint updates, merge all mutex into one
Before, endpoint updates were constantly being interrupted and resumed
on Linux due to tons of LinkChange messages from over-zealous Linux
netlink messages (from router_linux.go)

Now that endpoint updates are fast and bounded in time anyway, just
let them run to completion, but note that another needs to be
scheduled after.

Now logs went from pages of noise to just:

root@taildoc:~# grep -i -E 'stun|endpoint update' log
2020/03/13 08:51:29 magicsock.Conn: starting endpoint update (initial)
2020/03/13 08:51:30 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")
2020/03/13 08:51:31 magicsock.Conn: starting endpoint update (link-change-minor)
2020/03/13 08:51:31 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")
2020/03/13 08:51:33 magicsock.Conn: starting endpoint update (link-change-minor)
2020/03/13 08:51:33 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")
2020/03/13 08:51:35 magicsock.Conn: starting endpoint update (link-change-minor)
2020/03/13 08:51:35 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")

Or, seen in another run:

2020/03/13 08:45:41 magicsock.Conn: starting endpoint update (periodic)
2020/03/13 08:46:09 magicsock.Conn: starting endpoint update (periodic)
2020/03/13 08:46:21 magicsock.Conn: starting endpoint update (link-change-major)
2020/03/13 08:46:37 magicsock.Conn: starting endpoint update (periodic)
2020/03/13 08:47:05 magicsock.Conn: starting endpoint update (periodic)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick db31550854 wgengine: don't Reconfig on boring link changes
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick b9c6d3ceb8 netcheck: work behind UDP-blocked networks again, add tests
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick bc73dcf204 wgengine/magicsock: don't block in Send waiting for derphttp.Send
Fixes #137
Updates #109
Updates #162
Updates #163

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 8807913be9 wgengine/magicsock: wait for previous DERP goroutines to end before new ones
Updates #109 (hopefully fixes, will wait for graphs to be happy)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick eff6dcdb4e wgengine/magicsock: log more about why we're re-STUNing 5 years ago
Brad Fitzpatrick b3ddf51a15 wgengine/magicsock: add a pointer value for logging
Updates #109

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick b0f8931d26 wgengine/magicsock: make a test signature a bit more explicit 5 years ago
David Crawshaw 7ec54e0064 wgengine/magicsock: remove TODO
The TODO above derphttp.NewClient suggests it does network I/O,
but the derphttp client connects lazily and so creating one is
very cheap.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
Brad Fitzpatrick 01b4bec33f stunner: re-do how Stunner works
It used to make assumptions based on having Anycast IPs that are super
near. Now we're intentionally going to a bunch of different distant
IPs to measure latency.

Also, optimize how the hairpin detection works. No need to STUN on
that socket. Just use that separate socket for sending, once we know
the other UDP4 socket's endpoint. The trick is: make our test probe
also a STUN packet, so it fits through magicsock's existing STUN
routing.

This drops netcheck from ~5 seconds to ~250-500ms.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Anderson 77af7e5436 wgengine/magicsock: mark test logfunc as a helper.
Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson 7eda3af034 wgengine/magicsock: clean up derp http servers on shutdown.
Failure to do this leads to fd exhaustion at -count=10000,
and increasingly poor execution north of -count=100.

Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson d651715528 wgengine/magicsock: synchronize test STUN shutdown.
Failure to do so triggers either a data race or a panic
in the testing package, due to racey use of t.Logf.

Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson 86baf60bd4 wgengine/magicsock: synchronize epUpdate cleanup on shutdown.
Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
Brad Fitzpatrick 023df9239e Move linkstate boring change filtering to magicsock
So we can at least re-STUN on boring updates.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Anderson 592fec7606 wgengine/magicsock: move device close to uncursed portion of test.
Device close used to suffer from deadlocks, but no longer.

Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
Brad Fitzpatrick a265d7cbff wgengine/magicsock: in STUN-disabled test mode, let endpoint discovery proceed 5 years ago
Brad Fitzpatrick 5c1e443d34 wgengine/monitor: don't call LinkChange when interfaces look unchanged
Basically, don't trust the OS-level link monitor to only tell you
interesting things. Sanity check it.

Also, move the interfaces package into the net directory now that we
have it.
5 years ago
Brad Fitzpatrick 39c0ae1dba derp/derpmap: new DERP config package, merge netcheck into magicsock more
Fixes #153
Updates #162
Updates #163

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 4800926006 wgengine/magicsock: add AddrSet appendDests+UpdateDst tests 5 years ago
David Crawshaw e201f63230 magicsock: unskip tests that are reliable
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
David Anderson bb93d7aaba wgengine/magicsock: plumb logf throughout, and expose in Options.
Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
Brad Fitzpatrick f42b9b6c9a wgengine/magicsock: don't discard UDP packet on UDP+DERP race
Fixes #155

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Anderson e3172ae267 wgengine/magicsock: uncurse TestDeviceStartStop, let CI run it.
Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson f265603110 wgengine/magicsock: fix data race in ReceiveIPv4.
The UDP reader goroutine was clobbering `n` and `err` from the
main goroutine, whose accesses are not synchronized the way `b` is.

Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson 77354d4617 wgengine/magicsock: unblock wireguard-go's read on magicsock shutdown.
wireguard-go closes magicsock, and expects this to unblock reads
so that its internal goroutines can wind down. We were incorrectly
blocking the read indefinitey and breaking this contract.

Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson fdee5fb639 wgengine/magicsock: don't mutexly reach inside Conn to tweak DERP settings.
Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson 643bf14653 wgengine/magicsock: disable the new ping test.
It's extremely flaky in several dimensions, as well as very slow.
It's making the CI completely red all the time without telling us
useful information.

Set RUN_CURSED_TESTS=1 to run locally.
5 years ago
David Anderson c8ebac2def wgengine/magicsock: try deflaking again.
This change just alters the semantics of the one flaky test, without
trying to speed up timeouts on the others. Empirically, speeding up
the timeouts causes _more_ flakes right now :(
5 years ago
David Anderson cd1ac63b4c Revert "wgengine/magicsock: temporarily deflake."
This reverts commit c5835c6ced.
5 years ago
David Anderson c5835c6ced wgengine/magicsock: temporarily deflake.
The remaining flake occurs due to a mysterious packet loss. This
doesn't affect normal tailscaled operations, so until I track down
where the loss occurs and fix it, the flaky test is going to be
lenient about packet loss (but not about whether the spray logic
worked).

Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
Brad Fitzpatrick 61d83f759b wgengine/magicsock: remove redundant derpMagicIP comparison
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Anderson bd60a750e8 wgengine/magicsock: fix packet spraying test to (mostly) pass.
It previously passed incorrectly due to bugs. With those fixed,
it becomes flaky for 2 reasons. One of them is the wireguard handshake
race, which can eat the 1st sprayed packet and prevent roamAddr
discovery. This change fixes that failure, by spreading the test
traffic out enough that additional spraying occurs.

Signed-Off-By: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson ef31dd7bb5 wgengine/magicsock: check all 3 fast paths independently.
The previous code would skip the DERP short-circuit if roamAddr
was set, which is not what we wanted. More generally, hitting
any of the fast path conditions is a direct return, so we can
just have 3 standalone branches rather than 'else if' stuff.

Signed-Off-By: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson 05a52746a4 wgengine/magicsock: fix destination selection logic to work with DERP.
The effect is subtle: when we're not spraying packets, and have not yet
figured out a curAddr, and we're not spraying, we end up sending to
whatever the first IP is in the iteration order. In English, that
means "when we have no idea where to send packets, and we've given
up on sending to everyone, just send to the first addr we see in
the list."

This is, in general, what we want, because the addrs are in sorted
preference order, low to high, and DERP is the least preferred
destination. So, when we have no idea where to send, send to DERP,
right?

... Except for very historical reasons, appendDests iterated through
addresses in _reverse_ order, most preferred to least preferred.
crawshaw@ believes this was part of the earliest handshaking
algorithm magicsock had, where it slowly iterated through possible
destinations and poked handshakes to them one at a time.

Anyway, because of this historical reverse iteration, in the case
described above of "we have no idea where to send", the code would
end up sending to the _most_ preferred candidate address, rather
than the _least_ preferred. So when in doubt, we'd end up firing
packets into the blackhole of some LAN address that doesn't work,
and connectivity would not work.

This case only comes up if all your non-DERP connectivity options
have failed, so we more or less failed to detect it because we
didn't have a pathological test box deployed. Worse, codependent
bug 2839854994 made DERP accidentally
work sometimes anyway by incorrectly exploiting roamAddr behavior,
albeit at the cost of making DERP traffic symmetric. In fixing
DERP to once again be asymmetric, we effectively removed the
bandaid that was concealing this bug.

Signed-Off-By: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson 97e58ad44d wgengine/magicsock: only set addrByKey once in CreateEndpoint.
Signed-Off-By: David Anderson <danderson@tailscale.com>
5 years ago