Commit Graph

457 Commits (c573bef0aa088d519f6fd4cde10f2b42fa221c39)

Author SHA1 Message Date
Brad Fitzpatrick c576fea60e wgengine/magicsock: delete unused WhoIs method that was moved elsewhere
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick ef7bac2895 tailcfg, net/portmapper, wgengine/magicsock: add NetInfo.HavePortMap
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 79d8288f0a wgengine/magicsock, derp, derp/derphttp: respond to DERP server->client pings
No server support yet, but we want Tailscale 1.6 clients to be able to respond
to them when the server can do it.

Updates #1310

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 387e83c8fe wgengine/magicsock: fix Conn.Rebind race that let ErrClosed errors be read
There was a logical race where Conn.Rebind could acquire the
RebindingUDPConn mutex, close the connection, fail to rebind, release
the mutex, and then because the mutex was no longer held, ReceiveIPv4
wouldn't retry reads that failed with net.ErrClosed, letting that
error back to wireguard-go, which would then stop running that receive
IP goroutine.

Instead, keep the RebindingUDPConn mutex held for the entirety of the
replacement in all cases.

Updates tailscale/corp#1289

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c445e3d327 wgengine/magicsock: fix typo in comment 4 years ago
Brad Fitzpatrick a6d098c750 wgengine/magicsock: log when DERP connection succeeds
Updates #1310
4 years ago
Brad Fitzpatrick 829eb8363a net/interfaces: sort returned addresses from LocalAddresses
Also change the type to netaddr.IP while here, because it made sorting
easier.

Updates tailscale/corp#1397

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c3e5903b91 wgengine/magicsock: remove leftover portmapper debug logging
It's already logged at the right time in logEndpointChange.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick ea3715e3ce wgengine/magicsock: remove TODO about endpoints-over-DERP
It was done in Tailscale 1.4 with CallMeMaybe disco messages
containing endpoints.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick e9e4f1063d wgengine/magicsock: fix discoEndpoint caching bug when a node key changes
Fixes #1391

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c64bd587ae net/portmapper: add NAT-PMP client, move port mapping service probing
* move probing out of netcheck into new net/portmapper package
* use PCP ANNOUNCE op codes for PCP discovery, rather than causing
  short-lived (sub-second) side effects with a 1-second-expiring map +
  delete.
* track when we heard things from the router so we can be less wasteful
  in querying the router's port mapping services in the future
* use portmapper from magicsock to map a public port

Fixes #1298
Fixes #1080
Fixes #1001
Updates #864

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Josh Bleecher Snyder 1632f9fd6b wgengine/magicsock: reduce log spam during tests
Only do the type assertion to *net.UDPAddr when addr is non-nil.
This prevents a bunch of log spam during tests.
4 years ago
Josh Bleecher Snyder 88586ec4a4 wgengine/magicsock: remove an alloc from ReceiveIPvN
We modified the standard net package to not allocate a *net.UDPAddr
during a call to (*net.UDPConn).ReadFromUDP if the caller's use
of the *net.UDPAddr does not cause it to escape.
That is https://golang.org/cl/291390.

This is the companion change to magicsock.
There are two changes required.
First, call ReadFromUDP instead of ReadFrom, if possible.
ReadFrom returns a net.Addr, which is an interface, which always allocates.
Second, reduce the lifetime of the returned *net.UDPAddr.
We do this by immediately converting it into a netaddr.IPPort.

We left the existing RebindingUDPConn.ReadFrom method in place,
as it is required to satisfy the net.PacketConn interface.

With the upstream change and both of these fixes in place,
we have removed one large allocation per packet received.

name           old time/op    new time/op    delta
ReceiveFrom-8    16.7µs ± 5%    16.4µs ± 8%     ~     (p=0.310 n=5+5)

name           old alloc/op   new alloc/op   delta
ReceiveFrom-8      112B ± 0%       64B ± 0%  -42.86%  (p=0.008 n=5+5)

name           old allocs/op  new allocs/op  delta
ReceiveFrom-8      3.00 ± 0%      2.00 ± 0%  -33.33%  (p=0.008 n=5+5)

Co-authored-by: Sonia Appasamy <sonia@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
Josh Bleecher Snyder 0c673c1344 wgengine/magicsock: unify on netaddr types in addrSet
addrSet maintained duplicate lists of netaddr.IPPorts and net.UDPAddrs.
Unify to use the netaddr type only.

This makes (*Conn).ReceiveIPvN a bit uglier,
but that'll be cleaned up in a subsequent commit.

This is preparatory work to remove an allocation from ReceiveIPv4.

Co-authored-by: Sonia Appasamy <sonia@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
Brad Fitzpatrick 7e201806b1 wgengine/magicsock: reconnect to DERP home after network comes back up
Updates #1310
4 years ago
Brad Fitzpatrick 9b4e50cec0 wgengine/magicsock: fix typo in comment 4 years ago
Brad Fitzpatrick 6b365b0239 wgengine/magicsock: fix DERP reader hang regression during concurrent reads
Fixes #1282

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 6d2b8df06d wgengine/magicsock: add disabled failing (deadlocking) test for #1282
The fix can make this test run unconditionally.

This moves code from 5c619882bc for
testability but doesn't fix it yet. The #1282 problem remains (when I
wrote its wake-up mechanism, I forgot there were N DERP readers
funneling into 1 UDP reader, and the code just isn't correct at all
for that case).

Also factor out some test helper code from BenchmarkReceiveFrom.

The refactoring in magicsock.go for testability should have no
behavior change.
4 years ago
Brad Fitzpatrick 1e7a35b225 types/netmap: split controlclient.NetworkMap off into its own leaf package
Updates #1278

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 6064b6ff47 wgengine/wgcfg/nmcfg: split control/controlclient/netmap.go into own package
It couldn't move to ipnlocal due to test dependency cycles.

Updates #1278

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick f7eed25bb9 wgengine/magicsock: filter disco packets and packets when stopped from wireguard
Fixes #1167
Fixes tailscale/corp#219

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Josh Bleecher Snyder dd10babaed wgenginer/magicsock: remove Addrs methods
They are now unused.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
Brad Fitzpatrick 9541886856 wgengine/magicsock: disable regular STUNs for all platforms by default
Reduces background CPU & network.

Updates #1034

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c55d26967b wgengine/magicsock: log more details of endpoints learned over disco
Also, don't try to use IPv6 LinkLocalUnicast addresses for now. Like endpoints
exchanged with control, we share them but don't yet use them.

Updates #1172
4 years ago
Brad Fitzpatrick 359055d3fa wgengine/magicsock: fix logging regression
c8c493f3d9 made it always say
`created=false` which scared me when I saw it, as that would've implied
things were broken much worse. Fortunately the logging was just wrong.
4 years ago
Brad Fitzpatrick edf64e0901 wgengine/magicsock: send, use endpoints in CallMeMaybe messages
Fixes #1172

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick b5b4992eff disco: support parsing/encoding endpoints in call-me-maybe frames
Updates #1172

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 187e22a756 wgengine/magicsock: don't run the DERP cleanup so often
To save CPU and wakeups, don't run the DERP cleanup timer regularly
unless there is a non-home DERP connection open.

Also eliminates the goroutine, moving to a time.AfterFunc.

Updates #1034

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Josh Bleecher Snyder 5fe5402fcd Revert "wgengine/magicsock: shortcircuit discoEndpoint.heartbeat when its connection is closed"
This reverts commit 08baa17d9a.
It caused deadlocks due to lock ordering violations.
It was not the right fix, and thus should simply be reverted
while we look for the right fix (if we haven't already found it
in the interim; we've fixed other logging-after-test issues).

Fixes #1161
4 years ago
Brad Fitzpatrick edce91a8a6 wgengine/magicsock: fix a naked return bug/crash where we returned (nil, true)
The 'ok' from 'ipp, ok :=' above was the result parameter ok. Whoops.
4 years ago
Brad Fitzpatrick 51bd1feae4 wgengine/magicsock: add single element IPPort->endpoint cache in receive path
name           old time/op  new time/op  delta
ReceiveFrom-4  21.8µs ± 2%  20.9µs ± 2%  -4.27%  (p=0.000 n=10+10)

Updates #414

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 5c619882bc wgengine/magicsock: simplify ReceiveIPv4+DERP path
name           old time/op  new time/op  delta
ReceiveFrom-4  35.8µs ± 3%  21.9µs ± 5%  -38.92%  (p=0.008 n=5+5)

Fixes #1145
Updates #414

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 3fa86a8b23 wgengine/magicsock: use relatively new netaddr.IPPort.IsZero method 4 years ago
Brad Fitzpatrick 4811236189 wgengine/magicsock: speed up BenchmarkReceiveFrom, store context.Done chan
context.cancelCtx.Done involves a mutex and isn't as cheap as I
previously assumed. Convert the donec method into a struct field and
store the channel value once. Our one magicsock.Conn gets one pointer
larger, but it cuts ~1% of the CPU time of the ReceiveFrom benchmark
and removes a bubble from the --svg output :)
4 years ago
Josh Bleecher Snyder 63af950d8c wgengine/magicsock: adapt to wireguard-go without UpdateDst
22507adf54 stopped relying on
our fork of wireguard-go's UpdateDst callback.
As a result, we can unwind that code,
and the extra return value of ReceiveIPv{4,6}.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
David Anderson 57d95dd005 wgengine/magicsock: default legacy networking to off for some tests.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson a2463e8948 wgengine/magicsock: add an option to disable legacy peer handling.
Used in tests to ensure we're not relying on behavior we're going
to remove eventually.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson d456bfdc6d wgengine/magicsock: fix BenchmarkReceiveFrom.
Previously, this benchmark relied on behavior of the legacy
receive codepath, which I changed in 22507adf. With this
change, the benchmark instead relies on the new active discovery
path.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Josh Bleecher Snyder 08baa17d9a wgengine/magicsock: shortcircuit discoEndpoint.heartbeat when its connection is closed
This prevents us from continuing to do unnecessary work
(including logging) after the connection has closed.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
Josh Bleecher Snyder 7c76435bf7 wgengine/magicsock: simplify
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
Josh Bleecher Snyder 654b5f1570 all: convert from []wgcfg.Endpoint to string
This eliminates a dependency on wgcfg.Endpoint,
as part of the effort to eliminate our wireguard-go fork.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
David Anderson 22507adf54 wgengine/magicsock: stop depending on UpdateDst in legacy codepaths.
This makes connectivity between ancient and new tailscale nodes slightly
worse in some cases, but only in cases where the ancient version would
likely have failed to get connectivity anyway.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Denton Gentry 0aa55bffce magicsock: test error case in derpWriteChanOfAddr
In derpWriteChanOfAddr when we call derphttp.NewRegionClient(),
there is a check of whether the connection is already errored and
if so it returns before grabbing the lock. The lock might already
be held and would be a deadlock.

This corner case is not being reliably exercised by other tests.
This shows up in code coverage reports, the lines of code in
derpWriteChanOfAddr are alternately added and subtracted from
code coverage.

Add a test to specifically exercise this code path, and verify that
it doesn't deadlock.

This is the best tradeoff I could come up with:
+ the moment code calls Err() to check if there is an error, we
  grab the lock to make sure it would deadlock if it tries to grab
  the lock itself.
+ if a new call to Err() is added in this code path, only the
  first one will be covered and the rest will not be tested.
+ this test doesn't verify whether code is checking for Err() in
  the right place, which ideally I guess it would.

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
4 years ago
Brad Fitzpatrick 85e54af0d7 wgengine: on TCP connect fail/timeout, log some clues about why it failed
So users can see why things aren't working.

A start. More diagnostics coming.

Updates #1094
4 years ago
Brad Fitzpatrick f85769b1ed wgengine/magicsock: drop netaddr.IPPort cache
netaddr.IP no longer allocates, so don't need a cache or all its associated
code/complexity.

This totally removes groupcache/lru from the deps.

Also go mod tidy.
4 years ago
Brad Fitzpatrick 5aa5db89d6 cmd/tailscaled, wgengine/netstack: add start of gvisor userspace netstack work
Not usefully functional yet (mostly a proof of concept), but getting
it submitted for some work @namansood is going to do atop this.

Updates #707
Updates #634
Updates #48
Updates #835
4 years ago
Brad Fitzpatrick 5efb0a8bca cmd/tailscale: change formatting of "tailscale status"
* show DNS name over hostname, removing domain's common MagicDNS suffix.
  only show hostname if there's no DNS name.
  but still show shared devices' MagicDNS FQDN.

* remove nerdy low-level details by default: endpoints, DERP relay,
  public key.  They're available in JSON mode still for those who need
  them.

* only show endpoint or DERP relay when it's active with the goal of
  making debugging easier. (so it's easier for users to understand
  what's happening) The asterisks are gone.

* remove Tx/Rx numbers by default for idle peers; only show them when
  there's traffic.

* include peers' owner login names

* add CLI option to not show peers (matching --self=true, --peers= also
  defaults to true)

* sort by DNS/host name, not public key

* reorder columns
4 years ago
Brad Fitzpatrick b5b9866ba2 wgengine/magicsock: copy self DNS name to PeerStatus, re-fill OS
The OS used to be sent back from the server but that has since
been removed as being redundant.
4 years ago
David Anderson 86fe22a1b1 Update netaddr, and adjust wgengine/magicsock due to API change.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Josh Bleecher Snyder 56a7652dc9 wgkey: new package
This is a replacement for the key-related parts
of the wireguard-go wgcfg package.

This is almost a straight copy/paste from the wgcfg package.
I have slightly changed some of the exported functions and types
to avoid stutter, added and tweaked some comments,
and removed some now-unused code.

To avoid having wireguard-go depend on this new package,
wgcfg will keep its key types.

We translate into and out of those types at the last minute.
These few remaining uses will be eliminated alongside
the rest of the wgcfg package.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
Josh Bleecher Snyder 2fe770ed72 all: replace wgcfg.IP and wgcfg.CIDR with netaddr types
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
Brad Fitzpatrick 053a1d1340 all: annotate log verbosity levels on most egregiously spammy log prints
Fixes #924
Fixes #282

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
David Anderson 294ceb513c ipn, wgengine/magicsock: fix `tailscale status` display.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson c8c493f3d9 wgengine/magicsock: make ReceiveIPv4 a little easier to follow.
The previous code used a lot of whole-function variables and shared
behavior that only triggered based on prior action from a single codepath.
Instead of that, move the small amounts of "shared" code into each switch
case.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 0ad109f63d wgengine/magicsock: move legacy endpoint creation into legacy.go.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson f873da5b16 wgengine/magicsock: move more legacy endpoint handling.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 58fcd103c4 wgengine/magicsock: move legacy sending code to legacy.go.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 65ae66260f wgengine/magicsock: unexport AddrSet.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson c9b9afd761 wgengine/magicsock: move most legacy nat traversal bits to another file.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 554a20becb wgengine/magicsock: only log about lazy config when actually doing lazy config.
Before, tailscaled would log every 10 seconds when the periodic noteRecvActivity
call happens. This is noisy, but worse it's misleading, because the message
suggests that the disco code is starting a lazy config run for a missing peer,
whereas in fact it's just an internal piece of keepalive logic.

With this change, we still log when going from 0->1 tunnel for the peer, but
not every 10s thereafter.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick fa412c8760 wgengine/filter, wgengine/magicsock: use new IP.BitLen to simplify some code 4 years ago
David Anderson 9cee0bfa8c wgengine/magicsock: sprinkle more docstrings.
Magicsock is too damn big, but this might help me page it back
in faster next time.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick 713cbe84c1 wgengine/magicsock: use net.JoinHostPort when host might have colons (udp6)
Only affected tests. (where it just generated log spam)
4 years ago
Brad Fitzpatrick 450cfedeba wgengine/magicsock: quiet an IPv6 warning in tests
In tests, we force binding to localhost to avoid OS firewall warning
dialogs.

But for IPv6, we were trying (and failing) to bind to 127.0.0.1.

You'd think we'd just say "localhost", but that's apparently ill
defined. See
https://tools.ietf.org/html/draft-ietf-dnsop-let-localhost-be-localhost
and golang/go#22826. (It's bitten me in the past, but I can't
remember specific bugs.)

So use "::1" explicitly for "udp6", which makes the test quieter.
4 years ago
Brad Fitzpatrick fd2a30cd32 wgengine/magicsock: make test pass on Windows and without firewall dialog box
Updates #50
4 years ago
Brad Fitzpatrick ac866054c7 wgengine/magicsock: add a backoff on DERP reconnects
Fixes #808
4 years ago
Brad Fitzpatrick 105a820622 wgengine/magicsock: skip an endpoint update at start-up
At startup the client doesn't yet have the DERP map so can't do STUN
queries against DERP servers, so it only knows it local interface
addresses, not its STUN-mapped addresses.

We were reporting the interface-local addresses to control, getting
the DERP map, and then immediately reporting the full set of
updates. That was an extra HTTP request to control, but worse: it was
an extra broadcast from control out to all the peers in the network.

Now, skip the initial update if there are no stun results and we don't
have a DERP map.

More work remains optimizing start-up requests/map updates, but this
is a start.

Updates tailscale/corp#557
4 years ago
Brad Fitzpatrick 2076a50862 wgengine/magicsock: finish a comment sentence that ended prematurely 4 years ago
Brad Fitzpatrick 3e4c46259d wgengine/magicsock: don't do netchecks either when network is down
A continuation of 6ee219a25d

Updates #640
4 years ago
Brad Fitzpatrick 6ee219a25d ipn, wgengine, magicsock, tsdns: be quieter and less aggressive when offline
If no interfaces are up, calm down and stop spamming so much. It was
noticed as especially bad on Windows, but probably was bad
everywhere. I just have the best network conditions testing on a
Windows VM.

Updates #604
4 years ago
Christina Wen 48fbe93e72
wgengine/magicsock: clarify pre-disco 'tailscale ping' error message
This change clarifies the error message when a user pings a peer that is using an outdated version of Tailscale.
4 years ago
Josh Bleecher Snyder 0c0239242c wgengine/magicsock: make discoPingPurpose a stringer
It was useful for debugging once, it'll probably be useful again.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
Josh Bleecher Snyder 57e642648f wgengine/magicsock: fix typo in comment 4 years ago
Brad Fitzpatrick 756d6a72bd wgengine: lazily create peer wireguard configs more explicitly
Rather than consider bigs jumps in last-received-from activity as a
signal to possibly reconfigure the set of wireguard peers to have
configured, instead just track the set of peers that are currently
excluded from the configuration. Easier to reason about.

Also adds a bit more logging.

This might fix an error we saw on a machine running a recent unstable
build:

2020-08-26 17:54:11.528033751 +0000 UTC: 8.6M/92.6M magicsock: [unexpected] lazy endpoint not created for [UcppE], d:42a770f678357249
2020-08-26 17:54:13.691305296 +0000 UTC: 8.7M/92.6M magicsock: DERP packet received from idle peer [UcppE]; created=false
2020-08-26 17:54:13.691383687 +0000 UTC: 8.7M/92.6M magicsock: DERP packet from unknown key: [UcppE]

If it does happen again, though, we'll have more logs.
4 years ago
halulu f27a57911b
cmd/tailscale: add derp and endpoints status (#703)
cmd/tailscale: add local node's information to status output (by default)

RELNOTE=yes

Updates #477

Signed-off-by: Halulu <lzjluzijie@gmail.com>
4 years ago
David Crawshaw dd2c61a519 magicsock: call RequestStatus when DERP connects
Second attempt.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
4 years ago
David Crawshaw a67b174da1 Revert "magicsock: call RequestStatus when DERP connects"
Seems to break linux CI builder. Cannot reproduce locally,
so attempting a rollback.

This reverts commit cd7bc02ab1.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
4 years ago
David Crawshaw cd7bc02ab1 magicsock: call RequestStatus when DERP connects
Without this, a freshly started ipn client will be stuck in the
"Starting" state until something triggers a call to RequestStatus.
Usually a UI does this, but until then we can sit in this state
until poked by an external event, as is evidenced by our e2e tests
locking up when DERP is attached.

(This only recently became a problem when we enabled lazy handshaking
everywhere, otherwise the wireugard tunnel creation would also
trigger a RequestStatus.)

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
4 years ago
Brad Fitzpatrick f6dc47efe4 tailcfg, controlclient, magicsock: add control feature flag to enable DRPO
Updates #150
4 years ago
Brad Fitzpatrick 85c3d17b3c wgengine/magicsock: use disco ping src as a candidate endpoint
Consider:

   Hard NAT (A) <---> Hard NAT w/ mapped port (B)

If A sends a packet to B's mapped port, A can disco ping B directly,
with low latency, without DERP.

But B couldn't establish a path back to A and needed to use DERP,
despite already logging about A's endpoint and adding a mapping to it
for other purposes (the wireguard conn.Endpoint lookup also needed
it).

This adds the tracking to discoEndpoint too so it'll be used for
finding a path back.

Fixes tailscale/corp#556

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 0512fd89a1 wgengine/magicsock: simplify handlePingLocked
It's no longer true that 'de may be nil'
4 years ago
Brad Fitzpatrick 84dc891843 cmd/tailscale/cli: add ping subcommand
For example:

$ tailscale ping -h
USAGE
  ping <hostname-or-IP>

FLAGS
  -c 10                   max number of pings to send
  -stop-once-direct true  stop once a direct path is established
  -verbose false          verbose output

$ tailscale ping mon.ts.tailscale.com
pong from monitoring (100.88.178.64) via DERP(sfo) in 65ms
pong from monitoring (100.88.178.64) via DERP(sfo) in 252ms
pong from monitoring (100.88.178.64) via [2604:a880:2:d1::36:d001]:41641 in 33ms

Fixes #661

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 9a346fd8b4 wgengine,magicsock: fix two lazy wireguard config issues
1) we weren't waking up a discoEndpoint that once existed and
   went idle for 5 minutes and then got a disco message again.

2) userspaceEngine.noteReceiveActivity had a buggy check; fixed
   and added a test
4 years ago
Brad Fitzpatrick cff737786e wgengine/magicsock: fix lazy config deadlock, document more lock ordering
This removes the atomic bool that tried to track whether we needed to acquire
the lock on a future recursive call back into magicsock. Unfortunately that
hack doesn't work because we also had a lock ordering issue between magicsock
and userspaceEngine (see issue). This documents that too.

Fixes #644
4 years ago
Brad Fitzpatrick 2bd9ad4b40 wgengine: fix deadlock between engine and magicsock 4 years ago
Brad Fitzpatrick 7c38db0c97 wgengine/magicsock: don't deadlock on pre-disco Endpoints w/ lazy wireguard configs
Fixes tailscale/tailscale#637
4 years ago
Brad Fitzpatrick 4987a7d46c wgengine/magicsock: when hard NAT, add stun-ipv4:static-port as candidate
If a node is behind a hard NAT and is using an explicit local port
number, assume they might've mapped a port and add their public IPv4
address with the local tailscaled's port number as a candidate endpoint.
4 years ago
Brad Fitzpatrick bfcb0aa0be wgengine/magicsock: deflake tests, Close deadlock again
Better fix than 37903a9056

Fixes tailscale/corp#533
4 years ago
Brad Fitzpatrick cb970539a6 wgengine/magicsock: remove TODO comment that's no longer applicable 4 years ago
Brad Fitzpatrick 915f65ddae wgengine/magicsock: stop disco activity on IPN stop
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c180abd7cf wgengine/magicsock: merge errClosed and errConnClosed 4 years ago
Brad Fitzpatrick d55fdd4669 wgengine/magicsock: update, flesh out a TODO 4 years ago
Brad Fitzpatrick 58b721f374 wgengine/magicsock: deflake some tests with an ugly hack
Starting with fe68841dc7, some e2e tests
got flaky. Rather than debug them (they're gnarly), just revert to the old
behavior as far as those tests are concerned. The tests were somehow
using magicsock without a private key and expecting it to do ... something.

My goal with fe68841dc7 was to stop log spam
and unnecessary work I saw on the iOS app when when stopping the app.

Instead, only stop doing that work on any transition from
once-had-a-private-key to no-longer-have-a-private-key. That fixes
what I wanted to fix while still making the mysterious e2e tests
happy.
4 years ago
David Anderson 0249236cc0 ipn/ipnstate: record assigned Tailscale IPs.
wgengine/magicsock: use ipnstate to find assigned Tailscale IPs.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson f582eeabd1 wgengine/magicsock: add a test for active path discovery.
Uses natlab only, because the point of this active discovery test is going to be
that it should get through a lot of obstacles.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick 37903a9056 wgengine/magicsock: fix occasional deadlock on Conn.Close on c.derpStarted
The deadlock was:

* Conn.Close was called, which acquired c.mu
* Then this goroutine scheduled:

    if firstDerp {
        startGate = c.derpStarted
        go func() {
            dc.Connect(ctx)
            close(c.derpStarted)
        }()
    }

* The getRegion hook for that derphttp.Client then ran, which also
  tries to acquire c.mu.

This change makes that hook first see if we're already in a closing
state and then it can pretend that region doesn't exist.
4 years ago
Brad Fitzpatrick fe68841dc7 wgengine/magicsock: log better with less spam on transition to stopped state
Required a minor test update too, which now needs a private key to get far
enough to test the thing being tested.
4 years ago
Brad Fitzpatrick e298327ba8 wgengine/magicsock: remove overkill, slow reflect.DeepEqual of NetworkMap
No need to allocate or compare all the fields we don't care about.
4 years ago
Brad Fitzpatrick 16a9cfe2f4 wgengine: configure wireguard peers lazily, as needed
wireguard-go uses 3 goroutines per peer (with reasonably large stacks
& buffers).

Rather than tell wireguard-go about all our peers, only tell it about
peers we're actively communicating with. That means we need hooks into
magicsock's packet receiving path and tstun's packet sending path to
lazily create a wireguard peer on demand from the network map.

This frees up lots of memory for iOS (where we have almost nothing
left for larger domains with many users).

We should ideally do this in wireguard-go itself one day, but that'd
be a pretty big change.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 5066b824a6 wgengine/magicsock: don't log about disco ping timeouts if we have a working address
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c06d2a8513 wgengine/magicsock: fix typo in comment 4 years ago
Brad Fitzpatrick bf195cd3d8 wgengine/magicsock: reduce log verbosity of discovery messages
Don't log heartbeat pings & pongs. Track the reason for pings and then
only log the ping/pong traffic if it was for initial path discovery.
4 years ago
Brad Fitzpatrick 10ac066013 all: fix vet warnings 4 years ago
Brad Fitzpatrick d74c9aa95b wgengine/magicsock: update comment, fix earlier commit
891898525c had a continue that meant the didCopy synchronization never ran.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c976264bd1 wgengine/magicsock: gofmt 4 years ago
Dmytro Shynkevych f3e2b65637
wgengine/magicsock: time.Sleep -> time.After
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Dmytro Shynkevych 380ee76d00
wgengine/magicsock: make time.Sleep in runDerpReader respect cancellation.
Before this patch, the 250ms sleep would not be interrupted by context cancellation,
which would result in the goroutine sometimes lingering in tests (100ms grace period).

Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Dmytro Shynkevych 891898525c
wgengine/magicsock: make receive from didCopy respect cancellation.
Very rarely, cancellation occurs between a successful send on derpRecvCh
and a call to copyBuf on the receiving side.
Without this patch, this situation results in <-copyBuf blocking indefinitely.

Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Brad Fitzpatrick a2267aae99 wgengine: only launch pingers for peers predating the discovery protocol
Peers advertising a discovery key know how to speak the discovery
protocol and do their own heartbeats to get through NATs and keep NATs
open. No need for the pinger except for with legacy peers.
4 years ago
Dmytro Shynkevych 2f15894a10 wgengine/magicsock: wait for derphttp client goroutine to exit
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Brad Fitzpatrick 6c74065053 wgengine/magicsock, tstest/natlab: start hooking up natlab to magicsock
Also adds ephemeral port support to natlab.

Work in progress.

Pairing with @danderson.
4 years ago
Brad Fitzpatrick bd59bba8e6 wgengine/magicsock: stop discoEndpoint timers on Close
And add some defensive early returns on c.closed.
4 years ago
Brad Fitzpatrick de875a4d87 wgengine/magicsock: remove DisableSTUNForTesting 4 years ago
Brad Fitzpatrick 5c6d8e3053 netcheck, tailcfg, interfaces, magicsock: survey UPnP, NAT-PMP, PCP
Don't do anything with UPnP, NAT-PMP, PCP yet, but see how common they
are in the wild.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 6196b7e658 wgengine/magicsock: change API to not permit disco key changes
Generate the disco key ourselves and give out the public half instead.

Fixes #525
4 years ago
Brad Fitzpatrick 5132edacf7 wgengine/magicsock: fix data race from undocumented wireguard-go requirement
Endpoints need to be Stringers apparently.

Fixes tailscale/corp#422
4 years ago
Brad Fitzpatrick 630379a1d0 cmd/tailscale: add tailscale status region name, last write, consistently star
There's a lot of confusion around what tailscale status shows, so make it better:
show region names, last write time, and put stars around DERP too if active.

Now stars are always present if activity, and always somewhere.
4 years ago
Brad Fitzpatrick 9a8700b02a wgengine/magicsock: add discoEndpoint heartbeat
Updates #483
4 years ago
Brad Fitzpatrick 9f930ef2bf wgengine/magicsock: remove the discoEndpoint.timers map
It ended up being more complicated than it was worth.
4 years ago
Brad Fitzpatrick f5f3885b5b wgengine/magicsock: bunch of misc discovery path cleanups
* fix tailscale status for peers using discovery
* as part of that, pull out disco address selection into reusable
  and testable discoEndpoint.addrForSendLocked
* truncate ping/pong logged hex txids in half to eliminate noise
* move a bunch of random time constants into named constants
  with docs
* track a history of per-endpoint pong replies for future use &
  status display
* add "send" and " got" prefix to discovery message logging
  immediately before the frame type so it's easier to read than
  searching for the "<-" or "->" arrows earlier in the line; but keep
  those as the more reasily machine readable part for later.

Updates #483
4 years ago
Brad Fitzpatrick 6c70cf7222 wgengine/magicsock: stop ping timeout timer on pong receipt, misc log cleanup
Updates #483
4 years ago
Brad Fitzpatrick c52905abaa wgengine/magicsock: log less on no-op disco route switches
Also, renew trustBestAddrUntil even if latency isn't better.
4 years ago
Brad Fitzpatrick 0f0ed3dca0 wgengine/magicsock: clean up discovery logging
Updates #483
4 years ago
Brad Fitzpatrick 056fbee4ef wgengine/magicsock: add TS_DEBUG_OMIT_LOCAL_ADDRS knob to force STUN use only
For debugging.
4 years ago
Brad Fitzpatrick e03cc2ef57 wgengine/magicsock: populate discoOfAddr upon receiving ping frames
Updates #483
4 years ago
Brad Fitzpatrick 275a20f817 wgengine/magicsock: keep discoOfAddr populated, use it for findEndpoint
Update the mapping from ip:port to discokey, so when we retrieve a
packet from the network, we can find the same conn.Endpoint that we
gave to wireguard-go previously, without making it think we've
roamed. (We did, but we're not using its roaming.)

Updates #483
4 years ago
Brad Fitzpatrick 77e89c4a72 wgengine/magicsock: handle CallMeMaybe discovery mesages
Roughly feature complete now. Testing and polish remains.

Updates #483
4 years ago
Brad Fitzpatrick 710ee88e94 wgengine/magicsock: add timeout on discovery pings, clean up state
Updates #483
4 years ago
Brad Fitzpatrick 77d3ef36f4 wgengine/magicsock: hook up discovery messages, upgrade to LAN works
Ping messages now go out somewhat regularly, pong replies are sent,
and pong replies are now partially handled enough to upgrade off DERP
to LAN.

CallMeMaybe packets are sent & received over DERP, but aren't yet
handled. That's next (and regular maintenance timers), and then WAN
should work.

Updates #483
4 years ago
Brad Fitzpatrick 9b8ca219a1 wgengine/magicsock: remove allocs in UDP write, use new netaddr.PutUDPAddr
The allocs were only introduced yesterday with a TODO. Now they're gone again.
4 years ago
Brad Fitzpatrick 7b3c0bb7f6 wgengine/magicsock: fix crash reading DERP packet
Starting at yesterday's e96f22e560 (convering some UDPAddrs to
IPPorts), Conn.ReceiveIPv4 could return a nil addr, which would make
its way through wireguard-go and blow up later. The DERP read path
wasn't initializing the addr result parameter any more, and wgRecvAddr
wasn't checking it either.

Fixes #515
4 years ago
Brad Fitzpatrick 47b4a19786 wgengine/magicsock: use netaddr.ParseIPPort instead of net.ResolveUDPAddr 4 years ago
Brad Fitzpatrick f7124c7f06 wgengine/magicsock: start of discoEndpoint state tracking
Updates #483
4 years ago
Brad Fitzpatrick 92252b0988 wgengine/magicsock: add a little LRU cache for netaddr.IPPort lookups
And while plumbing, a bit of discovery work I'll need: the
endpointOfAddr map to map from validated paths to the discoEndpoint.
Not being populated yet.

Updates #483
4 years ago
Brad Fitzpatrick 2d6e84e19e net/netcheck, wgengine/magicsock: replace more UDPAddr with netaddr.IPPort 4 years ago
Brad Fitzpatrick 9070aacdee wgengine/magicsock: minor comments & logging & TODO changes 4 years ago
Brad Fitzpatrick e96f22e560 wgengine/magicsock: start handling disco message, use netaddr.IPPort more
Updates #483
4 years ago
Brad Fitzpatrick a83ca9e734 wgengine/magicsock: cache precomputed nacl/box shared keys
Updates #483
4 years ago
Brad Fitzpatrick a975e86bb8 wgengine/magicsock: add new endpoint type used for discovery-supporting peers
This adds a new magicsock endpoint type only used when both sides
support discovery (that is, are advertising a discovery
key). Otherwise the old code is used.

So far the new code only communicates over DERP as proof that the new
code paths are wired up. None of the actually discovery messaging is
implemented yet.

Support for discovery (generating and advertising a key) are still
behind an environment variable for now.

Updates #483
4 years ago
Brad Fitzpatrick 103c06cc68 wgengine/magicsock: open discovery naclbox messages from known peers
And track known peers.

Doesn't yet do anything with the messages. (nor does it send any yet)

Start of docs on the message format. More will come in subsequent changes.

Updates #483
4 years ago
Brad Fitzpatrick 23e74a0f7a wgengine, magicsock, tstun: don't regularly STUN when idle (mobile only for now)
If there's been 5 minutes of inactivity, stop doing STUN lookups. That
means NAT mappings will expire, but they can resume later when there's
activity again.

We'll do this for all platforms later.

Updates tailscale/corp#320

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick fe50cd0c48 ipn, wgengine: plumb NetworkMap down to magicsock
Now we can have magicsock make decisions based on tailcfg.Debug
settings sent by the server.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 53fb25fc2f all: generate discovery key, plumb it around
Not actually used yet.

Updates #483
4 years ago
Brad Fitzpatrick abd79ea368 derp: reduce DERP memory use; don't require callers to pass in memory to use
The magicsock derpReader was holding onto 65KB for each DERP
connection forever, just in case.

Make the derp{,http}.Client be in charge of memory instead. It can
reuse its bufio.Reader buffer space.
4 years ago
Brad Fitzpatrick 280e8884dd wgengine/magicsock: limit redundant log spam on packets from low-pri addresses
Fixes #407

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 9e5d79e2f1 wgengine/magicsock: drop a bytes.Buffer sync.Pool, use logger.ArgWriter instead 4 years ago
Brad Fitzpatrick db2a216561 wgengine/magicsock: don't log on UDP send errors if address family known missing
Fixes #376
5 years ago
Brad Fitzpatrick 9e3ad4f79f net/netns: add package for start of network namespace support
And plumb in netcheck STUN packets.

TODO: derphttp, logs, control.

Updates #144

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick a428656280 wgengine/magicsock: don't report v4 localhost addresses on IPv6-only systems
Updates #376
5 years ago
Avery Pennarun 30e5c19214 magicsock: work around race condition initializing .Regions[].
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
5 years ago
Brad Fitzpatrick b0c10fa610 stun, netcheck: move under net 5 years ago
Brad Fitzpatrick e6b84f2159 all: make client use server-provided DERP map, add DERP region support
Instead of hard-coding the DERP map (except for cmd/tailscale netcheck
for now), get it from the control server at runtime.

And make the DERP map support multiple nodes per region with clients
picking the first one that's available. (The server will balance the
order presented to clients for load balancing)

This deletes the stunner package, merging it into the netcheck package
instead, to minimize all the config hooks that would've been
required.

Also fix some test flakes & races.

Fixes #387 (Don't hard-code the DERP map)
Updates #388 (Add DERP region support)
Fixes #399 (wgengine: flaky tests)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick e6d0c92b1d wgengine/magicsock: clean up earlier fix a bit
Move WaitReady from fc88e34f42 into the
test code, and keep the derp-reading goroutine named for debugging.
5 years ago
Avery Pennarun fc88e34f42 wgengine/magicsock/tests: wait for home DERP connection before sending packets.
This fixes an elusive test flake. Fixes #161.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
5 years ago
Avery Pennarun 08acb502e5 Add tstest.PanicOnLog(), and fix various problems detected by this.
If a test calls log.Printf, 'go test' horrifyingly rearranges the
output to no longer be in chronological order, which makes debugging
virtually impossible. Let's stop that from happening by making
log.Printf panic if called from any module, no matter how deep, during
tests.

This required us to change the default error handler in at least one
http.Server, as well as plumbing a bunch of logf functions around,
especially in magicsock and wgengine, but also in logtail and backoff.

To add insult to injury, 'go test' also rearranges the output when a
parent test has multiple sub-tests (all the sub-test's t.Logf is always
printed after all the parent tests t.Logf), so we need to screw around
with a special Logf that can point at the "current" t (current_t.Logf)
in some places. Probably our entire way of using subtests is wrong,
since 'go test' would probably like to run them all in parallel if you
called t.Parallel(), but it definitely can't because the're all
manipulating the shared state created by the parent test. They should
probably all be separate toplevel tests instead, with common
setup/teardown logic. But that's a job for another time.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
5 years ago
Brad Fitzpatrick fefd7e10dc types/structs: add structs.Incomparable annotation, use it where applicable
Shotizam before and output queries:

sqlite> select sum(size) from bin where func like 'type..%';
129067
=>
120216
5 years ago
Brad Fitzpatrick e1526b796e ipn: don't listen on the unspecified address in test
To avoid the Mac firewall dialog of (test) death.

See 4521a59f30
which I added to help debug this.
5 years ago
Brad Fitzpatrick 18017f7630 ipn, wgengine/magicsock: be more idle when in Stopped state with no peers
(Previously as #288, but with some more.)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Anderson 9669b85b41 wgengine/magicsock: wait for endpoint updater goroutine when closing.
Fixes #204.

Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
Brad Fitzpatrick 268d331cb5 wgengine/magicsock: prune key.Public-keyed on peer removals
Fixes #215
5 years ago
Brad Fitzpatrick 00d053e25a wgengine/magicsock: fix slow memory leak as peer endpoints move around
Updates #215
5 years ago
Brad Fitzpatrick 7fc97c5493 wgengine/magicsock: use netaddr more
In prep for deleting from the ever-growing maps.
5 years ago
Brad Fitzpatrick 6fb30ff543 wgengine/magicsock: start using inet.af/netaddr a bit 5 years ago
Brad Fitzpatrick a7e7c7b548 wgengine/magicsock: close derp connections on rebind
Fixes #276

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 614261d00d wgengine/magicsock: reset AddrSet states on Rebind
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Blake Gentry e19287f60f wgengine/magicsock: fix Conn docs type reference
The docs on magicsock.Conn stated that they implemented the
wireguard/device.Bind interface, yet this type does not exist. In
reality, the Conn type implements the wireguard/conn.Bind interface.

I also fixed a small typo in the same file.

Signed-off-by: Blake Gentry <blakesgentry@gmail.com>
5 years ago
Brad Fitzpatrick 322499473e cmd/tailscaled, wgengine, ipn: add /debug/ipn handler with world state
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 2d48f92a82 wgengine/magicsock: re-stun every [20,27] sec, not 28
28 is cutting it close, and we think jitter will help some spikes
we're seeing.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 577f321c38 wgengine/magicsock: revise derp fallback logic
Revision to earlier 6284454ae5

Don't be sticky if we have no peers.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 6284454ae5 wgengine/magicsock: if UDP blocked, pick DERP where most peers are
Updates #207

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick d321190578 wgengine/magicsock: stringify [IPv6]:port normally in AddrSet.String 5 years ago
Brad Fitzpatrick 3c3ea8bc8a wgengine/magicsock: finish IPv6 transport support
DEBUG_INCLUDE_IPV6=1 is still required, but works now.

Updates #18 (fixes it, once env var gate is removed)
5 years ago
Brad Fitzpatrick 82ed7e527e wgengine/magicsock: remove log allocation
This was the whole point but I goofed at the last line.
5 years ago
Brad Fitzpatrick 8454bbbda5 wgengine/magicsock: more logging improvements
* remove endpoint discovery noise when results unchanged
* consistently spell derp nodes as "derp-N"
* replace "127.3.3.40:" with "derp-" in CreateEndpoint log output
* stop early DERP setup before SetPrivateKey is called;
  it just generates log nosie
* fix stringification of peer ShortStrings (it had an old %x on it,
  rendering it garbage)
* describe why derp routes are changing, with one of:
  shared home, their home, our home, alt
5 years ago
Brad Fitzpatrick 680311b3df wgengine/magicsock: fix few remaining logs without package prefix 5 years ago
Brad Fitzpatrick c473927558 wgengine/magicsock: clean up, add, improve DERP logs 5 years ago
Brad Fitzpatrick ea9310403d wgengine/magicsock: re-STUN on DERP connection death
Fixes #201
5 years ago
Brad Fitzpatrick 1ab5b31c4b derp, magicsock: send new "peer gone" frames when previous sender disconnects
Updates #150 (not yet enabled by default in magicsock)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick b6f77cc48d wgengine/magicsock: return early, outdent in derpWriteChanOfAddr 5 years ago
Brad Fitzpatrick dd31285ad4 wgengine/magicsock: send IPv6 using pconn6, if available
In prep for IPv6 support. Nothing should make it this far yet.
5 years ago
Brad Fitzpatrick af277a6762 controlclient, magicsock: add debug knob to request IPv6 endpoints
Add opt-in method to request IPv6 endpoints from the control plane.
For now they should just be skipped. A previous version of this CL was
unconditional and reportedly had problems that I can't reproduce. So
make it a knob until the mystery is solved.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 221e7d7767 wgengine/magicsock: make log message include DERP port (node) 5 years ago
Brad Fitzpatrick 33bdcabf03 wgengine/magicsock: call stun callback w/ only valid part of STUN packet 5 years ago
David Anderson 0be475ba46 Revert "tailcfg, controlclient, magicsock: request IPv6 endpoints, but ignore them"
Breaks something deep in wireguard or magicsock's brainstem, no packets at all
can flow. All received packets fail decryption with "invalid mac1".

This reverts commit 94024355ed.

Signed-off-by: David Anderson <dave@natulte.net>
5 years ago
Brad Fitzpatrick 94024355ed tailcfg, controlclient, magicsock: request IPv6 endpoints, but ignore them
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 60ea635c6d wgengine/magicsock: delete inaccurate comment
I meant to include this in the earlier commit.
5 years ago
Brad Fitzpatrick a184e05290 wgengine/magicsock: listen on udp6, use it for STUN, report endpoint
More steps towards IPv6 transport.

We now send it to tailcontrol, which ignores it.

But it doesn't actually actually support IPv6 yet (outside of STUN).

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 7caa288213 wgengine/magicsock: rename pconn field to pconn4, in prep for pconn6 5 years ago
David Crawshaw addbdce296 wgengine, ipn: include number of active DERPs in status
Use this when making the ipn state transition from Starting to
Running. This way a network of quiet nodes with no active
handshaking will still transition to Active.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
David Crawshaw 1ad78ce698 magicsock: reconnect to home DERP on key change
Typically the home DERP server is found and set on startup before
magicsock's SetPrivateKey can be called, so no DERP connection is
established. Make sure one is by kicking the home DERP tires in
SetPrivateKey.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
David Crawshaw 455ba751d9 magicsock: start connection to HOME derp immediately
The code as written intended to do this, but it repeated the
comparison of derpNum and c.myDerp after c.myDerp had been
updated, so it never executed.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
David Anderson 315a5e5355 scripts: add a license header checker.
Signed-off-by: David Anderson <dave@natulte.net>
5 years ago
Brad Fitzpatrick db2436c7ff wgengine/magicsock: don't interrupt endpoint updates, merge all mutex into one
Before, endpoint updates were constantly being interrupted and resumed
on Linux due to tons of LinkChange messages from over-zealous Linux
netlink messages (from router_linux.go)

Now that endpoint updates are fast and bounded in time anyway, just
let them run to completion, but note that another needs to be
scheduled after.

Now logs went from pages of noise to just:

root@taildoc:~# grep -i -E 'stun|endpoint update' log
2020/03/13 08:51:29 magicsock.Conn: starting endpoint update (initial)
2020/03/13 08:51:30 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")
2020/03/13 08:51:31 magicsock.Conn: starting endpoint update (link-change-minor)
2020/03/13 08:51:31 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")
2020/03/13 08:51:33 magicsock.Conn: starting endpoint update (link-change-minor)
2020/03/13 08:51:33 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")
2020/03/13 08:51:35 magicsock.Conn: starting endpoint update (link-change-minor)
2020/03/13 08:51:35 magicsock.Conn.ReSTUN: endpoint update active, need another later ("link-change-minor")

Or, seen in another run:

2020/03/13 08:45:41 magicsock.Conn: starting endpoint update (periodic)
2020/03/13 08:46:09 magicsock.Conn: starting endpoint update (periodic)
2020/03/13 08:46:21 magicsock.Conn: starting endpoint update (link-change-major)
2020/03/13 08:46:37 magicsock.Conn: starting endpoint update (periodic)
2020/03/13 08:47:05 magicsock.Conn: starting endpoint update (periodic)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick db31550854 wgengine: don't Reconfig on boring link changes
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick bc73dcf204 wgengine/magicsock: don't block in Send waiting for derphttp.Send
Fixes #137
Updates #109
Updates #162
Updates #163

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 8807913be9 wgengine/magicsock: wait for previous DERP goroutines to end before new ones
Updates #109 (hopefully fixes, will wait for graphs to be happy)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick eff6dcdb4e wgengine/magicsock: log more about why we're re-STUNing 5 years ago
Brad Fitzpatrick b3ddf51a15 wgengine/magicsock: add a pointer value for logging
Updates #109

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Crawshaw 7ec54e0064 wgengine/magicsock: remove TODO
The TODO above derphttp.NewClient suggests it does network I/O,
but the derphttp client connects lazily and so creating one is
very cheap.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
Brad Fitzpatrick 01b4bec33f stunner: re-do how Stunner works
It used to make assumptions based on having Anycast IPs that are super
near. Now we're intentionally going to a bunch of different distant
IPs to measure latency.

Also, optimize how the hairpin detection works. No need to STUN on
that socket. Just use that separate socket for sending, once we know
the other UDP4 socket's endpoint. The trick is: make our test probe
also a STUN packet, so it fits through magicsock's existing STUN
routing.

This drops netcheck from ~5 seconds to ~250-500ms.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Anderson 86baf60bd4 wgengine/magicsock: synchronize epUpdate cleanup on shutdown.
Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
Brad Fitzpatrick 023df9239e Move linkstate boring change filtering to magicsock
So we can at least re-STUN on boring updates.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick a265d7cbff wgengine/magicsock: in STUN-disabled test mode, let endpoint discovery proceed 5 years ago
Brad Fitzpatrick 5c1e443d34 wgengine/monitor: don't call LinkChange when interfaces look unchanged
Basically, don't trust the OS-level link monitor to only tell you
interesting things. Sanity check it.

Also, move the interfaces package into the net directory now that we
have it.
5 years ago
Brad Fitzpatrick 39c0ae1dba derp/derpmap: new DERP config package, merge netcheck into magicsock more
Fixes #153
Updates #162
Updates #163

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 4800926006 wgengine/magicsock: add AddrSet appendDests+UpdateDst tests 5 years ago
David Anderson bb93d7aaba wgengine/magicsock: plumb logf throughout, and expose in Options.
Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
Brad Fitzpatrick f42b9b6c9a wgengine/magicsock: don't discard UDP packet on UDP+DERP race
Fixes #155

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Anderson f265603110 wgengine/magicsock: fix data race in ReceiveIPv4.
The UDP reader goroutine was clobbering `n` and `err` from the
main goroutine, whose accesses are not synchronized the way `b` is.

Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson 77354d4617 wgengine/magicsock: unblock wireguard-go's read on magicsock shutdown.
wireguard-go closes magicsock, and expects this to unblock reads
so that its internal goroutines can wind down. We were incorrectly
blocking the read indefinitey and breaking this contract.

Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson fdee5fb639 wgengine/magicsock: don't mutexly reach inside Conn to tweak DERP settings.
Signed-off-by: David Anderson <danderson@tailscale.com>
5 years ago
Brad Fitzpatrick 61d83f759b wgengine/magicsock: remove redundant derpMagicIP comparison
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Anderson ef31dd7bb5 wgengine/magicsock: check all 3 fast paths independently.
The previous code would skip the DERP short-circuit if roamAddr
was set, which is not what we wanted. More generally, hitting
any of the fast path conditions is a direct return, so we can
just have 3 standalone branches rather than 'else if' stuff.

Signed-Off-By: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson 05a52746a4 wgengine/magicsock: fix destination selection logic to work with DERP.
The effect is subtle: when we're not spraying packets, and have not yet
figured out a curAddr, and we're not spraying, we end up sending to
whatever the first IP is in the iteration order. In English, that
means "when we have no idea where to send packets, and we've given
up on sending to everyone, just send to the first addr we see in
the list."

This is, in general, what we want, because the addrs are in sorted
preference order, low to high, and DERP is the least preferred
destination. So, when we have no idea where to send, send to DERP,
right?

... Except for very historical reasons, appendDests iterated through
addresses in _reverse_ order, most preferred to least preferred.
crawshaw@ believes this was part of the earliest handshaking
algorithm magicsock had, where it slowly iterated through possible
destinations and poked handshakes to them one at a time.

Anyway, because of this historical reverse iteration, in the case
described above of "we have no idea where to send", the code would
end up sending to the _most_ preferred candidate address, rather
than the _least_ preferred. So when in doubt, we'd end up firing
packets into the blackhole of some LAN address that doesn't work,
and connectivity would not work.

This case only comes up if all your non-DERP connectivity options
have failed, so we more or less failed to detect it because we
didn't have a pathological test box deployed. Worse, codependent
bug 2839854994 made DERP accidentally
work sometimes anyway by incorrectly exploiting roamAddr behavior,
albeit at the cost of making DERP traffic symmetric. In fixing
DERP to once again be asymmetric, we effectively removed the
bandaid that was concealing this bug.

Signed-Off-By: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson 97e58ad44d wgengine/magicsock: only set addrByKey once in CreateEndpoint.
Signed-Off-By: David Anderson <danderson@tailscale.com>
5 years ago
Brad Fitzpatrick fbab12c94c wgengine/magicsock: skip netcheck if external STUN aren't in use
Updates #146 (not a complete fix yet probably)
5 years ago
Brad Fitzpatrick fe0051fafd wgengine/magicsock: expand AddrSet.addrs comment 5 years ago
David Anderson 2839854994 wgengine/magicsock: never set a DERP server as a roamAddr.
DERP traffic is asymmetric by design, with nodes always sending
to their peer's home DERP server. However, if roamAddr is set,
magicsock will always push data there, rather than let DERP
server selection do its thing, so we end up accidentally
creating a symmetric flow.

Signed-Off-By: David Anderson <danderson@tailscale.com>
5 years ago
David Anderson 4f5c0da1ae wgengine/magicsock: log when home DERP server changes. 5 years ago
Brad Fitzpatrick 6978b93bdd derp, magicsock: track home (preferred) vs visiting connections for stats 5 years ago
Brad Fitzpatrick 12b77f30ad wgengine/magicsock: close stale DERP connections 5 years ago
Brad Fitzpatrick 2cff9016e4 net/dnscache: add overly simplistic DNS cache package for selective use
I started to write a full DNS caching resolver and I realized it was
overkill and wouldn't work on Windows even in Go 1.14 yet, so I'm
doing this tiny one instead for now, just for all our netcheck STUN
derp lookups, and connections to DERP servers. (This will be caching a
exactly 8 DNS entries, all ours.)

Fixes #145 (can be better later, of course)
5 years ago
Brad Fitzpatrick a36ccb8525 wgengine/magicsock: actually add to the activeDerp map
Fixes bug just introduced in 8f9849c140; not tested enough :(
5 years ago
Brad Fitzpatrick 8f9849c140 wgengine/magicsock: collapse three DERP maps down into one 5 years ago
Brad Fitzpatrick 40ebba1373 magicsock: use [unexpected] convention more
Fixes #136 (not entirely, but we have a convention now)
5 years ago
Brad Fitzpatrick 848a2bddf0 wgengine/magicsock: update set of DERP nodes 5 years ago
David Crawshaw 7932481b95 magicsock: lookup AddrSet by key from DERP
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
Brad Fitzpatrick eac62ec5ff ipn, wgengine/magicsock: add ipn.Prefs.DisableDERP bool
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick bf704a5218 derp: protocol negotiation, add v2: send src pub keys to clients in packets
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Crawshaw a33419167b magicsock: plumb through derpTLSConfig variable (for testing)
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
David Crawshaw 34859f8e7d wgengine, magicsock: add a CreateBind method
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
Brad Fitzpatrick b27d4c017a magicsock, wgengine, ipn, controlclient: plumb regular netchecks to map poll
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 724c37fb41 wgengine/magicsock: start tracking nearest DERP node 5 years ago
Brad Fitzpatrick 4675c70464 wgengine/magicsock: check STUN regularly 5 years ago
Brad Fitzpatrick bc7bc43fb8 magicsock, interfaces: move some code from magicsock to interfaces 5 years ago
Brad Fitzpatrick af7a01d6f0 wgengine/magicsock: drop donec channel, rename epUpdateCtx to serve its purpose 5 years ago
David Crawshaw cc4afa775f magicsock: rate limit send error log messages
The x/time/rate dependency adds 24kb to tailscaled binary size.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
David Crawshaw 0752c77dc2 magicsock: keep DERP magic IPs out of the address map
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
David Crawshaw c6550135d5 magicsock: remove the index from indexedAddrs
The value predates the introduction of AddrSet which replaces
the index by tracking curAddr directly.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
Brad Fitzpatrick 1abf2da392 wgengine/magicsock: reset favorite address on handshakes
Updates #92 (not a complete fix; could be better/faster?)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 67ede8d6d2 wgengine, magicsock: fix SetPrivateKey data race
Updates #112

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick c185e6b4b0 stunner: support IPv6, add latency info to callbacks, use unique TxIDs per retry
And some more docs.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Crawshaw a6ad3c46e2 magicsock: spray some normal packets after a handshake
In particular, this is designed to catch the case where a
HandshakeInitiation packet is sent out but the intermediate NATs
have not been primed, so the packet passes over DERP.
In that case, the HandshakeResponse also comes back over DERP,
and the connection proceeds via DERP without ever trying to punch
through the NAT.

With this change, the HandshakeResponse (which was sprayed out
and so primed one NAT) triggers an UpdateDst, which triggers
the extra spray logic.

(For this to work, there has to be an initial supply of packets
to send on to a peer for the three seconds following a handshake.
The source of these packets is left as a future exercise.)

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
Brad Fitzpatrick 8696b17b5f wgengine/magicsock: turn off DERP log spamminess by default
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 7a3b91390b wgengine/magicsock: fix crash in Send when Endpoint isn't an AddrSet
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
David Crawshaw 868cfae84f wgengine, magicsock: adjust for wireguard-go conn/device package split
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
5 years ago
Brad Fitzpatrick cc7b9b0dff control/controlclient: fix priority of DERP server, add comment 5 years ago
Brad Fitzpatrick c02f4b5a1f control/controlclient: add temporary mechanism to force derp on
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago
Brad Fitzpatrick 525bf1f3d2 wgengine/magicsock: remember fixed port number preference
So LinkChange events rebind to the same port when possible.
5 years ago
Brad Fitzpatrick 379a3125fd derp, wgengine/magicsock: support more than just packets from Client.Recv
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 years ago