Commit Graph

191 Commits (22024a38c3ef6a1aa656e664ec0dfa7b790fd9fe)

Author SHA1 Message Date
Brad Fitzpatrick 105a820622 wgengine/magicsock: skip an endpoint update at start-up
At startup the client doesn't yet have the DERP map so can't do STUN
queries against DERP servers, so it only knows it local interface
addresses, not its STUN-mapped addresses.

We were reporting the interface-local addresses to control, getting
the DERP map, and then immediately reporting the full set of
updates. That was an extra HTTP request to control, but worse: it was
an extra broadcast from control out to all the peers in the network.

Now, skip the initial update if there are no stun results and we don't
have a DERP map.

More work remains optimizing start-up requests/map updates, but this
is a start.

Updates tailscale/corp#557
4 years ago
Brad Fitzpatrick 2076a50862 wgengine/magicsock: finish a comment sentence that ended prematurely 4 years ago
Brad Fitzpatrick 3e4c46259d wgengine/magicsock: don't do netchecks either when network is down
A continuation of 6ee219a25d

Updates #640
4 years ago
Brad Fitzpatrick 6ee219a25d ipn, wgengine, magicsock, tsdns: be quieter and less aggressive when offline
If no interfaces are up, calm down and stop spamming so much. It was
noticed as especially bad on Windows, but probably was bad
everywhere. I just have the best network conditions testing on a
Windows VM.

Updates #604
4 years ago
Christina Wen 48fbe93e72
wgengine/magicsock: clarify pre-disco 'tailscale ping' error message
This change clarifies the error message when a user pings a peer that is using an outdated version of Tailscale.
4 years ago
Josh Bleecher Snyder 0c0239242c wgengine/magicsock: make discoPingPurpose a stringer
It was useful for debugging once, it'll probably be useful again.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
4 years ago
Josh Bleecher Snyder 57e642648f wgengine/magicsock: fix typo in comment 4 years ago
Brad Fitzpatrick 756d6a72bd wgengine: lazily create peer wireguard configs more explicitly
Rather than consider bigs jumps in last-received-from activity as a
signal to possibly reconfigure the set of wireguard peers to have
configured, instead just track the set of peers that are currently
excluded from the configuration. Easier to reason about.

Also adds a bit more logging.

This might fix an error we saw on a machine running a recent unstable
build:

2020-08-26 17:54:11.528033751 +0000 UTC: 8.6M/92.6M magicsock: [unexpected] lazy endpoint not created for [UcppE], d:42a770f678357249
2020-08-26 17:54:13.691305296 +0000 UTC: 8.7M/92.6M magicsock: DERP packet received from idle peer [UcppE]; created=false
2020-08-26 17:54:13.691383687 +0000 UTC: 8.7M/92.6M magicsock: DERP packet from unknown key: [UcppE]

If it does happen again, though, we'll have more logs.
4 years ago
halulu f27a57911b
cmd/tailscale: add derp and endpoints status (#703)
cmd/tailscale: add local node's information to status output (by default)

RELNOTE=yes

Updates #477

Signed-off-by: Halulu <lzjluzijie@gmail.com>
4 years ago
David Crawshaw dd2c61a519 magicsock: call RequestStatus when DERP connects
Second attempt.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
4 years ago
David Crawshaw a67b174da1 Revert "magicsock: call RequestStatus when DERP connects"
Seems to break linux CI builder. Cannot reproduce locally,
so attempting a rollback.

This reverts commit cd7bc02ab1.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
4 years ago
David Crawshaw cd7bc02ab1 magicsock: call RequestStatus when DERP connects
Without this, a freshly started ipn client will be stuck in the
"Starting" state until something triggers a call to RequestStatus.
Usually a UI does this, but until then we can sit in this state
until poked by an external event, as is evidenced by our e2e tests
locking up when DERP is attached.

(This only recently became a problem when we enabled lazy handshaking
everywhere, otherwise the wireugard tunnel creation would also
trigger a RequestStatus.)

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
4 years ago
Brad Fitzpatrick f6dc47efe4 tailcfg, controlclient, magicsock: add control feature flag to enable DRPO
Updates #150
4 years ago
Brad Fitzpatrick 85c3d17b3c wgengine/magicsock: use disco ping src as a candidate endpoint
Consider:

   Hard NAT (A) <---> Hard NAT w/ mapped port (B)

If A sends a packet to B's mapped port, A can disco ping B directly,
with low latency, without DERP.

But B couldn't establish a path back to A and needed to use DERP,
despite already logging about A's endpoint and adding a mapping to it
for other purposes (the wireguard conn.Endpoint lookup also needed
it).

This adds the tracking to discoEndpoint too so it'll be used for
finding a path back.

Fixes tailscale/corp#556

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 0512fd89a1 wgengine/magicsock: simplify handlePingLocked
It's no longer true that 'de may be nil'
4 years ago
Brad Fitzpatrick 84dc891843 cmd/tailscale/cli: add ping subcommand
For example:

$ tailscale ping -h
USAGE
  ping <hostname-or-IP>

FLAGS
  -c 10                   max number of pings to send
  -stop-once-direct true  stop once a direct path is established
  -verbose false          verbose output

$ tailscale ping mon.ts.tailscale.com
pong from monitoring (100.88.178.64) via DERP(sfo) in 65ms
pong from monitoring (100.88.178.64) via DERP(sfo) in 252ms
pong from monitoring (100.88.178.64) via [2604:a880:2:d1::36:d001]:41641 in 33ms

Fixes #661

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 9a346fd8b4 wgengine,magicsock: fix two lazy wireguard config issues
1) we weren't waking up a discoEndpoint that once existed and
   went idle for 5 minutes and then got a disco message again.

2) userspaceEngine.noteReceiveActivity had a buggy check; fixed
   and added a test
4 years ago
Brad Fitzpatrick cff737786e wgengine/magicsock: fix lazy config deadlock, document more lock ordering
This removes the atomic bool that tried to track whether we needed to acquire
the lock on a future recursive call back into magicsock. Unfortunately that
hack doesn't work because we also had a lock ordering issue between magicsock
and userspaceEngine (see issue). This documents that too.

Fixes #644
4 years ago
Brad Fitzpatrick 2bd9ad4b40 wgengine: fix deadlock between engine and magicsock 4 years ago
Brad Fitzpatrick 7c38db0c97 wgengine/magicsock: don't deadlock on pre-disco Endpoints w/ lazy wireguard configs
Fixes tailscale/tailscale#637
4 years ago
Brad Fitzpatrick 4987a7d46c wgengine/magicsock: when hard NAT, add stun-ipv4:static-port as candidate
If a node is behind a hard NAT and is using an explicit local port
number, assume they might've mapped a port and add their public IPv4
address with the local tailscaled's port number as a candidate endpoint.
4 years ago
Brad Fitzpatrick bfcb0aa0be wgengine/magicsock: deflake tests, Close deadlock again
Better fix than 37903a9056

Fixes tailscale/corp#533
4 years ago
Brad Fitzpatrick cb970539a6 wgengine/magicsock: remove TODO comment that's no longer applicable 4 years ago
Brad Fitzpatrick 915f65ddae wgengine/magicsock: stop disco activity on IPN stop
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c180abd7cf wgengine/magicsock: merge errClosed and errConnClosed 4 years ago
Brad Fitzpatrick d55fdd4669 wgengine/magicsock: update, flesh out a TODO 4 years ago
Brad Fitzpatrick 58b721f374 wgengine/magicsock: deflake some tests with an ugly hack
Starting with fe68841dc7, some e2e tests
got flaky. Rather than debug them (they're gnarly), just revert to the old
behavior as far as those tests are concerned. The tests were somehow
using magicsock without a private key and expecting it to do ... something.

My goal with fe68841dc7 was to stop log spam
and unnecessary work I saw on the iOS app when when stopping the app.

Instead, only stop doing that work on any transition from
once-had-a-private-key to no-longer-have-a-private-key. That fixes
what I wanted to fix while still making the mysterious e2e tests
happy.
4 years ago
David Anderson 0249236cc0 ipn/ipnstate: record assigned Tailscale IPs.
wgengine/magicsock: use ipnstate to find assigned Tailscale IPs.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson f582eeabd1 wgengine/magicsock: add a test for active path discovery.
Uses natlab only, because the point of this active discovery test is going to be
that it should get through a lot of obstacles.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick 37903a9056 wgengine/magicsock: fix occasional deadlock on Conn.Close on c.derpStarted
The deadlock was:

* Conn.Close was called, which acquired c.mu
* Then this goroutine scheduled:

    if firstDerp {
        startGate = c.derpStarted
        go func() {
            dc.Connect(ctx)
            close(c.derpStarted)
        }()
    }

* The getRegion hook for that derphttp.Client then ran, which also
  tries to acquire c.mu.

This change makes that hook first see if we're already in a closing
state and then it can pretend that region doesn't exist.
4 years ago
Brad Fitzpatrick fe68841dc7 wgengine/magicsock: log better with less spam on transition to stopped state
Required a minor test update too, which now needs a private key to get far
enough to test the thing being tested.
4 years ago
Brad Fitzpatrick e298327ba8 wgengine/magicsock: remove overkill, slow reflect.DeepEqual of NetworkMap
No need to allocate or compare all the fields we don't care about.
4 years ago
Brad Fitzpatrick 16a9cfe2f4 wgengine: configure wireguard peers lazily, as needed
wireguard-go uses 3 goroutines per peer (with reasonably large stacks
& buffers).

Rather than tell wireguard-go about all our peers, only tell it about
peers we're actively communicating with. That means we need hooks into
magicsock's packet receiving path and tstun's packet sending path to
lazily create a wireguard peer on demand from the network map.

This frees up lots of memory for iOS (where we have almost nothing
left for larger domains with many users).

We should ideally do this in wireguard-go itself one day, but that'd
be a pretty big change.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 5066b824a6 wgengine/magicsock: don't log about disco ping timeouts if we have a working address
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c06d2a8513 wgengine/magicsock: fix typo in comment 4 years ago
Brad Fitzpatrick bf195cd3d8 wgengine/magicsock: reduce log verbosity of discovery messages
Don't log heartbeat pings & pongs. Track the reason for pings and then
only log the ping/pong traffic if it was for initial path discovery.
4 years ago
Brad Fitzpatrick 10ac066013 all: fix vet warnings 4 years ago
Brad Fitzpatrick d74c9aa95b wgengine/magicsock: update comment, fix earlier commit
891898525c had a continue that meant the didCopy synchronization never ran.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c976264bd1 wgengine/magicsock: gofmt 4 years ago
Dmytro Shynkevych f3e2b65637
wgengine/magicsock: time.Sleep -> time.After
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Dmytro Shynkevych 380ee76d00
wgengine/magicsock: make time.Sleep in runDerpReader respect cancellation.
Before this patch, the 250ms sleep would not be interrupted by context cancellation,
which would result in the goroutine sometimes lingering in tests (100ms grace period).

Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Dmytro Shynkevych 891898525c
wgengine/magicsock: make receive from didCopy respect cancellation.
Very rarely, cancellation occurs between a successful send on derpRecvCh
and a call to copyBuf on the receiving side.
Without this patch, this situation results in <-copyBuf blocking indefinitely.

Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Brad Fitzpatrick a2267aae99 wgengine: only launch pingers for peers predating the discovery protocol
Peers advertising a discovery key know how to speak the discovery
protocol and do their own heartbeats to get through NATs and keep NATs
open. No need for the pinger except for with legacy peers.
4 years ago
Dmytro Shynkevych 2f15894a10 wgengine/magicsock: wait for derphttp client goroutine to exit
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Brad Fitzpatrick 6c74065053 wgengine/magicsock, tstest/natlab: start hooking up natlab to magicsock
Also adds ephemeral port support to natlab.

Work in progress.

Pairing with @danderson.
4 years ago
Brad Fitzpatrick bd59bba8e6 wgengine/magicsock: stop discoEndpoint timers on Close
And add some defensive early returns on c.closed.
4 years ago
Brad Fitzpatrick de875a4d87 wgengine/magicsock: remove DisableSTUNForTesting 4 years ago
Brad Fitzpatrick 5c6d8e3053 netcheck, tailcfg, interfaces, magicsock: survey UPnP, NAT-PMP, PCP
Don't do anything with UPnP, NAT-PMP, PCP yet, but see how common they
are in the wild.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 6196b7e658 wgengine/magicsock: change API to not permit disco key changes
Generate the disco key ourselves and give out the public half instead.

Fixes #525
4 years ago
Brad Fitzpatrick 5132edacf7 wgengine/magicsock: fix data race from undocumented wireguard-go requirement
Endpoints need to be Stringers apparently.

Fixes tailscale/corp#422
4 years ago