Commit Graph

366 Commits (cff737786ed9907fa3c9da9d6001b5d6c8f1a315)

Author SHA1 Message Date
Brad Fitzpatrick cff737786e wgengine/magicsock: fix lazy config deadlock, document more lock ordering
This removes the atomic bool that tried to track whether we needed to acquire
the lock on a future recursive call back into magicsock. Unfortunately that
hack doesn't work because we also had a lock ordering issue between magicsock
and userspaceEngine (see issue). This documents that too.

Fixes #644
4 years ago
Brad Fitzpatrick 43bc86588e wgengine/monitor: log RTM_DELROUTE details, fix format strings
Updates #643
4 years ago
Brad Fitzpatrick 2bd9ad4b40 wgengine: fix deadlock between engine and magicsock 4 years ago
Brad Fitzpatrick 7c38db0c97 wgengine/magicsock: don't deadlock on pre-disco Endpoints w/ lazy wireguard configs
Fixes tailscale/tailscale#637
4 years ago
Brad Fitzpatrick 4987a7d46c wgengine/magicsock: when hard NAT, add stun-ipv4:static-port as candidate
If a node is behind a hard NAT and is using an explicit local port
number, assume they might've mapped a port and add their public IPv4
address with the local tailscaled's port number as a candidate endpoint.
4 years ago
Brad Fitzpatrick bfcb0aa0be wgengine/magicsock: deflake tests, Close deadlock again
Better fix than 37903a9056

Fixes tailscale/corp#533
4 years ago
Brad Fitzpatrick da3b50ad88 wgengine/filter: omit logging for all v6 multicast, remove debug panic :( 4 years ago
Dmytro Shynkevych 28e52a0492
all: dns refactor, add Proxied and PerDomain flags from control (#615)
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Brad Fitzpatrick 3e493e0417 wgengine: fix lazy wireguard config bug on sent packet minute+ later
A comparison operator was backwards.

The bad case went:

* device A send packet to B at t=1s
* B gets added to A's wireguard config
* B gets packet

(5 minutes pass)

* some other activity happens, causing B to expire
  to be removed from A's network map, since it's
  been over 5 minutes since sent or received activity
* device A sends packet to B at t=5m1s
* normally, B would get added back, but the old send
  time was not zero (we sent earlier!) and the time
  comparison was backwards, so we never regenerated
  the wireguard config.

This also refactors the code for legibility and moves constants up
top, with comments.
4 years ago
Dmytro Shynkevych 8c850947db
router: split off sandboxed path from router_darwin (#624)
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Brad Fitzpatrick cb970539a6 wgengine/magicsock: remove TODO comment that's no longer applicable 4 years ago
Brad Fitzpatrick 915f65ddae wgengine/magicsock: stop disco activity on IPN stop
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick c180abd7cf wgengine/magicsock: merge errClosed and errConnClosed 4 years ago
Brad Fitzpatrick 7cc8fcb784 wgengine/filter: remove leftover debug knob that staticcheck doesn't like 4 years ago
Brad Fitzpatrick b4d97d2532 wgengine/filter: fix IPv4 IGMP spam omission, also omit ff02::16 spam
And add tests.

Fixes #618
Updates #402
4 years ago
Dmytro Shynkevych 2ce2b63239
router: stop iOS subprocess sandbox violations (#617)
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Dmytro Shynkevych 154d1cde05
router: reload systemd-resolved after changing /etc/resolv.conf (#619)
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Brad Fitzpatrick b3fc61b132 wgengine: disable wireguard config trimming for now except iOS w/ many peers
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick d55fdd4669 wgengine/magicsock: update, flesh out a TODO 4 years ago
Brad Fitzpatrick d96d26c22a wgengine/filter: don't spam logs on dropped outgoing IPv6 ICMP or IPv4 IGMP
The OS (tries) to send these but we drop them. No need to worry the
user with spam that we're dropping it.

Fixes #402

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Dmytro Shynkevych c7582dc234
ipn: fix netmap change tracking and dns map generation (#609)
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
Brad Fitzpatrick 3e3c24b8f6 wgengine/packet: add IPVersion field, don't use IPProto to note version
As prep for IPv6 log spam fixes in a future change.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
David Anderson f8e4c75f6b wgengine/magicsock: check slightly less aggressively for connectivity.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick 58b721f374 wgengine/magicsock: deflake some tests with an ugly hack
Starting with fe68841dc7, some e2e tests
got flaky. Rather than debug them (they're gnarly), just revert to the old
behavior as far as those tests are concerned. The tests were somehow
using magicsock without a private key and expecting it to do ... something.

My goal with fe68841dc7 was to stop log spam
and unnecessary work I saw on the iOS app when when stopping the app.

Instead, only stop doing that work on any transition from
once-had-a-private-key to no-longer-have-a-private-key. That fixes
what I wanted to fix while still making the mysterious e2e tests
happy.
4 years ago
David Anderson 41d0c81859 wgengine/magicsock: make disco subtest name more precise.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 9beea8b314 wgengine/magicsock: remove unnecessary use of context.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson b62341d308 wgengine/magicsock: add docstring.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 9265296b33 wgengine/magicsock: don't deadlock on shutdown if sending blocks.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 0249236cc0 ipn/ipnstate: record assigned Tailscale IPs.
wgengine/magicsock: use ipnstate to find assigned Tailscale IPs.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson c3958898f1 tstest/natlab: be a bit more lenient during test shutdown.
There is a race in natlab where we might start shutdown while natlab is still running
a goroutine or two to deliver packets. This adds a small grace period to try and receive
it before continuing shutdown.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 7578c815be wgengine/magicsock: give pinger a more generous packet timeout.
The first packet to transit may take several seconds to do so, because
setup rates in wgengine may result in the initial WireGuard handshake
init to get dropped. So, we have to wait at least long enough for a
retransmit to correct the fault.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson c3994fd77c derp: remove OnlyDisco option.
Active discovery lets us introspect the state of the network stack precisely
enough that it's unnecessary, and dropping the initial DERP packets greatly
slows down tests. Additionally, it's unrealistic since our production network
will never deliver _only_ discovery packets, it'll be all or nothing.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 5455c64f1d wgengine/magicsock: add a test for two facing endpoint-independent NATs.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson f794493b4f wgengine/magicsock: explicitly check path discovery, add a firewall test.
The test proves that active discovery can traverse two facing firewalls.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson f582eeabd1 wgengine/magicsock: add a test for active path discovery.
Uses natlab only, because the point of this active discovery test is going to be
that it should get through a lot of obstacles.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick 37903a9056 wgengine/magicsock: fix occasional deadlock on Conn.Close on c.derpStarted
The deadlock was:

* Conn.Close was called, which acquired c.mu
* Then this goroutine scheduled:

    if firstDerp {
        startGate = c.derpStarted
        go func() {
            dc.Connect(ctx)
            close(c.derpStarted)
        }()
    }

* The getRegion hook for that derphttp.Client then ran, which also
  tries to acquire c.mu.

This change makes that hook first see if we're already in a closing
state and then it can pretend that region doesn't exist.
4 years ago
Brad Fitzpatrick fe68841dc7 wgengine/magicsock: log better with less spam on transition to stopped state
Required a minor test update too, which now needs a private key to get far
enough to test the thing being tested.
4 years ago
Brad Fitzpatrick e298327ba8 wgengine/magicsock: remove overkill, slow reflect.DeepEqual of NetworkMap
No need to allocate or compare all the fields we don't care about.
4 years ago
Brad Fitzpatrick 4970e771ab wgengine: add debug knob to disable the watchdog during debugging
It launches goroutines and interferes with panic-based debugging,
obscuring stacks.
4 years ago
David Anderson 3669296cef wgengine/magicsock: refactor twoDevicePing to make stack construction cleaner.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick 16a9cfe2f4 wgengine: configure wireguard peers lazily, as needed
wireguard-go uses 3 goroutines per peer (with reasonably large stacks
& buffers).

Rather than tell wireguard-go about all our peers, only tell it about
peers we're actively communicating with. That means we need hooks into
magicsock's packet receiving path and tstun's packet sending path to
lazily create a wireguard peer on demand from the network map.

This frees up lots of memory for iOS (where we have almost nothing
left for larger domains with many users).

We should ideally do this in wireguard-go itself one day, but that'd
be a pretty big change.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick 5066b824a6 wgengine/magicsock: don't log about disco ping timeouts if we have a working address
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Brad Fitzpatrick a89d610a3d wgengine/tstun: move sync.Pool to package global
sync.Pools should almost always be packate globals, even though in this
case we only have exactly 1 TUN device anyway, so it matters less.
Still, it's unusual to see a Pool that's not a package global, so move it.
4 years ago
Dmytro Shynkevych c53ab3111d wgengine/router: support legacy resolvconf
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago
David Anderson 189d86cce5 wgengine/router: don't use 88 or 8888 as table/rule numbers.
We originally picked those numbers somewhat at random, but with the idea
that 8 is a traditionally lucky number in Chinese culture. Unfortunately,
"88" is also neo-nazi shorthand language.

Use 52 instead, because those are the digits above the letters
"TS" (tailscale) on a qwerty keyboard, so we're unlikely to collide with
other users. 5, 2 and 52 are also pleasantly culturally meaningless.

Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson 972a42cb33 wgengine/router: fix router_test to match the new marks.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
David Anderson d60917c0f1 wgengine/router: switch packet marks to avoid conflict with Weave Net.
Signed-off-by: David Anderson <danderson@tailscale.com>
4 years ago
Brad Fitzpatrick c06d2a8513 wgengine/magicsock: fix typo in comment 4 years ago
Brad Fitzpatrick bf195cd3d8 wgengine/magicsock: reduce log verbosity of discovery messages
Don't log heartbeat pings & pongs. Track the reason for pings and then
only log the ping/pong traffic if it was for initial path discovery.
4 years ago
Dmytro Shynkevych a3e7252ce6 wgengine/router: use better NetworkManager API
Signed-off-by: Dmytro Shynkevych <dmytro@tailscale.com>
4 years ago