Commit Graph

1443 Commits (13f8a669d5cd3bd2af39c693f4878ecbbb4d56d1)

Author SHA1 Message Date
Jordan Whited 8b47322acc
wgengine/magicsock: implement probing of UDP path lifetime (#10844)
This commit implements probing of UDP path lifetime on the tail end of
an active direct connection. Probing configuration has two parts -
Cliffs, which are various timeout cliffs of interest, and
CycleCanStartEvery, which limits how often a probing cycle can start,
per-endpoint. Initially a statically defined default configuration will
be used. The default configuration has cliffs of 10s, 30s, and 60s,
with a CycleCanStartEvery of 24h. Probing results are communicated via
clientmetric counters. Probing is off by default, and can be enabled
via control knob. Probing is purely informational and does not yet
drive any magicsock behaviors.

Updates #540

Signed-off-by: Jordan Whited <jordan@tailscale.com>
4 months ago
James Tucker 7e3bcd297e go.mod,wgengine/netstack: bump gvisor
Updates #8043

Signed-off-by: James Tucker <james@tailscale.com>
4 months ago
Claire Wang 213d696db0
magicsock: mute noisy expected peer mtu related error (#10870) 4 months ago
Andrew Dunham 7a0392a8a3 wgengine/netstack: expose gVisor metrics through expvar
When tailscaled is run with "-debug 127.0.0.1:12345", these metrics are
available at:
    http://localhost:12345/debug/metrics

Updates #8210

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I19db6c445ac1f8344df2bc1066a3d9c9030606f8
4 months ago
Andrew Dunham 6540d1f018 wgengine/router: look up absolute path to netsh.exe on Windows
This is in response to logs from a customer that show that we're unable
to run netsh due to the following error:

    router: firewall: adding Tailscale-Process rule to allow UDP for "C:\\Program Files\\Tailscale\\tailscaled.exe" ...
    router: firewall: error adding Tailscale-Process rule: exec: "netsh": cannot run executable found relative to current directory:

There's approximately no reason to ever dynamically look up the path of
a system utility like netsh.exe, so instead let's first look for it
in the System32 directory and only if that fails fall back to the
previous behaviour.

Updates #10804

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I68cfeb4cab091c79ccff3187d35f50359a690573
5 months ago
Jordan Whited b084888e4d
wgengine/magicsock: fix typos in docs (#10729)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
5 months ago
Andrew Lytvynov 2716250ee8
all: cleanup unused code, part 2 (#10670)
And enable U1000 check in staticcheck.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
5 months ago
Andrew Lytvynov 1302bd1181
all: cleanup unused code, part 1 (#10661)
Run `staticcheck` with `U1000` to find unused code. This cleans up about
a half of it. I'll do the other half separately to keep PRs manageable.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
5 months ago
Jordan Whited 685b853763
wgengine/magicsock: fix handling of derp.PeerGoneMessage (#10589)
The switch in Conn.runDerpReader() on the derp.ReceivedMessage type
contained cases other than derp.ReceivedPacket that fell through to
writing to c.derpRecvCh, which should only be reached for
derp.ReceivedPacket. This can result in the last/previous
derp.ReceivedPacket to be re-handled, effectively creating a duplicate
packet. If the last derp.ReceivedPacket happens to be a
disco.CallMeMaybe it may result in a disco ping scan towards the
originating peer on the endpoints contained.

The change in this commit moves the channel write on c.derpRecvCh and
subsequent select awaiting the result into the derp.ReceivedMessage
case, preventing it from being reached from any other case. Explicit
continue statements are also added to non-derp.ReceivedPacket cases
where they were missing, in order to signal intent to the reader.

Fixes #10586

Signed-off-by: Jordan Whited <jordan@tailscale.com>
6 months ago
Andrew Dunham 727acf96a6 net/netcheck: use DERP frames as a signal for home region liveness
This uses the fact that we've received a frame from a given DERP region
within a certain time as a signal that the region is stil present (and
thus can still be a node's PreferredDERP / home region) even if we don't
get a STUN response from that region during a netcheck.

This should help avoid DERP flaps that occur due to losing STUN probes
while still having a valid and active TCP connection to the DERP server.

RELNOTE=Reduce home DERP flapping when there's still an active connection

Updates #8603

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: If7da6312581e1d434d5c0811697319c621e187a0
6 months ago
Naman Sood 97f84200ac
wgengine/router: implement UpdateMagicsockPort for CallbackRouter (#10494)
Updates #9084.

Signed-off-by: Naman Sood <mail@nsood.in>
6 months ago
Naman Sood d46a4eced5
util/linuxfw, wgengine: allow ingress to magicsock UDP port on Linux (#10370)
* util/linuxfw, wgengine: allow ingress to magicsock UDP port on Linux

Updates #9084.

Currently, we have to tell users to manually open UDP ports on Linux when
certain firewalls (like ufw) are enabled. This change automates the process of
adding and updating those firewall rules as magicsock changes what port it
listens on.

Signed-off-by: Naman Sood <mail@nsood.in>
6 months ago
Naman Sood 0a59754eda linuxfw,wgengine/route,ipn: add c2n and nodeattrs to control linux netfilter
Updates tailscale/corp#14029.

Signed-off-by: Naman Sood <mail@nsood.in>
6 months ago
James Tucker 215f657a5e wgengine/router: create netfilter runner in setNetfilterMode
This will enable the runner to be replaced as a configuration side
effect in a later change.

Updates tailscale/corp#14029

Signed-off-by: James Tucker <james@tailscale.com>
6 months ago
Andrew Lytvynov 263e01c47b
wgengine/filter: add protocol-agnostic packet checker (#10446)
For use in ACL tests, we need a way to check whether a packet is allowed
not just with TCP, but any protocol.

Updates #3561

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
6 months ago
Jordan Whited 5e861c3871
wgengine/netstack: disable RACK on Windows (#10402)
Updates #9707

Signed-off-by: Jordan Whited <jordan@tailscale.com>
6 months ago
Jordan Whited 1af7f5b549
wgengine/magicsock: fix typo in Conn.handlePingLocked() (#10365)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
6 months ago
Brad Fitzpatrick 4d196c12d9 health: don't report a warning in DERP homeless mode
Updates #3363
Updates tailscale/corp#396

Change-Id: Ibfb0496821cb58a78399feb88d4206d81e95ca0f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick 3bd382f369 wgengine/magicsock: add DERP homeless debug mode for testing
In DERP homeless mode, a DERP home connection is not sought or
maintained and the local node is not reachable.

Updates #3363
Updates tailscale/corp#396

Change-Id: Ibc30488ac2e3cfe4810733b96c2c9f10a51b8331
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Jordan Whited 2ff54f9d12
wgengine/magicsock: move trustBestAddrUntil forward on non-disco rx (#10274)
This is gated behind the silent disco control knob, which is still in
its infancy. Prior to this change disco pong reception was the only
event that could move trustBestAddrUntil forward, so even though we
weren't heartbeating, we would kick off discovery pings every
trustUDPAddrDuration and mirror to DERP.

Updates #540

Signed-off-by: Jordan Whited <jordan@tailscale.com>
7 months ago
Jordan Whited c99488ea19
wgengine/magicsock: fix typo in endpoint.sendDiscoPing() docs (#10232)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
7 months ago
Jordan Whited e848736927
control/controlknobs,wgengine/magicsock: implement SilentDisco toggle (#10195)
This change exposes SilentDisco as a control knob, and plumbs it down to
magicsock.endpoint. No changes are being made to magicsock.endpoint
disco behavior, yet.

Updates #540

Signed-off-by: Jordan Whited <jordan@tailscale.com>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Charlotte Brandhorst-Satzkorn 839fee9ef4 wgengine/magicsock: handle wireguard only clean up and log messages
This change updates log messaging when cleaning up wireguard only peers.
This change also stops us unnecessarily attempting to clean up disco
pings for wireguard only endpoints.

Updates #7826

Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
7 months ago
Maisem Ali d0f2c0664b wgengine/netstack: standardize var names in UpdateNetstackIPs
Updates #cleanup

Signed-off-by: Maisem Ali <maisem@tailscale.com>
7 months ago
Maisem Ali eaf8aa63fc wgengine/netstack: remove unnecessary map in UpdateNetstackIPs
Updates #cleanup

Signed-off-by: Maisem Ali <maisem@tailscale.com>
7 months ago
Maisem Ali d601c81c51 wgengine/netstack: use netip.Prefix as map keys
Updates #cleanup

Signed-off-by: Maisem Ali <maisem@tailscale.com>
7 months ago
James Tucker 6f69fe8ad7 wgnengine: remove unused field in userspaceEngine
Updates #cleanup

Signed-off-by: James Tucker <james@tailscale.com>
7 months ago
Brad Fitzpatrick 514539b611 wgengine/magicsock: close disco listeners on Conn.Close, fix Linux root TestNewConn
TestNewConn now passes as root on Linux. It wasn't closing the BPF
listeners and their goroutines.

The code is still a mess of two Close overlapping code paths, but that
can be refactored later. For now, make the two close paths more similar.

Updates #9945

Change-Id: I8a3cf5fb04d22ba29094243b8e645de293d9ed85
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
James Tucker 7df6f8736a wgengine/netstack: only add addresses to correct protocols
Prior to an earlier netstack bump this code used a string conversion
path to cover multiple cases of behavior seemingly checking for
unspecified addresses, adding unspecified addresses to v6. The behavior
is now crashy in netstack, as it is enforcing address length in various
areas of the API, one in particular being address removal.

As netstack is now protocol specific, we must not create invalid
protocol addresses - an address is v4 or v6, and the address value
contained inside must match. If a control path attempts to do something
otherwise it is now logged and skipped rather than incorrect addressing
being added.

Fixes tailscale/corp#15377

Signed-off-by: James Tucker <james@tailscale.com>
7 months ago
Jordan Whited 891d964bd4
wgengine/magicsock: simplify tryEnableUDPOffload() (#9872)
Don't assume Linux lacks UDP_GRO support if it lacks UDP_SEGMENT
support. This mirrors a similar change in wireguard/wireguard-go@177caa7
for consistency sake. We haven't found any issues here, just being
overly paranoid.

Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
8 months ago
Brad Fitzpatrick c363b9055d tstest/integration: add tests for tun mode (requiring root)
Updates #7894

Change-Id: Iff0b07b21ae28c712dd665b12918fa28d6f601d0
Co-authored-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Brad Fitzpatrick a6270826a3 wgengine/magicsock: fix data race regression in disco ping callbacks
Regression from c15997511d. The callback could be run multiple times
from different endpoints.

Fixes #9801

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Maisem Ali 5297bd2cff cmd/tailscaled,net/tstun: fix data race on start-up in TUN mode
Fixes #7894

Change-Id: Ice3f8019405714dd69d02bc07694f3872bb598b8

Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Maisem Ali <maisem@tailscale.com>
8 months ago
Brad Fitzpatrick 8b630c91bc wgengine/filter: use slices.Contains in another place
We keep finding these.

Updates #cleanup

Change-Id: Iabc049b0f8da07341011356f0ecd5315c33ff548
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Maisem Ali fbfee6a8c0 cmd/containerboot: use linuxfw.NetfilterRunner
This migrates containerboot to reuse the NetfilterRunner used
by tailscaled instead of manipulating iptables rule itself.
This has the added advantage of now working with nftables and
we can potentially drop the `iptables` command from the container
image in the future.

Updates #9310

Co-authored-by: Irbe Krumina <irbe@tailscale.com>
Signed-off-by: Maisem Ali <maisem@tailscale.com>
8 months ago
Maisem Ali 05a1f5bf71 util/linuxfw: move detection logic
Just a refactor to consolidate the firewall detection logic in a single
package so that it can be reused in a later commit by containerboot.

Updates #9310

Signed-off-by: Maisem Ali <maisem@tailscale.com>
8 months ago
Val 249edaa349 wgengine/magicsock: add probed MTU metrics
Record the number of MTU probes sent, the total bytes sent, the number of times
we got a successful return from an MTU probe of a particular size, and the max
MTU recorded.

Updates #311

Signed-off-by: Val <valerie@tailscale.com>
8 months ago
Val 893bdd729c disco,net/tstun,wgengine/magicsock: probe peer MTU
Automatically probe the path MTU to a peer when peer MTU is enabled, but do not
use the MTU information for anything yet.

Updates #311

Signed-off-by: Val <valerie@tailscale.com>
8 months ago
Brad Fitzpatrick 6f36f8842c cmd/tailscale, magicsock: add debug command to flip DERP homes
For testing netmap patchification server-side.

Updates #1909

Change-Id: Ib1d784bd97b8d4a31e48374b4567404aae5280cc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Brad Fitzpatrick f991c8a61f tstest: make ResourceCheck panic on parallel tests
To find potential flakes earlier.

Updates #deflake-effort

Change-Id: I52add6111d660821c3a23d4b1dd032821344bc48
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Jordan Whited eb22c0dfc7
wgengine/magicsock: use binary.NativeEndian for UDP GSO control data (#9640)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
8 months ago
Val 4130851f12 wgengine/magicsock: probe but don't use path MTU from CLI ping
When sending a CLI ping with a specific size, continue to probe all possible UDP
paths to the peer until we find one with a large enough MTU to accommodate the
ping. Record any peer path MTU information we discover (but don't use it for
anything other than CLI pings).

Updates #311

Signed-off-by: Val <valerie@tailscale.com>
8 months ago
Val 67926ede39 wgengine/magicsock: add MTU to addrLatency and rename to addrQuality
Add a field to record the wire MTU of the path to this address to the
addrLatency struct and rename it addrQuality.

Updates #311

Signed-off-by: Val <valerie@tailscale.com>
8 months ago
Brad Fitzpatrick 425cf9aa9d tailcfg, all: use []netip.AddrPort instead of []string for Endpoints
It's JSON wire compatible.

Updates #cleanup

Change-Id: Ifa5c17768fec35b305b06d75eb5f0611c8a135a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Brad Fitzpatrick 5f5c9142cc util/slicesx: add EqualSameNil, like slices.Equal but same nilness
Then use it in tailcfg which had it duplicated a couple times.

I think we have it a few other places too.

And use slices.Equal in wgengine/router too. (found while looking for callers)

Updates #cleanup

Change-Id: If5350eee9b3ef071882a3db29a305081e4cd9d23
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
James Tucker 324f0d5f80 cmd/cloner,*: revert: optimize nillable slice cloner
This reverts commit ee90cd02fd.

The outcome is not identical for empty slices. Cloner really needs
tests!

Updates #9601

Signed-off-by: James Tucker <james@tailscale.com>
8 months ago
James Tucker ee90cd02fd cmd/cloner,*: optimize nillable slice cloner
A wild @josharian appears with a good suggestion for a refactor, thanks
Josh!

Updates #9410
Signed-off-by: James Tucker <james@tailscale.com>
8 months ago
Jordan Whited 16fa3c24ea
wgengine/magicsock: use x/sys/unix constants for UDP GSO (#9597)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
8 months ago
Andrea Barisani f50b2a87ec wgengine/netstack: refactor address construction and conversion
Updates #9252
Updates #9253

Signed-off-by: Andrea Barisani <andrea@inversepath.com>
Signed-off-by: James Tucker <james@tailscale.com>
8 months ago
Andrea Barisani b5b4298325 go.mod,*: bump gvisor
Updates #9253

Signed-off-by: Andrea Barisani <andrea@inversepath.com>
Signed-off-by: James Tucker <james@tailscale.com>
8 months ago