Commit Graph

53 Commits (88133c361e8cc267b9e45c90f357f96084c60a0c)

Author SHA1 Message Date
Brad Fitzpatrick 1336fb740b wgengine/netstack: make netstack MTU be 1280 also
Updates #3878

Change-Id: I1850085b32c8a40d85607b4ad433622c97d96a8d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Mihai Parparita 86069874c9 net/tstun, wgengine: use correct type for counter metrics
We were marking them as gauges, but they are only ever incremented,
thus counter is more appropriate.

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
James Tucker f9e86e64b7 *: use WireGuard where logged, printed or named
Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
James Tucker ae483d3446 wgengine, net/packet, cmd/tailscale: add ICMP echo
Updates tailscale/corp#754

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Tom DNetto 7f45734663 assorted: documentation and readability fixes
This were intended to be pushed to #4408, but in my excitement I
forgot to git push :/ better late than never.

Signed-off-by: Tom DNetto <tom@tailscale.com>
2 years ago
Tom DNetto 9e77660931 net/tstun,wgengine/{.,netstack}: handle UDP magicDNS traffic in netstack
This change wires netstack with a hook for traffic coming from the host
into the tun, allowing interception and handling of traffic to quad-100.

With this hook wired, magicDNS queries over UDP are now handled within
netstack. The existing logic in wgengine to handle magicDNS remains for now,
but its hook operates after the netstack hook so the netstack implementation
takes precedence. This is done in case we need to support platforms with
netstack longer than expected.

Signed-off-by: Tom DNetto <tom@tailscale.com>
2 years ago
Tom DNetto dc71d3559f net/tstun,wgengine: split PreFilterOut into multiple hooks
A subsequent commit implements handling of magicDNS traffic via netstack.
Implementing this requires a hook for traffic originating from the host and
hitting the tun, so we make another hook to support this.

Signed-off-by: Tom DNetto <tom@tailscale.com>
2 years ago
James Tucker 445c04c938
wgengine: inject packetbuffers rather than bytes (#4220)
Plumb the outbound injection path to allow passing netstack
PacketBuffers down to the tun Read, where they are decref'd to enable
buffer re-use. This removes one packet alloc & copy, and reduces GC
pressure by pooling outbound injected packets.

Fixes #2741
Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Josh Bleecher Snyder 0868329936 all: use any instead of interface{}
My favorite part of generics.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2 years ago
Dmytro Shynkevych d9a7205be5 net/tstun: set link speed to SPEED_UNKNOWN
Fixes #3933.

Signed-off-by: Dmytro Shynkevych <dm.shynk@gmail.com>
2 years ago
Brad Fitzpatrick 8267ea0f80 net/tstun: remove TODO that's done
This TODO was both added and fixed in 506c727e3.

As I recall, I wasn't originally going to do it because it seemed
annoying, so I wrote the TODO, but then I felt bad about it and just
did it, but forgot to remove the TODO.

Change-Id: I8f3514809ad69b447c62bfeb0a703678c1aec9a3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 1af26222b6 go.mod: bump netstack, switch to upstream netstack
Now that Go 1.17 has module graph pruning
(https://go.dev/doc/go1.17#go-command), we should be able to use
upstream netstack without breaking our private repo's build
that then depends on the tailscale.com Go module.

This is that experiment.

Updates #1518 (the original bug to break out netstack to own module)
Updates #2642 (this updates netstack, but doesn't remove workaround)

Change-Id: I27a252c74a517053462e5250db09f379de8ac8ff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 41fd4eab5c envknob: add new package for all the strconv.ParseBool(os.Getenv(..))
A new package can also later record/report which knobs are checked and
set. It also makes the code cleaner & easier to grep for env knobs.

Change-Id: Id8a123ab7539f1fadbd27e0cbeac79c2e4f09751
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 506c727e30 ipnlocal, net/{dns,tsaddr,tstun}, wgengine: support MagicDNS on IPv6
Fixes #3660

RELNOTE=MagicDNS now works over IPv6 when CGNAT IPv4 is disabled.

Change-Id: I001e983df5feeb65289abe5012dedd177b841b45
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Josh Bleecher Snyder 73beaaf360 net/tstun: rate limit "self disco out packet" logging
When this happens, it is incredibly noisy in the logs.
It accounts for about a third of all remaining
"unexpected" log lines from a recent investigation.

It's not clear that we know how to fix this,
we have a functioning workaround,
and we now have a (cheap and efficient) metric for this
that we can use for measurements.

So reduce the logging to approximately once per minute.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick cf06f9df37 net/tstun, wgengine: add packet-level and drop metrics
Primarily tstun work, but some MagicDNS stuff spread into wgengine.

No wireguard reconfig metrics (yet).

Updates #3307

Change-Id: Ide768848d7b7d0591e558f118b553013d1ec94ad
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson 0532eb30db all: replace tailcfg.DiscoKey with key.DiscoPublic.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Josh Bleecher Snyder 94fb42d4b2 all: use testingutil.MinAllocsPerRun
There are a few remaining uses of testing.AllocsPerRun:
Two in which we only log the number of allocations,
and one in which dynamically calculate the allocations
target based on a different AllocsPerRun run.

This also allows us to tighten the "no allocs"
test in wgengine/filter.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 9b101bd6af net/tstun: don't compile the code New constructor on js/wasm
Updates #3157

Change-Id: I81603edf3e69e6f1517b0074eef6b648f2981c50
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Aaron Klotz 1991a1ac6a net/tstun: update tun_windows for wintun 0.14 API revisions, update wireguard-go dependency to 82d2aa87aa623cb5143a41c3345da4fb875ad85d
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
3 years ago
Brad Fitzpatrick 080381c79f net/tstun: block looped disco traffic, take 17
It was in the wrong filter direction before, per CPU profiles
we now have.

Updates #1526 (maybe fixes? time will tell)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick dabeda21e0 net/tstun: block looped disco traffic
Updates #1526 (maybe fixes? time will tell)

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Emmanuel T Odeke 0daa32943e all: add (*testing.B).ReportAllocs() to every benchmark
This ensures that we can properly track and catch allocation
slippages that could otherwise have been missed.

Fixes #2748
3 years ago
Maisem Ali 1f006025c2 net/tstun: fix build on arm
Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Matt Layher 8ab44b339e net/tstun: use unix.Ifreq type for Linux TAP interface configuration
Signed-off-by: Matt Layher <mdlayher@gmail.com>
3 years ago
Brad Fitzpatrick 833200da6f net/tstun: don't exec uname -r on Linux in TUN failure diagnostics
Fixes https://twitter.com/zekjur/status/1425557520513486848

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick e804ab29fd net/tstun: move TUN failure diagnostics to OS-specific files
Mostly so the Linux one can use Linux-specific stuff in package
syscall and not use os/exec for uname for portability.

But also it helps deps a tiny bit on iOS.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder a5da4ed981 all: gofmt with Go 1.17
This adds "//go:build" lines and tidies up existing "// +build" lines.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick a729070252 net/tstun: add start of Linux TAP support, with DHCP+ARP server
Still very much a prototype (hard-coded IPs, etc) but should be
non-invasive enough to submit at this point and iterate from here.

Updates #2589

Co-Author: David Crawshaw <crawshaw@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder c2202cc27c net/tstun: use mono.Time
There's a call to Now once per packet.
Move to mono.Now.

Though the current implementation provides high precision,
we document it to be coarse, to preserve the ability
to switch to a coarse monotonic time later.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 1034b17bc7 net/tstun: buffer outbound channel
The handoff between tstun.Wrap's Read and poll methods
is one of the per-packet hotspots. It shows up in pprof.

Making outbound buffered increases throughput.

It is hard to measure exactly how much, because the numbers
are highly variable, but I'd estimate it at about 1%,
using the best observed max throughput across three runs.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 965dccd4fc net/tstun: buffer outbound channel
The handoff between tstun.Wrap's Read and poll methods
is one of the per-packet hotspots. It shows up in pprof.

Making outbound buffered increases throughput.

It is hard to measure exactly how much, because the numbers
are highly variable, but I'd estimate it at about 1%,
using the best observed max throughput across three runs.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 0ad92b89a6 net/tstun: fix data races
To remove some multi-case selects, we intentionally allowed
sends on closed channels (cc23049cd2).

However, we also introduced concurrent sends and closes,
which is a data race.

This commit fixes the data race. The mutexes here are uncontended,
and thus very cheap.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder c35a832de6 net/tstun: add inner loop to poll
This avoids re-enqueuing to t.bufferConsumed,
which makes the code a bit clearer.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder a4cc7b6d54 net/tstun: simplify code
Calculate whether the packet is injected directly,
rather than via an else branch.

Unify the exit paths. It is easier here than duplicating them.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder cc23049cd2 net/tstun: remove multi-case selects from hot code
Every TUN Read went through several multi-case selects.
We know from past experience with wireguard-go that these are slow
and cause scheduler churn.

The selects served two purposes: they separated errors from data and
gracefully handled shutdown. The first is fairly easy to replace by sending
errors and data over a single channel. The second, less so.

We considered a few approaches: Intricate webs of channels,
global condition variables. They all get ugly fast.

Instead, let's embrace the ugly and handle shutdown ungracefully.
It's horrible, but the horror is simple and localized.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
David Anderson df8a5d09c3 net/tstun: add a debug envvar to override tun MTU.
Signed-off-by: David Anderson <dave@natulte.net>
3 years ago
Brad Fitzpatrick a321c24667 go.mod: update netaddr
Involves minor IPSetBuilder.Set API change.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder 1ece91cede go.mod: upgrade wireguard-windows, de-fork wireguard-go
Pull in the latest version of wireguard-windows.

Switch to upstream wireguard-go.
This requires reverting all of our import paths.

Unfortunately, this has to happen at the same time.
The wireguard-go change is very low risk,
as that commit matches our fork almost exactly.
(The only changes are import paths, CI files, and a go.mod entry.)
So if there are issues as a result of this commit,
the first place to look is wireguard-windows changes.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 25df067dd0 all: adapt to opaque netaddr types
This commit is a mishmash of automated edits using gofmt:

gofmt -r 'netaddr.IPPort{IP: a, Port: b} -> netaddr.IPPortFrom(a, b)' -w .
gofmt -r 'netaddr.IPPrefix{IP: a, Port: b} -> netaddr.IPPrefixFrom(a, b)' -w .

gofmt -r 'a.IP.Is4 -> a.IP().Is4' -w .
gofmt -r 'a.IP.As16 -> a.IP().As16' -w .
gofmt -r 'a.IP.Is6 -> a.IP().Is6' -w .
gofmt -r 'a.IP.As4 -> a.IP().As4' -w .
gofmt -r 'a.IP.String -> a.IP().String' -w .

And regexps:

\w*(.*)\.Port = (.*)  ->  $1 = $1.WithPort($2)
\w*(.*)\.IP = (.*)  ->  $1 = $1.WithIP($2)

And lots of manual fixups.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 7f2eb1d87a net/tstun: fix TUN log spam when ACLs drop a packet
Whenever we dropped a packet due to ACLs, wireguard-go was logging:

Failed to write packet to TUN device: packet dropped by filter

Instead, just lie to wireguard-go and pretend everything is okay.

Fixes #1229

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder 42c8b9ad53 net/tstun: remove unnecessary break statement
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Josh Bleecher Snyder 99705aa6b7 net/tstun: split TUN events channel into up/down and MTU
We had a long-standing bug in which our TUN events channel
was being received from simultaneously in two places.

The first is wireguard-go.

At wgengine/userspace.go:366, we pass e.tundev to wireguard-go,
which starts a goroutine (RoutineTUNEventReader)
that receives from that channel and uses events to adjust the MTU
and bring the device up/down.

At wgengine/userspace.go:374, we launch a goroutine that
receives from e.tundev, logs MTU changes, and triggers
state updates when up/down changes occur.

Events were getting delivered haphazardly between the two of them.

We don't really want wireguard-go to receive the up/down events;
we control the state of the device explicitly by calling device.Up.
And the userspace.go loop MTU logging duplicates logging that
wireguard-go does when it received MTU updates.

So this change splits the single TUN events channel into up/down
and other (aka MTU), and sends them to the parties that ought
to receive them.

I'm actually a bit surprised that this hasn't caused more visible trouble.
If a down event went to wireguard-go but the subsequent up event
went to userspace.go, we could end up with the wireguard-go device disappearing.

I believe that this may also (somewhat accidentally) be a fix for #1790.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
3 years ago
Avery Pennarun a92b9647c5 wgengine/bench: speed test for channels, sockets, and wireguard-go.
This tries to generate traffic at a rate that will saturate the
receiver, without overdoing it, even in the event of packet loss. It's
unrealistically more aggressive than TCP (which will back off quickly
in case of packet loss) but less silly than a blind test that just
generates packets as fast as it can (which can cause all the CPU to be
absorbed by the transmitter, giving an incorrect impression of how much
capacity the total system has).

Initial indications are that a syscall about every 10 packets (TCP bulk
delivery) is roughly the same speed as sending every packet through a
channel. A syscall per packet is about 5x-10x slower than that.

The whole tailscale wireguard-go + magicsock + packet filter
combination is about 4x slower again, which is better than I thought
we'd do, but probably has room for improvement.

Note that in "full" tailscale, there is also a tundev read/write for
every packet, effectively doubling the syscall overhead per packet.

Given these numbers, it seems like read/write syscalls are only 25-40%
of the total CPU time used in tailscale proper, so we do have
significant non-syscall optimization work to do too.

Sample output:

$ GOMAXPROCS=2 go test -bench . -benchtime 5s ./cmd/tailbench
goos: linux
goarch: amd64
pkg: tailscale.com/cmd/tailbench
cpu: Intel(R) Core(TM) i7-4785T CPU @ 2.20GHz
BenchmarkTrivialNoAlloc/32-2         	56340248	        93.85 ns/op	 340.98 MB/s	         0 %lost	       0 B/op	       0 allocs/op
BenchmarkTrivialNoAlloc/124-2        	57527490	        99.27 ns/op	1249.10 MB/s	         0 %lost	       0 B/op	       0 allocs/op
BenchmarkTrivialNoAlloc/1024-2       	52537773	       111.3 ns/op	9200.39 MB/s	         0 %lost	       0 B/op	       0 allocs/op
BenchmarkTrivial/32-2                	41878063	       135.6 ns/op	 236.04 MB/s	         0 %lost	       0 B/op	       0 allocs/op
BenchmarkTrivial/124-2               	41270439	       138.4 ns/op	 896.02 MB/s	         0 %lost	       0 B/op	       0 allocs/op
BenchmarkTrivial/1024-2              	36337252	       154.3 ns/op	6635.30 MB/s	         0 %lost	       0 B/op	       0 allocs/op
BenchmarkBlockingChannel/32-2           12171654	       494.3 ns/op	  64.74 MB/s	         0 %lost	    1791 B/op	       0 allocs/op
BenchmarkBlockingChannel/124-2          12149956	       507.8 ns/op	 244.17 MB/s	         0 %lost	    1792 B/op	       1 allocs/op
BenchmarkBlockingChannel/1024-2         11034754	       528.8 ns/op	1936.42 MB/s	         0 %lost	    1792 B/op	       1 allocs/op
BenchmarkNonlockingChannel/32-2          8960622	      2195 ns/op	  14.58 MB/s	         8.825 %lost	    1792 B/op	       1 allocs/op
BenchmarkNonlockingChannel/124-2         3014614	      2224 ns/op	  55.75 MB/s	        11.18 %lost	    1792 B/op	       1 allocs/op
BenchmarkNonlockingChannel/1024-2        3234915	      1688 ns/op	 606.53 MB/s	         3.765 %lost	    1792 B/op	       1 allocs/op
BenchmarkDoubleChannel/32-2          	 8457559	       764.1 ns/op	  41.88 MB/s	         5.945 %lost	    1792 B/op	       1 allocs/op
BenchmarkDoubleChannel/124-2         	 5497726	      1030 ns/op	 120.38 MB/s	        12.14 %lost	    1792 B/op	       1 allocs/op
BenchmarkDoubleChannel/1024-2        	 7985656	      1360 ns/op	 752.86 MB/s	        13.57 %lost	    1792 B/op	       1 allocs/op
BenchmarkUDP/32-2                    	 1652134	      3695 ns/op	   8.66 MB/s	         0 %lost	     176 B/op	       3 allocs/op
BenchmarkUDP/124-2                   	 1621024	      3765 ns/op	  32.94 MB/s	         0 %lost	     176 B/op	       3 allocs/op
BenchmarkUDP/1024-2                  	 1553750	      3825 ns/op	 267.72 MB/s	         0 %lost	     176 B/op	       3 allocs/op
BenchmarkTCP/32-2                    	11056336	       503.2 ns/op	  63.60 MB/s	         0 %lost	       0 B/op	       0 allocs/op
BenchmarkTCP/124-2                   	11074869	       533.7 ns/op	 232.32 MB/s	         0 %lost	       0 B/op	       0 allocs/op
BenchmarkTCP/1024-2                  	 8934968	       671.4 ns/op	1525.20 MB/s	         0 %lost	       0 B/op	       0 allocs/op
BenchmarkWireGuardTest/32-2          	 1403702	      4547 ns/op	   7.04 MB/s	        14.37 %lost	     467 B/op	       3 allocs/op
BenchmarkWireGuardTest/124-2         	  780645	      7927 ns/op	  15.64 MB/s	         1.537 %lost	     420 B/op	       3 allocs/op
BenchmarkWireGuardTest/1024-2        	  512671	     11791 ns/op	  86.85 MB/s	         0.5206 %lost	     411 B/op	       3 allocs/op
PASS
ok  	tailscale.com/wgengine/bench	195.724s

Updates #414.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
3 years ago
Brad Fitzpatrick 939861773d net/tstun: accept peerapi connections through the filter
Fixes tailscale/corp#1545

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson bc4381447f net/tstun: return the real interface name at device creation.
This is usually the same as the requested interface, but on some
unixes can vary based on device number allocation, and on Windows
it's the GUID instead of the pretty name, since everything relating
to configuration wants the GUID.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Brad Fitzpatrick 41e4e02e57 net/{packet,tstun}: send peerapi port in TSMP pongs
For discovery when an explicit hostname/IP is known. We'll still
also send it via control for finding peers by a list.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson 25e0bb0a4e net/tstun: rename wrap_windows.go to tun_windows.go.
The code has nothing to do with wrapping, it's windows-specific
driver initialization code.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 22d53fe784 net/tstun: document exported function.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 016de16b2e net/tstun: rename TUN to Wrapper.
The tstun packagen contains both constructors for generic tun
Devices, and a wrapper that provides additional functionality.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago