tailscale

Commit Graph

Author	SHA1	Message	Date
Josh Bleecher Snyder	7ee891f5fd	all: delete wgcfg.Key and wgcfg.PrivateKey For historical reasons, we ended up with two near-duplicate copies of curve25519 key types, one in the wireguard-go module (wgcfg) and one in the tailscale module (types/wgkey). Then we moved wgcfg to the tailscale module. We can now remove the wgcfg key type in favor of wgkey. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	9d542e08e2	wgengine/magicsock: always run ReceiveIPv6 One of the consequences of the bind refactoring in `6f23087175` is that attempting to bind an IPv6 socket will always result in c.pconn6.pconn being non-nil. If the bind fails, it'll be set to a placeholder packet conn that blocks forever. As a result, we can always run ReceiveIPv6 and health check it. This removes IPv4/IPv6 asymmetry and also will allow health checks to detect any IPv6 receive func failures. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	fe50ded95c	health: track whether we have a functional udp4 bind Suggested-by: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	7dc7078d96	wgengine/magicsock: use netaddr.IP in listenPacket It must be an IP address; enforce that at the type level. Suggested-by: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	3c543c103a	wgengine/magicsock: unify initial bind and rebind We had two separate code paths for the initial UDP listener bind and any subsequent rebinds. IPv6 got left out of the rebind code. Rather than duplicate it there, unify the two code paths. Then improve the resulting code: * Rebind had nested listen attempts to try the user-specified port first, and then fall back to :0 if that failed. Convert that into a loop. * Initial bind tried only the user-specified port. Rebind tried the user-specified port and 0. But there are actually three ports of interest: The one the user specified, the most recent port in use, and 0. We now try all three in order, as appropriate. * In the extremely rare case in which binding to port 0 fails, use a dummy net.PacketConn whose reads block until close. This will keep the wireguard-go receive func goroutine alive. As a pleasant side-effect of this, if we decide that we need to resuscitate #1796, it will now be much easier. Fixes #1799 Co-authored-by: David Anderson <danderson@tailscale.com> Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	8fb66e20a4	wgengine/magicsock: remove DefaultPort const Assume it'll stay at 0 forever, so hard-code it and delete code conditional on it being non-0. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	a8f61969b9	wgengine/magicsock: remove context arg from listenPacket It was set to context.Background by all callers, for the same reasons. Set it locally instead, to simplify call sites. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Brad Fitzpatrick	bb2141e0cf	wgengine: periodically poll engine status for logging side effect Fixes tailscale/corp#1560 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	3 years ago
Brad Fitzpatrick	3c9dea85e6	wgengine: update a log line from 'weird' to conventional 'unexpected' Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	3 years ago
Josh Bleecher Snyder	744de615f1	health, wgenegine: fix receive func health checks for the fourth time The old implementation knew too much about how wireguard-go worked. As a result, it missed genuine problems that occurred due to unrelated bugs. This fourth attempt to fix the health checks takes a black box approach. A receive func is healthy if one (or both) of these conditions holds: * It is currently running and blocked. * It has been executed recently. The second condition is required because receive functions are not continuously executing. wireguard-go calls them and then processes their results before calling them again. There is a theoretical false positive if wireguard-go go takes longer than one minute to process the results of a receive func execution. If that happens, we have other problems. Updates #1790 Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	0d4c8cb2e1	health: delete ReceiveFunc health checks They were not doing their job. They need yet another conceptual re-think. Start by clearing the decks. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	99705aa6b7	net/tstun: split TUN events channel into up/down and MTU We had a long-standing bug in which our TUN events channel was being received from simultaneously in two places. The first is wireguard-go. At wgengine/userspace.go:366, we pass e.tundev to wireguard-go, which starts a goroutine (RoutineTUNEventReader) that receives from that channel and uses events to adjust the MTU and bring the device up/down. At wgengine/userspace.go:374, we launch a goroutine that receives from e.tundev, logs MTU changes, and triggers state updates when up/down changes occur. Events were getting delivered haphazardly between the two of them. We don't really want wireguard-go to receive the up/down events; we control the state of the device explicitly by calling device.Up. And the userspace.go loop MTU logging duplicates logging that wireguard-go does when it received MTU updates. So this change splits the single TUN events channel into up/down and other (aka MTU), and sends them to the parties that ought to receive them. I'm actually a bit surprised that this hasn't caused more visible trouble. If a down event went to wireguard-go but the subsequent up event went to userspace.go, we could end up with the wireguard-go device disappearing. I believe that this may also (somewhat accidentally) be a fix for #1790. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Avery Pennarun	a7fe1d7c46	wgengine/bench: improved rate selection. The old decay-based one took a while to converge. This new one (based very loosely on TCP BBR) seems to converge quickly on what seems to be the best speed. Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>	3 years ago
Avery Pennarun	a92b9647c5	wgengine/bench: speed test for channels, sockets, and wireguard-go. This tries to generate traffic at a rate that will saturate the receiver, without overdoing it, even in the event of packet loss. It's unrealistically more aggressive than TCP (which will back off quickly in case of packet loss) but less silly than a blind test that just generates packets as fast as it can (which can cause all the CPU to be absorbed by the transmitter, giving an incorrect impression of how much capacity the total system has). Initial indications are that a syscall about every 10 packets (TCP bulk delivery) is roughly the same speed as sending every packet through a channel. A syscall per packet is about 5x-10x slower than that. The whole tailscale wireguard-go + magicsock + packet filter combination is about 4x slower again, which is better than I thought we'd do, but probably has room for improvement. Note that in "full" tailscale, there is also a tundev read/write for every packet, effectively doubling the syscall overhead per packet. Given these numbers, it seems like read/write syscalls are only 25-40% of the total CPU time used in tailscale proper, so we do have significant non-syscall optimization work to do too. Sample output: $ GOMAXPROCS=2 go test -bench . -benchtime 5s ./cmd/tailbench goos: linux goarch: amd64 pkg: tailscale.com/cmd/tailbench cpu: Intel(R) Core(TM) i7-4785T CPU @ 2.20GHz BenchmarkTrivialNoAlloc/32-2 56340248 93.85 ns/op 340.98 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTrivialNoAlloc/124-2 57527490 99.27 ns/op 1249.10 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTrivialNoAlloc/1024-2 52537773 111.3 ns/op 9200.39 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTrivial/32-2 41878063 135.6 ns/op 236.04 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTrivial/124-2 41270439 138.4 ns/op 896.02 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTrivial/1024-2 36337252 154.3 ns/op 6635.30 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkBlockingChannel/32-2 12171654 494.3 ns/op 64.74 MB/s 0 %lost 1791 B/op 0 allocs/op BenchmarkBlockingChannel/124-2 12149956 507.8 ns/op 244.17 MB/s 0 %lost 1792 B/op 1 allocs/op BenchmarkBlockingChannel/1024-2 11034754 528.8 ns/op 1936.42 MB/s 0 %lost 1792 B/op 1 allocs/op BenchmarkNonlockingChannel/32-2 8960622 2195 ns/op 14.58 MB/s 8.825 %lost 1792 B/op 1 allocs/op BenchmarkNonlockingChannel/124-2 3014614 2224 ns/op 55.75 MB/s 11.18 %lost 1792 B/op 1 allocs/op BenchmarkNonlockingChannel/1024-2 3234915 1688 ns/op 606.53 MB/s 3.765 %lost 1792 B/op 1 allocs/op BenchmarkDoubleChannel/32-2 8457559 764.1 ns/op 41.88 MB/s 5.945 %lost 1792 B/op 1 allocs/op BenchmarkDoubleChannel/124-2 5497726 1030 ns/op 120.38 MB/s 12.14 %lost 1792 B/op 1 allocs/op BenchmarkDoubleChannel/1024-2 7985656 1360 ns/op 752.86 MB/s 13.57 %lost 1792 B/op 1 allocs/op BenchmarkUDP/32-2 1652134 3695 ns/op 8.66 MB/s 0 %lost 176 B/op 3 allocs/op BenchmarkUDP/124-2 1621024 3765 ns/op 32.94 MB/s 0 %lost 176 B/op 3 allocs/op BenchmarkUDP/1024-2 1553750 3825 ns/op 267.72 MB/s 0 %lost 176 B/op 3 allocs/op BenchmarkTCP/32-2 11056336 503.2 ns/op 63.60 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTCP/124-2 11074869 533.7 ns/op 232.32 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkTCP/1024-2 8934968 671.4 ns/op 1525.20 MB/s 0 %lost 0 B/op 0 allocs/op BenchmarkWireGuardTest/32-2 1403702 4547 ns/op 7.04 MB/s 14.37 %lost 467 B/op 3 allocs/op BenchmarkWireGuardTest/124-2 780645 7927 ns/op 15.64 MB/s 1.537 %lost 420 B/op 3 allocs/op BenchmarkWireGuardTest/1024-2 512671 11791 ns/op 86.85 MB/s 0.5206 %lost 411 B/op 3 allocs/op PASS ok tailscale.com/wgengine/bench 195.724s Updates #414. Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>	3 years ago
Maisem Ali	590792915a	wgengine/router{win}: ignore broadcast routes added by Windows when removing routes. Signed-off-by: Maisem Ali <maisem@tailscale.com>	3 years ago
Josh Bleecher Snyder	8d7f7fc7ce	health, wgenegine: fix receive func health checks yet again The existing implementation was completely, embarrassingly conceptually broken. We aren't able to see whether wireguard-go's receive function goroutines are running or not. All we can do is model that based on what we have done. This commit fixes that model. Fixes #1781 Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	5835a3f553	health, wgengine/magicsock: avoid receive function false positives Avery reported a sub-ms health transition from "receiveIPv4 not running" to "ok". To avoid these transient false-positives, be more precise about the expected lifetime of receive funcs. The problematic case is one in which they were started but exited prior to a call to connBind.Close. Explicitly represent started vs running state, taking care with the order of updates. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	f845aae761	health: track whether magicsock receive functions are running Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Brad Fitzpatrick	12b4672add	wgengine: quiet connection failure diagnostics for exit nodes The connection failure diagnostic code was never updated enough for exit nodes, so disable its misleading output when the node it picks (incorrectly) to diagnose is only an exit node. Fixes #1754 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	3 years ago
Josh Bleecher Snyder	a29b0cf55f	wgengine/wglog: allow wireguard-go receive routines to log I've spent two days searching for a theoretical wireguard-go bug around receive functions exiting early. I've found many bugs, but none of the flavor we're looking for. Restore wireguard-go's logging around starting and stopping receive functions, so that we can definitively rule in or out this particular theory. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	eb2a9d4ce3	wgengine/netstack: log error when acceptUDP fails I see a bunch of these in some logs I'm looking at, separated only by a few seconds. Log the error so we can tell what's going on here. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Naman Sood	4a90a91d29	wgengine/netstack: log ForwarderRequest in readable form, only in debug mode (#1758 ) * wgengine/netstack: log ForwarderRequest in readable form, only in debug mode Fixes #1757 Signed-off-by: Naman Sood <mail@nsood.in>	3 years ago
Josh Bleecher Snyder	07c95a0219	wgengine/wgcfg/nmcfg: consolidate exit node log lines These were getting rate-limited for nodes with many peers. Consolate the output into single lines, which are nicer anyway. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	48e30bb8de	wgengine/magicsock: remove named return Doesn't add anything. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	a2a2c0ce1c	wgengine/magicsock: fix two comments Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	b1e624ef04	wgengine/magicsock: remove unnecessary type assertions Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	98714e784b	wgengine/magicsock: improve Rebind logging We were accidentally logging oldPort -> oldPort. Log oldPort as well as c.port; if we failed to get the preferred port in a previous rebind, oldPort might differ from c.port. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Josh Bleecher Snyder	15ceacc4c5	wgengine/magicsock: accept a host and port instead of an addr in listenPacket This simplifies call sites and prevents accidental failure to use net.JoinHostPort. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>	3 years ago
Brad Fitzpatrick	b993d9802a	ipn/ipnlocal, etc: require file sharing capability to send/recv files tailscale/corp#1582 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	4 years ago
Maisem Ali	4f3203556d	wgengine/router: add the Tailscale ULA route on darwin. Signed-off-by: Maisem Ali <maisem@tailscale.com>	4 years ago
Brad Fitzpatrick	762180595d	ipn/ipnstate: add PeerStatus.TailscaleIPs slice, deprecate TailAddr Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	4 years ago
Brad Fitzpatrick	34d2f5a3d9	tailcfg: add Endpoint, EndpointType, MapRequest.EndpointType Track endpoints internally with a new tailcfg.Endpoint type that includes a typed netaddr.IPPort (instead of just a string) and includes a type for how that endpoint was discovered (STUN, local, etc). Use []tailcfg.Endpoint instead of []string internally. At the last second, send it to the control server as the existing []string for endpoints, but also include a new parallel MapRequest.EndpointType []tailcfg.EndpointType, so the control server can start filtering out less-important endpoint changes from new-enough clients. Notably, STUN-discovered endpoints can be filtered out from 1.6+ clients, as they can discover them amongst each other via CallMeMaybe disco exchanges started over DERP. And STUN endpoints change a lot, causing a lot of MapResposne updates. But portmapped endpoints are worth keeping for now, as they they work right away without requiring the firewall traversal extra RTT dance. End result will be less control->client bandwidth. (despite negligible increase in client->control bandwidth) Updates tailscale/corp#1543 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	4 years ago
Maisem Ali	1b9d8771dc	ipn/ipnlocal,wgengine/router,cmd/tailscale: add flag to allow local lan access when routing traffic via an exit node. For #1527 Signed-off-by: Maisem Ali <maisem@tailscale.com>	4 years ago
David Anderson	854d5d36a1	net/dns: return error from NewOSManager, use it to initialize NM. Signed-off-by: David Anderson <danderson@tailscale.com>	4 years ago
Brad Fitzpatrick	d5d70ae9ea	wgengine/monitor: reduce Linux log spam on down Fixes #1689 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	4 years ago
David Anderson	84430cdfa1	net/dns: improve NetworkManager detection, using more DBus. Signed-off-by: David Anderson <danderson@tailscale.com>	4 years ago
David Anderson	19eca34f47	wgengine/router: fix FreeBSD configuration failure on the v6 /48. On FreeBSD, we add the interface IP as a /48 to work around a kernel bug, so we mustn't then try to add a /48 route to the Tailscale ULA, since that will fail as a dupe. Signed-off-by: David Anderson <danderson@tailscale.com>	4 years ago
David Anderson	4a64d2a603	net/dns: some post-review cleanups. Signed-off-by: David Anderson <danderson@tailscale.com>	4 years ago
David Anderson	68f76e9aa1	net/dns: add GetBaseConfig to OSConfigurator interface. Part of #953, required to make split DNS work on more basic platforms. Signed-off-by: David Anderson <danderson@tailscale.com>	4 years ago
Brad Fitzpatrick	d488678fdc	cmd/tailscaled, wgengine{,/netstack}: add netstack hybrid mode, add to Windows For #707 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	4 years ago
Denton Gentry	3089081349	monitor/polling: reduce Cloud Run polling interval. Cloud Run's routes never change at runtime. Don't poll it for route changes very often. Signed-off-by: Denton Gentry <dgentry@tailscale.com>	4 years ago
David Anderson	de6dc4c510	net/dns: add a Primary field to OSConfig. Currently ignored. Signed-off-by: David Anderson <danderson@tailscale.com>	4 years ago
David Anderson	7d84ee6c98	net/dns: unify the OS manager and internal resolver. Signed-off-by: David Anderson <danderson@tailscale.com>	4 years ago
David Anderson	1bf91c8123	net/dns/resolver: remove unused err return value. Signed-off-by: David Anderson <danderson@tailscale.com>	4 years ago
David Anderson	f007a9dd6b	health: add DNS subsystem and plumb errors in. Signed-off-by: David Anderson <danderson@tailscale.com>	4 years ago
David Anderson	4c61ebacf4	wgengine: move DNS configuration out of wgengine/router. Signed-off-by: David Anderson <danderson@tailscale.com>	4 years ago
Josh Bleecher Snyder	ba72126b72	wgengine/magicsock: remove RebindingUDPConn.FakeClosed It existed to work around the frequent opening and closing of the conn.Bind done by wireguard-go. The preceding commit removed that behavior, so we can simply close the connections when we are done with them. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>	4 years ago
Josh Bleecher Snyder	69cdc30c6d	wgengine/wgcfg: remove Config.ListenPort We don't use the port that wireguard-go passes to us (via magicsock.connBind.Open). We ignore it entirely and use the port we selected. When we tell wireguard-go that we're changing the listen_port, it calls connBind.Close and then connBind.Open. And in the meantime, it stops calling the receive functions, which means that we stop receiving and processing UDP and DERP packets. And that is Very Bad. That was never a problem prior to `b3ceca1dd7`, because we passed the SkipBindUpdate flag to our wireguard-go fork, which told wireguard-go not to re-bind on listen_port changes. That commit eliminated the SkipBindUpdate flag. We could write a bunch of code to work around the gap. We could add background readers that process UDP and DERP packets when wireguard-go isn't. But it's simpler to never create the conditions in which wireguard-go rebinds. The other scenario in which wireguard-go re-binds is device.Down. Conveniently, we never call device.Down. We go from device.Up to device.Close, and the latter only when we're shutting down a magicsock.Conn completely. Rubber-ducked-by: Avery Pennarun <apenwarr@tailscale.com> Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>	4 years ago
David Anderson	27a1a2976a	wgengine/router: add a CallbackRouter shim. The shim implements both network and DNS configurators, and feeds both into a single callback that receives both configs. Signed-off-by: David Anderson <danderson@tailscale.com>	4 years ago
Josh Bleecher Snyder	b3ceca1dd7	wgengine/...: split into multiple receive functions Upstream wireguard-go has changed its receive model. NewDevice now accepts a conn.Bind interface. The conn.Bind is stateless; magicsock.Conns are stateful. To work around this, we add a connBind type that supports cheap teardown and bring-up, backed by a Conn. The new conn.Bind allows us to specify a set of receive functions, rather than having to shoehorn everything into ReceiveIPv4 and ReceiveIPv6. This lets us plumbing DERP messages directly into wireguard-go, instead of having to mux them via ReceiveIPv4. One consequence of the new conn.Bind layer is that closing the wireguard-go device is now indistinguishable from the routine bring-up and tear-down normally experienced by a conn.Bind. We thus have to explicitly close the magicsock.Conn when the close the wireguard-go device. One downside of this change is that we are reliant on wireguard-go to call receiveDERP to process DERP messages. This is fine for now, but is perhaps something we should fix in the future. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>	4 years ago

1 2 3 4 5 ...

805 Commits (f342d10dc5e8b0494ba069b380e9b8c00d2792c8)