Don't set all the *.arpa. reverse DNS lookup domains if systemd-resolved
is old and can't handle them.
Fixes#3188
Change-Id: I283f8ce174daa8f0a972ac7bfafb6ff393dde41d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There are a few remaining uses of testing.AllocsPerRun:
Two in which we only log the number of allocations,
and one in which dynamically calculate the allocations
target based on a different AllocsPerRun run.
This also allows us to tighten the "no allocs"
test in wgengine/filter.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Now that we multicast the SSDP query, we can get IGD offers from
devices other than the current device's default gateway. We don't want
to accidentally bind ourselves to those.
Updates #3197
Signed-off-by: David Anderson <danderson@tailscale.com>
And the derper change to add a CORS endpoint for latency measurement.
And a little magicsock change to cut down some log spam on js/wasm.
Updates #3157
Change-Id: I5fd9e6f5098c815116ddc8ac90cbcd0602098a48
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There are /etc/resolv.conf files out there where resolvconf wrote
the file but pointed to systemd-resolved as the nameserver.
We're better off handling those as systemd-resolved.
> # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
> # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
> # 127.0.0.53 is the systemd-resolved stub resolver.
> # run "systemd-resolve --status" to see details about the actual nameservers.
Fixes https://github.com/tailscale/tailscale/issues/3026
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
In some containers, /etc/resolv.conf is a bind-mount from outside the container.
This prevents renaming to or from /etc/resolv.conf, because it's on a different
filesystem from linux's perspective. It also prevents removing /etc/resolv.conf,
because doing so would break the bind-mount.
If we find ourselves within this environment, fall back to using copy+delete when
renaming to /etc/resolv.conf, and copy+truncate when renaming from /etc/resolv.conf.
Fixes#3000
Co-authored-by: Denton Gentry <dgentry@tailscale.com>
Signed-off-by: David Anderson <danderson@tailscale.com>
The "go generate" command blindly looks for "//go:generate" anywhere
in the file regardless of whether it is truly a comment.
Prevent this false positive in cloner.go by mangling the string
to look less like "//go:generate".
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
When a DNS server claims to be unable or unwilling to handle a request,
instead of passing that refusal along to the client, just treat it as
any other error trying to connect to the DNS server. This prevents DNS
requests from failing based on if a server can respond with a transient
error before another server is able to give an actual response. DNS
requests only failing *sometimes* is really hard to find the cause of
(#1033).
Signed-off-by: Smitty <me@smitop.com>
"skipping portmap; gateway range likely lacks support" is really
spammy on cloud systems, and not very useful in debugging.
Fixes https://github.com/tailscale/tailscale/issues/3034
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
We added the initial handling only for macOS and iOS.
With 1.16.0 now released, suppress forwarding DNS-SD
on all platforms to test it through the 1.17.x cycle.
Updates #2442
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
On iOS (and possibly other platforms), sometimes our UDP socket would
get stuck in a state where it was bound to an invalid interface (or no
interface) after a network reconfiguration. We can detect this by
actually checking the error codes from sending our STUN packets.
If we completely fail to send any STUN packets, we know something is
very broken. So on the next STUN attempt, let's rebind the UDP socket
to try to correct any problems.
This fixes a problem where iOS would sometimes get stuck using DERP
instead of direct connections until the backend was restarted.
Fixes#2994
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
I forgot to include this file in the earlier
7cf8ec8108 commit.
This exists purely to keep "go mod tidy" happy.
Updates #1609
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We still try the host's x509 roots first, but if that fails (like if
the host is old), we fall back to using LetsEncrypt's root and
retrying with that.
tlsdial was used in the three main places: logs, control, DERP. But it
was missing in dnsfallback. So added it there too, so we can run fine
now on a machine with no DNS config and no root CAs configured.
Also, move SSLKEYLOGFILE support out of DERP. tlsdial is the logical place
for that support.
Fixes#1609
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
DNSSEC is an availability issue, as recently demonstrated by the
Slack issue, with limited security advantage. DoH on the other hand
is a critical security upgrade. This change adds DoH support for the
non-DNSSEC endpoints of Quad9.
https://www.quad9.net/service/service-addresses-and-features#unsec
Signed-off-by: Filippo Valsorda <hi@filippo.io>
It was in the wrong filter direction before, per CPU profiles
we now have.
Updates #1526 (maybe fixes? time will tell)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Windows has a public dns.Flush used in router_windows.go.
However that won't work for platforms like Linux, where
we need a different flush mechanism for resolved versus
other implementations.
We're instead adding a FlushCaches method to the dns Manager,
which can be made to work on all platforms as needed.
Fixes https://github.com/tailscale/tailscale/issues/2132
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
The earlier 382b349c54 was too late,
as engine creation itself needed to listen on things.
Fixes#2827
Updates #2822
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We currently plumb full URLs for DNS resolvers from the control server
down to the client. But when we pass the values into the net/dns
package, we throw away any URL that isn't a bare IP. This commit
continues the plumbing, and gets the URL all the way to the built in
forwarder. (It stops before plumbing URLs into the OS configurations
that can handle them.)
For #2596
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
Reported on IRC: in an edge case, you can end up with a directManager DNS
manager and --accept-dns=false, in which case we should do nothing, but
actually end up restarting resolved whenever the netmap changes, even though
the user told us to not manage DNS.
Signed-off-by: David Anderson <danderson@tailscale.com>
Reported on IRC: a resolv.conf that contained two entries for
"nameserver 127.0.0.53", which defeated our "is resolved actually
in charge" check. Relax that check to allow any number of nameservers,
as long as they're all 127.0.0.53.
Signed-off-by: David Anderson <danderson@tailscale.com>
It wasn't using the right metric. Apparently you're supposed to sum the route
metric and interface metric. Whoops.
While here, optimize a few little things too, not that this code
should be too hot.
Fixes#2707 (at least; probably dups but I'm failing to find)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Now that we have the easier-to-parse go:build build tags,
it is straightforward to simplify them. Yay.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Mostly so the Linux one can use Linux-specific stuff in package
syscall and not use os/exec for uname for portability.
But also it helps deps a tiny bit on iOS.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This logs some basic statistics for UPnP, so that tailscale can better understand what routers
are being used and how to connect to them.
Signed-off-by: julianknodt <julianknodt@gmail.com>
This adds a PCP test to the IGD test server, by hardcoding in a few observed packets from
Denton's box.
Signed-off-by: julianknodt <julianknodt@gmail.com>
And use dynamic port numbers in tests, as Linux on GitHub Actions and
Windows in general have things running on these ports.
Co-Author: Julian Knodt <julianknodt@gmail.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously, we hashed the question and combined it with the original
txid which was useful when concurrent queries were multiplexed on a
single local source port. We encountered some situations where the DNS
server canonicalizes the question in the response (uppercase converted
to lowercase in this case), which resulted in responses that we couldn't
match to the original request due to hash mismatches. This includes a
new test to cover that situation.
Fixes#2597
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
PCP handles external IPs by allowing the client to specify them in the packet, which is more
explicit than requiring 2 packets from PMP, so allow for future changes to add it in easily.
Signed-off-by: julianknodt <julianknodt@gmail.com>
Still very much a prototype (hard-coded IPs, etc) but should be
non-invasive enough to submit at this point and iterate from here.
Updates #2589
Co-Author: David Crawshaw <crawshaw@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Prior to Tailscale 1.12 it detected UPnP on any port.
Starting with Tailscale 1.11.x, it stopped detecting UPnP on all ports.
Then start plumbing its discovered Location header port number to the
code that was assuming port 5000.
Fixes#2109
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There's a call to Now once per packet.
Move to mono.Now.
Though the current implementation provides high precision,
we document it to be coarse, to preserve the ability
to switch to a coarse monotonic time later.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Go 1.17 switches to a register ABI on amd64 platforms.
Part of that switch is that go and defer calls use an argument-less
closure, which allocates. This means that we have an extra
alloc in some DNS work. That's unfortunate but not a showstopper,
and I don't see a clear path to fixing it.
The other performance benefits from the register ABI will all
but certainly outweigh this extra alloc.
Fixes#2545
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
I don't know how to get access to a real packet. Basing this commit
entirely off:
+------------+--------------+------------------------------+
| Field Name | Field Type | Description |
+------------+--------------+------------------------------+
| NAME | domain name | MUST be 0 (root domain) |
| TYPE | u_int16_t | OPT (41) |
| CLASS | u_int16_t | requestor's UDP payload size |
| TTL | u_int32_t | extended RCODE and flags |
| RDLEN | u_int16_t | length of all RDATA |
| RDATA | octet stream | {attribute,value} pairs |
+------------+--------------+------------------------------+
From https://datatracker.ietf.org/doc/html/rfc6891#section-6.1.2
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
The handoff between tstun.Wrap's Read and poll methods
is one of the per-packet hotspots. It shows up in pprof.
Making outbound buffered increases throughput.
It is hard to measure exactly how much, because the numbers
are highly variable, but I'd estimate it at about 1%,
using the best observed max throughput across three runs.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
The handoff between tstun.Wrap's Read and poll methods
is one of the per-packet hotspots. It shows up in pprof.
Making outbound buffered increases throughput.
It is hard to measure exactly how much, because the numbers
are highly variable, but I'd estimate it at about 1%,
using the best observed max throughput across three runs.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Tested manually with:
$ go test -v ./net/dnscache/ -dial-test=bogusplane.dev.tailscale.com:80
Where bogusplane has three A records, only one of which works.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Instead of blasting away at all upstream resolvers at the same time,
make a timing plan upon reconfiguration and have each upstream have an
associated start delay, depending on the overall forwarding config.
So now if you have two or four upstream Google or Cloudflare DNS
servers (e.g. two IPv4 and two IPv6), we now usually only send a
query, not four.
This is especially nice on iOS where we start fewer DoH queries and
thus fewer HTTP/1 requests (because we still disable HTTP/2 on iOS),
fewer sockets, fewer goroutines, and fewer associated HTTP buffers,
etc, saving overall memory burstiness.
Fixes#2436
Updates tailscale/corp#2250
Updates tailscale/corp#2238
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add a place to hang state in a future change for #2436.
For now this just simplifies the send signature without
any functional change.
Updates #2436
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously, this was incorrectly returning the internal port, and using that with the external
exposed IP when it did not use WANIPConnection2. In the case when we must provide a port, we
return it instead.
Noticed this while implementing the integration test for upnp.
Signed-off-by: julianknodt <julianknodt@gmail.com>
It was a huge chunk of the overall log output and made debugging
difficult. Omit and summarize the spammy *.arpa parts instead.
Fixestailscale/corp#2066 (to which nobody had opinions, so)
Add in UPnP portmapping, using goupnp library in order to get the UPnP client and run the
portmapping functions. This rips out anywhere where UPnP used to be in portmapping, and has a
flow separate from PMP and PCP.
RELNOTE=portmapper now supports UPnP mappings
Fixes#682
Updates #2109
Signed-off-by: julianknodt <julianknodt@gmail.com>
Recognize Cloudflare, Google, Quad9 which are by far the
majority of upstream DNS servers that people use.
RELNOTE=MagicDNS now uses DNS-over-HTTPS when querying popular upstream resolvers,
so DNS queries aren't sent in the clear over the Internet.
Updates #915 (might fix it?)
Updates #988 (gets us closer, if it fixes Android)
Updates #74 (not yet configurable, but progress)
Updates #2056 (not yet configurable, dup of #74?)
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Added the net/speedtest package that contains code for starting up a
speedtest server and a client. The speedtest command for starting a
client takes in a duration for the speedtest as well as the host and
port of the speedtest server to connect to. The speedtest command for
starting a server takes in a host:port pair to listen on.
Signed-off-by: Aaditya Chaudhary <32117362+AadityaChaudhary@users.noreply.github.com>
With netns handling localhost now, existing tests no longer
need special handling. The tests set up their connections to
localhost, and the connections work without fuss.
Remove the special handling for tests.
Also remove the hostinfo.TestCase support, since this was
the only use of it. It can be added back later if really
needed, but it would be better to try to make tests work
without special cases.
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
netns_linux checked whether "ip rule" could run to determine
whether to use SO_MARK for network namespacing. However in
Linux environments which lack CAP_NET_ADMIN, such as various
container runtimes, the "ip rule" command succeeds but SO_MARK
fails due to lack of permission. SO_BINDTODEVICE would work in
these environments, but isn't tried.
In addition to running "ip rule" check directly whether SO_MARK
works or not. Among others, this allows Microsoft Azure App
Service and AWS App Runner to work.
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
Connections to a control server or log server on localhost,
used in a number of tests, are working right now because the
calls to SO_MARK in netns fail for non-root but then we ignore
the failure when running in tests.
Unfortunately that failure in SO_MARK also affects container
environments without CAP_NET_ADMIN, breaking Tailscale
connectivity. We're about to fix netns to recognize when SO_MARK
doesn't work and use SO_BINDTODEVICE instead. Doing so makes
tests fail, as their sockets now BINDTODEVICE of the default
route and cannot connect to localhost.
Add support to skip namespacing for localhost connections,
which Darwin and Windows already do. This is not conditional
on running within a test, if you tell tailscaled to connect
to localhost it will automatically use a non-namespaced
socket to do so.
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
To remove some multi-case selects, we intentionally allowed
sends on closed channels (cc23049cd2).
However, we also introduced concurrent sends and closes,
which is a data race.
This commit fixes the data race. The mutexes here are uncontended,
and thus very cheap.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Calculate whether the packet is injected directly,
rather than via an else branch.
Unify the exit paths. It is easier here than duplicating them.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
Every TUN Read went through several multi-case selects.
We know from past experience with wireguard-go that these are slow
and cause scheduler churn.
The selects served two purposes: they separated errors from data and
gracefully handled shutdown. The first is fairly easy to replace by sending
errors and data over a single channel. The second, less so.
We considered a few approaches: Intricate webs of channels,
global condition variables. They all get ugly fast.
Instead, let's embrace the ugly and handle shutdown ungracefully.
It's horrible, but the horror is simple and localized.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
We also have to make a one-off change to /etc/wsl.conf to stop every
invocation of wsl.exe clobbering the /etc/resolv.conf. This appears to
be a safe change to make permanently, as even though the resolv.conf is
constantly clobbered, it is always the same stable internal IP that is
set as a nameserver. (I believe the resolv.conf clobbering predates the
MS stub resolver.)
Tested on WSL2, should work for WSL1 too.
Fixes#775
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
This is preliminary work for using the directManager as
part of a wslManager on windows, where in addition to configuring
windows we'll use wsl.exe to edit the linux file system and modify the
system resolv.conf.
The pinholeFS is a little funky, but it's designed to work through
simple unix tools via wsl.exe without invoking bash. I would not have
thought it would stand on its own like this, but it turns out it's
useful for writing a test for the directManager.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
This has been bothering me for a while, but everytime I run format from the root directory
it also formats this file. I didn't want to add it to my other PRs but it's annoying to have to
revert it every time.
Signed-off-by: julianknodt <julianknodt@gmail.com>
Move derpmap.Prod to a static JSON file (go:generate'd) instead,
to make its role explicit. And add a TODO about making dnsfallback
use an update-over-time DERP map file instead of a baked-in one.
Updates #1264
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This change (subject to some limitations) looks for the EDNS OPT record
in queries and responses, clamping the size field to fit within our DNS
receive buffer. If the size field is smaller than the DNS receive buffer
then it is left unchanged.
I think we will eventually need to transition to fully processing the
DNS queries to handle all situations, but this should cover the most
common case.
Mostly fixes#2066
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
Windows 8.1 incorrectly handles search paths on an interface with no
associated resolver, so we have to provide a full primary DNS config
rather than use Windows 8.1's nascent-but-present NRPT functionality.
Fixes#2237.
Signed-off-by: David Anderson <danderson@tailscale.com>
The only connectivity an AWS Lambda container has is an IPv4 link-local
169.254.x.x address using NAT:
12: vtarget_1@if11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
qdisc noqueue state UP group default qlen 1000
link/ether 7e:1c:3f:00:00:00 brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet 169.254.79.1/32 scope global vtarget_1
valid_lft forever preferred_lft forever
If there are no other IPv4/v6 addresses available, and we are running
in AWS Lambda, allow IPv4 169.254.x.x addresses to be used.
----
Similarly, a Google Cloud Run container's only connectivity is
a Unique Local Address fddf:3978:feb1:d745::c001/128.
If there are no other addresses available then allow IPv6
Unique Local Addresses to be used.
We actually did this in an earlier release, but now refactor it to
work the same way as the IPv4 link-local support is being done.
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
Split out of Denton's #2164, to make that diff smaller to review.
This change has no behavior changes.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It's possible to install a configuration that passes our current checks
for systemd-resolved, without actually pointing to systemd-resolved. In
that case, we end up programming DNS in resolved, but that config never
applies to any name resolution requests on the system.
This is quite a far-out edge case, but there's a simple additional check
we can do: if the header comment names systemd-resolved, there should be
a single nameserver in resolv.conf pointing to 127.0.0.53. If not, the
configuration should be treated as an unmanaged resolv.conf.
Fixes#2136.
Signed-off-by: David Anderson <danderson@tailscale.com>
This raises the maximum DNS response message size from 512 to 4095. This
should be large enough for almost all situations that do not need TCP.
We still do not recognize EDNS, so we will still forward requests that
claim support for a larger response size than 4095 (that will be solved
later). For now, when a response comes back that is too large to fit in
our receive buffer, we now set the truncation flag in the DNS header,
which is an improvement from before but will prompt attempts to use TCP
which isn't supported yet.
On Windows, WSARecvFrom into a buffer that's too small returns an error
in addition to the data. On other OSes, the extra data is silently
discarded. In this case, we prefer the latter so need to catch the error
on Windows.
Partially addresses #1123
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
We used to use "redo" for that, but it was pretty vague.
Also, fix the build tags broken in interfaces_default_route_test.go from
a9745a0b68, moving those Linux-specific
tests to interfaces_linux_test.go.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
netaddr allocated at the time this was written. No longer.
name old time/op new time/op delta
TailscaleServiceAddr-4 5.46ns ± 4% 1.83ns ± 3% -66.52% (p=0.008 n=5+5)
A bunch of the others can probably be simplified too, but this
was the only one with just an IP and not an IPPrefix.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Pull in the latest version of wireguard-windows.
Switch to upstream wireguard-go.
This requires reverting all of our import paths.
Unfortunately, this has to happen at the same time.
The wireguard-go change is very low risk,
as that commit matches our fork almost exactly.
(The only changes are import paths, CI files, and a go.mod entry.)
So if there are issues as a result of this commit,
the first place to look is wireguard-windows changes.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
This leads to a cleaner separation of intent vs. implementation
(Routes is now the only place specifying who handles DNS requests),
and allows for cleaner expression of a configuration that creates
MagicDNS records without serving them to the OS.
Signed-off-by: David Anderson <danderson@tailscale.com>
interfaces.Tailscale only returns an interface if it has at least one Tailscale
IP assigned to it. In the resolved DNS manager, when we're called upon to tear
down DNS config, the interface no longer has IPs.
Instead, look up the interface index on construction and reuse it throughout
the daemon lifecycle.
Fixes#1892.
Signed-off-by: David Anderson <dave@natulte.net>
This reverts commit 7d16c8228b.
I have no idea how I ended up here. The bug I was fixing with this change
fails to reproduce on Ubuntu 18.04 now, and this change definitely does
break 20.04, 20.10, and Debian Buster. So, until we can reliably reproduce
the problem this was meant to fix, reverting.
Part of #1875
Signed-off-by: David Anderson <dave@natulte.net>
Whenever we dropped a packet due to ACLs, wireguard-go was logging:
Failed to write packet to TUN device: packet dropped by filter
Instead, just lie to wireguard-go and pretend everything is okay.
Fixes#1229
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We had a long-standing bug in which our TUN events channel
was being received from simultaneously in two places.
The first is wireguard-go.
At wgengine/userspace.go:366, we pass e.tundev to wireguard-go,
which starts a goroutine (RoutineTUNEventReader)
that receives from that channel and uses events to adjust the MTU
and bring the device up/down.
At wgengine/userspace.go:374, we launch a goroutine that
receives from e.tundev, logs MTU changes, and triggers
state updates when up/down changes occur.
Events were getting delivered haphazardly between the two of them.
We don't really want wireguard-go to receive the up/down events;
we control the state of the device explicitly by calling device.Up.
And the userspace.go loop MTU logging duplicates logging that
wireguard-go does when it received MTU updates.
So this change splits the single TUN events channel into up/down
and other (aka MTU), and sends them to the parties that ought
to receive them.
I'm actually a bit surprised that this hasn't caused more visible trouble.
If a down event went to wireguard-go but the subsequent up event
went to userspace.go, we could end up with the wireguard-go device disappearing.
I believe that this may also (somewhat accidentally) be a fix for #1790.
Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
This tries to generate traffic at a rate that will saturate the
receiver, without overdoing it, even in the event of packet loss. It's
unrealistically more aggressive than TCP (which will back off quickly
in case of packet loss) but less silly than a blind test that just
generates packets as fast as it can (which can cause all the CPU to be
absorbed by the transmitter, giving an incorrect impression of how much
capacity the total system has).
Initial indications are that a syscall about every 10 packets (TCP bulk
delivery) is roughly the same speed as sending every packet through a
channel. A syscall per packet is about 5x-10x slower than that.
The whole tailscale wireguard-go + magicsock + packet filter
combination is about 4x slower again, which is better than I thought
we'd do, but probably has room for improvement.
Note that in "full" tailscale, there is also a tundev read/write for
every packet, effectively doubling the syscall overhead per packet.
Given these numbers, it seems like read/write syscalls are only 25-40%
of the total CPU time used in tailscale proper, so we do have
significant non-syscall optimization work to do too.
Sample output:
$ GOMAXPROCS=2 go test -bench . -benchtime 5s ./cmd/tailbench
goos: linux
goarch: amd64
pkg: tailscale.com/cmd/tailbench
cpu: Intel(R) Core(TM) i7-4785T CPU @ 2.20GHz
BenchmarkTrivialNoAlloc/32-2 56340248 93.85 ns/op 340.98 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTrivialNoAlloc/124-2 57527490 99.27 ns/op 1249.10 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTrivialNoAlloc/1024-2 52537773 111.3 ns/op 9200.39 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTrivial/32-2 41878063 135.6 ns/op 236.04 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTrivial/124-2 41270439 138.4 ns/op 896.02 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTrivial/1024-2 36337252 154.3 ns/op 6635.30 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkBlockingChannel/32-2 12171654 494.3 ns/op 64.74 MB/s 0 %lost 1791 B/op 0 allocs/op
BenchmarkBlockingChannel/124-2 12149956 507.8 ns/op 244.17 MB/s 0 %lost 1792 B/op 1 allocs/op
BenchmarkBlockingChannel/1024-2 11034754 528.8 ns/op 1936.42 MB/s 0 %lost 1792 B/op 1 allocs/op
BenchmarkNonlockingChannel/32-2 8960622 2195 ns/op 14.58 MB/s 8.825 %lost 1792 B/op 1 allocs/op
BenchmarkNonlockingChannel/124-2 3014614 2224 ns/op 55.75 MB/s 11.18 %lost 1792 B/op 1 allocs/op
BenchmarkNonlockingChannel/1024-2 3234915 1688 ns/op 606.53 MB/s 3.765 %lost 1792 B/op 1 allocs/op
BenchmarkDoubleChannel/32-2 8457559 764.1 ns/op 41.88 MB/s 5.945 %lost 1792 B/op 1 allocs/op
BenchmarkDoubleChannel/124-2 5497726 1030 ns/op 120.38 MB/s 12.14 %lost 1792 B/op 1 allocs/op
BenchmarkDoubleChannel/1024-2 7985656 1360 ns/op 752.86 MB/s 13.57 %lost 1792 B/op 1 allocs/op
BenchmarkUDP/32-2 1652134 3695 ns/op 8.66 MB/s 0 %lost 176 B/op 3 allocs/op
BenchmarkUDP/124-2 1621024 3765 ns/op 32.94 MB/s 0 %lost 176 B/op 3 allocs/op
BenchmarkUDP/1024-2 1553750 3825 ns/op 267.72 MB/s 0 %lost 176 B/op 3 allocs/op
BenchmarkTCP/32-2 11056336 503.2 ns/op 63.60 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTCP/124-2 11074869 533.7 ns/op 232.32 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkTCP/1024-2 8934968 671.4 ns/op 1525.20 MB/s 0 %lost 0 B/op 0 allocs/op
BenchmarkWireGuardTest/32-2 1403702 4547 ns/op 7.04 MB/s 14.37 %lost 467 B/op 3 allocs/op
BenchmarkWireGuardTest/124-2 780645 7927 ns/op 15.64 MB/s 1.537 %lost 420 B/op 3 allocs/op
BenchmarkWireGuardTest/1024-2 512671 11791 ns/op 86.85 MB/s 0.5206 %lost 411 B/op 3 allocs/op
PASS
ok tailscale.com/wgengine/bench 195.724s
Updates #414.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
NetworkManager fixed the bug that forced us to use NetworkManager
if it's programming systemd-resolved, and in the same release also
made NetworkManager ignore DNS settings provided for unmanaged
interfaces... Which breaks what we used to do. So, with versions
1.26.6 and above, we MUST NOT use NetworkManager to indirectly
program systemd-resolved, but thankfully we can talk to resolved
directly and get the right outcome.
Fixes#1788
Signed-off-by: David Anderson <danderson@tailscale.com>
Clear LLMNR and mdns flags, update reasoning for our settings,
and set our override priority harder than before when we want
to be primary resolver.
Signed-off-by: David Anderson <danderson@tailscale.com>
Debian resolvconf is not legacy, it's alive and well,
just historically before the other implementations.
Signed-off-by: David Anderson <danderson@tailscale.com>
This allows split-DNS configurations to not break clients on OSes that
haven't yet been ported to understand split DNS, by falling back to quad-9
as a global resolver when handed an "impossible to implement"
split-DNS config.
Part of #953. Needs to be removed before shipping 1.8.
Signed-off-by: David Anderson <danderson@tailscale.com>
With this change, all OSes can sort-of do split DNS, except that the
default upstream is hardcoded to 8.8.8.8 pending further plumbing.
Additionally, Windows 8-10 can do split DNS fully correctly, without
the 8.8.8.8 hack.
Part of #953.
Signed-off-by: David Anderson <danderson@tailscale.com>
It seems that all the setups that support split DNS understand
this distinction, and it's an important one when translating
high-level configuration.
Part of #953.
Signed-off-by: David Anderson <danderson@tailscale.com>
Correctly reports that Win7 cannot do split DNS, and has a helper to
discover the "base" resolvers for the system.
Part of #953
Signed-off-by: David Anderson <danderson@tailscale.com>
OS implementations are going to support split DNS soon.
Until they're all in place, hardcode Primary=true to get
the old behavior.
Signed-off-by: David Anderson <danderson@tailscale.com>
This is usually the same as the requested interface, but on some
unixes can vary based on device number allocation, and on Windows
it's the GUID instead of the pretty name, since everything relating
to configuration wants the GUID.
Signed-off-by: David Anderson <danderson@tailscale.com>
wgengine/router.CallbackRouter needs to support both the Router
and OSConfigurator interfaces, so the setters can't both be called
Set.
Signed-off-by: David Anderson <danderson@tailscale.com>
They need some rework to do the right thing, in the meantime the direct
and resolvconf managers will work out.
The resolved implementation was never selected due to control-side settings.
The networkmanager implementation mostly doesn't get selected due to
unforeseen interactions with `resolvconf` on many platforms.
Both implementations also need rework to support the various routing modes
they're capable of.
Signed-off-by: David Anderson <danderson@tailscale.com>
It's only use to skip some optional initialization during cleanup,
but that work is very minor anyway, and about to change drastically.
Signed-off-by: David Anderson <danderson@tailscale.com>
It's currently unused, and no longer makes sense with the upcoming
DNS infrastructure. Keep it in tailcfg for now, since we need protocol
compat for a bit longer.
Signed-off-by: David Anderson <danderson@tailscale.com>
The resolver still only supports a single upstream config, and
ipn/wgengine still have to split up the DNS config, but this moves
closer to unifying the DNS configs.
As a handy side-effect of the refactor, IPv6 MagicDNS records exist
now.
Signed-off-by: David Anderson <danderson@tailscale.com>
They're only used internally and in tests, and have surprising
semantics in that they only resolve MagicDNS names, not upstream
resolver queries.
Signed-off-by: David Anderson <danderson@tailscale.com>
Work around https://github.com/google/gvisor/issues/5732
by trying to read /proc/net/route with a larger bufsize if
it fails the first time.
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
IPv6 Unique Local Addresses are sometimes used with Network
Prefix Translation to reach the Internet. In that respect
their use is similar to the private IPv4 address ranges
10/8, 172.16/12, and 192.168/16.
Treat them as sufficient for AnyInterfaceUp(), but specifically
exclude Tailscale's own IPv6 ULA prefix to avoid mistakenly
trying to bootstrap Tailscale using Tailscale.
This helps in supporting Google Cloud Run, where the addresses
are 169.254.8.1/32 and fddf:3978:feb1:d745::c001/128 on eth1.
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
For discovery when an explicit hostname/IP is known. We'll still
also send it via control for finding peers by a list.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The tstun packagen contains both constructors for generic tun
Devices, and a wrapper that provides additional functionality.
Signed-off-by: David Anderson <danderson@tailscale.com>
Now callers (wgengine/monitor) don't need to mutate the state to remove
boring interfaces before calling State.Equal. Instead, the methods
to remove boring interfaces from the State are removed, as is
the reflect-using Equal method itself, and in their place is
a new EqualFiltered method that takes a func predicate to match
interfaces to compare.
And then the FilterInteresting predicate is added for use
with EqualFiltered to do the job that that wgengine/monitor
previously wanted.
Now wgengine/monitor can keep the full interface state around,
including the "boring" interfaces, which we'll need for peerapi on
macOS/iOS to bind to the interface index of the utunN device.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We have it already but threw it away. But macOS/iOS code will
be needing the interface index, so hang on to it.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add proto to flowtrack.Tuple.
Add types/ipproto leaf package to break a cycle.
Server-side ACL work remains.
Updates #1516
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We strip them control-side anyway, and we already strip IPv4 link
local, so there's no point uploading them. And iOS has a ton of them,
which results in somewhat silly amount of traffic in the MapRequest.
We'll be doing same-LAN-inter-tailscaled link-local traffic a
different way, with same-LAN discovery.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We basically already had the RIB-parsing Go code for this in both
net/interfaces and wgengine/monitor, for other reasons.
Fixes#1426Fixes#1471
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So a region can be used if needed, but won't be STUN-probed or used as
its home.
This gives us another possible debugging mechanism for #1310, or can
be used as a short-term measure against DERP flip-flops for people
equidistant between regions if our hysteresis still isn't good enough.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
interfaces.State.String tries to print a concise summary of the
network state, removing any interfaces that don't have any or any
interesting IP addresses. On macOS and iOS, for instance, there are a
ton of misc things.
But the link monitor based its are-there-changes decision on
interfaces.State.Equal, which just used reflect.DeepEqual, including
comparing all the boring interfaces. On macOS, when turning wifi on or off, there
are a ton of misc boring interface changes, resulting in hitting an earlier
check I'd added on suspicion this was happening:
[unexpected] network state changed, but stringification didn't
This fixes that by instead adding a new
interfaces.State.RemoveUninterestingInterfacesAndAddresses method that
does, uh, that. Then use that in the monitor. So then when Equal is
used later, it's DeepEqualing the already-cleaned version with only
interesting interfaces.
This makes cmd/tailscaled debug --monitor much less noisy.
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Not beautiful, but I'm debugging connectivity problems on
NEProvider.sleep+wake and need more clues.
Updates #1426
Updates tailscale/corp#1289
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>