Commit Graph

78 Commits (6a19d58ffaaa7fb7de2484c8be71429fcfb5e626)

Author SHA1 Message Date
Andrew Dunham f97d0ac994 net/dns/resolver: add better error wrapping
To aid in debugging exactly what's going wrong, instead of the
not-particularly-useful "dns udp query: context deadline exceeded" error
that we currently get.

Updates #3786
Updates #10768
Updates #11620
(etc.)

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I76334bf0681a8a2c72c90700f636c4174931432c
6 months ago
Brad Fitzpatrick 3672f29a4e net/netns, net/dns/resolver, etc: make netmon required in most places
The goal is to move more network state accessors to netmon.Monitor
where they can be cheaper/cached. But first (this change and others)
we need to make sure the one netmon.Monitor is plumbed everywhere.

Some notable bits:

* tsdial.NewDialer is added, taking a now-required netmon

* because a tsdial.Dialer always has a netmon, anything taking both
  a Dialer and a NetMon is now redundant; take only the Dialer and
  get the NetMon from that if/when needed.

* netmon.NewStatic is added, primarily for tests

Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299

Change-Id: I877f9cb87618c4eb037cee098241d18da9c01691
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Andrew Dunham b85c2b2313 net/dns/resolver: use SystemDial in DoH forwarder
This ensures that we close the underlying connection(s) when a major
link change happens. If we don't do this, on mobile platforms switching
between WiFi and cellular can result in leftover connections in the
http.Client's connection pool which are bound to the "wrong" interface.

Updates #10821
Updates tailscale/corp#19124

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibd51ce2efcaf4bd68e14f6fdeded61d4e99f9a01
7 months ago
James Tucker 8d0d46462b net/dns: timeout DOH requests after 10s without response headers
If a client socket is remotely lost but the client is not sent an RST in
response to the next request, the socket might sit in RTO for extended
lengths of time, resulting in "no internet" for users. Instead, timeout
after 10s, which will close the underlying socket, recovering from the
situation more promptly.

Updates #10967

Signed-off-by: James Tucker <james@tailscale.com>
8 months ago
Andrew Dunham 35c303227a net/dns/resolver: add ID to verbose logs in forwarder
To make it easier to correlate the starting/ending log messages.

Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2802d53ad98e19bc8914bc58f8c04d4443227b26
10 months ago
Andrew Dunham 286c6ce27c
net/dns/resolver: race UDP and TCP queries (#9544)
Instead of just falling back to making a TCP query to an upstream DNS
server when the UDP query returns a truncated query, also start a TCP
query in parallel with the UDP query after a given race timeout. This
ensures that if the upstream DNS server does not reply over UDP (or if
the response packet is blocked, or there's an error), we can still make
queries if the server replies to TCP queries.

This also adds a new package, util/race, to contain the logic required for
racing two different functions and returning the first non-error answer.

Updates tailscale/corp#14809

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I4311702016c1093b1beaa31b135da1def6d86316
1 year ago
Andrew Dunham 530aaa52f1 net/dns: retry forwarder requests over TCP
We weren't correctly retrying truncated requests to an upstream DNS
server with TCP. Instead, we'd return a truncated request to the user,
even if the user was querying us over TCP and thus able to handle a
large response.

Also, add an envknob and controlknob to allow users/us to disable this
behaviour if it turns out to be buggy ( DNS ).

Updates #9264

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ifb04b563839a9614c0ba03e9c564e8924c1a2bfd
1 year ago
Mihai Parparita 7330aa593e all: avoid repeated default interface lookups
On some platforms (notably macOS and iOS) we look up the default
interface to bind outgoing connections to. This is both duplicated
work and results in logspam when the default interface is not available
(i.e. when a phone has no connectivity, we log an error and thus cause
more things that we will try to upload and fail).

Fixed by passing around a netmon.Monitor to more places, so that we can
use its cached interface state.

Fixes #7850
Updates #7621

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Mihai Parparita 4722f7e322 all: move network monitoring from wgengine/monitor to net/netmon
We're using it in more and more places, and it's not really specific to
our use of Wireguard (and does more just link/interface monitoring).

Also removes the separate interface we had for it in sockstats -- it's
a small enough package (we already pull in all of its dependencies
via other paths) that it's not worth the extra complexity.

Updates #7621
Updates #7850

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Brad Fitzpatrick 10f1c90f4d wgengine/magicsock, types/nettype, etc: finish ReadFromUDPAddrPort netip migration
So we're staying within the netip.Addr/AddrPort consistently and
avoiding allocs/conversions to the legacy net addr types.

Updates #5162

Change-Id: I59feba60d3de39f773e68292d759766bac98c917
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Mihai Parparita edb02b63f8 net/sockstats: pass in logger to sockstats.WithSockStats
Using log.Printf may end up being printed out to the console, which
is not desirable. I noticed this when I was investigating some client
logs with `sockstats: trace "NetcheckClient" was overwritten by another`.
That turns to be harmless/expected (the netcheck client will fall back
to the DERP client in some cases, which does its own sockstats trace).

However, the log output could be visible to users if running the
`tailscale netcheck` CLI command, which would be needlessly confusing.

Updates tailscale/corp#9230

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Andrew Dunham 83fa17d26c various: pass logger.Logf through to more places
Updates #7537

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id89acab70ea678c8c7ff0f44792d54c7223337c6
2 years ago
Mihai Parparita 6ac6ddbb47 sockstats: switch label to enum
Makes it cheaper/simpler to persist values, and encourages reuse of
labels as opposed to generating an arbitrary number.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Mihai Parparita 9cb332f0e2 sockstats: instrument networking code paths
Uses the hooks added by tailscale/go#45 to instrument the reads and
writes on the major code paths that do network I/O in the client. The
convention is to use "<package>.<type>:<label>" as the annotation for
the responsible code path.

Enabled on iOS, macOS and Android only, since mobile platforms are the
ones we're most interested in, and we are less sensitive to any
throughput degradation due to the per-I/O callback overhead (macOS is
also enabled for ease of testing during development).

For now just exposed as counters on a /v0/sockstats PeerAPI endpoint.

We also keep track of the current interface so that we can break out
the stats by interface.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
David Anderson 8b2ae47c31 version: unexport all vars, turn Short/Long into funcs
The other formerly exported values aren't used outside the package,
so just unexport them.

Signed-off-by: David Anderson <danderson@tailscale.com>
2 years ago
Mihai Parparita 0e3fb91a39 net/dns/resolver: remove maxDoHInFlight
It was originally added to control memory use on iOS (#2490), but then
was relaxed conditionally when running on iOS 15 (#3098). Now that we
require iOS 15, there's no need for the limit at all, so simplify back
to the original state.

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Will Norris 71029cea2d all: update copyright and license headers
This updates all source files to use a new standard header for copyright
and license declaration.  Notably, copyright no longer includes a date,
and we now use the standard SPDX-License-Identifier header.

This commit was done almost entirely mechanically with perl, and then
some minimal manual fixes.

Updates #6865

Signed-off-by: Will Norris <will@tailscale.com>
2 years ago
Josh Soref d4811f11a0 all: fix spelling mistakes
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2 years ago
Eng Zer Jun f0347e841f refactor: move from io/ioutil to io and os packages
The io/ioutil package has been deprecated as of Go 1.16 [1]. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.

Reference: https://golang.org/doc/go1.16#ioutil
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2 years ago
Brad Fitzpatrick 74674b110d envknob: support changing envknobs post-init
Updates #5114

Change-Id: Ia423fc7486e1b3f3180a26308278be0086fae49b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Mihai Parparita 82e82d9b7a net/dns/resolver: remove unused responseTimeout constant
Timeout is now enforced elsewhere, see discussion in https://github.com/tailscale/tailscale/pull/4408#discussion_r970092333.

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Brad Fitzpatrick e7376aca25 net/dns/resolver: set DNS-over-HTTPS Accept and User-Agent header on requests
Change-Id: I14b821771681e70405a507f43229c694159265ff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 58abae1f83 net/dns/{publicdns,resolver}: add NextDNS DoH support
NextDNS is unique in that users create accounts and then get
user-specific DNS IPs & DoH URLs.

For DoH, the customer ID is in the URL path.

For IPv6, the IP address includes the customer ID in the lower bits.

For IPv4, there's a fragile "IP linking" mechanism to associate your
public IPv4 with an assigned NextDNS IPv4 and that tuple maps to your
customer ID.

We don't use the IP linking mechanism.

Instead, NextDNS is DoH-only. Which means using NextDNS necessarily
shunts all DNS traffic through 100.100.100.100 (programming the OS to
use 100.100.100.100 as the global resolver) because operating systems
can't usually do DoH themselves.

Once it's in Tailscale's DoH client, we then connect out to the known
NextDNS IPv4/IPv6 anycast addresses.

If the control plane sends the client a NextDNS IPv6 address, we then
map it to the corresponding NextDNS DoH with the same client ID, and
we dial that DoH server using the combination of v4/v6 anycast IPs.

Updates #2452

Change-Id: I3439d798d21d5fc9df5a2701839910f5bef85463
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Nahum Shalman 214242ff62 net/dns/publicdns: Add Mullvad DoH
See https://mullvad.net/en/help/dns-over-https-and-dns-over-tls/

The Mullvad DoH servers appear to only speak HTTP/2 and
the use of a non-nil DialContext in the http.Transport
means that ForceAttemptHTTP2 must be set to true to be
able to use them.

Signed-off-by: Nahum Shalman <nahamu@gmail.com>
2 years ago
Maisem Ali 3bb57504af net/dns/resolver: add comments clarifying nil error returns
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali 4497bb0b81 net/dns/resolver: return SERVFAIL when no upstream resolvers set
Otherwise we just keep looping over the same thing again and again.

```
dns udp query: upstream nameservers not set
dns udp query: upstream nameservers not set
dns udp query: upstream nameservers not set
```

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali 9bb5a038e5 all: use atomic.Pointer
Also add some missing docs.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick a12aad6b47 all: convert more code to use net/netip directly
perl -i -npe 's,netaddr.IPPrefixFrom,netip.PrefixFrom,' $(git grep -l -F netaddr.)
    perl -i -npe 's,netaddr.IPPortFrom,netip.AddrPortFrom,' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IPPrefix,netip.Prefix,g' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IPPort,netip.AddrPort,g' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IP\b,netip.Addr,g' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IPv6Raw\b,netip.AddrFrom16,g' $(git grep -l -F netaddr. )
    goimports -w .

Then delete some stuff from the net/netaddr shim package which is no
longer neeed.

Updates #5162

Change-Id: Ia7a86893fe21c7e3ee1ec823e8aba288d4566cd8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 7eaf5e509f net/netaddr: start migrating to net/netip via new netaddr adapter package
Updates #5162

Change-Id: Id7bdec303b25471f69d542f8ce43805328d56c12
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Tom DNetto d6817d0f22 net/dns/resolver: respond with SERVFAIL if all upstreams fail
Fixes #4722

Signed-off-by: Tom DNetto <tom@tailscale.com>
2 years ago
Brad Fitzpatrick aa37aece9c ipn/ipnlocal, net/dns*, util/cloudenv: add AWS DNS support
And remove the GCP special-casing from ipn/ipnlocal; do it only in the
forwarder for *.internal.

Fixes #4980
Fixes #4981

Change-Id: I5c481e96d91f3d51d274a80fbd37c38f16dfa5cb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 88c2afd1e3 ipn/ipnlocal, net/dns*, util/cloudenv: specialize DNS config on Google Cloud
This does three things:

* If you're on GCP, it adds a *.internal DNS split route to the
  metadata server, so we never break GCP DNS names. This lets people
  have some Tailscale nodes on GCP and some not (e.g. laptops at home)
  without having to add a Tailnet-wide *.internal DNS route.
  If you already have such a route, though, it won't overwrite it.

* If the 100.100.100.100 DNS forwarder has nowhere to forward to,
  it forwards it to the GCP metadata IP, which forwards to 8.8.8.8.
  This means there are never errNoUpstreams ("upstream nameservers not set")
  errors on GCP due to e.g. mangled /etc/resolv.conf (GCP default VMs
  don't have systemd-resolved, so it's likely a DNS supremacy fight)

* makes the DNS fallback mechanism use the GCP metadata IP as a
  fallback before our hosted HTTP-based fallbacks

I created a default GCP VM from their web wizard. It has no
systemd-resolved.

I then made its /etc/resolv.conf be empty and deleted its GCP
hostnames in /etc/hosts.

I then logged in to a tailnet with no global DNS settings.

With this, tailscaled writes /etc/resolv.conf (direct mode, as no
systemd-resolved) and sets it to 100.100.100.100, which then has
regular DNS via the metadata IP and *.internal DNS via the metadata IP
as well. If the tailnet configures explicit DNS servers, those are used
instead, except for *.internal.

This also adds a new util/cloudenv package based on version/distro
where the cloud type is only detected once. We'll likely expand it in
the future for other clouds, doing variants of this change for other
popular cloud environments.

Fixes #4911

RELNOTES=Google Cloud DNS improvements

Change-Id: I19f3c2075983669b2b2c0f29a548da8de373c7cf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Maisem Ali fd99c54e10 tailcfg,all: change structs to []*dnstype.Resolver
Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Tom DNetto 5b85f848dd net/dns,net/dns/resolver: refactor channels/magicDNS out of Resolver
Moves magicDNS-specific handling out of Resolver & into dns.Manager. This
greatly simplifies the Resolver to solely issuing queries and returning
responses, without channels.

Enforcement of max number of in-flight magicDNS queries, assembly of
synthetic UDP datagrams, and integration with wgengine for
recieving/responding to magicDNS traffic is now entirely in Manager.
This path is being kept around, but ultimately aims to be deleted and
replaced with a netstack-based path.

This commit is part of a series to implement magicDNS using netstack.

Signed-off-by: Tom DNetto <tom@tailscale.com>
3 years ago
Tom DNetto 5fb8e01a8b net/dns/resolver: add metric for number of truncated dns packets
Updates #2067

This should help us determine if more robust control of edns parameters
+ implementing answer truncation is warranted, given its likely complexity.

Signed-off-by: Tom DNetto <tom@tailscale.com>
3 years ago
Brad Fitzpatrick cc575fe4d6 net/dns: schedule DoH upgrade explicitly, fix Resolver.Addr confusion
Two changes in one:

* make DoH upgrades an explicitly scheduled send earlier, when we come
  up with the resolvers-and-delay send plan. Previously we were
  getting e.g.  four Google DNS IPs and then spreading them out in
  time (for back when we only did UDP) but then later we added DoH
  upgrading at the UDP packet layer, which resulted in sometimes
  multiple DoH queries to the same provider running (each doing happy
  eyeballs dialing to 4x IPs themselves) for each of the 4 source IPs.
  Instead, take those 4 Google/Cloudflare IPs and schedule 5 things:
  first the DoH query (which can use all 4 IPs), and then each of the
  4 IPs as UDP later.

* clean up the dnstype.Resolver.Addr confusion; half the code was
  using it as an IP string (as documented) as half was using it as
  an IP:port (from some prior type we used), primarily for tests.
  Instead, document it was being primarily an IP string but also
  accepting an IP:port for tests, then add an accessor method on it
  to get the IPPort and use that consistently everywhere.

Change-Id: Ifdd72b9e45433a5b9c029194d50db2b9f9217b53
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick e3a4952527 net/dns/resolver: count errors when racing DNS queries, fail earlier
If all N queries failed, we waited until context timeout (in 5
seconds) to return.

This makes (*forwarder).forward fail fast when the network's
unavailable.

Change-Id: Ibbb3efea7ed34acd3f3b29b5fee00ba8c7492569
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick ecea6cb994 net/dns/resolver: make DoH dialer use existing dnscache happy eyeball dialer
Simplify the ability to reason about the DoH dialing code by reusing the
dnscache's dialer we already have.

Also, reduce the scope of the "ip" variable we don't want to close over.

This necessarily adds a new field to dnscache.Resolver:
SingleHostStaticResult, for when the caller already knows the IPs to be
returned.

Change-Id: I9f2aef7926f649137a5a3e63eebad6a3fffa48c0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
phirework 83c734a6e0
net/dns, util/publicdns: extract public DNS mapping into own package (#4405)
This extracts DOH mapping of known public DNS providers in
forwarder.go into its own package, to be consumed by other repos

Signed-off-by: Jenny Zhang <jz@tailscale.com>
3 years ago
Brad Fitzpatrick f2041c9088 all: use strings.Cut even more
Change-Id: I943ce72c6f339589235bddbe10d07799c4e37979
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 2513d2d728 net/{neterror,dns/resolver}: move PacketWasTruncated to neterror from DNS code
And delete the unused code in net/dns/resolver/neterr_*.go.

Change-Id: Ibe62c486bacce2733eb9968c96a98cbbdb2758bd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 0aa4c6f147 net/dns/resolver: add debug HTML handler to see what DNS traffic was forwarded
Change-Id: I6b790e92dcc608515ac8b178f2271adc9fd98f78
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 39f22a357d net/dns/resolver: send NXDOMAIN to iOS DNS-SD/Bonjour queries
Don't just ignore them. See if this makes them calm down.

Updates #3363

Change-Id: Id1d66308e26660d26719b2538b577522a1e36b63
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c7052154d5 net/dns/resolver: fix the subject in a func comment
Change-Id: I519268c20dbd2c2da92da565839d3c1c84612dcc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c37af58ea4 net/tsdial: move more weirdo dialing into new tsdial package, plumb
Not done yet, but this move more of the outbound dial special casing
from random packages into tsdial, which aspires to be the one unified
place for all outbound dialing shenanigans.

Then this plumbs it all around, so everybody is ultimately
holding on to the same dialer.

As of this commit, macOS/iOS using an exit node should be able to
reach to the exit node's DoH DNS proxy over peerapi, doing the sockopt
to stay within the Network Extension.

A number of steps remain, including but limited to:

* move a bunch more random dialing stuff

* make netstack-mode tailscaled be able to use exit node's DNS proxy,
  teaching tsdial's resolver to use it when an exit node is in use.

Updates #1713

Change-Id: I1e8ee378f125421c2b816f47bc2c6d913ddcd2f5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 3ae6f898cf ipn/ipnlocal, net/dns/resolver: use exit node's DoH proxy when available
Updates #1713

Change-Id: I3695a40ec12d2b4e6dac41cf4559daca6dddd68e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 135580a5a8 tailcfg, ipn/ipnlocal, net/dns: forward exit node DNS on Unix to system DNS
Updates #1713

Change-Id: I4c073fec0992d9e01a9a4ce97087d5af0efdc68d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 78b0bd2957 net/dns/resolver: add clientmetrics for DNS
Fixes tailscale/corp#1811

Change-Id: I864d11e0332a177e8c5ff403591bff6fec548f5a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 25525b7754 net/dns/resolver, ipn/ipnlocal: wire up peerapi DoH server to DNS forwarder
Updates #1713

Change-Id: Ia4ed9d8c9cef0e70aa6d30f2852eaab80f5f695a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder 758c37b83d net/netns: thread logf into control functions
So that darwin can log there without panicking during tests.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago