Commit Graph

418 Commits (01e645fae1d3e97d1b43a78ad9b6e5cf5d390c74)

Author SHA1 Message Date
Jonathan Nobels 19b0c8a024
net/dns, health: raise health warning for failing forwarded DNS queries (#12888)
updates tailscale/corp#21823

Misconfigured, broken, or blocked DNS will often present as
"internet is broken'" to the end user.  This  plumbs the health tracker
into the dns manager and forwarder and adds a health warning
with a 5 second delay that is raised on failures in the forwarder and
lowered on successes.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
1 year ago
KevinLiang10 8d7b78f3f7 net/dns/publicdns: remove additional information in DOH URL passed to IPv6 address generation for controlD.
This commit truncates any additional information (mainly hostnames) that's passed to controlD via DOH URL in DoHIPsOfBase.
This change is to make sure only resolverID is passed to controlDv6Gen but not the additional information.

Updates: #7946
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
1 year ago
Brad Fitzpatrick c6af5bbfe8 all: add test for package comments, fix, add comments as needed
Updates #cleanup

Change-Id: Ic4304e909d2131a95a38b26911f49e7b1729aaef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Nick Khyl 8bd442ba8c util/winutil/gp, net/dns: add package for Group Policy API
This adds a package with GP-related functions and types to be used in the future PRs.
It also updates nrptRuleDatabase to use the new package instead of its own gpNotificationWatcher implementation.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Jonathan Nobels 4e5ef5b628
net/dns: fix broken dns benchmark tests (#12686)
Updates tailscale/corp#20677

The recover function wasn't getting set in the benchmark
tests.  Default changed to an empty func.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
1 year ago
Brad Fitzpatrick 9766f0e110 net/dns: move mutex before the field it guards
And some misc doc tweaks for idiomatic Go style.

Updates #cleanup

Change-Id: I3ca45f78aaca037f433538b847fd6a9571a2d918
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Andrew Dunham 53a5d00fff net/dns: ensure /etc/resolv.conf is world-readable even with a umask
Previously, if we had a umask set (e.g. 0027) that prevented creating a
world-readable file, /etc/resolv.conf would be created without the o+r
bit and thus other users may be unable to resolve DNS.

Since a umask only applies to file creation, chmod the file after
creation and before renaming it to ensure that it has the appropriate
permissions.

Updates #12609

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2a05d64f4f3a8ee8683a70be17a7da0e70933137
1 year ago
Andrew Dunham a475c435ec net/dns/resolver: fix test failure
Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I0e815a69ee44ca0ff7c0ea0ca3c6904bbf67ed1f
1 year ago
Jonathan Nobels 27033c6277
net/dns: recheck DNS config on SERVFAIL errors (#12547)
Fixes tailscale/corp#20677

Replaces the original attempt to rectify this (by injecting a netMon
event) which was both heavy handed, and missed cases where the
netMon event was "minor".

On apple platforms, the fetching the interface's nameservers can
and does return an empty list in certain situations.   Apple's API
in particular is very limiting here.  The header hints at notifications
for dns changes which would let us react ahead of time, but it's all
private APIs.

To avoid remaining in the state where we end up with no
nameservers but we absolutely need them, we'll react
to a lack of upstream nameservers by attempting to re-query
the OS.

We'll rate limit this to space out the attempts.   It seems relatively
harmless to attempt a reconfig every 5 seconds (triggered
by an incoming query) if the network is in this broken state.

Missing nameservers might possibly be a persistent condition
(vs a transient error), but that would  also imply that something
out of our control is badly misconfigured.

Tested by randomly returning [] for the nameservers.   When switching
between Wifi networks, or cell->wifi, this will randomly trigger
the bug, and we appear to reliably heal the DNS state.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
1 year ago
Aaron Klotz d7a4f9d31c net/dns: ensure multiple hosts with the same IP address are combined into a single HostEntry
This ensures that each line has a unique IP address.

Fixes #11939

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
1 year ago
Nick Khyl c32efd9118 various: create a catch-all NRPT rule when "Override local DNS" is enabled on Windows
Without this rule, Windows 8.1 and newer devices issue parallel DNS requests to DNS servers
associated with all network adapters, even when "Override local DNS" is enabled and/or
a Mullvad exit node is being used, resulting in DNS leaks.

This also adds "disable-local-dns-override-via-nrpt" nodeAttr that can be used to disable
the new behavior if needed.

Fixes tailscale/corp#20718

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Andrea Gottardo a8ee83e2c5
health: begin work to use structured health warnings instead of strings, pipe changes into ipn.Notify (#12406)
Updates tailscale/tailscale#4136

This PR is the first round of work to move from encoding health warnings as strings and use structured data instead. The current health package revolves around the idea of Subsystems. Each subsystem can have (or not have) a Go error associated with it. The overall health of the backend is given by the concatenation of all these errors.

This PR polishes the concept of Warnable introduced by @bradfitz a few weeks ago. Each Warnable is a component of the backend (for instance, things like 'dns' or 'magicsock' are Warnables). Each Warnable has a unique identifying code. A Warnable is an entity we can warn the user about, by setting (or unsetting) a WarningState for it. Warnables have:

- an identifying Code, so that the GUI can track them as their WarningStates come and go
- a Title, which the GUIs can use to tell the user what component of the backend is broken
- a Text, which is a function that is called with a set of Args to generate a more detailed error message to explain the unhappy state

Additionally, this PR also begins to send Warnables and their WarningStates through LocalAPI to the clients, using ipn.Notify messages. An ipn.Notify is only issued when a warning is added or removed from the Tracker.

In a next PR, we'll get rid of subsystems entirely, and we'll start using structured warnings for all errors affecting the backend functionality.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Jonathan Nobels 02e3c046aa
net/dns: re-query system resolvers on no-upstream resolver failure on apple platforms (#12398)
Fixes tailscale/corp#20677

On macOS sleep/wake, we're encountering a condition where reconfigure the network
a little bit too quickly - before apple has set the nameservers for our interface.
This results in a persistent condition where we have no upstream resolver and
fail all forwarded DNS queries.

No upstream nameservers is a legitimate configuration, and we have no  (good) way
of determining when Apple is ready - but if we need to forward a query, and we
have no nameservers, then something has gone badly wrong and the network is
very broken.

A simple fix here is to simply inject a netMon event, which will go through the
configuration dance again when we hit the SERVFAIL condition.

Tested by artificially/randomly returning [] for the list of nameservers in the bespoke
ipn-bridge code responsible for getting the nameservers.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
1 year ago
Aaron Klotz 3511d1f8a2 cmd/tailscaled, net/dns, wgengine/router: start Windows child processes with DETACHED_PROCESS when I/O is being piped
When we're starting child processes on Windows that are CLI programs that
don't need to output to a console, we should pass in DETACHED_PROCESS as a
CreationFlag on SysProcAttr. This prevents the OS from even creating a console
for the child (and paying the associated time/space penalty for new conhost
processes). This is more efficient than letting the OS create the console
window and then subsequently trying to hide it, which we were doing at a few
callsites.

Fixes #12270

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2 years ago
Nick Khyl 4cdc4ed7db net/dns/resolver: return an empty successful response instead of NXDomain when resolving A records for 4via6 domains
As quad-100 is an authoritative server for 4via6 domains, it should always return responses
with a response code of 0 (indicating no error) when resolving records for these domains.
If there's no resource record of the specified type (e.g. A), it should return a response
with an empty answer section rather than NXDomain. Such a response indicates that there
is at least one RR of a different type (e.g., AAAA), suggesting the Windows stub resolver
to look for it.

Fixes tailscale/corp#20767

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2 years ago
Brad Fitzpatrick 916c4db75b net/dns: fix crash in tests
Looks like #12346 as submitted with failing tests.

Updates #12346

Change-Id: I582cd0dfb117686330d935d763d972373c5ae598
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrea Gottardo b65221999c
tailcfg,net/dns: add controlknob to disable battery split DNS on iOS (#12346)
Updates corp#15802.

Adds the ability for control to disable the recently added change that uses split DNS in more cases on iOS. This will allow us to disable the feature if it leads to regression in production. We plan to remove this knob once we've verified that the feature works properly.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2 years ago
Andrea Gottardo d636407f14
net/dns: don't set MatchDomains on Apple platforms when no upstream nameservers available (#12334)
This PR addresses a DNS issue on macOS as discussed this morning.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
2 years ago
Andrew Dunham ced9a0d413
net/dns: fix typo in OSConfig logging (#12330)
Updates tailscale/corp#20530

Change-Id: I48834a0a5944ed35509c63bdd2830aa34e1bddeb

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2 years ago
Andrea Gottardo dd77111462
xcode/iOS: set MatchDomains when no route requires a custom DNS resolver (#10576)
Updates https://github.com/tailscale/corp/issues/15802.

On iOS exclusively, this PR adds logic to use a split DNS configuration in more cases, with the goal of improving battery life. Acting as the global DNS resolver on iOS should be avoided, as it leads to frequent wakes of IPNExtension.

We try to determine if we can have Tailscale only handle DNS queries for resources inside the tailnet, that is, all routes in the DNS configuration do not require a custom resolver (this is the case for app connectors, for instance).

If so, we set all Routes as MatchDomains. This enables a split DNS configuration which will help preserve battery life. Effectively, for the average Tailscale user who only relies on MagicDNS to resolve *.ts.net domains, this means that Tailscale DNS will only be used for those domains.

This PR doesn't affect users with Override Local DNS enabled. For these users, there should be no difference and Tailscale will continue acting as a global DNS resolver.

Signed-off-by: Andrea Gottardo <andrea@tailscale.com>
2 years ago
Kevin Liang 7f83f9fc83 Net/DNS/Publicdns: update the IPv6 range that we use to recreate route endpoint for control D
In this commit I updated the Ipv6 range we use to generate Control D DOH ip, we were using the NextDNSRanges to generate Control D DOH ip, updated to use the correct range.

Updates: #7946
Signed-off-by: Kevin Liang <kevinliang@tailscale.com>
2 years ago
Nick Khyl f62e678df8 net/dns/resolver, control/controlknobs, tailcfg: use UserDial instead of SystemDial to dial DNS servers
Now that tsdial.Dialer.UserDial has been updated to honor the configured routes
and dial external network addresses without going through Tailscale, while also being
able to dial a node/subnet router on the tailnet, we can start using UserDial to forward
DNS requests. This is primarily needed for DNS over TCP when forwarding requests
to internal DNS servers, but we also update getKnownDoHClientForProvider to use it.

Updates tailscale/corp#18725

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2 years ago
Andrew Dunham f97d0ac994 net/dns/resolver: add better error wrapping
To aid in debugging exactly what's going wrong, instead of the
not-particularly-useful "dns udp query: context deadline exceeded" error
that we currently get.

Updates #3786
Updates #10768
Updates #11620
(etc.)

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I76334bf0681a8a2c72c90700f636c4174931432c
2 years ago
Brad Fitzpatrick 3672f29a4e net/netns, net/dns/resolver, etc: make netmon required in most places
The goal is to move more network state accessors to netmon.Monitor
where they can be cheaper/cached. But first (this change and others)
we need to make sure the one netmon.Monitor is plumbed everywhere.

Some notable bits:

* tsdial.NewDialer is added, taking a now-required netmon

* because a tsdial.Dialer always has a netmon, anything taking both
  a Dialer and a NetMon is now redundant; take only the Dialer and
  get the NetMon from that if/when needed.

* netmon.NewStatic is added, primarily for tests

Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299

Change-Id: I877f9cb87618c4eb037cee098241d18da9c01691
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 745931415c health, all: remove health.Global, finish plumbing health.Tracker
Updates #11874
Updates #4136

Change-Id: I414470f71d90be9889d44c3afd53956d9f26cd61
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 5b32264033 health: break Warnable into a global and per-Tracker value halves
Previously it was both metadata about the class of warnable item as
well as the value.

Now it's only metadata and the value is per-Tracker.

Updates #11874
Updates #4136

Change-Id: Ia1ed1b6c95d34bc5aae36cffdb04279e6ba77015
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick ebc552d2e0 health: add Tracker type, in prep for removing global variables
This moves most of the health package global variables to a new
`health.Tracker` type.

But then rather than plumbing the Tracker in tsd.System everywhere,
this only goes halfway and makes one new global Tracker
(`health.Global`) that all the existing callers now use.

A future change will eliminate that global.

Updates #11874
Updates #4136

Change-Id: I6ee27e0b2e35f68cb38fecdb3b2dc4c3f2e09d68
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham b85c2b2313 net/dns/resolver: use SystemDial in DoH forwarder
This ensures that we close the underlying connection(s) when a major
link change happens. If we don't do this, on mobile platforms switching
between WiFi and cellular can result in leftover connections in the
http.Client's connection pool which are bound to the "wrong" interface.

Updates #10821
Updates tailscale/corp#19124

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibd51ce2efcaf4bd68e14f6fdeded61d4e99f9a01
2 years ago
Brad Fitzpatrick 7c1d6e35a5 all: use Go 1.22 range-over-int
Updates #11058

Change-Id: I35e7ef9b90e83cac04ca93fd964ad00ed5b48430
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Aaron Klotz 4d5d669cd5 net/dns: unconditionally write NRPT rules to local settings
We were being too aggressive when deciding whether to write our NRPT rules
to the local registry key or the group policy registry key.

After once again reviewing the document which calls itself a spec
(see issue), it is clear that the presence of the DnsPolicyConfig subkey
is the important part, not the presence of values set in the DNSClient
subkey. Furthermore, a footnote indicates that the presence of
DnsPolicyConfig in the GPO key will always override its counterpart in
the local key. The implication of this is important: we may unconditionally
write our NRPT rules to the local key. We copy our rules to the policy
key only when it contains NRPT rules belonging to somebody other than us.

Fixes https://github.com/tailscale/corp/issues/19071

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2 years ago
James Tucker db760d0bac cmd/tailscaled: move cleanup to an implicit action during startup
This removes a potentially increased boot delay for certain boot
topologies where they block on ExecStartPre that may have socket
activation dependencies on other system services (such as
systemd-resolved and NetworkManager).

Also rename cleanup to clean up in affected/immediately nearby places
per code review commentary.

Fixes #11599

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
alexelisenko fe22032fb3
net/dns/{publicdns,resolver}: add start of Control D support
Updates #7946

[@bradfitz fixed up version of #8417]

Change-Id: I1dbf6fa8d525b25c0d7ad5c559a7f937c3cd142a
Signed-off-by: alexelisenko <39712468+alexelisenko@users.noreply.github.com>
Signed-off-by: Alex Paguis <alex@windscribe.com>
2 years ago
Brad Fitzpatrick 8d7894c68e clientupdate, net/dns: fix some "tailsacle" typos
Updates #cleanup

Change-Id: I982175e74b0c8c5b3e01a573e5785e6596b7ac39
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Asutorufa e20ce7bf0c net/dns: close ctx when close dns directManager
Signed-off-by: Asutorufa <16442314+Asutorufa@users.noreply.github.com>
2 years ago
Nick Khyl 7ef1fb113d cmd/tailscaled, ipn/ipnlocal, wgengine: shutdown tailscaled if wgdevice is closed
Tailscaled becomes inoperative if the Tailscale Tunnel wintun adapter is abruptly removed.
wireguard-go closes the device in case of a read error, but tailscaled keeps running.
This adds detection of a closed WireGuard device, triggering a graceful shutdown of tailscaled.
It is then restarted by the tailscaled watchdog service process.

Fixes #11222

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2 years ago
Nick Khyl b42b9817b0 net/dns: do not wait for the interface registry key to appear if the windowsManager is being closed
The WinTun adapter may have been removed by the time we're closing
the dns.windowsManager, and its associated interface registry key might
also have been deleted. We shouldn't use winutil.OpenKeyWait and wait
for the interface key to appear when performing a cleanup as a part of
the windowsManager shutdown.

Updates #11222

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2 years ago
mrrfv ff1391a97e net/dns/publicdns: add Mullvad family DNS to the list of known DoH servers
Adds the new Mullvad family DNS server to the known DNS over HTTPS server list.

Signed-off-by: mrrfv <rm-rfv-no-preserve-root@protonmail.com>
2 years ago
James Tucker 8d0d46462b net/dns: timeout DOH requests after 10s without response headers
If a client socket is remotely lost but the client is not sent an RST in
response to the next request, the socket might sit in RTO for extended
lengths of time, resulting in "no internet" for users. Instead, timeout
after 10s, which will close the underlying socket, recovering from the
situation more promptly.

Updates #10967

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Andrew Dunham 70b7201744 net/dns: fix infinite loop when run on Amazon Linux 2023
This fixes an infinite loop caused by the configuration of
systemd-resolved on Amazon Linux 2023 and how that interacts with
Tailscale's "direct" mode. We now drop the Tailscale service IP from the
OS's "base configuration" when we detect this configuration.

Updates #7816

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I73a4ea8e65571eb368c7e179f36af2c049a588ee
2 years ago
Andrew Dunham b0e96a6c39 net/dns: log more info when openresolv commands fail
Updates #11129

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ic594868ba3bc31f6d3b0721ecba4090749a81f7f
2 years ago
Andrew Dunham 35c303227a net/dns/resolver: add ID to verbose logs in forwarder
To make it easier to correlate the starting/ending log messages.

Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2802d53ad98e19bc8914bc58f8c04d4443227b26
2 years ago
Andrew Lytvynov 2716250ee8
all: cleanup unused code, part 2 (#10670)
And enable U1000 check in staticcheck.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2 years ago
Aaron Klotz 64a26b221b net/dns: use an additional registry setting to disable dynamic DNS updates for our interface on Windows
Fixes #9775

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2 years ago
Juergen Knaack c27aa9e7ff net/dns: fix darwin dns resolver files
putting each nameserver on one line in /etc/resolver/<domain>

fixes: #10134
Signed-off-by: Juergen Knaack <jk@jk-1.de>
2 years ago
Ryan Petris c4855fe0ea Fix Empty Resolver Set
Config.singleResolverSet returns true if all routes have the same resolvers,
even if the routes have no resolvers. If none of the routes have a specific
resolver, the default should be used instead. Therefore, check for more than
0 instead of nil.

Signed-off-by: Ryan Petris <ryan@petris.net>
2 years ago
James Tucker b48b7d82d0 appc,ipn/ipnlocal,net/dns/resolver: add App Connector wiring when enabled in prefs
An EmbeddedAppConnector is added that when configured observes DNS
responses from the PeerAPI. If a response is found matching a configured
domain, routes are advertised when necessary.

The wiring from a configuration in the netmap capmap is not yet done, so
while the connector can be enabled, no domains can yet be added.

Updates tailscale/corp#15437

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Andrew Dunham 57c5b5a77e net/dns/recursive: update IP for b.root-servers.net
As of 2023-11-27, the official IP addresses for b.root-servers.net will
change to a new set, with the older IP addresses supported for at least
a year after that date. These IPs are already active and returning
results, so update these in our recursive DNS resolver package so as to
be ready for the switchover.

See: https://b.root-servers.org/news/2023/05/16/new-addresses.html

Fixes #9994

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I29e2fe9f019163c9ec0e62bdb286e124aa90a487
2 years ago
Denton Gentry 97ee3891f1 net/dns: use direct when NetworkManager has no systemd-resolved
Endeavour OS, at least, uses NetworkManager 1.44.2 and does
not use systemd-resolved behind the scenes at all. If we
find ourselves in that situation, return "direct" not
"systemd-resolved"

Fixes https://github.com/tailscale/tailscale/issues/9687

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2 years ago
Galen Guyer 04a8b8bb8e net/dns: properly detect newer debian resolvconf
Tailscale attempts to determine if resolvconf or openresolv
is in use by running `resolvconf --version`, under the assumption
this command will error when run with Debian's resolvconf. This
assumption is no longer true and leads to the wrong commands being
run on newer versions of Debian with resolvconf >= 1.90. We can
now check if the returned version string starts with "Debian resolvconf"
if the command is successful.

Fixes #9218

Signed-off-by: Galen Guyer <galen@galenguyer.com>
2 years ago
Brad Fitzpatrick 7868393200 net/dns/resolver, ipnlocal: fix ExitDNS on Android and iOS
Advertise it on Android (it looks like it already works once advertised).

And both advertise & likely fix it on iOS. Yet untested.

Updates #9672

Change-Id: If3b7e97f011dea61e7e75aff23dcc178b6cf9123
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham 91b9899402 net/dns/resolver: fix flaky test
Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2d073220bb6ac78ba88d8be35085cc23b727d69f
2 years ago
Andrew Dunham 286c6ce27c
net/dns/resolver: race UDP and TCP queries (#9544)
Instead of just falling back to making a TCP query to an upstream DNS
server when the UDP query returns a truncated query, also start a TCP
query in parallel with the UDP query after a given race timeout. This
ensures that if the upstream DNS server does not reply over UDP (or if
the response packet is blocked, or there's an error), we can still make
queries if the server replies to TCP queries.

This also adds a new package, util/race, to contain the logic required for
racing two different functions and returning the first non-error answer.

Updates tailscale/corp#14809

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I4311702016c1093b1beaa31b135da1def6d86316
2 years ago
Andrew Dunham 530aaa52f1 net/dns: retry forwarder requests over TCP
We weren't correctly retrying truncated requests to an upstream DNS
server with TCP. Instead, we'd return a truncated request to the user,
even if the user was querying us over TCP and thus able to handle a
large response.

Also, add an envknob and controlknob to allow users/us to disable this
behaviour if it turns out to be buggy ( DNS ).

Updates #9264

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ifb04b563839a9614c0ba03e9c564e8924c1a2bfd
2 years ago
Brad Fitzpatrick 3b32d6c679 wgengine/magicsock, controlclient, net/dns: reduce some logspam
Updates #cleanup

Change-Id: I78b0697a01e94baa33f3de474b591e616fa5e6af
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
James Tucker 1c88a77f68 net/dns/publicdns: update Quad9 addresses and references
One Quad9 IPv6 address was incorrect, and an additional group needed
adding. Additionally I checked Cloudflare and included source reference
URLs for both.

Updates #cleanup
Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
James Tucker 4c693d2ee8 net/dns/publicdns: update Mullvad DoH server list
The following IPs are not used anymore: 193.19.108.2 and 193.19.108.3.
All of the servers are now named consistently under dns.mullvad.net.
Several new servers were added.

https://mullvad.net/en/help/dns-over-https-and-dns-over-tls/

Updates #5416
Updates #9345

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Brad Fitzpatrick dc7aa98b76 all: use set.Set consistently instead of map[T]struct{}
I didn't clean up the more idiomatic map[T]bool with true values, at
least yet.  I just converted the relatively awkward struct{}-valued
maps.

Updates #cleanup

Change-Id: I758abebd2bb1f64bc7a9d0f25c32298f4679c14f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham d413dd7ee5 net/dns/publicdns: add support for Wikimedia DNS
RELNOTE=Adds support for Wikimedia DNS

Updates #9255

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I4213c29e0f91ea5aa0304a5a026c32b6690fead9
2 years ago
Brad Fitzpatrick 11ece02f52 net/{interfaces,netmon}: remove "interesting", EqualFiltered API
This removes a lot of API from net/interfaces (including all the
filter types, EqualFiltered, active Tailscale interface func, etc) and
moves the "major" change detection to net/netmon which knows more
about the world and the previous/new states.

Updates #9040

Change-Id: I7fe66a23039c6347ae5458745b709e7ebdcce245
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick e8551d6b40 all: use Go 1.21 slices, maps instead of x/exp/{slices,maps}
Updates #8419

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick be4eb6a39e derp, net/dns/recursive: use Go 1.21 min
Updates #8419

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
David Anderson 52212f4323 all: update exp/slices and fix call sites
slices.SortFunc suffered a late-in-cycle API breakage.

Updates #cleanup

Signed-off-by: David Anderson <danderson@tailscale.com>
2 years ago
Michael Stapelberg 2a6c237d4c net/dns: overwrite /tmp/resolv.conf on gokrazy
Appliances built using https://gokrazy.org/ have a read-only root file system,
including /etc/resolv.conf, which is a symlink to /tmp/resolv.conf.

The system’s dhcp client overwrites /tmp/resolv.conf instead,
so we need to use this path in Tailscale, too.

related to https://github.com/gokrazy/gokrazy/issues/209

fixes https://github.com/tailscale/tailscale/issues/8689

Signed-off-by: Michael Stapelberg <michael@stapelberg.de>
2 years ago
Anton Tolchanov 388b124513 net/dns: detect when libnss_resolve is used
Having `127.0.0.53` is not the only way to use `systemd-resolved`. An
alternative way is to enable `libnss_resolve` module, which seems to now
be used by default on Debian 12 bookworm.

Fixes #8549

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2 years ago
Adrian Dewhurst cd4c71c122 tstest: prepare for Clock API changes
This change introduces tstime.NewClock and tstime.ClockOpts as a new way
to construct tstime.Clock. This is a subset of #8464 as a stepping stone
so that we can update our internal code to use the new API before making
the second round of changes.

Updates #8463

Change-Id: Ib26edb60e5355802aeca83ed60e4fdf806c90e27
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
2 years ago
Brad Fitzpatrick a874f1afd8 all: adjust case of "IPv4" and "IPv6"
Updates #docs

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham b6d20e6f8f go.mod, net/dns/recursive: update github.com/miekg/dns
Updates #cleanup

Change-Id: If4de6a84448a17dd81cc2a8af788bd18c3d0bbe3
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2 years ago
Andrew Dunham f077b672e4 net/dns/recursive: add initial implementation of recursive DNS resolver
We've talked in the past about reworking how bootstrap DNS works to
instead do recursive DNS resolution from the root; this would better
support on-prem customers and Headscale users where the DERP servers
don't currently resolve their DNS server. This package is an initial
implementation of recursive resolution for A and AAAA records.

Updates #5853

Change-Id: Ibe974d78709b4b03674b47c4ef61f9a00addf8b4
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
3 years ago
Mihai Parparita 7330aa593e all: avoid repeated default interface lookups
On some platforms (notably macOS and iOS) we look up the default
interface to bind outgoing connections to. This is both duplicated
work and results in logspam when the default interface is not available
(i.e. when a phone has no connectivity, we log an error and thus cause
more things that we will try to upload and fail).

Fixed by passing around a netmon.Monitor to more places, so that we can
use its cached interface state.

Fixes #7850
Updates #7621

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
3 years ago
Mihai Parparita 4722f7e322 all: move network monitoring from wgengine/monitor to net/netmon
We're using it in more and more places, and it's not really specific to
our use of Wireguard (and does more just link/interface monitoring).

Also removes the separate interface we had for it in sockstats -- it's
a small enough package (we already pull in all of its dependencies
via other paths) that it's not worth the extra complexity.

Updates #7621
Updates #7850

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
3 years ago
Andrew Dunham f85dc6f97c
ci: add more lints (#7909)
This is a follow-up to #7905 that adds two more linters and fixes the corresponding findings. As per the previous PR, this only flags things that are "obviously" wrong, and fixes the issues found.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8739bdb7bc4f75666a7385a7a26d56ec13741b7c
3 years ago
Andrew Dunham 280255acae
various: add golangci-lint, fix issues (#7905)
This adds an initial and intentionally minimal configuration for
golang-ci, fixes the issues reported, and adds a GitHub Action to check
new pull requests against this linter configuration.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8f38fbc315836a19a094d0d3e986758b9313f163
3 years ago
Brad Fitzpatrick 10f1c90f4d wgengine/magicsock, types/nettype, etc: finish ReadFromUDPAddrPort netip migration
So we're staying within the netip.Addr/AddrPort consistently and
avoiding allocs/conversions to the legacy net addr types.

Updates #5162

Change-Id: I59feba60d3de39f773e68292d759766bac98c917
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Mihai Parparita edb02b63f8 net/sockstats: pass in logger to sockstats.WithSockStats
Using log.Printf may end up being printed out to the console, which
is not desirable. I noticed this when I was investigating some client
logs with `sockstats: trace "NetcheckClient" was overwritten by another`.
That turns to be harmless/expected (the netcheck client will fall back
to the DERP client in some cases, which does its own sockstats trace).

However, the log output could be visible to users if running the
`tailscale netcheck` CLI command, which would be needlessly confusing.

Updates tailscale/corp#9230

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
3 years ago
Andrew Dunham 33b359642e net/dns: don't send on closed channel in resolvedManager
Fixes #7686

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibffb05539ab876b12407d77dcf2201d467895981
3 years ago
Andrew Dunham 83fa17d26c various: pass logger.Logf through to more places
Updates #7537

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id89acab70ea678c8c7ff0f44792d54c7223337c6
3 years ago
Mihai Parparita 6ac6ddbb47 sockstats: switch label to enum
Makes it cheaper/simpler to persist values, and encourages reuse of
labels as opposed to generating an arbitrary number.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
3 years ago
Aaron Klotz 9687f3700d net/dns: deal with Windows wsl.exe hangs
Despite the fact that WSL configuration is still disabled by default, we
continue to log the machine's list of WSL distros as a diagnostic measure.

Unfortunately I have seen the "wsl.exe -l" command hang indefinitely. This patch
adds a (more than reasonable) 10s timeout to ensure that tailscaled does not get
stuck while executing this operation.

I also modified the Windows implementation of NewOSConfigurator to do the
logging asynchronously, since that information is not required in order to
continue starting up.

Fixes #7476

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
3 years ago
Maisem Ali 1a30b2d73f all: use tstest.Replace more
Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Mihai Parparita 9cb332f0e2 sockstats: instrument networking code paths
Uses the hooks added by tailscale/go#45 to instrument the reads and
writes on the major code paths that do network I/O in the client. The
convention is to use "<package>.<type>:<label>" as the annotation for
the responsible code path.

Enabled on iOS, macOS and Android only, since mobile platforms are the
ones we're most interested in, and we are less sensitive to any
throughput degradation due to the per-I/O callback overhead (macOS is
also enabled for ease of testing during development).

For now just exposed as counters on a /v0/sockstats PeerAPI endpoint.

We also keep track of the current interface so that we can break out
the stats by interface.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
3 years ago
David Anderson 8b2ae47c31 version: unexport all vars, turn Short/Long into funcs
The other formerly exported values aren't used outside the package,
so just unexport them.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Mihai Parparita 0e3fb91a39 net/dns/resolver: remove maxDoHInFlight
It was originally added to control memory use on iOS (#2490), but then
was relaxed conditionally when running on iOS 15 (#3098). Now that we
require iOS 15, there's no need for the limit at all, so simplify back
to the original state.

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
3 years ago
Andrew Dunham 880a41bfcc net/dns/resolver: add envknob to debug exit node DNS queries on on Windows
Add the envknob TS_DEBUG_EXIT_NODE_DNS_NET_PKG, which enables more
verbose debug logging when calling the handleExitNodeDNSQueryWithNetPkg
function. This function is currently only called on Windows and Android.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ieb3ca7b98837d7dc69cd9ca47609c1c52e3afd7b
3 years ago
Brad Fitzpatrick b1248442c3 all: update to Go 1.20, use strings.CutPrefix/Suffix instead of our fork
Updates #7123
Updates #5309

Change-Id: I90bcd87a2fb85a91834a0dd4be6e03db08438672
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Will Norris 71029cea2d all: update copyright and license headers
This updates all source files to use a new standard header for copyright
and license declaration.  Notably, copyright no longer includes a date,
and we now use the standard SPDX-License-Identifier header.

This commit was done almost entirely mechanically with perl, and then
some minimal manual fixes.

Updates #6865

Signed-off-by: Will Norris <will@tailscale.com>
3 years ago
Brad Fitzpatrick 3addcacfe9 net/dns: fix recently added URL scheme from http to https
I typoed/brainoed in the earlier 3582628691

Change-Id: Ic198a6f9911f195d9da9fc5259b5784a4b15e5e3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 3582628691 net/dns/resolvconffile: link to FAQ about resolv.conf being overwritten
Add link to new http://tailscale.com/s/resolvconf-overwrite page,
added in tailscale/tailscale-www#2243

Change-Id: I9718399487f2ed18bf1a112581fd168aea30f232
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Tom DNetto 673b3d8dbd net/dns,userspace: remove unused DNS paths, normalize query limit on iOS
With a42a594bb3, iOS uses netstack and
hence there are no longer any platforms which use the legacy MagicDNS path. As such, we remove it.

We also normalize the limit for max in-flight DNS queries on iOS (it was 64, now its 256 as per other platforms).
It was 64 for the sake of being cautious about memory, but now we have 50Mb (iOS-15 and greater) instead of 15Mb
so we have the spare headroom.

Signed-off-by: Tom DNetto <tom@tailscale.com>
3 years ago
Brad Fitzpatrick ea70aa3d98 net/dns/resolvconffile: fix handling of multiple search domains
Fixes #6875

Change-Id: I57eb9312c9a1c81792ce2b5a0a0f254213b05df2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Andrew Dunham 0372e14d79 net/dns: bump DNS-over-TCP size limit to 4k
We saw a few cases where we hit this limit; bumping to 4k seems
relatively uncontroversial.

Change-Id: I218fee3bc0d2fa5fde16eddc36497a73ebd7cbda
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
3 years ago
Andrew Dunham 3f4d51c588 net/dns: don't send on closed channel when message too large
Previously, if a DNS-over-TCP message was received while there were
existing queries in-flight, and it was over the size limit, we'd close
the 'responses' channel. This would cause those in-flight queries to
send on the closed channel and panic.

Instead, don't close the channel at all and rely on s.ctx being
canceled, which will ensure that in-flight queries don't hang.

Fixes #6725

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8267728ac37ed7ae38ddd09ce2633a5824320097
3 years ago
Brad Fitzpatrick ca08e316af util/endian: delete package; use updated josharian/native instead
See josharian/native#3

Updates golang/go#57237

Change-Id: I238c04c6654e5b9e7d9cfb81a7bbc5e1043a84a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Maisem Ali 99aa335923 net/dns: [linux] log and add metric for dnsMode
I couldn't find any logs that indicated which mode it was running in so adding that.
Also added a gauge metric for dnsMode.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Andrew Dunham ec790e58df net/dns: retry overwriting hosts file on Windows
Updates #5753

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I60f81bd3325d5ba5383b947c7a7aaa5b14e460f6
3 years ago
Mihai Parparita 33520920c3 all: use strs.CutPrefix and strs.CutSuffix more
Updates places where we use HasPrefix + TrimPrefix to use the combined
function.

Updates #5309

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
3 years ago
Aaron Klotz 41e1d336cc net/dns: change windows DNS manager to use pointer receiver
This is safer given that we need to close the NRPT database.

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
3 years ago
Brad Fitzpatrick 90bd74fc05 net/dns: add a health warning when Linux /etc/resolv.conf is overwritten
Change-Id: I925b4d904bc7ed920bc5afee11e6dcb2ffc2fbfd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 001f482aca net/dns: make "direct" mode on Linux warn on resolv.conf fights
Run an inotify goroutine and watch if another program takes over
/etc/inotify.conf. Log if so.

For now this only logs. In the future I want to wire it up into the
health system to warn (visible in "tailscale status", etc) about the
situation, with a short URL to more info about how you should really
be using systemd-resolved if you want programs to not fight over your
DNS files on Linux.

Updates #4254 etc etc

Change-Id: I86ad9125717d266d0e3822d4d847d88da6a0daaa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 66b4a363bd net/dns/resolver: add yet another 4via6 DNS form that's hopefully more robust
$ dig +short @100.100.100.100 aaaa 10-2-5-3-via-7.foo-bar.ts.net
fd7a:115c:a1e0:b1a:0:7:a02:503

$ dig +short @100.100.100.100 aaaa 10-2-5-3-via-7
fd7a:115c:a1e0:b1a:0:7:a02:503

$ ping 10-2-5-3-via-7
PING 10-2-5-3-via-7(fd7a:115c:a1e0:b1a:0:7:a02:503 (fd7a:115c:a1e0:b1a:0:7:a02:503)) 56 data bytes
...

Change-Id: Ice8f954518a6a2fca8b2c04da7f31f61d78cdec4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick da8def8e13 all: remove old +build tags
The //go:build syntax was introduced in Go 1.17:

https://go.dev/doc/go1.17#build-lines

gofmt has kept the +build and go:build lines in sync since
then, but enough time has passed. Time to remove them.

Done with:

    perl -i -npe 's,^// \+build.*\n,,' $(git grep -l -F '+build')

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago