Commit Graph

1177 Commits (c76d0754723f10f6d7100972bbddaf6edae4e57c)

Author SHA1 Message Date
James Scott ebaf33a80c
net/tsaddr: extract IsTailscaleIPv4 from IsTailscaleIP (#14169)
Extracts tsaddr.IsTailscaleIPv4 out of tsaddr.IsTailscaleIP.

This will allow for checking valid Tailscale assigned IPv4 addresses
without checking IPv6 addresses.

Updates #14168
Updates tailscale/corp#24620

Signed-off-by: James Scott <jim@tailscale.com>
1 year ago
Brad Fitzpatrick 3b93fd9c44 net/captivedetection: replace 10k log lines with ... less
We see tons of logs of the form:

    2024/11/15 19:57:29 netcheck: [v2] 76 available captive portal detection endpoints: [Endpoint{URL="http://192.73.240.161/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://192.73.240.121/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://192.73.240.132/generate_204", StatusCode=204, ExpectedContent="",
11:58SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://209.177.158.246/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://209.177.158.15/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://199.38.182.118/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://192.73.243.135/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://192.73.243.229/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://192.73.243.141/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://45.159.97.144/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://45.159.97.61/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://45.159.97.233/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://45.159.98.196/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://45.159.98.253/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://45.159.98.145/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://68.183.90.120/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://209.177.156.94/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://192.73.248.83/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://209.177.156.197/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://199.38.181.104/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://209.177.145.120/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://199.38.181.93/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://199.38.181.103/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://102.67.165.90/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://102.67.165.185/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://102.67.165.36/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://176.58.90.147/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://176.58.90.207/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://176.58.90.104/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://162.248.221.199/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://162.248.221.215/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://162.248.221.248/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://185.34.3.232/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://185.34.3.207/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://185.34.3.75/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://208.83.234.151/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://208.83.233.233/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://208.72.155.133/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://185.40.234.219/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://185.40.234.113/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://185.40.234.77/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://43.245.48.220/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://43.245.48.50/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://43.245.48.250/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://192.73.252.65/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://192.73.252.134/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://208.111.34.178/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://43.245.49.105/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://43.245.49.83/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://43.245.49.144/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://176.58.92.144/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://176.58.88.183/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://176.58.92.254/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://148.163.220.129/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://148.163.220.134/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://148.163.220.210/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://192.73.242.187/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://192.73.242.28/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://192.73.242.204/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://176.58.93.248/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://176.58.93.147/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://176.58.93.154/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://192.73.244.245/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://208.111.40.12/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://208.111.40.216/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://103.6.84.152/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://205.147.105.30/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://205.147.105.78/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://102.67.167.245/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://102.67.167.37/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://102.67.167.188/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://103.84.155.178/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://103.84.155.188/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://103.84.155.46/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=true, Provider=DERPMapOther} Endpoint{URL="http://controlplane.tailscale.com/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=false, Provider=Tailscale} Endpoint{URL="http://login.tailscale.com/generate_204", StatusCode=204, ExpectedContent="", SupportsTailscaleChallenge=false, Provider=Tailscale}]

That can be much shorter.

Also add a fast exit path to the concurrency on match. Doing 5 all at
once is still pretty gratuitous, though.

Updates #1634
Fixes #13019

Change-Id: Icdbb16572fca4477b0ee9882683a3ac6eb08e2f2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 4e0fc037e6 all: use iterators over slice views more
This gets close to all of the remaining ones.

Updates #12912

Change-Id: I9c672bbed2654a6c5cab31e0cbece6c107d8c6fa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 01185e436f types/result, util/lineiter: add package for a result type, use it
This adds a new generic result type (motivated by golang/go#70084) to
try it out, and uses it in the new lineutil package (replacing the old
lineread package), changing that package to return iterators:
sometimes over []byte (when the input is all in memory), but sometimes
iterators over results of []byte, if errors might happen at runtime.

Updates #12912
Updates golang/go#70084

Change-Id: Iacdc1070e661b5fb163907b1e8b07ac7d51d3f83
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
VimT 43138c7a5c net/socks5: optimize UDP relay
Key changes:
- No mutex for every udp package: replace syncs.Map with regular map for udpTargetConns
- Use socksAddr as map key for better type safety
- Add test for multi udp target

Updates #7581

Change-Id: Ic3d384a9eab62dcbf267d7d6d268bf242cc8ed3c
Signed-off-by: VimT <me@vimt.me>
1 year ago
VimT b0626ff84c net/socks5: fix UDP relay in userspace-networking mode
This commit addresses an issue with the SOCKS5 UDP relay functionality
when using the --tun=userspace-networking option. Previously, UDP packets
were not being correctly routed into the Tailscale network in this mode.

Key changes:
- Replace single UDP connection with a map of connections per target
- Use c.srv.dial for creating connections to ensure proper routing

Updates #7581

Change-Id: Iaaa66f9de6a3713218014cf3f498003a7cac9832
Signed-off-by: VimT <me@vimt.me>
1 year ago
Jordan Whited 49de23cf1b
net/netcheck: add addReportHistoryAndSetPreferredDERP() test case (#13989)
Add an explicit case for exercising preferred DERP hysteresis around
the branch that compares latencies on a percentage basis.

Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Andrea Gottardo 6985369479
net/sockstats: prevent crash in setNetMon (#13985) 1 year ago
Anton Tolchanov b4f46c31bb wgengine/magicsock: export packet drop metric for outbound errors
This required sharing the dropped packet metric between two packages
(tstun and magicsock), so I've moved its definition to util/usermetric.

Updates tailscale/corp#22075

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
1 year ago
James Tucker e1e22785b4 net/netcheck: ensure prior preferred DERP is always in netchecks
In an environment with unstable latency, such as upstream bufferbloat,
there are cases where a full netcheck could drop the prior preferred
DERP (likely home DERP) from future netcheck probe plans. This will then
likely result in a home DERP having a missing sample on the next
incremental netcheck, ultimately resulting in a home DERP move.

This change does not fix our overall response to highly unstable
latency, but it is an incremental improvement to prevent single spurious
samples during a full netcheck from alone triggering a flapping
condition, as now the prior changes to include historical latency will
still provide the desired resistance, and the home DERP should not move
unless latency is consistently worse over a 5 minute period.

Note that there is a nomenclature and semantics issue remaining in the
difference between a report preferred DERP and a home DERP. A report
preferred DERP is aspirational, it is what will be picked as a home DERP
if a home DERP connection needs to be established. A nodes home DERP may
be different than a recent preferred DERP, in which case a lot of
netcheck logic is fallible. In future enhancements much of the DERP move
logic should move to consider the home DERP, rather than recent report
preferred DERP.

Updates #8603
Updates #13969

Signed-off-by: James Tucker <james@tailscale.com>
1 year ago
Renato Aguiar 5d07c17b93
net/dns: fix blank lines being added to resolv.conf on OpenBSD (#13928)
During resolv.conf update, old 'search' lines are cleared but '\n' is not
deleted, leaving behind a new blank line on every update.

This adds 's' flag to regexp, so '\n' is included in the match and deleted when
old lines are cleared.

Also, insert missing `\n` when updated 'search' line is appended to resolv.conf.

Signed-off-by: Renato Aguiar <renato@renatoaguiar.net>
1 year ago
Andrew Dunham 7fe6e50858 net/dns/resolver: fix test flake
Updates #13902

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib2def19caad17367e9a31786ac969278e65f51c6
1 year ago
Andrew Dunham b2665d9b89 net/netcheck: add a Now field to the netcheck Report
This allows us to print the time that a netcheck was run, which is
useful in debugging.

Updates #10972

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id48d30d4eb6d5208efb2b1526a71d83fe7f9320b
1 year ago
Maisem Ali 85241f8408 net/tstun: use /10 as subnet for TAP mode; read IP from netmap
Few changes to resolve TODOs in the code:
- Instead of using a hardcoded IP, get it from the netmap.
- Use 100.100.100.100 as the gateway IP
- Use the /10 CGNAT range instead of a random /24

Updates #2589

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Maisem Ali d4d21a0bbf net/tstun: restore tap mode functionality
It had bit-rotted likely during the transition to vector io in
76389d8baf. Tested on Ubuntu 24.04
by creating a netns and doing the DHCP dance to get an IP.

Updates #2589

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Andrea Gottardo f8f53bb6d4
health: remove SysDNSOS, add two Warnables for read+set system DNS config (#13874) 1 year ago
Andrea Gottardo fd77965f23
net/tlsdial: call out firewalls blocking Tailscale in health warnings (#13840)
Updates tailscale/tailscale#13839

Adds a new blockblame package which can detect common MITM SSL certificates used by network appliances. We use this in `tlsdial` to display a dedicated health warning when we cannot connect to control, and a network appliance MITM attack is detected.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Jordan Whited 877fa504b4
net/netcheck: remove arbitrary deadlines from GetReport() tests (#13832)
GetReport() may have side effects when the caller enforces a deadline
that is shorter than ReportTimeout.

Updates #13783
Updates #13394

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Kristoffer Dalby e0d711c478 {net/connstats,wgengine/magicsock}: fix packet counting in connstats
connstats currently increments the packet counter whenever it is called
to store a length of data, however when udp batch sending was introduced
we pass the length for a series of packages, and it is only incremented
ones, making it count wrongly if we are on a platform supporting udp
batches.

Updates tailscale/corp#22075

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
1 year ago
Nick Khyl f07ff47922 net/dns/resolver: add tests for using a forwarder with multiple upstream resolvers
If multiple upstream DNS servers are available, quad-100 sends requests to all of them
and forwards the first successful response, if any. If no successful responses are received,
it propagates the first failure from any of them.

This PR adds some test coverage for these scenarios.

Updates #13571

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Nick Hill c2144c44a3 net/dns/resolver: update (*forwarder).forwardWithDestChan to always return an error unless it sends a response to responseChan
We currently have two executions paths where (*forwarder).forwardWithDestChan
returns nil, rather than an error, without sending a DNS response to responseChan.

These paths are accompanied by a comment that reads:
// Returning an error will cause an internal retry, there is
// nothing we can do if parsing failed. Just drop the packet.
But it is not (or no longer longer) accurate: returning an error from forwardWithDestChan
does not currently cause a retry.

Moreover, although these paths are currently unreachable due to implementation details,
if (*forwarder).forwardWithDestChan were to return nil without sending a response to
responseChan, it would cause a deadlock at one call site and a panic at another.

Therefore, we update (*forwarder).forwardWithDestChan to return errors in those two paths
and remove comments that were no longer accurate and misleading.

Updates #cleanup
Updates #13571

Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
1 year ago
Nick Hill e7545f2eac net/dns/resolver: translate 5xx DoH server errors into SERVFAIL DNS responses
If a DoH server returns an HTTP server error, rather than a SERVFAIL within
a successful HTTP response, we should handle it in the same way as SERVFAIL.

Updates #13571

Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
1 year ago
Nick Hill 17335d2104 net/dns/resolver: forward SERVFAIL responses over PeerDNS
As per the docstring, (*forwarder).forwardWithDestChan should either send to responseChan
and returns nil, or returns a non-nil error (without sending to the channel).
However, this does not hold when all upstream DNS servers replied with an error.

We've been handling this special error path in (*Resolver).Query but not in (*Resolver).HandlePeerDNSQuery.
As a result, SERVFAIL responses from upstream servers were being converted into HTTP 503 responses,
instead of being properly forwarded as SERVFAIL within a successful HTTP response, as per RFC 8484, section 4.2.1:
A successful HTTP response with a 2xx status code (see Section 6.3 of [RFC7231]) is used for any valid DNS response,
regardless of the DNS response code. For example, a successful 2xx HTTP status code is used even with a DNS message
whose DNS response code indicates failure, such as SERVFAIL or NXDOMAIN.

In this PR we fix (*forwarder).forwardWithDestChan to no longer return an error when it sends a response to responseChan,
and remove the special handling in (*Resolver).Query, as it is no longer necessary.

Updates #13571

Signed-off-by: Nick Hill <mykola.khyl@gmail.com>
1 year ago
Jordan Whited 33029d4486
net/netcheck: fix netcheck cli-triggered nil pointer deref (#13782)
Updates #13780

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Brad Fitzpatrick 841eaacb07 net/sockstats: quiet some log spam in release builds
Updates #13731

Change-Id: Ibee85426827ebb9e43a1c42a9c07c847daa50117
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Andrew Dunham 8ee7f82bf4 net/netcheck: don't panic if a region has no Nodes
Updates #13728

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1e8319d6b2da013ae48f15113b30c9333e69cc0b
1 year ago
Andrea Gottardo 6de6ab015f
net/dns: tweak DoH timeout, limit MaxConnsPerHost, require TLS 1.3 (#13564)
Updates tailscale/tailscale#6148

This is the result of some observations we made today with @raggi. The DNS over HTTPS client currently doesn't cap the number of connections it uses, either in-use or idle. A burst of DNS queries will open multiple connections. Idle connections remain open for 30 seconds (this interval is defined in the dohTransportTimeout constant). For DoH providers like NextDNS which send keep-alives, this means the cellular modem will remain up more than expected to send ACKs if any keep-alives are received while a connection remains idle during those 30 seconds. We can set the IdleConnTimeout to 10 seconds to ensure an idle connection is terminated if no other DNS queries come in after 10 seconds. Additionally, we can cap the number of connections to 1. This ensures that at all times there is only one open DoH connection, either active or idle. If idle, it will be terminated within 10 seconds from the last query.

We also observed all the DoH providers we support are capable of TLS 1.3. We can force this TLS version to reduce the number of packets sent/received each time a TLS connection is established.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Brad Fitzpatrick f49d218cfe net/dnscache: don't fall back to an IPv6 dial if we don't have IPv6
I noticed while debugging a test failure elsewhere that our failure
logs (when verbosity is cranked up) were uselessly attributing dial
failures to failure to dial an invalid IP address (this IPv6 address
we didn't have), rather than showing me the actual IPv4 connection
failure.

Updates #13597 (tangentially)

Change-Id: I45ffbefbc7e25ebfb15768006413a705b941dae5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Andrea Gottardo ed1ac799c8
net/captivedetection: set Timeout on net.Dialer (#13613)
Updates tailscale/tailscale#1634
Updates tailscale/tailscale#13265

Captive portal detection uses a custom `net.Dialer` in its `http.Client`. This custom Dialer ensures that the socket is bound specifically to the Wi-Fi interface. This is crucial because without it, if any default routes are set, the outgoing requests for detecting a captive portal would bypass Wi-Fi and go through the default route instead.

The Dialer did not have a Timeout property configured, so the default system timeout was applied. This caused issues in #13265, where we attempted to make captive portal detection requests over an IPsec interface used for Wi-Fi Calling. The call to `connect()` would fail and remain blocked until the system timeout (approximately 1 minute) was reached.

In #13598, I simply excluded the IPsec interface from captive portal detection. This was a quick and safe mitigation for the issue. This PR is a follow-up to make the process more robust, by setting a 3 seconds timeout on any connection establishment on any interface (this is the same timeout interval we were already setting on the HTTP client).

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Brad Fitzpatrick 262c526c4e net/portmapper: don't treat 0.0.0.0 as a valid IP
Updates tailscale/corp#23538

Change-Id: I58b8c30abe43f1d1829f01eb9fb2c1e6e8db9476
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Andrew Dunham 16ef88754d net/portmapper: don't return unspecified/local external IPs
We were previously not checking that the external IP that we got back
from a UPnP portmap was a valid endpoint; add minimal validation that
this endpoint is something that is routeable by another host.

Updates tailscale/corp#23538

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id9649e7683394aced326d5348f4caa24d0efd532
1 year ago
Andrea Gottardo 69be54c7b6
net/captivedetection: exclude ipsec interfaces from captive portal detection (#13598)
Updates tailscale/tailscale#1634

Logs from some iOS users indicate that we're pointlessly performing captive portal detection on certain interfaces named ipsec*. These are tunnels with the cellular carrier that do not offer Internet access, and are only used to provide internet calling functionality (VoLTE / VoWiFi).

```
attempting to do captive portal detection on interface ipsec1
attempting to do captive portal detection on interface ipsec6
```

This PR excludes interfaces with the `ipsec` prefix from captive portal detection.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Kristoffer Dalby 7d1160ddaa {ipn,net,tsnet}: use tsaddr helpers
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
1 year ago
Kristoffer Dalby 3dc33a0a5b net/tsaddr: add WithoutExitRoutes and IsExitRoute
Updates #cleanup

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
1 year ago
Kristoffer Dalby 0e0e53d3b3 util/usermetrics: make usermetrics non-global
this commit changes usermetrics to be non-global, this is a building
block for correct metrics if a go process runs multiple tsnets or
in tests.

Updates #13420
Updates tailscale/corp#22075

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
1 year ago
Andrea Gottardo 8a6f48b455
cli: add `tailscale dns query` (#13368)
Updates tailscale/tailscale#13326

Adds a CLI subcommand to perform DNS queries using the internal DNS forwarder and observe its internals (namely, which upstream resolvers are being used).

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
James Tucker af5a845a87 net/dns/resolver: fix dns-sd NXDOMAIN responses from quad-100
mdnsResponder at least as of macOS Sequoia does not find NXDOMAIN
responses to these dns-sd PTR queries acceptable unless they include the
question section in the response. This was found debugging #13511, once
we turned on additional diagnostic reporting from mdnsResponder we
witnessed:

```
Received unacceptable 12-byte response from 100.100.100.100 over UDP via utun6/27 -- id: 0x7F41 (32577), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 0/0/0/0,
```

If the response includes a question section, the resposnes are
acceptable, e.g.:

```
Received acceptable 59-byte response from 8.8.8.8 over UDP via en0/17 -- id: 0x2E55 (11861), flags: 0x8183 (R/Query, RD, RA, NXDomain), counts: 1/0/0/0,
```

This may be contributing to an issue under diagnosis in #13511 wherein
some combination of conditions results in mdnsResponder no longer
answering DNS queries correctly to applications on the system for
extended periods of time (multiple minutes), while dig against quad-100
provides correct responses for those same domains. If additional debug
logging is enabled in mdnsResponder we see it reporting:

```
Penalizing server 100.100.100.100 for 60 seconds
```

It is also possible that the reason that macOS & iOS never "stopped
spamming" these queries is that they have never been replied to with
acceptable responses. It is not clear if this special case handling of
dns-sd PTR queries was ever beneficial, and given this evidence may have
always been harmful. If we subsequently observe that the queries settle
down now that they have acceptable responses, we should remove these
special cases - making upstream queries very occasionally isn't a lot of
battery, so we should be better off having to maintain less special
cases and avoid bugs of this class.

Updates #2442
Updates #3025
Updates #3363
Updates #3594
Updates #13511

Signed-off-by: James Tucker <james@tailscale.com>
1 year ago
Jordan Whited 951884b077
net/netcheck,wgengine/magicsock: plumb OnlyTCP443 controlknob through netcheck (#13491)
Updates tailscale/corp#17879

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Jordan Whited afec2d41b4
wgengine/magicsock: remove redundant deadline from netcheck report call (#13395)
netcheck.Client.GetReport() applies its own deadlines. This 2s deadline
was causing GetReport() to never fall back to HTTPS/ICMP measurements
as it was shorter than netcheck.stunProbeTimeout, leaving no time
for fallbacks.

Updates #13394
Updates #6187

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Nick Khyl 4dfde7bffc net/dns: disable DNS registration for Tailscale interface on Windows
We already disable dynamic updates by setting DisableDynamicUpdate to 1 for the Tailscale interface.
However, this does not prevent non-dynamic DNS registration from happening when `ipconfig /registerdns`
runs and in similar scenarios. Notably, dns/windowsManager.SetDNS runs `ipconfig /registerdns`,
triggering DNS registration for all interfaces that do not explicitly disable it.

In this PR, we update dns/windowsManager.disableDynamicUpdates to also set RegistrationEnabled to 0.

Fixes #13411

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Jordan Whited 7aa766ee65
net/tstun: probe TCP GRO (#13376)
Disable TCP & UDP GRO if the probe fails.

torvalds/linux@e269d79c7d broke virtio_net
TCP & UDP GRO causing GRO writes to return EINVAL. The bug was then
resolved later in
torvalds/linux@89add40066. The offending
commit was pulled into various LTS releases.

Updates #13041

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Andrew Dunham 7dcf65a10a net/dns: fix IsZero and Equal methods on OSConfig
Discovered this while investigating the following issue; I think it's
unrelated, but might as well fix it. Also, add a test helper for
checking things that have an IsZero method using the reflect package.

Updates tailscale/support-escalations#55

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I57b7adde43bcef9483763b561da173b4c35f49e2
1 year ago
Brad Fitzpatrick 3d401c11fa all: use new Go 1.23 slices.Sorted more
Updates #12912

Change-Id: If1294e5bc7b5d3cf0067535ae10db75e8b988d8b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Andrea Gottardo d060b3fa02
cli: implement `tailscale dns status` (#13353)
Updates tailscale/tailscale#13326

This PR begins implementing a `tailscale dns` command group in the Tailscale CLI. It provides an initial implementation of `tailscale dns status` which dumps the state of the internal DNS forwarder.

Two new endpoints were added in LocalAPI to support the CLI functionality:

- `/netmap`: dumps a copy of the last received network map (because the CLI shouldn't have to listen to the ipn bus for a copy)
- `/dns-osconfig`: dumps the OS DNS configuration (this will be very handy for the UI clients as well, as they currently do not display this information)

My plan is to implement other subcommands mentioned in tailscale/tailscale#13326, such as `query`, in later PRs.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Andrea Gottardo 0112da6070
net/dns: support GetBaseConfig on Darwin OSS tailscaled (#13351)
Updates tailscale/tailscale#177

It appears that the OSS distribution of `tailscaled` is currently unable to get the current system base DNS configuration, as GetBaseConfig() in manager_darwin.go is unimplemented. This PR adds a basic implementation that reads the current values in `/etc/resolv.conf`, to at least unblock DNS resolution via Quad100 if `--accept-dns` is enabled.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Andrew Dunham 1c972bc7cb wgengine/magicsock: actually use AF_PACKET socket for raw disco
Previously, despite what the commit said, we were using a raw IP socket
that was *not* an AF_PACKET socket, and thus was subject to the host
firewall rules. Switch to using a real AF_PACKET socket to actually get
the functionality we want.

Updates #13140

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: If657daeeda9ab8d967e75a4f049c66e2bca54b78
1 year ago
Jordan Whited 45c97751fb
net/tstun: clarify GROFilterFunc *gro.GRO usage (#13318)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Andrea Gottardo a584d04f8a
dns: increase TimeToVisible before DNS unavailable warning (#13317)
Updates tailscale/tailscale#13314

Some users are reporting 'DNS unavailable' spurious (?) warnings, especially on Android:

https://old.reddit.com/r/Tailscale/comments/1f2ow3w/health_warning_dns_unavailable_on_tailscale/
https://old.reddit.com/r/Tailscale/comments/1f3l2il/health_warnings_dns_unavailable_what_does_it_mean/

I suspect this is caused by having a too low TimeToVisible setting on the Warnable, which triggers the unhealthy state during slow network transitions.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Jordan Whited 0926954cf5
net/tstun,wgengine/netstack: implement TCP GRO for local services (#13315)
Throughput improves substantially when measured via netstack loopback
(TS_DEBUG_NETSTACK_LOOPBACK_PORT).

Before (d21ebc2):
jwhited@i5-12400-2:~$ iperf3 -V -c 100.100.100.100
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  5.77 GBytes  4.95 Gbits/sec    0 sender
[  5]   0.00-10.01  sec  5.77 GBytes  4.95 Gbits/sec      receiver

After:
jwhited@i5-12400-2:~$ iperf3 -V -c 100.100.100.100
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  12.7 GBytes  10.9 Gbits/sec    0 sender
[  5]   0.00-10.00  sec  12.7 GBytes  10.9 Gbits/sec      receiver

Updates tailscale/corp#22754

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Jordan Whited 31cdbd68b1
net/tstun: fix gvisor inbound GSO packet injection (#13283)
buffs[0] was not sized to hold pkt with GSO, resulting in a panic.

Updates tailscale/corp#22511

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Kristoffer Dalby a2c42d3cd4 usermetric: add initial user-facing metrics
This commit adds a new usermetric package and wires
up metrics across the tailscale client.

Updates tailscale/corp#22075

Co-authored-by: Anton Tolchanov <anton@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
1 year ago
Jordan Whited d097096ddc
net/tstun,wgengine/netstack: make inbound synthetic packet injection GSO-aware (#13266)
Updates tailscale/corp#22511

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Ilarion Kovalchuk 0cb7eb9b75 net/dns: updated gonotify dependency to v2 that supports closable context
Signed-off-by: Ilarion Kovalchuk <illarion.kovalchuk@gmail.com>
1 year ago
Brad Fitzpatrick 0ff474ff37 all: fix new lint warnings from bumping staticcheck
In prep for updating to new staticcheck required for Go 1.23.

Updates #12912

Change-Id: If77892a023b79c6fa798f936fc80428fd4ce0673
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Jordan Whited df6014f1d7
net/tstun,wgengine{/netstack/gro}: refactor and re-enable gVisor GRO for Linux (#13172)
In 2f27319baf we disabled GRO due to a
data race around concurrent calls to tstun.Wrapper.Write(). This commit
refactors GRO to be thread-safe, and re-enables it on Linux.

This refactor now carries a GRO type across tstun and netstack APIs
with a lifetime that is scoped to a single tstun.Wrapper.Write() call.

In 25f0a3fc8f we used build tags to
prevent importation of gVisor's GRO package on iOS as at the time we
believed it was contributing to additional memory usage on that
platform. It wasn't, so this commit simplifies and removes those
build tags.

Updates tailscale/corp#22353
Updates tailscale/corp#22125
Updates #6816

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Andrea Gottardo 5cbbb48c2e
health/dns: reduce severity of DNS unavailable warning (#13152)
`DNS unavailable` was marked as a high severity warning. On Android (and other platforms), these trigger a system notification. Here we reduce the severity level to medium. A medium severity warning will still display the warning icon on platforms with a tray icon because of the `ImpactsConnectivity=true` flag being set here, but it won't show a notification anymore. If people enter an area with bad cellular reception, they're bound to receive so many of these notifications and we need to reduce notification fatigue.

Signed-off-by: Andrea Gottardo <andrea@tailscale.com>
1 year ago
Kyle Carberry 6c852fa817 go.{mod,sum}: migrate from nhooyr.io/websocket to github.com/coder/websocket
Coder has just adopted nhooyr/websocket which unfortunately changes the import path.

`github.com/coder/coder` imports `tailscale.com/net/wsconn` which was still pointing
to `nhooyr.io/websocket`, but this change updates it.

See https://coder.com/blog/websocket

Updates #13154

Change-Id: I3dec6512472b14eae337ae22c5bcc1e3758888d5
Signed-off-by: Kyle Carberry <kyle@carberry.com>
1 year ago
Brad Fitzpatrick a61825c7b8 cmd/tta, vnet: add host firewall, env var support, more tests
In particular, tests showing that #3824 works. But that test doesn't
actually work yet; it only gets a DERP connection. (why?)

Updates #13038

Change-Id: Ie1fd1b6a38d4e90fae7e72a0b9a142a95f0b2e8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Nick Khyl f23932bd98 net/dns/resolver: log forwarded query details when TS_DEBUG_DNS_FORWARD_SEND is enabled
Troubleshooting DNS resolution issues often requires additional information.
This PR expands the effect of the TS_DEBUG_DNS_FORWARD_SEND envknob to forwarder.forwardWithDestChan,
and includes the request type, domain name length, and the first 3 bytes of the domain's SHA-256 hash in the output.

Fixes #13070

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Anton Tolchanov 227509547f {control,net}: close idle connections of custom transports
I noticed a few places with custom http.Transport where we are not
closing idle connections when transport is no longer used.

Updates tailscale/corp#21609

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
1 year ago
VimT e3f047618b net/socks5: support UDP
Updates #7581

Signed-off-by: VimT <me@vimt.me>
1 year ago
Jordan Whited 17c88a19be
net/captivedetection: mark TestAllEndpointsAreUpAndReturnExpectedResponse flaky (#13021)
Updates #13019

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Maisem Ali f205efcf18 net/packet/checksum: fix v6 NAT
We were copying 12 out of the 16 bytes which meant that
the 1:1 NAT required would only work if the last 4 bytes
happened to match between the new and old address, something
that our tests accidentally had. Fix it by copying the full
16 bytes and make the tests also verify the addr and use rand
addresses.

Updates #9511

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Andrea Gottardo 4055b63b9b
net/captivedetection: exclude cellular data interfaces (#13002)
Updates tailscale/tailscale#1634

This PR optimizes captive portal detection on Android and iOS by excluding cellular data interfaces (`pdp*` and `rmnet`). As cellular networks do not present captive portals, frequent network switches between Wi-Fi and cellular would otherwise trigger captive detection unnecessarily, causing battery drain.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Jordan Whited f0230ce0b5
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GRO for Linux (#12921)
This commit implements TCP GRO for packets being written to gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported IP checksum functions.
gVisor is updated in order to make use of newly exported
stack.PacketBuffer GRO logic.

TCP throughput towards gVisor, i.e. TUN write direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement, sometimes as high as 2x. High bandwidth-delay product
paths remain receive window limited, bottlenecked by gVisor's default
TCP receive socket buffer size. This will be addressed in a  follow-on
commit.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.

The first result is from commit 57856fc without TCP GRO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  4.77 GBytes  4.10 Gbits/sec   20 sender
[  5]   0.00-10.00  sec  4.77 GBytes  4.10 Gbits/sec      receiver

The second result is from this commit with TCP GRO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  10.6 GBytes  9.14 Gbits/sec   20 sender
[  5]   0.00-10.00  sec  10.6 GBytes  9.14 Gbits/sec      receiver

Updates #6816

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Aaron Klotz 655b4f8fc5 net/netns: remove some logspam by avoiding logging parse errors due to unspecified addresses
I updated the address parsing stuff to return a specific error for
unspecified hosts passed as empty strings, and look for that
when logging errors. I explicitly did not make parseAddress return a
netip.Addr containing an unspecified address because at this layer,
in the absence of any host, we don't necessarily know the address
family we're dealing with.

For the purposes of this code I think this is fine, at least until
we implement #12588.

Fixes #12979

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
1 year ago
Brad Fitzpatrick 004dded0a8 net/tlsdial: relax self-signed cert health warning
It seems some security software or macOS itself might be MITMing TLS
(for ScreenTime?), so don't warn unless it fails x509 validation
against system roots.

Updates #3198

Change-Id: I6ea381b5bb6385b3d51da4a1468c0d803236b7bf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Aaron Klotz 0def4f8e38 net/netns: on Windows, fall back to default interface index when unspecified address is passed to ControlC and bindToInterfaceByRoute is enabled
We were returning an error instead of binding to the default interface.

Updates #12979

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
1 year ago
Jordan Whited 7bc2ddaedc
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869)
This commit implements TCP GSO for packets being read from gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported GSO logic from its tun
package.

A new gVisor stack.LinkEndpoint implementation has been established
(linkEndpoint) that is loosely modeled after its predecessor
(channel.Endpoint). This new implementation supports GSO of monster TCP
segments up to 64K in size, whereas channel.Endpoint only supports up to
32K. linkEndpoint will also be required for GRO, which will be
implemented in a follow-on commit.

TCP throughput from gVisor, i.e. TUN read direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement through a wide range of RTT and loss conditions, sometimes
as high as 5x.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.

The first result is from commit 57856fc without TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.51 GBytes  2.15 Gbits/sec  154 sender
[  5]   0.00-10.00  sec  2.49 GBytes  2.14 Gbits/sec      receiver

The second result is from this commit with TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  12.6 GBytes  10.8 Gbits/sec    6 sender
[  5]   0.00-10.00  sec  12.6 GBytes  10.8 Gbits/sec      receiver

Updates #6816

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Andrea Gottardo 949b15d858
net/captivedetection: call SetHealthy once connectivity restored (#12974)
Fixes tailscale/tailscale#12973
Updates tailscale/tailscale#1634

There was a logic issue in the captive detection code we shipped in https://github.com/tailscale/tailscale/pull/12707.

Assume a captive portal has been detected, and the user notified. Upon switching to another Wi-Fi that does *not* have a captive portal, we were issuing a signal to interrupt any pending captive detection attempt. However, we were not also setting the `captive-portal-detected` warnable to healthy. The result was that any "captive portal detected" alert would not be cleared from the UI.

Also fixes a broken log statement value.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Jonathan Nobels 8a8ecac6a7
net/dns, cmd/tailscaled: plumb system health tracker into dns cleanup (#12969)
fixes tailscale#12968

The dns manager cleanup func was getting passed a nil
health tracker, which will panic.  Fixed to pass it
the system health tracker.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
1 year ago
Jonathan Nobels 19b0c8a024
net/dns, health: raise health warning for failing forwarded DNS queries (#12888)
updates tailscale/corp#21823

Misconfigured, broken, or blocked DNS will often present as
"internet is broken'" to the end user.  This  plumbs the health tracker
into the dns manager and forwarder and adds a health warning
with a 5 second delay that is raised on failures in the forwarder and
lowered on successes.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
1 year ago
Andrea Gottardo 6840f471c0
net/dnsfallback: set CanPort80 in static DERPMap (#12929)
Updates tailscale/corp#21949

As discussed with @raggi, this PR updates the static DERPMap embedded in the client to reflect the availability of HTTP on the DERP servers run by Tailscale.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Andrea Gottardo 90be06bd5b
health: introduce captive-portal-detected Warnable (#12707)
Updates tailscale/tailscale#1634

This PR introduces a new `captive-portal-detected` Warnable which is set to an unhealthy state whenever a captive portal is detected on the local network, preventing Tailscale from connecting.



ipn/ipnlocal: fix captive portal loop shutdown


Change-Id: I7cafdbce68463a16260091bcec1741501a070c95

net/captivedetection: fix mutex misuse

ipn/ipnlocal: ensure that we don't fail to start the timer


Change-Id: I3e43fb19264d793e8707c5031c0898e48e3e7465

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Jordan Whited f0b9d3f477
net/tstun: fix docstring for Wrapper.SetWGConfig (#12796)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
KevinLiang10 8d7b78f3f7 net/dns/publicdns: remove additional information in DOH URL passed to IPv6 address generation for controlD.
This commit truncates any additional information (mainly hostnames) that's passed to controlD via DOH URL in DoHIPsOfBase.
This change is to make sure only resolverID is passed to controlDv6Gen but not the additional information.

Updates: #7946
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
1 year ago
Brad Fitzpatrick c6af5bbfe8 all: add test for package comments, fix, add comments as needed
Updates #cleanup

Change-Id: Ic4304e909d2131a95a38b26911f49e7b1729aaef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Maisem Ali 2238ca8a05 go.mod: bump bart
Updates #bart

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Nick Khyl 8bd442ba8c util/winutil/gp, net/dns: add package for Group Policy API
This adds a package with GP-related functions and types to be used in the future PRs.
It also updates nrptRuleDatabase to use the new package instead of its own gpNotificationWatcher implementation.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Jonathan Nobels 4e5ef5b628
net/dns: fix broken dns benchmark tests (#12686)
Updates tailscale/corp#20677

The recover function wasn't getting set in the benchmark
tests.  Default changed to an empty func.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
1 year ago
James Tucker b565a9faa7 cmd/xdpderper: add autodetection for default interface name
This makes deployment easier in hetrogenous environments.

Updates ENG-4274

Signed-off-by: James Tucker <james@tailscale.com>
1 year ago
Brad Fitzpatrick 9766f0e110 net/dns: move mutex before the field it guards
And some misc doc tweaks for idiomatic Go style.

Updates #cleanup

Change-Id: I3ca45f78aaca037f433538b847fd6a9571a2d918
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Andrew Dunham 0323dd01b2 ci: enable checklocks workflow for specific packages
This turns the checklocks workflow into a real check, and adds
annotations to a few basic packages as a starting point.

Updates #12625

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2b0185bae05a843b5257980fc6bde732b1bdd93f
1 year ago
Andrew Dunham 53a5d00fff net/dns: ensure /etc/resolv.conf is world-readable even with a umask
Previously, if we had a umask set (e.g. 0027) that prevented creating a
world-readable file, /etc/resolv.conf would be created without the o+r
bit and thus other users may be unable to resolve DNS.

Since a umask only applies to file creation, chmod the file after
creation and before renaming it to ensure that it has the appropriate
permissions.

Updates #12609

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2a05d64f4f3a8ee8683a70be17a7da0e70933137
1 year ago
Andrew Dunham a475c435ec net/dns/resolver: fix test failure
Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I0e815a69ee44ca0ff7c0ea0ca3c6904bbf67ed1f
1 year ago
Jonathan Nobels 27033c6277
net/dns: recheck DNS config on SERVFAIL errors (#12547)
Fixes tailscale/corp#20677

Replaces the original attempt to rectify this (by injecting a netMon
event) which was both heavy handed, and missed cases where the
netMon event was "minor".

On apple platforms, the fetching the interface's nameservers can
and does return an empty list in certain situations.   Apple's API
in particular is very limiting here.  The header hints at notifications
for dns changes which would let us react ahead of time, but it's all
private APIs.

To avoid remaining in the state where we end up with no
nameservers but we absolutely need them, we'll react
to a lack of upstream nameservers by attempting to re-query
the OS.

We'll rate limit this to space out the attempts.   It seems relatively
harmless to attempt a reconfig every 5 seconds (triggered
by an incoming query) if the network is in this broken state.

Missing nameservers might possibly be a persistent condition
(vs a transient error), but that would  also imply that something
out of our control is badly misconfigured.

Tested by randomly returning [] for the nameservers.   When switching
between Wifi networks, or cell->wifi, this will randomly trigger
the bug, and we appear to reliably heal the DNS state.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
1 year ago
Aaron Klotz 7dd76c3411 net/netns: add Windows support for bind-to-interface-by-route
This is implemented via GetBestInterfaceEx. Should we encounter errors
or fail to resolve a valid, non-Tailscale interface, we fall back to
returning the index for the default interface instead.

Fixes #12551

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
1 year ago
Aaron Klotz d7a4f9d31c net/dns: ensure multiple hosts with the same IP address are combined into a single HostEntry
This ensures that each line has a unique IP address.

Fixes #11939

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
1 year ago
Brad Fitzpatrick 5ec01bf3ce wgengine/filter: support FilterRules matching on srcIP node caps [capver 100]
See #12542 for background.

Updates #12542

Change-Id: Ida312f700affc00d17681dc7551ee9672eeb1789
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 162d593514 net/flowtrack: fix, test String method
I meant to do this in the earlier change and had a git fail.

To atone, add a test too while I'm here.

Updates #12486
Updates #12507

Change-Id: I4943b454a2530cb5047636f37136aa2898d2ffc7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 9e0a5cc551 net/flowtrack: optimize Tuple type for use as map key
This gets UDP filter overhead closer to TCP. Still ~2x, but no longer ~3x.

    goos: darwin
    goarch: arm64
    pkg: tailscale.com/wgengine/filter
                                       │   before    │                after                │
                                       │   sec/op    │   sec/op     vs base                │
    FilterMatch/tcp-not-syn-v4-8         15.43n ± 3%   15.38n ± 5%        ~ (p=0.339 n=10)
    FilterMatch/udp-existing-flow-v4-8   42.45n ± 0%   34.77n ± 1%  -18.08% (p=0.000 n=10)
    geomean                              25.59n        23.12n        -9.65%

Updates #12486

Change-Id: I595cfadcc6b7234604bed9c4dd4261e087c0d4c4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick bd93c3067e wgengine/filter/filtertype: make Match.IPProto a view
I noticed we were allocating these every time when they could just
share the same memory. Rather than document ownership, just lock it
down with a view.

I was considering doing all of the fields but decided to just do this
one first as test to see how infectious it became.  Conclusion: not
very.

Updates #cleanup (while working towards tailscale/corp#20514)

Change-Id: I8ce08519de0c9a53f20292adfbecd970fe362de0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 1f6645b19f net/ipset: skip the loop over Prefixes when there's only one
For pprof cosmetic/confusion reasons more than performance, but it
might have tiny speed benefit.

Updates #12486

Change-Id: I40e03714f3afa3a7e7f5e1fa99b81c7e889b91b6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick bf2d13cfa0 net/ipset: return all closures from named wrappers
So profiles show more useful names than just func1, func2, func3, etc.
There will still be func1 on them all, but the symbol before will say
what the lookup type is.

Updates #12486

Change-Id: I910b024a7861394eb83d07f5a899eae338cb1f22
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 86e0f9b912 net/ipset, wgengine/filter/filtertype: add split-out packages
This moves NewContainsIPFunc from tsaddr to new ipset package.

And wgengine/filter types gets split into wgengine/filter/filtertype,
so netmap (and thus the CLI, etc) doesn't need to bring in ipset,
bart, etc.

Then add a test making sure the CLI deps don't regress.

Updates #1278

Change-Id: Ia246d6d9502bbefbdeacc4aef1bed9c8b24f54d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 64ac64fb66 net/tsaddr: use bart in NewContainsIPFunc, add tests, benchmarks
NewContainsIPFunc was previously documented as performing poorly if
there were many netip.Prefixes to search over. As such, we never it used it
in such cases.

This updates it to use bart at a certain threshold (over 6 prefixes,
currently), at which point the bart lookup overhead pays off.

This is currently kinda useless because we're not using it. But now we
can and get wins elsewhere. And we can remove the caveat in the docs.

    goos: darwin
    goarch: arm64
    pkg: tailscale.com/net/tsaddr
                                     │    before    │                after                 │
                                     │    sec/op    │    sec/op     vs base                │
    NewContainsIPFunc/empty-8          2.215n ± 11%   2.239n ±  1%   +1.08% (p=0.022 n=10)
    NewContainsIPFunc/cidr-list-1-8    17.44n ±  0%   17.59n ±  6%   +0.89% (p=0.000 n=10)
    NewContainsIPFunc/cidr-list-2-8    27.85n ±  0%   28.13n ±  1%   +1.01% (p=0.000 n=10)
    NewContainsIPFunc/cidr-list-3-8    36.05n ±  0%   36.56n ± 13%   +1.41% (p=0.000 n=10)
    NewContainsIPFunc/cidr-list-4-8    43.73n ±  0%   44.38n ±  1%   +1.50% (p=0.000 n=10)
    NewContainsIPFunc/cidr-list-5-8    51.61n ±  2%   51.75n ±  0%        ~ (p=0.101 n=10)
    NewContainsIPFunc/cidr-list-10-8   95.65n ±  0%   68.92n ±  0%  -27.94% (p=0.000 n=10)
    NewContainsIPFunc/one-ip-8         4.466n ±  0%   4.469n ±  1%        ~ (p=0.491 n=10)
    NewContainsIPFunc/two-ip-8         8.002n ±  1%   7.997n ±  4%        ~ (p=0.697 n=10)
    NewContainsIPFunc/three-ip-8       27.98n ±  1%   27.75n ±  0%   -0.82% (p=0.012 n=10)
    geomean                            19.60n         19.07n         -2.71%

Updates #12486

Change-Id: I2e2320cc4384f875f41721374da536bab995c1ce
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Nick Khyl c32efd9118 various: create a catch-all NRPT rule when "Override local DNS" is enabled on Windows
Without this rule, Windows 8.1 and newer devices issue parallel DNS requests to DNS servers
associated with all network adapters, even when "Override local DNS" is enabled and/or
a Mullvad exit node is being used, resulting in DNS leaks.

This also adds "disable-local-dns-override-via-nrpt" nodeAttr that can be used to disable
the new behavior if needed.

Fixes tailscale/corp#20718

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Andrea Gottardo a8ee83e2c5
health: begin work to use structured health warnings instead of strings, pipe changes into ipn.Notify (#12406)
Updates tailscale/tailscale#4136

This PR is the first round of work to move from encoding health warnings as strings and use structured data instead. The current health package revolves around the idea of Subsystems. Each subsystem can have (or not have) a Go error associated with it. The overall health of the backend is given by the concatenation of all these errors.

This PR polishes the concept of Warnable introduced by @bradfitz a few weeks ago. Each Warnable is a component of the backend (for instance, things like 'dns' or 'magicsock' are Warnables). Each Warnable has a unique identifying code. A Warnable is an entity we can warn the user about, by setting (or unsetting) a WarningState for it. Warnables have:

- an identifying Code, so that the GUI can track them as their WarningStates come and go
- a Title, which the GUIs can use to tell the user what component of the backend is broken
- a Text, which is a function that is called with a set of Args to generate a more detailed error message to explain the unhappy state

Additionally, this PR also begins to send Warnables and their WarningStates through LocalAPI to the clients, using ipn.Notify messages. An ipn.Notify is only issued when a warning is added or removed from the Tracker.

In a next PR, we'll get rid of subsystems entirely, and we'll start using structured warnings for all errors affecting the backend functionality.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Jordan Whited 65888d95c9
derp/xdp,cmd/xdpderper: initial skeleton (#12390)
This commit introduces a userspace program for managing an experimental
eBPF XDP STUN server program. derp/xdp contains the eBPF pseudo-C along
with a Go pkg for loading it and exporting its metrics.
cmd/xdpderper is a package main user of derp/xdp.

Updates tailscale/corp#20689

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Jonathan Nobels 02e3c046aa
net/dns: re-query system resolvers on no-upstream resolver failure on apple platforms (#12398)
Fixes tailscale/corp#20677

On macOS sleep/wake, we're encountering a condition where reconfigure the network
a little bit too quickly - before apple has set the nameservers for our interface.
This results in a persistent condition where we have no upstream resolver and
fail all forwarded DNS queries.

No upstream nameservers is a legitimate configuration, and we have no  (good) way
of determining when Apple is ready - but if we need to forward a query, and we
have no nameservers, then something has gone badly wrong and the network is
very broken.

A simple fix here is to simply inject a netMon event, which will go through the
configuration dance again when we hit the SERVFAIL condition.

Tested by artificially/randomly returning [] for the list of nameservers in the bespoke
ipn-bridge code responsible for getting the nameservers.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
1 year ago