Commit Graph

981 Commits (irbekrm/extsvcnftableslb)

Author SHA1 Message Date
Andrew Dunham b85c2b2313 net/dns/resolver: use SystemDial in DoH forwarder
This ensures that we close the underlying connection(s) when a major
link change happens. If we don't do this, on mobile platforms switching
between WiFi and cellular can result in leftover connections in the
http.Client's connection pool which are bound to the "wrong" interface.

Updates #10821
Updates tailscale/corp#19124

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibd51ce2efcaf4bd68e14f6fdeded61d4e99f9a01
1 month ago
Andrew Dunham 226486eb9a net/interfaces: handle removed interfaces in State.Equal
This wasn't previously handling the case where an interface in s2 was
removed and not present in s1, and would cause the Equal method to
incorrectly return that the states were equal.

Updates tailscale/corp#19124

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I3af22bc631015d1ddd0a1d01bfdf312161b9532d
2 months ago
Brad Fitzpatrick 7c1d6e35a5 all: use Go 1.22 range-over-int
Updates #11058

Change-Id: I35e7ef9b90e83cac04ca93fd964ad00ed5b48430
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Brad Fitzpatrick 068db1f972 net/interfaces: delete unused unexported function
It should've been deleted in 11ece02f52.

Updates #9040

Change-Id: If8a136bdb6c82804af658c9d2b0a8c63ce02d509
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Brad Fitzpatrick a1abd12f35 cmd/tailscaled, net/tstun: build for aix/ppc64
At least in userspace-networking mode.

Fixes #11361

Change-Id: I78d33f0f7e05fe9e9ee95b97c99b593f8fe498f2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Aaron Klotz 4d5d669cd5 net/dns: unconditionally write NRPT rules to local settings
We were being too aggressive when deciding whether to write our NRPT rules
to the local registry key or the group policy registry key.

After once again reviewing the document which calls itself a spec
(see issue), it is clear that the presence of the DnsPolicyConfig subkey
is the important part, not the presence of values set in the DNSClient
subkey. Furthermore, a footnote indicates that the presence of
DnsPolicyConfig in the GPO key will always override its counterpart in
the local key. The implication of this is important: we may unconditionally
write our NRPT rules to the local key. We copy our rules to the policy
key only when it contains NRPT rules belonging to somebody other than us.

Fixes https://github.com/tailscale/corp/issues/19071

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2 months ago
James Tucker db760d0bac cmd/tailscaled: move cleanup to an implicit action during startup
This removes a potentially increased boot delay for certain boot
topologies where they block on ExecStartPre that may have socket
activation dependencies on other system services (such as
systemd-resolved and NetworkManager).

Also rename cleanup to clean up in affected/immediately nearby places
per code review commentary.

Fixes #11599

Signed-off-by: James Tucker <james@tailscale.com>
2 months ago
Brad Fitzpatrick b0fbd85592 net/tsdial: partially fix "tailscale nc" (UserDial) on macOS
At least in the case of dialing a Tailscale IP.

Updates #4529

Change-Id: I9fd667d088a14aec4a56e23aabc2b1ffddafa3fe
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
alexelisenko fe22032fb3
net/dns/{publicdns,resolver}: add start of Control D support
Updates #7946

[@bradfitz fixed up version of #8417]

Change-Id: I1dbf6fa8d525b25c0d7ad5c559a7f937c3cd142a
Signed-off-by: alexelisenko <39712468+alexelisenko@users.noreply.github.com>
Signed-off-by: Alex Paguis <alex@windscribe.com>
2 months ago
James Tucker 6e334e64a1 net/netcheck,wgengine/magicsock: align DERP frame receive time heuristics
The netcheck package and the magicksock package coordinate via the
health package, but both sides have time based heuristics through
indirect dependencies. These were misaligned, so the implemented
heuristic aimed at reducing DERP moves while there is active traffic
were non-operational about 3/5ths of the time.

It is problematic to setup a good test for this integration presently,
so instead I added comment breadcrumbs along with the initial fix.

Updates #8603

Signed-off-by: James Tucker <james@tailscale.com>
2 months ago
Brad Fitzpatrick 8d7894c68e clientupdate, net/dns: fix some "tailsacle" typos
Updates #cleanup

Change-Id: I982175e74b0c8c5b3e01a573e5785e6596b7ac39
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
James Tucker f384742375 net/packet: allow more ICMP errors
We now allow some more ICMP errors to flow, specifically:

- ICMP parameter problem in both IPv4 and IPv6 (corrupt headers)
- ICMP Packet Too Big (for IPv6 PMTU)

Updates #311
Updates #8102
Updates #11002

Signed-off-by: James Tucker <james@tailscale.com>
2 months ago
Asutorufa e20ce7bf0c net/dns: close ctx when close dns directManager
Signed-off-by: Asutorufa <16442314+Asutorufa@users.noreply.github.com>
2 months ago
Percy Wegmann 8b8b315258 net/tstun: use gaissmai/bart instead of tempfork/device
This implementation uses less memory than tempfork/device,
which helps avoid OOM conditions in the iOS VPN extension when
switching to a Tailnet with ExitNode routing enabled.

Updates tailscale/corp#18514

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2 months ago
Brad Fitzpatrick a36cfb4d3d tailcfg, ipn/ipnlocal, wgengine/magicsock: add only-tcp-443 node attr
Updates tailscale/corp#17879

Change-Id: I0dc305d147b76c409cf729b599a94fa723aef0e0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Anton Tolchanov cf8948da5f net/routetable: increase route limit used by the test
I was running all tests while preparing a recent stable release, and
this was failing because my computer is connected to a fairly large
tailnet.

```
--- FAIL: TestGetRouteTable (0.01s)
    routetable_linux_test.go:32: expected at least one default route;
    ...
```

```
$ ip route show table 52  | wc -l
1051
```

Updates #cleanup

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
3 months ago
Andrew Dunham 9884d06b80 net/interfaces: fix test hang on Darwin
This test could hang because the subprocess was blocked on writing to
the stdout pipe if we find the address we're looking for early in the
output.

Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I68d82c22a5d782098187ae6d8577e43063b72573
3 months ago
Andrew Dunham 62cf83eb92 go.mod: bump gvisor
The `stack.PacketBufferPtr` type no longer exists; replace it with
`*stack.PacketBuffer` instead.

Updates #8043

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib56ceff09166a042aa3d9b80f50b2aa2d34b3683
3 months ago
Andrew Dunham 3dd8ae2f26 net/tstun: fix spelling of "WireGuard"
Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ida7e30f4689bc18f5f7502f53a0adb5ac3c7981a
3 months ago
Anton Tolchanov 8cc5c51888 health: warn about reverse path filtering and exit nodes
When reverse path filtering is in strict mode on Linux, using an exit
node blocks all network connectivity. This change adds a warning about
this to `tailscale status` and the logs.

Example in `tailscale status`:

```
- not connected to home DERP region 22
- The following issues on your machine will likely make usage of exit nodes impossible: [interface "eth0" has strict reverse-path filtering enabled], please set rp_filter=2 instead of rp_filter=1; see https://github.com/tailscale/tailscale/issues/3310
```

Example in the logs:
```
2024/02/21 21:17:07 health("overall"): error: multiple errors:
	not in map poll
	The following issues on your machine will likely make usage of exit nodes impossible: [interface "eth0" has strict reverse-path filtering enabled], please set rp_filter=2 instead of rp_filter=1; see https://github.com/tailscale/tailscale/issues/3310
```

Updates #3310

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
3 months ago
Nick Khyl 7ef1fb113d cmd/tailscaled, ipn/ipnlocal, wgengine: shutdown tailscaled if wgdevice is closed
Tailscaled becomes inoperative if the Tailscale Tunnel wintun adapter is abruptly removed.
wireguard-go closes the device in case of a read error, but tailscaled keeps running.
This adds detection of a closed WireGuard device, triggering a graceful shutdown of tailscaled.
It is then restarted by the tailscaled watchdog service process.

Fixes #11222

Signed-off-by: Nick Khyl <nickk@tailscale.com>
3 months ago
Nick Khyl b42b9817b0 net/dns: do not wait for the interface registry key to appear if the windowsManager is being closed
The WinTun adapter may have been removed by the time we're closing
the dns.windowsManager, and its associated interface registry key might
also have been deleted. We shouldn't use winutil.OpenKeyWait and wait
for the interface key to appear when performing a cleanup as a part of
the windowsManager shutdown.

Updates #11222

Signed-off-by: Nick Khyl <nickk@tailscale.com>
3 months ago
Brad Fitzpatrick e1bd7488d0 all: remove LenIter, use Go 1.22 range-over-int instead
Updates #11058
Updates golang/go#65685

Change-Id: Ibb216b346e511d486271ab3d84e4546c521e4e22
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 months ago
mrrfv ff1391a97e net/dns/publicdns: add Mullvad family DNS to the list of known DoH servers
Adds the new Mullvad family DNS server to the known DNS over HTTPS server list.

Signed-off-by: mrrfv <rm-rfv-no-preserve-root@protonmail.com>
3 months ago
James Tucker 8d0d46462b net/dns: timeout DOH requests after 10s without response headers
If a client socket is remotely lost but the client is not sent an RST in
response to the next request, the socket might sit in RTO for extended
lengths of time, resulting in "no internet" for users. Instead, timeout
after 10s, which will close the underlying socket, recovering from the
situation more promptly.

Updates #10967

Signed-off-by: James Tucker <james@tailscale.com>
3 months ago
James Tucker 651c4899ac net/interfaces: reduce & cleanup logs on iOS
We don't need a log line every time defaultRoute is read in the good
case, and we now only log default interface updates that are actually
changes.

Updates #3363

Signed-off-by: James Tucker <james@tailscale.com>
3 months ago
Andrew Dunham e8d2fc7f7f net/tshttpproxy: log when we're using a proxy
Updates #11196

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id6334c10f52f4cfbda9f03dc8096ab7a6c54a088
3 months ago
James Tucker 8fe504241d net/ktimeout: add a package to set TCP user timeout
Setting a user timeout will be a more practical tuning knob for a number
of endpoints, this provides a way to set it.

Updates tailscale/corp#17587

Signed-off-by: James Tucker <james@tailscale.com>
3 months ago
Andrew Dunham 70b7201744 net/dns: fix infinite loop when run on Amazon Linux 2023
This fixes an infinite loop caused by the configuration of
systemd-resolved on Amazon Linux 2023 and how that interacts with
Tailscale's "direct" mode. We now drop the Tailscale service IP from the
OS's "base configuration" when we detect this configuration.

Updates #7816

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I73a4ea8e65571eb368c7e179f36af2c049a588ee
4 months ago
Andrew Dunham b0e96a6c39 net/dns: log more info when openresolv commands fail
Updates #11129

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ic594868ba3bc31f6d3b0721ecba4090749a81f7f
4 months ago
Brad Fitzpatrick 2bd3c1474b util/cmpx: delete now that we're using Go 1.22
Updates #11058

Change-Id: I09dea8e86f03ec148b715efca339eab8b1f0f644
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 months ago
Andrew Dunham fd94d96e2b net/portmapper: support legacy "urn:dslforum-org" portmapping services
These are functionally the same as the "urn:schemas-upnp-org" services
with a few minor changes, and are still used by older devices. Support
them to improve our ability to obtain an external IP on such networks.

Updates #10911

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I05501fad9d6f0a3b8cf19fc95eee80e7d16cc2cf
4 months ago
Andrew Dunham b45089ad85 net/portmapper: handle cases where we have no supported clients
This no longer results in a nil pointer exception when we get a valid
UPnP response with no supported clients.

Updates #10911

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I6e3715a49a193ff5261013871ad7fff197a4d77e
4 months ago
kari-ts c9fd166cc6
net/netmon: when a new network is added, trigger netmon update (#10840)
Fixes #10107
5 months ago
Andrew Dunham 20f3f706a4 net/netutil: allow 16-bit 4via6 site IDs
The prefix has space for 32-bit site IDs, but the validateViaPrefix
function would previously have disallowed site IDs greater than 255.

Fixes tailscale/corp#16470

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I4cdb0711dafb577fae72d86c4014cf623fa538ef
5 months ago
James Tucker 953fa80c6f cmd/{derper,stund},net/stunserver: add standalone stun server
Add a standalone server for STUN that can be hosted independently of the
derper, and factor that back into the derper.

Fixes #8434
Closes #8435
Closes #10745

Signed-off-by: James Tucker <james@tailscale.com>
5 months ago
Andrew Dunham 35c303227a net/dns/resolver: add ID to verbose logs in forwarder
To make it easier to correlate the starting/ending log messages.

Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I2802d53ad98e19bc8914bc58f8c04d4443227b26
5 months ago
Andrea Gottardo d9aeb30281
net/interfaces: handle iOS network transitions (#10680)
Updates #8022
Updates #6075

On iOS, we currently rely on delegated interface information to figure out the default route interface.  The NetworkExtension framework in iOS seems to set the delegate interface only once, upon the *creation* of the VPN tunnel. If a network transition (e.g. from Wi-Fi to Cellular) happens while the tunnel is connected, it will be ignored and we will still try to set Wi-Fi as the default route because the delegated interface is not getting updated as connectivity transitions.

Here we work around this on the Swift side with a NWPathMonitor instance that observes the interface name of the first currently satisfied network path. Our Swift code will call into `UpdateLastKnownDefaultRouteInterface`, so we can rely on that when it is set.

If for any reason the Swift machinery didn't work and we don't get any updates, here we also have some fallback logic: we try finding a hardcoded Wi-Fi interface called en0. If en0 is down, we fall back to cellular (pdp_ip0) as a last resort. This doesn't handle all edge cases like USB-Ethernet adapters or multiple Ethernet interfaces, but it is good enough to ensure connectivity isn't broken.

I tested this on iPhones and iPads running iOS 17.1 and it appears to work. Switching between different cellular plans on a dual SIM configuration also works (the interface name remains pdp_ip0).

Signed-off-by: Andrea Gottardo <andrea@tailscale.com>
5 months ago
Andrew Dunham fa3639783c net/portmapper: check returned epoch from PMP and PCP protocols
If the epoch that we see during a Probe is less than the existing epoch,
it means that the gateway has either restarted or reset its
configuration, and an existing mapping is no longer valid. Reset any
saved mapping(s) if we detect this case so that a future
createOrGetMapping will not attempt to re-use it.

Updates #10597

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ie3cddaf625cb94a29885f7a1eeea25dbf6b97b47
5 months ago
Andrew Lytvynov 2716250ee8
all: cleanup unused code, part 2 (#10670)
And enable U1000 check in staticcheck.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
5 months ago
Nick Khyl c9836b454d net/netmon: fix goroutine leak in winMon if the monitor is never started
When the portable Monitor creates a winMon via newOSMon, we register
address and route change callbacks with Windows. Once a callback is hit,
it starts a goroutine that attempts to send the event into messagec and returns.
The newly started goroutine then blocks until it can send to the channel.
However, if the monitor is never started and winMon.Receive is never called,
the goroutines remain indefinitely blocked, leading to goroutine leaks and
significant memory consumption in the tailscaled service process on Windows.
Unlike the tailscaled subprocess, the service process creates but never starts
a Monitor.

This PR adds a check within the callbacks to confirm the monitor's active status,
and exits immediately if the monitor hasn't started.

Updates #9864

Signed-off-by: Nick Khyl <nickk@tailscale.com>
5 months ago
Andrew Lytvynov 1302bd1181
all: cleanup unused code, part 1 (#10661)
Run `staticcheck` with `U1000` to find unused code. This cleans up about
a half of it. I'll do the other half separately to keep PRs manageable.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
5 months ago
Andrew Dunham 3c333f6341 net/portmapper: add logs about obtained mapping(s)
This logs additional information about what mapping(s) are obtained
during the creation process, including whether we return an existing
cached mapping.

Updates #10597

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I9ff25071f064c91691db9ab0b9365ccc5f948d6e
5 months ago
Andrew Dunham 01286af82b net/interfaces: better handle multiple interfaces in LikelyHomeRouterIP
Currently, we get the "likely home router" gateway IP and then iterate
through all IPs for all interfaces trying to match IPs to determine the
source IP. However, on many platforms we know what interface the gateway
is through, and thus we don't need to iterate through all interfaces
checking IPs. Instead, use the IP address of the associated interface.

This better handles the case where we have multiple interfaces on a
system all connected to the same gateway, and where the first interface
that we visit (as iterated by ForeachInterfaceAddress) isn't also the
default internet route.

Updates #8992

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8632f577f1136930f4ec60c76376527a19a47d1f
5 months ago
Andrew Dunham 09136e5995
net/netutil: add function to check rp_filter value (#5703)
Updates #4432


Change-Id: Ifc332a5747fc1feffdbb87437308cf8ecb21b0b0

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
5 months ago
Andrew Dunham d05a572db4 net/portmapper: handle multiple UPnP discovery responses
Instead of taking the first UPnP response we receive and using that to
create port mappings, store all received UPnP responses, sort and
deduplicate them, and then try all of them to obtain an external
address.

Updates #10602

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I783ccb1834834ee2a9ecbae2b16d801f2354302f
6 months ago
Andrew Dunham 727acf96a6 net/netcheck: use DERP frames as a signal for home region liveness
This uses the fact that we've received a frame from a given DERP region
within a certain time as a signal that the region is stil present (and
thus can still be a node's PreferredDERP / home region) even if we don't
get a STUN response from that region during a netcheck.

This should help avoid DERP flaps that occur due to losing STUN probes
while still having a valid and active TCP connection to the DERP server.

RELNOTE=Reduce home DERP flapping when there's still an active connection

Updates #8603

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: If7da6312581e1d434d5c0811697319c621e187a0
6 months ago
Andrew Dunham bac4890467 net/portmapper: be smarter about selecting a UPnP device
Previously, we would select the first WANIPConnection2 (and related)
client from the root device, without any additional checks. However,
some routers expose multiple UPnP devices in various states, and simply
picking the first available one can result in attempting to perform a
portmap with a device that isn't functional.

Instead, mimic what the miniupnpc code does, and prefer devices that are
(a) reporting as Connected, and (b) have a valid external IP address.
For our use-case, we additionally prefer devices that have an external
IP address that's a public address, to increase the likelihood that we
can obtain a direct connection from peers.

Finally, we split out fetching the root device (getUPnPRootDevice) from
selecting the best service within that root device (selectBestService),
and add some extensive tests for various UPnP server behaviours.

RELNOTE=Improve UPnP portmapping when multiple UPnP services exist

Updates #8364

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I71795cd80be6214dfcef0fe83115a5e3fe4b8753
6 months ago
Andrea Barisani affe11c503 net/netcheck: only run HTTP netcheck for tamago clients
Signed-off-by: Andrea Barisani <andrea@inversepath.com>
6 months ago
Denton Gentry 137e9f4c46 net/portmap: add test of Mikrotik Root Desc XML.
Unfortunately in the test we can't reproduce the failure seen
in the real system ("SOAP fault: UPnPError")

Updates https://github.com/tailscale/tailscale/issues/8364

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
6 months ago