Commit Graph

1485 Commits (78dc8622d7e19082d94877bfa12a8801be9ee9b9)

Author SHA1 Message Date
James Tucker a2eb1c22b0 wgengine/magicsock: allow disco communication without known endpoints
Just because we don't have known endpoints for a peer does not mean that
the peer should become unreachable. If we know the peers key, it should
be able to call us, then we can talk back via whatever path it called us
on. First step - don't drop the packet in this context.

Updates tailscale/corp#19106

Signed-off-by: James Tucker <james@tailscale.com>
2 months ago
James Tucker db760d0bac cmd/tailscaled: move cleanup to an implicit action during startup
This removes a potentially increased boot delay for certain boot
topologies where they block on ExecStartPre that may have socket
activation dependencies on other system services (such as
systemd-resolved and NetworkManager).

Also rename cleanup to clean up in affected/immediately nearby places
per code review commentary.

Fixes #11599

Signed-off-by: James Tucker <james@tailscale.com>
2 months ago
Maisem Ali 3f4c5daa15 wgengine/netstack: remove SubnetRouterWrapper
It was used when we only supported subnet routers on linux
and would nil out the SubnetRoutes slice as no other router
worked with it, but now we support subnet routers on ~all platforms.

The field it was setting to nil is now only used for network logging
and nowhere else, so keep the field but drop the SubnetRouterWrapper
as it's not useful.

Updates #cleanup

Change-Id: Id03f9b6ec33e47ad643e7b66e07911945f25db79
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 months ago
James Tucker 6e334e64a1 net/netcheck,wgengine/magicsock: align DERP frame receive time heuristics
The netcheck package and the magicksock package coordinate via the
health package, but both sides have time based heuristics through
indirect dependencies. These were misaligned, so the implemented
heuristic aimed at reducing DERP moves while there is active traffic
were non-operational about 3/5ths of the time.

It is problematic to setup a good test for this integration presently,
so instead I added comment breadcrumbs along with the initial fix.

Updates #8603

Signed-off-by: James Tucker <james@tailscale.com>
2 months ago
Joonas Kuorilehto fe0cfec4ad wgengine/router: enable ip forwarding on gokrazy
Only on Gokrazy, set sysctls to enable IP forwarding so subnet routing
and advertised exit node works.

Fixes #11405

Signed-off-by: Joonas Kuorilehto <joneskoo@derbian.fi>
2 months ago
Percy Wegmann 853e3e29a0 wgengine/router: provide explicit hook to signal Android when VPN needs to be reconfigured
This allows clients to avoid establishing their VPN multiple times when
both routes and DNS are changing in rapid succession.

Updates tailscale/corp#18928

Signed-off-by: Percy Wegmann <percy@tailscale.com>
2 months ago
Charlotte Brandhorst-Satzkorn 93618a3518
tailscale: update tailfs functions and vars to use drive naming (#11597)
This change updates all tailfs functions and the majority of the tailfs
variables to use the new drive naming.

Updates tailscale/corp#16827

Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
2 months ago
Charlotte Brandhorst-Satzkorn 14683371ee
tailscale: update tailfs file and package names (#11590)
This change updates the tailfs file and package names to their new
naming convention.

Updates #tailscale/corp#16827

Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
2 months ago
Irbe Krumina 5fb721d4ad
util/linuxfw,wgengine/router: skip IPv6 firewall configuration in partial iptables mode (#11546)
We have hosts that support IPv6, but not IPv6 firewall configuration
in iptables mode.
We also have hosts that have some support for IPv6 firewall
configuration in iptables mode, but do not have iptables filter table.
We should:
- configure ip rules for all hosts that support IPv6
- only configure firewall rules in iptables mode if the host
has iptables filter table.

Updates tailscale/tailscale#11540

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 months ago
Brad Fitzpatrick a36cfb4d3d tailcfg, ipn/ipnlocal, wgengine/magicsock: add only-tcp-443 node attr
Updates tailscale/corp#17879

Change-Id: I0dc305d147b76c409cf729b599a94fa723aef0e0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
James Tucker 3f7313dbdb util/linuxfw,wgengine/router: enable IPv6 configuration when netfilter is disabled
Updates #11434

Signed-off-by: James Tucker <james@tailscale.com>
2 months ago
Joe Tsai 85febda86d
all: use zstdframe where sensible (#11491)
Use the zstdframe package where sensible instead of plumbing
around our own zstd.Encoder just for stateless operations.

This causes logtail to have a dependency on zstd,
but that's arguably okay since zstd support is implicit
to the protocol between a client and the logging service.
Also, virtually every caller to logger.NewLogger was
manually setting up a zstd.Encoder anyways,
meaning that zstd was functionally always a dependency.

Updates #cleanup
Updates tailscale/corp#18514

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 months ago
Brad Fitzpatrick 5d1c72f76b wgengine/magicsock: don't use endpoint debug ringbuffer on mobile.
Save some memory.

Updates tailscale/corp#18514

Change-Id: Ibcaf3c6d8e5cc275c81f04141d0f176e2249509b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 months ago
Andrew Dunham 6da1dc84de wgengine: fix logger data race in tests
Observed in:
    https://github.com/tailscale/tailscale/actions/runs/8350904950/job/22858266932?pr=11463

Updates #11226

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I9b57db4b34b6ad91d240cd9fa7e344fc0376d52d
2 months ago
Andrew Dunham 7429e8912a wgengine/netstack: fix bug with duplicate SYN packets in client limit
This fixes a bug that was introduced in #11258 where the handling of the
per-client limit didn't properly account for the fact that the gVisor
TCP forwarder will return 'true' to indicate that it's handled a
duplicate SYN packet, but not launch the handler goroutine.

In such a case, we neither decremented our per-client limit in the
wrapper function, nor did we do so in the handler function, leading to
our per-client limit table slowly filling up without bound.

Fix this by doing the same duplicate-tracking logic that the TCP
forwarder does so we can detect such cases and appropriately decrement
our in-flight counter.

Updates tailscale/corp#12184

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib6011a71d382a10d68c0802593f34b8153d06892
3 months ago
Andrew Dunham f072d017bd wgengine/magicsock: don't change DERP home when not connected to control
This pretty much always results in an outage because peers won't
discover our new home region and thus won't be able to establish
connectivity.

Updates tailscale/corp#18095

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ic0d09133f198b528dd40c6383b16d7663d9d37a7
3 months ago
Andrew Dunham 62cf83eb92 go.mod: bump gvisor
The `stack.PacketBufferPtr` type no longer exists; replace it with
`*stack.PacketBuffer` instead.

Updates #8043

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib56ceff09166a042aa3d9b80f50b2aa2d34b3683
3 months ago
Andrew Dunham 4338db28f7 wgengine/magicsock: prefer link-local addresses to private ones
Since link-local addresses are definitionally more likely to be a direct
(lower-latency, more reliable) connection than a non-link-local private
address, give those a bit of a boost when selecting endpoints.

Updates #8097

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I93fdeb07de55ba39ba5fcee0834b579ca05c2a4e
3 months ago
Brad Fitzpatrick f18f591bc6 wgengine: plumb the PeerByKey from wgengine to magicsock
This was just added in 69f4b459 which doesn't yet use it. This still
doesn't yet use it. It just pushes it down deeper into magicsock where
it'll used later.

Updates #7617

Change-Id: If2f8fd380af150ffc763489e1ff4f8ca2899fac6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 months ago
Percy Wegmann 2d5d6f5403 ipn,wgengine: only intercept TailFS traffic on quad 100
This fixes a regression introduced with 993acf4 and released in
v1.60.0.

The regression caused us to intercept all userspace traffic to port
8080 which prevented users from exposing their own services to their
tailnet at port 8080.

Now, we only intercept traffic to port 8080 if it's bound for
100.100.100.100 or fd7a:115c:a1e0::53.

Fixes #11283

Signed-off-by: Percy Wegmann <percy@tailscale.com>
(cherry picked from commit 17cd0626f3)
3 months ago
Brad Fitzpatrick 69f4b4595a wgengine{,/wgint}: add wgint.Peer wrapper type, add to wgengine.Engine
This adds a method to wgengine.Engine and plumbed down into magicsock
to add a way to get a type-safe Tailscale-safe wrapper around a
wireguard-go device.Peer that only exposes methods that are safe for
Tailscale to use internally.

It also removes HandshakeAttempts from PeerStatusLite that was just
added as it wasn't needed yet and is now accessible ala cart as needed
from the Peer type accessor.

None of this is used yet.

Updates #7617

Change-Id: I07be0c4e6679883e6eeddf8dbed7394c9e79c5f4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 months ago
Brad Fitzpatrick b4ff9a578f wgengine: rename local variable from 'found' to conventional 'ok'
Updates #cleanup

Change-Id: I799dc86ea9e4a3a949592abdd8e74282e7e5d086
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 months ago
Brad Fitzpatrick a8a525282c wgengine: use slices.Clone in two places
Updates #cleanup

Change-Id: I1cb30efb6d09180e82b807d6146f37897ef99307
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 months ago
Brad Fitzpatrick 74b8985e19 ipn/ipnstate, wgengine: make PeerStatusLite.LastHandshake zero Time means none
... rather than 1970. Code was using IsZero against the 1970 team
(which isn't a zero value), but fortunately not anywhere that seems to
have mattered.

Updates #cleanup

Change-Id: I708a3f2a9398aaaedc9503678b4a8a311e0e019e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 months ago
Andrew Dunham 3dd8ae2f26 net/tstun: fix spelling of "WireGuard"
Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ida7e30f4689bc18f5f7502f53a0adb5ac3c7981a
3 months ago
Andrew Dunham c5abbcd4b4 wgengine/netstack: add a per-client limit for in-flight TCP forwards
This is a fun one. Right now, when a client is connecting through a
subnet router, here's roughly what happens:

1. The client initiates a connection to an IP address behind a subnet
   router, and sends a TCP SYN
2. The subnet router gets the SYN packet from netstack, and after
   running through acceptTCP, starts DialContext-ing the destination IP,
   without accepting the connection¹
3. The client retransmits the SYN packet a few times while the dial is
   in progress, until either...
4. The subnet router successfully establishes a connection to the
   destination IP and sends the SYN-ACK back to the client, or...
5. The subnet router times out and sends a RST to the client.
6. If the connection was successful, the client ACKs the SYN-ACK it
   received, and traffic starts flowing

As a result, the notification code in forwardTCP never notices when a
new connection attempt is aborted, and it will wait until either the
connection is established, or until the OS-level connection timeout is
reached and it aborts.

To mitigate this, add a per-client limit on how many in-flight TCP
forwarding connections can be in-progress; after this, clients will see
a similar behaviour to the global limit, where new connection attempts
are aborted instead of waiting. This prevents a single misbehaving
client from blocking all other clients of a subnet router by ensuring
that it doesn't starve the global limiter.

Also, bump the global limit again to a higher value.

¹ We can't accept the connection before establishing a connection to the
remote server since otherwise we'd be opening the connection and then
immediately closing it, which breaks a bunch of stuff; see #5503 for
more details.

Updates tailscale/corp#12184

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I76e7008ddd497303d75d473f534e32309c8a5144
3 months ago
Brad Fitzpatrick 1cf85822d0 ipn/ipnstate, wgengine/wgint: add handshake attempts accessors
Not yet used. This is being made available so magicsock/wgengine can
use it to ignore certain sends (UDP + DERP) later on at least mobile,
letting wireguard-go think it's doing its full attempt schedule, but
we can cut it short conditionally based on what we know from the
control plane.

Updates #7617

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: Ia367cf6bd87b2aeedd3c6f4989528acdb6773ca7
3 months ago
Brad Fitzpatrick eb28818403 wgengine: make pendOpen time later, after dup check
Otherwise on OS retransmits, we'd make redundant timers in Go's timer
heap that upon firing just do nothing (well, grab a mutex and check a
map and see that there's nothing to do).

Updates #cleanup

Change-Id: Id30b8b2d629cf9c7f8133a3f7eca5dc79e81facb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 months ago
Brad Fitzpatrick 219efebad4 wgengine: reduce critical section
No need to hold wgLock while using the device to LookupPeer;
that has its own mutex already.

Updates #cleanup

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: Ib56049fcc7163cf5a2c2e7e12916f07b4f9d67cb
3 months ago
Nick Khyl 7ef1fb113d cmd/tailscaled, ipn/ipnlocal, wgengine: shutdown tailscaled if wgdevice is closed
Tailscaled becomes inoperative if the Tailscale Tunnel wintun adapter is abruptly removed.
wireguard-go closes the device in case of a read error, but tailscaled keeps running.
This adds detection of a closed WireGuard device, triggering a graceful shutdown of tailscaled.
It is then restarted by the tailscaled watchdog service process.

Fixes #11222

Signed-off-by: Nick Khyl <nickk@tailscale.com>
3 months ago
Anton Tolchanov cd9cf93de6 wgengine/netstack: expose TCP forwarder drops via clientmetrics
- add a clientmetric with a counter of TCP forwarder drops due to the
  max attempts;
- fix varz metric types, as they are all counters.

Updates #8210

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
3 months ago
Brad Fitzpatrick e1bd7488d0 all: remove LenIter, use Go 1.22 range-over-int instead
Updates #11058
Updates golang/go#65685

Change-Id: Ibb216b346e511d486271ab3d84e4546c521e4e22
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 months ago
Brad Fitzpatrick 6ad6d6b252 wgengine/wglog: add TS_DEBUG_RAW_WGLOG envknob for raw wg logs
Updates #7617 (part of debugging it)

Change-Id: I1bcbdcf0f929e3bcf83f244b1033fd438aa6dac1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 months ago
Brad Fitzpatrick 8b9474b06a wgengine/wgcfg: don't send UAPI to disable keep-alives on new peers
That's already the default. Avoid the overhead of writing it on one
side and reading it on the other to do nothing.

Updates #cleanup (noticed while researching something else)

Change-Id: I449c88a022271afb9be5da876bfaf438fe5d3f58
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 months ago
James Tucker 131f9094fd wgengine/wglog: quieten WireGuard logs for allowedips
An increasing number of users have very large subnet route
configurations, which can produce very large amounts of log data when
WireGuard is reconfigured. The logs don't contain the actual routes, so
they're largely useless for diagnostics, so we'll just suppress them.

Fixes tailscale/corp#17532

Signed-off-by: James Tucker <james@tailscale.com>
3 months ago
Jason Barnett 4d668416b8 wgengine/router: fix ip rule restoration
Fixes #10857

Signed-off-by: Jason Barnett <J@sonBarnett.com>
4 months ago
Aaron Klotz f7acbefbbb wgengine/router: make the Windows ifconfig implementation reuse existing MibIPforwardRow2 when possible
Looking at profiles, we spend a lot of time in winipcfg.LUID.DeleteRoute
looking up the routing table entry for the provided RouteData.

But we already have the row! We previously obtained that data via the full
table dump we did in getInterfaceRoutes. We can make this a lot faster by
hanging onto a reference to the wipipcfg.MibIPforwardRow2 and executing
the delete operation directly on that.

Fixes #11123

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
4 months ago
Percy Wegmann c42a4e407a tailfs: listen for local clients only on 100.100.100.100
FileSystemForLocal was listening on the node's Tailscale address,
which potentially exposes the user's view of TailFS shares to other
Tailnet users. Remote nodes should connect to exported shares via
the peerapi.

This removes that code so that FileSystemForLocal is only avaialable
on 100.100.100.100:8080.

Updates tailscale/corp#16827

Signed-off-by: Percy Wegmann <percy@tailscale.com>
4 months ago
Percy Wegmann ddcffaef7a tailfs: disable TailFSForLocal via policy
Adds support for node attribute tailfs:access. If this attribute is
not present, Tailscale will not accept connections to the local TailFS
server at 100.100.100.100:8080.

Updates tailscale/corp#16827

Signed-off-by: Percy Wegmann <percy@tailscale.com>
4 months ago
Percy Wegmann abab0d4197 tailfs: clean up naming and package structure
- Restyles tailfs -> tailFS
- Defines interfaces for main TailFS types
- Moves implemenatation of TailFS into tailfsimpl package

Updates tailscale/corp#16827

Signed-off-by: Percy Wegmann <percy@tailscale.com>
4 months ago
Percy Wegmann 993acf4475 tailfs: initial implementation
Add a WebDAV-based folder sharing mechanism that is exposed to local clients at
100.100.100.100:8080 and to remote peers via a new peerapi endpoint at
/v0/tailfs.

Add the ability to manage folder sharing via the new 'share' CLI sub-command.

Updates tailscale/corp#16827

Signed-off-by: Percy Wegmann <percy@tailscale.com>
4 months ago
Joe Tsai 94a4f701c2
all: use reflect.TypeFor now available in Go 1.22 (#11078)
Updates #cleanup

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
4 months ago
Jordan Whited 8b47322acc
wgengine/magicsock: implement probing of UDP path lifetime (#10844)
This commit implements probing of UDP path lifetime on the tail end of
an active direct connection. Probing configuration has two parts -
Cliffs, which are various timeout cliffs of interest, and
CycleCanStartEvery, which limits how often a probing cycle can start,
per-endpoint. Initially a statically defined default configuration will
be used. The default configuration has cliffs of 10s, 30s, and 60s,
with a CycleCanStartEvery of 24h. Probing results are communicated via
clientmetric counters. Probing is off by default, and can be enabled
via control knob. Probing is purely informational and does not yet
drive any magicsock behaviors.

Updates #540

Signed-off-by: Jordan Whited <jordan@tailscale.com>
4 months ago
James Tucker 7e3bcd297e go.mod,wgengine/netstack: bump gvisor
Updates #8043

Signed-off-by: James Tucker <james@tailscale.com>
4 months ago
Claire Wang 213d696db0
magicsock: mute noisy expected peer mtu related error (#10870) 4 months ago
Andrew Dunham 7a0392a8a3 wgengine/netstack: expose gVisor metrics through expvar
When tailscaled is run with "-debug 127.0.0.1:12345", these metrics are
available at:
    http://localhost:12345/debug/metrics

Updates #8210

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I19db6c445ac1f8344df2bc1066a3d9c9030606f8
4 months ago
Andrew Dunham 6540d1f018 wgengine/router: look up absolute path to netsh.exe on Windows
This is in response to logs from a customer that show that we're unable
to run netsh due to the following error:

    router: firewall: adding Tailscale-Process rule to allow UDP for "C:\\Program Files\\Tailscale\\tailscaled.exe" ...
    router: firewall: error adding Tailscale-Process rule: exec: "netsh": cannot run executable found relative to current directory:

There's approximately no reason to ever dynamically look up the path of
a system utility like netsh.exe, so instead let's first look for it
in the System32 directory and only if that fails fall back to the
previous behaviour.

Updates #10804

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I68cfeb4cab091c79ccff3187d35f50359a690573
5 months ago
Jordan Whited b084888e4d
wgengine/magicsock: fix typos in docs (#10729)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
5 months ago
Andrew Lytvynov 2716250ee8
all: cleanup unused code, part 2 (#10670)
And enable U1000 check in staticcheck.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
5 months ago
Andrew Lytvynov 1302bd1181
all: cleanup unused code, part 1 (#10661)
Run `staticcheck` with `U1000` to find unused code. This cleans up about
a half of it. I'll do the other half separately to keep PRs manageable.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
5 months ago