Commit Graph

1563 Commits (c61826a790a2e456ed4f92f1f481d792c61e6a05)

Author SHA1 Message Date
Andrew Dunham 8f7f9ac17e wgengine/netstack: handle 4via6 routes that are advertised by the same node
Previously, a node that was advertising a 4via6 route wouldn't be able
to make use of that same route; the packet would be delivered to
Tailscale, but since we weren't accepting it in handleLocalPackets, the
packet wouldn't be delivered to netstack and would never hit the 4via6
logic. Let's add that support so that usage of 4via6 is consistent
regardless of where the connection is initiated from.

Updates #11304

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ic28dc2e58080d76100d73b93360f4698605af7cb
6 months ago
Brad Fitzpatrick 21509db121 ipn/ipnlocal, all: plumb health trackers in tests
I saw some panics in CI, like:

    2024-05-08T04:30:25.9553518Z ## WARNING: (non-fatal) nil health.Tracker (being strict in CI):
    2024-05-08T04:30:25.9554043Z goroutine 801 [running]:
    2024-05-08T04:30:25.9554489Z tailscale.com/health.(*Tracker).nil(0x0)
    2024-05-08T04:30:25.9555086Z 	tailscale.com/health/health.go:185 +0x70
    2024-05-08T04:30:25.9555688Z tailscale.com/health.(*Tracker).SetUDP4Unbound(0x0, 0x0)
    2024-05-08T04:30:25.9556373Z 	tailscale.com/health/health.go:532 +0x2f
    2024-05-08T04:30:25.9557296Z tailscale.com/wgengine/magicsock.(*Conn).bindSocket(0xc0003b4808, 0xc0003b4878, {0x1fbca53, 0x4}, 0x0)
    2024-05-08T04:30:25.9558301Z 	tailscale.com/wgengine/magicsock/magicsock.go:2481 +0x12c5
    2024-05-08T04:30:25.9559026Z tailscale.com/wgengine/magicsock.(*Conn).rebind(0xc0003b4808, 0x0)
    2024-05-08T04:30:25.9559874Z 	tailscale.com/wgengine/magicsock/magicsock.go:2510 +0x16f
    2024-05-08T04:30:25.9561038Z tailscale.com/wgengine/magicsock.NewConn({0xc000063c80, 0x0, 0xc000197930, 0xc000197950, 0xc000197960, {0x0, 0x0}, 0xc000197970, 0xc000198ee0, 0x0, ...})
    2024-05-08T04:30:25.9562402Z 	tailscale.com/wgengine/magicsock/magicsock.go:476 +0xd5f
    2024-05-08T04:30:25.9563779Z tailscale.com/wgengine.NewUserspaceEngine(0xc000063c80, {{0x22c8750, 0xc0001976b0}, 0x0, {0x22c3210, 0xc000063c80}, {0x22c31d8, 0x2d3c900}, 0x0, 0x0, ...})
    2024-05-08T04:30:25.9564982Z 	tailscale.com/wgengine/userspace.go:389 +0x159d
    2024-05-08T04:30:25.9565529Z tailscale.com/ipn/ipnlocal.newTestBackend(0xc000358b60)
    2024-05-08T04:30:25.9566086Z 	tailscale.com/ipn/ipnlocal/serve_test.go:675 +0x2a5
    2024-05-08T04:30:25.9566612Z ta

Updates #11874

Change-Id: I3432ed52d670743e532be4642f38dbd6e3763b1b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Maisem Ali af97e7a793 tailcfg,all: add/plumb Node.IsJailed
This adds a new bool that can be sent down from control
to do jailing on the client side. Previously this would
only be done from control by modifying the packet filter
we sent down to clients. This would result in a lot of
additional work/CPU on control, we could instead just
do this on the client. This has always been a TODO which
we keep putting off, might as well do it now.

Updates tailscale/corp#19623

Signed-off-by: Maisem Ali <maisem@tailscale.com>
6 months ago
Maisem Ali e67069550b ipn/ipnlocal,net/tstun,wgengine: create and plumb jailed packet filter
This plumbs a packet filter for jailed nodes through to the
tstun.Wrapper; the filter for a jailed node is equivalent to a "shields
up" filter. Currently a no-op as there is no way for control to
tell the client whether a peer is jailed.

Updates tailscale/corp#19623

Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Maisem Ali <maisem@tailscale.com>
Change-Id: I5ccc5f00e197fde15dd567485b2a99d8254391ad
6 months ago
Andrew Lytvynov c28f5767bf
various: implement stateful firewalling on Linux (#12025)
Updates https://github.com/tailscale/corp/issues/19623


Change-Id: I7980e1fb736e234e66fa000d488066466c96ec85

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
6 months ago
Claire Wang 35872e86d2
ipnlocal, magicsock: store last suggested exit node id in local backend (#11959)
Updates tailscale/corp#19681

Signed-off-by: Claire Wang <claire@tailscale.com>
6 months ago
Brad Fitzpatrick 46f3feae96 ssh/tailssh: plumb health.Tracker in test
In prep for it being required in more places.

Updates #11874

Change-Id: Ib743205fc2a6c6ff3d2c4ed3a2b28cac79156539
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Claire Wang e0287a4b33
wgengine: add exit destination logging enable for wgengine logger (#11952)
Updates tailscale/corp#18625
Co-authored-by: Kevin Liang <kevinliang@tailscale.com>
Signed-off-by: Claire Wang <claire@tailscale.com>
6 months ago
Andrew Dunham b2b49cb3d5 wgengine/wgcfg/nmcfg: skip expired peers
Updates tailscale/corp#19315

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1ad0c8796efe3dd456280e51efaf81f6d2049772
6 months ago
Brad Fitzpatrick b9adbe2002 net/{interfaces,netmon}, all: merge net/interfaces package into net/netmon
In prep for most of the package funcs in net/interfaces to become
methods in a long-lived netmon.Monitor that can cache things.  (Many
of the funcs are very heavy to call regularly, whereas the long-lived
netmon.Monitor can subscribe to things from the OS and remember
answers to questions it's asked regularly later)

Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299

Change-Id: Ie4e8dedb70136af2d611b990b865a822cd1797e5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Brad Fitzpatrick 3672f29a4e net/netns, net/dns/resolver, etc: make netmon required in most places
The goal is to move more network state accessors to netmon.Monitor
where they can be cheaper/cached. But first (this change and others)
we need to make sure the one netmon.Monitor is plumbed everywhere.

Some notable bits:

* tsdial.NewDialer is added, taking a now-required netmon

* because a tsdial.Dialer always has a netmon, anything taking both
  a Dialer and a NetMon is now redundant; take only the Dialer and
  get the NetMon from that if/when needed.

* netmon.NewStatic is added, primarily for tests

Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299

Change-Id: I877f9cb87618c4eb037cee098241d18da9c01691
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Brad Fitzpatrick 7a62dddeac net/netcheck, wgengine/magicsock: make netmon.Monitor required
This has been a TODO for ages. Time to do it.

The goal is to move more network state accessors to netmon.Monitor
where they can be cheaper/cached.

Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299

Change-Id: I60fc6508cd2d8d079260bda371fc08b6318bcaf1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Brad Fitzpatrick 7f587d0321 health, wgengine/magicsock: remove last of health package globals
Fixes #11874
Updates #4136

Change-Id: Ib70e6831d4c19c32509fe3d7eee4aa0e9f233564
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Brad Fitzpatrick 745931415c health, all: remove health.Global, finish plumbing health.Tracker
Updates #11874
Updates #4136

Change-Id: I414470f71d90be9889d44c3afd53956d9f26cd61
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Brad Fitzpatrick 6d69fc137f ipn/{ipnlocal,localapi},wgengine{,/magicsock}: plumb health.Tracker
Down to 25 health.Global users. After this remains controlclient &
net/dns & wgengine/router.

Updates #11874
Updates #4136

Change-Id: I6dd1856e3d9bf523bdd44b60fb3b8f7501d5dc0d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Brad Fitzpatrick 723c775dbb tsd, ipnlocal, etc: add tsd.System.HealthTracker, start some plumbing
This adds a health.Tracker to tsd.System, accessible via
a new tsd.System.HealthTracker method.

In the future, that new method will return a tsd.System-specific
HealthTracker, so multiple tsnet.Servers in the same process are
isolated. For now, though, it just always returns the temporary
health.Global value. That permits incremental plumbing over a number
of changes. When the second to last health.Global reference is gone,
then the tsd.System.HealthTracker implementation can return a private
Tracker.

The primary plumbing this does is adding it to LocalBackend and its
dozen and change health calls. A few misc other callers are also
plumbed. Subsequent changes will flesh out other parts of the tree
(magicsock, controlclient, etc).

Updates #11874
Updates #4136

Change-Id: Id51e73cfc8a39110425b6dc19d18b3975eac75ce
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Brad Fitzpatrick 5b32264033 health: break Warnable into a global and per-Tracker value halves
Previously it was both metadata about the class of warnable item as
well as the value.

Now it's only metadata and the value is per-Tracker.

Updates #11874
Updates #4136

Change-Id: Ia1ed1b6c95d34bc5aae36cffdb04279e6ba77015
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Brad Fitzpatrick ebc552d2e0 health: add Tracker type, in prep for removing global variables
This moves most of the health package global variables to a new
`health.Tracker` type.

But then rather than plumbing the Tracker in tsd.System everywhere,
this only goes halfway and makes one new global Tracker
(`health.Global`) that all the existing callers now use.

A future change will eliminate that global.

Updates #11874
Updates #4136

Change-Id: I6ee27e0b2e35f68cb38fecdb3b2dc4c3f2e09d68
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Percy Wegmann c8e912896e wgengine/router: consolidate routes before reconfiguring router for mobile clients
This helps reduce memory pressure on tailnets with large numbers
of routes.

Updates tailscale/corp#19332

Signed-off-by: Percy Wegmann <percy@tailscale.com>
6 months ago
Irbe Krumina 3af0f526b8
cmd{containerboot,k8s-operator},util/linuxfw: support ExternalName Services (#11802)
* cmd/containerboot,util/linuxfw: support proxy backends specified by DNS name

Adds support for optionally configuring containerboot to proxy
traffic to backends configured by passing TS_EXPERIMENTAL_DEST_DNS_NAME env var
to containerboot.
Containerboot will periodically (every 10 minutes) attempt to resolve
the DNS name and ensure that all traffic sent to the node's
tailnet IP gets forwarded to the resolved backend IP addresses.

Currently:
- if the firewall mode is iptables, traffic will be load balanced
accross the backend IP addresses using round robin. There are
no health checks for whether the IPs are reachable.
- if the firewall mode is nftables traffic will only be forwarded
to the first IP address in the list. This is to be improved.

* cmd/k8s-operator: support ExternalName Services

 Adds support for exposing endpoints, accessible from within
a cluster to the tailnet via DNS names using ExternalName Services.
This can be done by annotating the ExternalName Service with
tailscale.com/expose: "true" annotation.
The operator will deploy a proxy configured to route tailnet
traffic to the backend IPs that service.spec.externalName
resolves to. The backend IPs must be reachable from the operator's
namespace.

Updates tailscale/tailscale#10606

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
6 months ago
Nick Khyl 9e1c86901b wgengine\router: fix the Tailscale-In firewall rule to work on domain networks
The Network Location Awareness service identifies networks authenticated against
an Active Directory domain and categorizes them as "Domain Authenticated".
This includes the Tailscale network if a Domain Controller is reachable through it.

If a network is categories as NLM_NETWORK_CATEGORY_DOMAIN_AUTHENTICATED,
it is not possible to override its category, and we shouldn't attempt to do so.
Additionally, our Windows Firewall rules should be compatible with both private
and domain networks.

This fixes both issues.

Fixes #11813

Signed-off-by: Nick Khyl <nickk@tailscale.com>
7 months ago
Brad Fitzpatrick 03d5d1f0f9 wgengine/magicsock: disable portmapper in tunchan-faked tests
Most of the magicsock tests fake the network, simulating packets going
out and coming in. There's no reason to actually hit your router to do
UPnP/NAT-PMP/PCP during in tests. But while debugging thousands of
iterations of tests to deflake some things, I saw it slamming my
router. This stops that.

Updates #11762

Change-Id: I59b9f48f8f5aff1fa16b4935753d786342e87744
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick 7c1d6e35a5 all: use Go 1.22 range-over-int
Updates #11058

Change-Id: I35e7ef9b90e83cac04ca93fd964ad00ed5b48430
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick 0fba9e7570 cmd/tailscale/cli: prevent concurrent Start calls in 'up'
Seems to deflake tstest/integration tests. I can't reproduce it
anymore on one of my VMs that was consistently flaking after a dozen
runs before. Now I can run hundreds of times.

Updates #11649
Fixes #7036

Change-Id: I2f7d4ae97500d507bdd78af9e92cd1242e8e44b8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick dd6c76ea24 ipn: remove unused Options.LegacyMigrationPrefs
I'm on a mission to simplify LocalBackend.Start and its locking
and deflake some tests.

I noticed this hasn't been used since March 2023 when it was removed
from the Windows client in corp 66be796d33c.

So, delete.

Updates #11649

Change-Id: I40f2cb75fb3f43baf23558007655f65a8ec5e1b2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Claire Wang 9171b217ba
cmd/tailscale, ipn/ipnlocal: add suggest exit node CLI option (#11407)
Updates tailscale/corp#17516

Signed-off-by: Claire Wang <claire@tailscale.com>
7 months ago
Charlotte Brandhorst-Satzkorn 449f46c207
wgengine/magicsock: rebind/restun if a syscall.EPERM error is returned (#11711)
We have seen in macOS client logs that the "operation not permitted", a
syscall.EPERM error, is being returned when traffic is attempted to be
sent. This may be caused by security software on the client.

This change will perform a rebind and restun if we receive a
syscall.EPERM error on clients running darwin. Rebinds will only be
called if we haven't performed one specifically for an EPERM error in
the past 5 seconds.

Updates #11710

Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
7 months ago
Brad Fitzpatrick 952e06aa46 wgengine/router: don't attempt route cleanup on Synology
Trying to run iptables/nftables on Synology pauses for minutes with
lots of errors and ultimately does nothing as it's not used and we
lack permissions.

This fixes a regression from db760d0bac (#11601) that landed
between Synology testing on unstable 1.63.110 and 1.64.0 being cut.

Fixes #11737

Change-Id: Iaf9563363b8e45319a9b6fe94c8d5ffaecc9ccef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
James Tucker a2eb1c22b0 wgengine/magicsock: allow disco communication without known endpoints
Just because we don't have known endpoints for a peer does not mean that
the peer should become unreachable. If we know the peers key, it should
be able to call us, then we can talk back via whatever path it called us
on. First step - don't drop the packet in this context.

Updates tailscale/corp#19106

Signed-off-by: James Tucker <james@tailscale.com>
7 months ago
James Tucker db760d0bac cmd/tailscaled: move cleanup to an implicit action during startup
This removes a potentially increased boot delay for certain boot
topologies where they block on ExecStartPre that may have socket
activation dependencies on other system services (such as
systemd-resolved and NetworkManager).

Also rename cleanup to clean up in affected/immediately nearby places
per code review commentary.

Fixes #11599

Signed-off-by: James Tucker <james@tailscale.com>
7 months ago
Maisem Ali 3f4c5daa15 wgengine/netstack: remove SubnetRouterWrapper
It was used when we only supported subnet routers on linux
and would nil out the SubnetRoutes slice as no other router
worked with it, but now we support subnet routers on ~all platforms.

The field it was setting to nil is now only used for network logging
and nowhere else, so keep the field but drop the SubnetRouterWrapper
as it's not useful.

Updates #cleanup

Change-Id: Id03f9b6ec33e47ad643e7b66e07911945f25db79
Signed-off-by: Maisem Ali <maisem@tailscale.com>
7 months ago
James Tucker 6e334e64a1 net/netcheck,wgengine/magicsock: align DERP frame receive time heuristics
The netcheck package and the magicksock package coordinate via the
health package, but both sides have time based heuristics through
indirect dependencies. These were misaligned, so the implemented
heuristic aimed at reducing DERP moves while there is active traffic
were non-operational about 3/5ths of the time.

It is problematic to setup a good test for this integration presently,
so instead I added comment breadcrumbs along with the initial fix.

Updates #8603

Signed-off-by: James Tucker <james@tailscale.com>
7 months ago
Joonas Kuorilehto fe0cfec4ad wgengine/router: enable ip forwarding on gokrazy
Only on Gokrazy, set sysctls to enable IP forwarding so subnet routing
and advertised exit node works.

Fixes #11405

Signed-off-by: Joonas Kuorilehto <joneskoo@derbian.fi>
7 months ago
Percy Wegmann 853e3e29a0 wgengine/router: provide explicit hook to signal Android when VPN needs to be reconfigured
This allows clients to avoid establishing their VPN multiple times when
both routes and DNS are changing in rapid succession.

Updates tailscale/corp#18928

Signed-off-by: Percy Wegmann <percy@tailscale.com>
7 months ago
Charlotte Brandhorst-Satzkorn 93618a3518
tailscale: update tailfs functions and vars to use drive naming (#11597)
This change updates all tailfs functions and the majority of the tailfs
variables to use the new drive naming.

Updates tailscale/corp#16827

Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
7 months ago
Charlotte Brandhorst-Satzkorn 14683371ee
tailscale: update tailfs file and package names (#11590)
This change updates the tailfs file and package names to their new
naming convention.

Updates #tailscale/corp#16827

Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
7 months ago
Irbe Krumina 5fb721d4ad
util/linuxfw,wgengine/router: skip IPv6 firewall configuration in partial iptables mode (#11546)
We have hosts that support IPv6, but not IPv6 firewall configuration
in iptables mode.
We also have hosts that have some support for IPv6 firewall
configuration in iptables mode, but do not have iptables filter table.
We should:
- configure ip rules for all hosts that support IPv6
- only configure firewall rules in iptables mode if the host
has iptables filter table.

Updates tailscale/tailscale#11540

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
7 months ago
Brad Fitzpatrick a36cfb4d3d tailcfg, ipn/ipnlocal, wgengine/magicsock: add only-tcp-443 node attr
Updates tailscale/corp#17879

Change-Id: I0dc305d147b76c409cf729b599a94fa723aef0e0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
James Tucker 3f7313dbdb util/linuxfw,wgengine/router: enable IPv6 configuration when netfilter is disabled
Updates #11434

Signed-off-by: James Tucker <james@tailscale.com>
7 months ago
Joe Tsai 85febda86d
all: use zstdframe where sensible (#11491)
Use the zstdframe package where sensible instead of plumbing
around our own zstd.Encoder just for stateless operations.

This causes logtail to have a dependency on zstd,
but that's arguably okay since zstd support is implicit
to the protocol between a client and the logging service.
Also, virtually every caller to logger.NewLogger was
manually setting up a zstd.Encoder anyways,
meaning that zstd was functionally always a dependency.

Updates #cleanup
Updates tailscale/corp#18514

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
7 months ago
Brad Fitzpatrick 5d1c72f76b wgengine/magicsock: don't use endpoint debug ringbuffer on mobile.
Save some memory.

Updates tailscale/corp#18514

Change-Id: Ibcaf3c6d8e5cc275c81f04141d0f176e2249509b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Andrew Dunham 6da1dc84de wgengine: fix logger data race in tests
Observed in:
    https://github.com/tailscale/tailscale/actions/runs/8350904950/job/22858266932?pr=11463

Updates #11226

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I9b57db4b34b6ad91d240cd9fa7e344fc0376d52d
8 months ago
Andrew Dunham 7429e8912a wgengine/netstack: fix bug with duplicate SYN packets in client limit
This fixes a bug that was introduced in #11258 where the handling of the
per-client limit didn't properly account for the fact that the gVisor
TCP forwarder will return 'true' to indicate that it's handled a
duplicate SYN packet, but not launch the handler goroutine.

In such a case, we neither decremented our per-client limit in the
wrapper function, nor did we do so in the handler function, leading to
our per-client limit table slowly filling up without bound.

Fix this by doing the same duplicate-tracking logic that the TCP
forwarder does so we can detect such cases and appropriately decrement
our in-flight counter.

Updates tailscale/corp#12184

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib6011a71d382a10d68c0802593f34b8153d06892
8 months ago
Andrew Dunham f072d017bd wgengine/magicsock: don't change DERP home when not connected to control
This pretty much always results in an outage because peers won't
discover our new home region and thus won't be able to establish
connectivity.

Updates tailscale/corp#18095

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ic0d09133f198b528dd40c6383b16d7663d9d37a7
8 months ago
Andrew Dunham 62cf83eb92 go.mod: bump gvisor
The `stack.PacketBufferPtr` type no longer exists; replace it with
`*stack.PacketBuffer` instead.

Updates #8043

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib56ceff09166a042aa3d9b80f50b2aa2d34b3683
8 months ago
Andrew Dunham 4338db28f7 wgengine/magicsock: prefer link-local addresses to private ones
Since link-local addresses are definitionally more likely to be a direct
(lower-latency, more reliable) connection than a non-link-local private
address, give those a bit of a boost when selecting endpoints.

Updates #8097

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I93fdeb07de55ba39ba5fcee0834b579ca05c2a4e
8 months ago
Brad Fitzpatrick f18f591bc6 wgengine: plumb the PeerByKey from wgengine to magicsock
This was just added in 69f4b459 which doesn't yet use it. This still
doesn't yet use it. It just pushes it down deeper into magicsock where
it'll used later.

Updates #7617

Change-Id: If2f8fd380af150ffc763489e1ff4f8ca2899fac6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Percy Wegmann 2d5d6f5403 ipn,wgengine: only intercept TailFS traffic on quad 100
This fixes a regression introduced with 993acf4 and released in
v1.60.0.

The regression caused us to intercept all userspace traffic to port
8080 which prevented users from exposing their own services to their
tailnet at port 8080.

Now, we only intercept traffic to port 8080 if it's bound for
100.100.100.100 or fd7a:115c:a1e0::53.

Fixes #11283

Signed-off-by: Percy Wegmann <percy@tailscale.com>
(cherry picked from commit 17cd0626f3)
8 months ago
Brad Fitzpatrick 69f4b4595a wgengine{,/wgint}: add wgint.Peer wrapper type, add to wgengine.Engine
This adds a method to wgengine.Engine and plumbed down into magicsock
to add a way to get a type-safe Tailscale-safe wrapper around a
wireguard-go device.Peer that only exposes methods that are safe for
Tailscale to use internally.

It also removes HandshakeAttempts from PeerStatusLite that was just
added as it wasn't needed yet and is now accessible ala cart as needed
from the Peer type accessor.

None of this is used yet.

Updates #7617

Change-Id: I07be0c4e6679883e6eeddf8dbed7394c9e79c5f4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Brad Fitzpatrick b4ff9a578f wgengine: rename local variable from 'found' to conventional 'ok'
Updates #cleanup

Change-Id: I799dc86ea9e4a3a949592abdd8e74282e7e5d086
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Brad Fitzpatrick a8a525282c wgengine: use slices.Clone in two places
Updates #cleanup

Change-Id: I1cb30efb6d09180e82b807d6146f37897ef99307
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Brad Fitzpatrick 74b8985e19 ipn/ipnstate, wgengine: make PeerStatusLite.LastHandshake zero Time means none
... rather than 1970. Code was using IsZero against the 1970 team
(which isn't a zero value), but fortunately not anywhere that seems to
have mattered.

Updates #cleanup

Change-Id: I708a3f2a9398aaaedc9503678b4a8a311e0e019e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Andrew Dunham 3dd8ae2f26 net/tstun: fix spelling of "WireGuard"
Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ida7e30f4689bc18f5f7502f53a0adb5ac3c7981a
8 months ago
Andrew Dunham c5abbcd4b4 wgengine/netstack: add a per-client limit for in-flight TCP forwards
This is a fun one. Right now, when a client is connecting through a
subnet router, here's roughly what happens:

1. The client initiates a connection to an IP address behind a subnet
   router, and sends a TCP SYN
2. The subnet router gets the SYN packet from netstack, and after
   running through acceptTCP, starts DialContext-ing the destination IP,
   without accepting the connection¹
3. The client retransmits the SYN packet a few times while the dial is
   in progress, until either...
4. The subnet router successfully establishes a connection to the
   destination IP and sends the SYN-ACK back to the client, or...
5. The subnet router times out and sends a RST to the client.
6. If the connection was successful, the client ACKs the SYN-ACK it
   received, and traffic starts flowing

As a result, the notification code in forwardTCP never notices when a
new connection attempt is aborted, and it will wait until either the
connection is established, or until the OS-level connection timeout is
reached and it aborts.

To mitigate this, add a per-client limit on how many in-flight TCP
forwarding connections can be in-progress; after this, clients will see
a similar behaviour to the global limit, where new connection attempts
are aborted instead of waiting. This prevents a single misbehaving
client from blocking all other clients of a subnet router by ensuring
that it doesn't starve the global limiter.

Also, bump the global limit again to a higher value.

¹ We can't accept the connection before establishing a connection to the
remote server since otherwise we'd be opening the connection and then
immediately closing it, which breaks a bunch of stuff; see #5503 for
more details.

Updates tailscale/corp#12184

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I76e7008ddd497303d75d473f534e32309c8a5144
8 months ago
Brad Fitzpatrick 1cf85822d0 ipn/ipnstate, wgengine/wgint: add handshake attempts accessors
Not yet used. This is being made available so magicsock/wgengine can
use it to ignore certain sends (UDP + DERP) later on at least mobile,
letting wireguard-go think it's doing its full attempt schedule, but
we can cut it short conditionally based on what we know from the
control plane.

Updates #7617

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: Ia367cf6bd87b2aeedd3c6f4989528acdb6773ca7
8 months ago
Brad Fitzpatrick eb28818403 wgengine: make pendOpen time later, after dup check
Otherwise on OS retransmits, we'd make redundant timers in Go's timer
heap that upon firing just do nothing (well, grab a mutex and check a
map and see that there's nothing to do).

Updates #cleanup

Change-Id: Id30b8b2d629cf9c7f8133a3f7eca5dc79e81facb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Brad Fitzpatrick 219efebad4 wgengine: reduce critical section
No need to hold wgLock while using the device to LookupPeer;
that has its own mutex already.

Updates #cleanup

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: Ib56049fcc7163cf5a2c2e7e12916f07b4f9d67cb
8 months ago
Nick Khyl 7ef1fb113d cmd/tailscaled, ipn/ipnlocal, wgengine: shutdown tailscaled if wgdevice is closed
Tailscaled becomes inoperative if the Tailscale Tunnel wintun adapter is abruptly removed.
wireguard-go closes the device in case of a read error, but tailscaled keeps running.
This adds detection of a closed WireGuard device, triggering a graceful shutdown of tailscaled.
It is then restarted by the tailscaled watchdog service process.

Fixes #11222

Signed-off-by: Nick Khyl <nickk@tailscale.com>
8 months ago
Anton Tolchanov cd9cf93de6 wgengine/netstack: expose TCP forwarder drops via clientmetrics
- add a clientmetric with a counter of TCP forwarder drops due to the
  max attempts;
- fix varz metric types, as they are all counters.

Updates #8210

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
8 months ago
Brad Fitzpatrick e1bd7488d0 all: remove LenIter, use Go 1.22 range-over-int instead
Updates #11058
Updates golang/go#65685

Change-Id: Ibb216b346e511d486271ab3d84e4546c521e4e22
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Brad Fitzpatrick 6ad6d6b252 wgengine/wglog: add TS_DEBUG_RAW_WGLOG envknob for raw wg logs
Updates #7617 (part of debugging it)

Change-Id: I1bcbdcf0f929e3bcf83f244b1033fd438aa6dac1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
Brad Fitzpatrick 8b9474b06a wgengine/wgcfg: don't send UAPI to disable keep-alives on new peers
That's already the default. Avoid the overhead of writing it on one
side and reading it on the other to do nothing.

Updates #cleanup (noticed while researching something else)

Change-Id: I449c88a022271afb9be5da876bfaf438fe5d3f58
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
8 months ago
James Tucker 131f9094fd wgengine/wglog: quieten WireGuard logs for allowedips
An increasing number of users have very large subnet route
configurations, which can produce very large amounts of log data when
WireGuard is reconfigured. The logs don't contain the actual routes, so
they're largely useless for diagnostics, so we'll just suppress them.

Fixes tailscale/corp#17532

Signed-off-by: James Tucker <james@tailscale.com>
8 months ago
Jason Barnett 4d668416b8 wgengine/router: fix ip rule restoration
Fixes #10857

Signed-off-by: Jason Barnett <J@sonBarnett.com>
9 months ago
Aaron Klotz f7acbefbbb wgengine/router: make the Windows ifconfig implementation reuse existing MibIPforwardRow2 when possible
Looking at profiles, we spend a lot of time in winipcfg.LUID.DeleteRoute
looking up the routing table entry for the provided RouteData.

But we already have the row! We previously obtained that data via the full
table dump we did in getInterfaceRoutes. We can make this a lot faster by
hanging onto a reference to the wipipcfg.MibIPforwardRow2 and executing
the delete operation directly on that.

Fixes #11123

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
9 months ago
Percy Wegmann c42a4e407a tailfs: listen for local clients only on 100.100.100.100
FileSystemForLocal was listening on the node's Tailscale address,
which potentially exposes the user's view of TailFS shares to other
Tailnet users. Remote nodes should connect to exported shares via
the peerapi.

This removes that code so that FileSystemForLocal is only avaialable
on 100.100.100.100:8080.

Updates tailscale/corp#16827

Signed-off-by: Percy Wegmann <percy@tailscale.com>
9 months ago
Percy Wegmann ddcffaef7a tailfs: disable TailFSForLocal via policy
Adds support for node attribute tailfs:access. If this attribute is
not present, Tailscale will not accept connections to the local TailFS
server at 100.100.100.100:8080.

Updates tailscale/corp#16827

Signed-off-by: Percy Wegmann <percy@tailscale.com>
9 months ago
Percy Wegmann abab0d4197 tailfs: clean up naming and package structure
- Restyles tailfs -> tailFS
- Defines interfaces for main TailFS types
- Moves implemenatation of TailFS into tailfsimpl package

Updates tailscale/corp#16827

Signed-off-by: Percy Wegmann <percy@tailscale.com>
9 months ago
Percy Wegmann 993acf4475 tailfs: initial implementation
Add a WebDAV-based folder sharing mechanism that is exposed to local clients at
100.100.100.100:8080 and to remote peers via a new peerapi endpoint at
/v0/tailfs.

Add the ability to manage folder sharing via the new 'share' CLI sub-command.

Updates tailscale/corp#16827

Signed-off-by: Percy Wegmann <percy@tailscale.com>
9 months ago
Joe Tsai 94a4f701c2
all: use reflect.TypeFor now available in Go 1.22 (#11078)
Updates #cleanup

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
9 months ago
Jordan Whited 8b47322acc
wgengine/magicsock: implement probing of UDP path lifetime (#10844)
This commit implements probing of UDP path lifetime on the tail end of
an active direct connection. Probing configuration has two parts -
Cliffs, which are various timeout cliffs of interest, and
CycleCanStartEvery, which limits how often a probing cycle can start,
per-endpoint. Initially a statically defined default configuration will
be used. The default configuration has cliffs of 10s, 30s, and 60s,
with a CycleCanStartEvery of 24h. Probing results are communicated via
clientmetric counters. Probing is off by default, and can be enabled
via control knob. Probing is purely informational and does not yet
drive any magicsock behaviors.

Updates #540

Signed-off-by: Jordan Whited <jordan@tailscale.com>
9 months ago
James Tucker 7e3bcd297e go.mod,wgengine/netstack: bump gvisor
Updates #8043

Signed-off-by: James Tucker <james@tailscale.com>
10 months ago
Claire Wang 213d696db0
magicsock: mute noisy expected peer mtu related error (#10870) 10 months ago
Andrew Dunham 7a0392a8a3 wgengine/netstack: expose gVisor metrics through expvar
When tailscaled is run with "-debug 127.0.0.1:12345", these metrics are
available at:
    http://localhost:12345/debug/metrics

Updates #8210

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I19db6c445ac1f8344df2bc1066a3d9c9030606f8
10 months ago
Andrew Dunham 6540d1f018 wgengine/router: look up absolute path to netsh.exe on Windows
This is in response to logs from a customer that show that we're unable
to run netsh due to the following error:

    router: firewall: adding Tailscale-Process rule to allow UDP for "C:\\Program Files\\Tailscale\\tailscaled.exe" ...
    router: firewall: error adding Tailscale-Process rule: exec: "netsh": cannot run executable found relative to current directory:

There's approximately no reason to ever dynamically look up the path of
a system utility like netsh.exe, so instead let's first look for it
in the System32 directory and only if that fails fall back to the
previous behaviour.

Updates #10804

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I68cfeb4cab091c79ccff3187d35f50359a690573
10 months ago
Jordan Whited b084888e4d
wgengine/magicsock: fix typos in docs (#10729)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
10 months ago
Andrew Lytvynov 2716250ee8
all: cleanup unused code, part 2 (#10670)
And enable U1000 check in staticcheck.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
11 months ago
Andrew Lytvynov 1302bd1181
all: cleanup unused code, part 1 (#10661)
Run `staticcheck` with `U1000` to find unused code. This cleans up about
a half of it. I'll do the other half separately to keep PRs manageable.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
11 months ago
Jordan Whited 685b853763
wgengine/magicsock: fix handling of derp.PeerGoneMessage (#10589)
The switch in Conn.runDerpReader() on the derp.ReceivedMessage type
contained cases other than derp.ReceivedPacket that fell through to
writing to c.derpRecvCh, which should only be reached for
derp.ReceivedPacket. This can result in the last/previous
derp.ReceivedPacket to be re-handled, effectively creating a duplicate
packet. If the last derp.ReceivedPacket happens to be a
disco.CallMeMaybe it may result in a disco ping scan towards the
originating peer on the endpoints contained.

The change in this commit moves the channel write on c.derpRecvCh and
subsequent select awaiting the result into the derp.ReceivedMessage
case, preventing it from being reached from any other case. Explicit
continue statements are also added to non-derp.ReceivedPacket cases
where they were missing, in order to signal intent to the reader.

Fixes #10586

Signed-off-by: Jordan Whited <jordan@tailscale.com>
11 months ago
Andrew Dunham 727acf96a6 net/netcheck: use DERP frames as a signal for home region liveness
This uses the fact that we've received a frame from a given DERP region
within a certain time as a signal that the region is stil present (and
thus can still be a node's PreferredDERP / home region) even if we don't
get a STUN response from that region during a netcheck.

This should help avoid DERP flaps that occur due to losing STUN probes
while still having a valid and active TCP connection to the DERP server.

RELNOTE=Reduce home DERP flapping when there's still an active connection

Updates #8603

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: If7da6312581e1d434d5c0811697319c621e187a0
11 months ago
Naman Sood 97f84200ac
wgengine/router: implement UpdateMagicsockPort for CallbackRouter (#10494)
Updates #9084.

Signed-off-by: Naman Sood <mail@nsood.in>
11 months ago
Naman Sood d46a4eced5
util/linuxfw, wgengine: allow ingress to magicsock UDP port on Linux (#10370)
* util/linuxfw, wgengine: allow ingress to magicsock UDP port on Linux

Updates #9084.

Currently, we have to tell users to manually open UDP ports on Linux when
certain firewalls (like ufw) are enabled. This change automates the process of
adding and updating those firewall rules as magicsock changes what port it
listens on.

Signed-off-by: Naman Sood <mail@nsood.in>
11 months ago
Naman Sood 0a59754eda linuxfw,wgengine/route,ipn: add c2n and nodeattrs to control linux netfilter
Updates tailscale/corp#14029.

Signed-off-by: Naman Sood <mail@nsood.in>
11 months ago
James Tucker 215f657a5e wgengine/router: create netfilter runner in setNetfilterMode
This will enable the runner to be replaced as a configuration side
effect in a later change.

Updates tailscale/corp#14029

Signed-off-by: James Tucker <james@tailscale.com>
11 months ago
Andrew Lytvynov 263e01c47b
wgengine/filter: add protocol-agnostic packet checker (#10446)
For use in ACL tests, we need a way to check whether a packet is allowed
not just with TCP, but any protocol.

Updates #3561

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
11 months ago
Jordan Whited 5e861c3871
wgengine/netstack: disable RACK on Windows (#10402)
Updates #9707

Signed-off-by: Jordan Whited <jordan@tailscale.com>
11 months ago
Jordan Whited 1af7f5b549
wgengine/magicsock: fix typo in Conn.handlePingLocked() (#10365)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
11 months ago
Brad Fitzpatrick 4d196c12d9 health: don't report a warning in DERP homeless mode
Updates #3363
Updates tailscale/corp#396

Change-Id: Ibfb0496821cb58a78399feb88d4206d81e95ca0f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
12 months ago
Brad Fitzpatrick 3bd382f369 wgengine/magicsock: add DERP homeless debug mode for testing
In DERP homeless mode, a DERP home connection is not sought or
maintained and the local node is not reachable.

Updates #3363
Updates tailscale/corp#396

Change-Id: Ibc30488ac2e3cfe4810733b96c2c9f10a51b8331
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
12 months ago
Jordan Whited 2ff54f9d12
wgengine/magicsock: move trustBestAddrUntil forward on non-disco rx (#10274)
This is gated behind the silent disco control knob, which is still in
its infancy. Prior to this change disco pong reception was the only
event that could move trustBestAddrUntil forward, so even though we
weren't heartbeating, we would kick off discovery pings every
trustUDPAddrDuration and mirror to DERP.

Updates #540

Signed-off-by: Jordan Whited <jordan@tailscale.com>
12 months ago
Jordan Whited c99488ea19
wgengine/magicsock: fix typo in endpoint.sendDiscoPing() docs (#10232)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
12 months ago
Jordan Whited e848736927
control/controlknobs,wgengine/magicsock: implement SilentDisco toggle (#10195)
This change exposes SilentDisco as a control knob, and plumbs it down to
magicsock.endpoint. No changes are being made to magicsock.endpoint
disco behavior, yet.

Updates #540

Signed-off-by: Jordan Whited <jordan@tailscale.com>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
12 months ago
Charlotte Brandhorst-Satzkorn 839fee9ef4 wgengine/magicsock: handle wireguard only clean up and log messages
This change updates log messaging when cleaning up wireguard only peers.
This change also stops us unnecessarily attempting to clean up disco
pings for wireguard only endpoints.

Updates #7826

Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
1 year ago
Maisem Ali d0f2c0664b wgengine/netstack: standardize var names in UpdateNetstackIPs
Updates #cleanup

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Maisem Ali eaf8aa63fc wgengine/netstack: remove unnecessary map in UpdateNetstackIPs
Updates #cleanup

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Maisem Ali d601c81c51 wgengine/netstack: use netip.Prefix as map keys
Updates #cleanup

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
James Tucker 6f69fe8ad7 wgnengine: remove unused field in userspaceEngine
Updates #cleanup

Signed-off-by: James Tucker <james@tailscale.com>
1 year ago
Brad Fitzpatrick 514539b611 wgengine/magicsock: close disco listeners on Conn.Close, fix Linux root TestNewConn
TestNewConn now passes as root on Linux. It wasn't closing the BPF
listeners and their goroutines.

The code is still a mess of two Close overlapping code paths, but that
can be refactored later. For now, make the two close paths more similar.

Updates #9945

Change-Id: I8a3cf5fb04d22ba29094243b8e645de293d9ed85
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
James Tucker 7df6f8736a wgengine/netstack: only add addresses to correct protocols
Prior to an earlier netstack bump this code used a string conversion
path to cover multiple cases of behavior seemingly checking for
unspecified addresses, adding unspecified addresses to v6. The behavior
is now crashy in netstack, as it is enforcing address length in various
areas of the API, one in particular being address removal.

As netstack is now protocol specific, we must not create invalid
protocol addresses - an address is v4 or v6, and the address value
contained inside must match. If a control path attempts to do something
otherwise it is now logged and skipped rather than incorrect addressing
being added.

Fixes tailscale/corp#15377

Signed-off-by: James Tucker <james@tailscale.com>
1 year ago
Jordan Whited 891d964bd4
wgengine/magicsock: simplify tryEnableUDPOffload() (#9872)
Don't assume Linux lacks UDP_GRO support if it lacks UDP_SEGMENT
support. This mirrors a similar change in wireguard/wireguard-go@177caa7
for consistency sake. We haven't found any issues here, just being
overly paranoid.

Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago