Commit Graph

1439 Commits (38a1cf748a9920aa2cb49101730bba06db1fd839)

Author SHA1 Message Date
Andrew Dunham 11ce5b7e57 ipn/ipnlocal, wgengine/magicsock: check Expired bool on Node; print error in Ping
Change-Id: Ic5f533f175a6e1bb73d4957d8c3f970add42e82e
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2 years ago
Tom DNetto 2ac5474be1 net/flowtrack,wgengine/filter: refactor Cache to use generics
Signed-off-by: Tom DNetto <tom@tailscale.com>
2 years ago
Tom DNetto 673b3d8dbd net/dns,userspace: remove unused DNS paths, normalize query limit on iOS
With a42a594bb3, iOS uses netstack and
hence there are no longer any platforms which use the legacy MagicDNS path. As such, we remove it.

We also normalize the limit for max in-flight DNS queries on iOS (it was 64, now its 256 as per other platforms).
It was 64 for the sake of being cautious about memory, but now we have 50Mb (iOS-15 and greater) instead of 15Mb
so we have the spare headroom.

Signed-off-by: Tom DNetto <tom@tailscale.com>
2 years ago
Brad Fitzpatrick aad6830df0 util/codegen, all: use latest year, not time.Now, in generated files
Updates #6865

Change-Id: I6b86c646968ebbd4553cf37df5e5612fbf5c5f7d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Claire Wang a45c9f982a wgengine/netstack: change netstack API to require LocalBackend
The macOS client was forgetting to call netstack.Impl.SetLocalBackend.
Change the API so that it can't be started without one, eliminating this
class of bug. Then update all the callers.

Updates #6764

Change-Id: I2b3a4f31fdfd9fdbbbbfe25a42db0c505373562f
Signed-off-by: Claire Wang <claire@tailscale.com>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick caa2fe394f wgengine/netstack: delete some dead code, old comment, use atomic int types
Noticed while looking at something else; #cleanup.

Change-Id: Icde7749363014eab9bebe1dd80708f5491f933d1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham 1011e64ad7 wgengine/monitor: don't log unhandled RTM_{NEW,DEL}LINK messages
These aren't handled, but it's not an error to get one.

Fixes #6806

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1fcb9032ac36420aa72a048bf26f58360b9461f9
2 years ago
Brad Fitzpatrick be10b529ec wgengine/magicsock: add TS_DISCO_PONG_IPV4_DELAY knob to bias IPv6 paths
Fixes #6818

Change-Id: I71597a045c5b4117af69fba869cb616271c0dfe1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
andig 14e8afe444 go.mod, etc: bump gvisor
Fixes #6554

Change-Id: Ia04ae37a47b67fa57091c9bfe1d45a1842589aa8
Signed-off-by: andig <cpuidle@gmx.de>
2 years ago
Brad Fitzpatrick 2eff9c8277 wgengine/magicsock: avoid ReadBatch/WriteBatch on old Linux kernels
Fixes #6807

Change-Id: I161424ef8a7338e1941d5e43d72dc6529993a0e3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 0f604923d3 ipn/ipnlocal: fix StatusWithoutPeers not populating parts of Status
Fixes #4311

Change-Id: Iaae0615148fa7154f4ef8f66b455e3a6c2fa9df3
Co-authored-by: Claire Wang <claire@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Joe Tsai d9df023e6f
net/connstats: enforce maximum number of connections (#6760)
The Tailscale logging service has a hard limit on the maximum
log message size that can be accepted.
We want to ensure that netlog messages never exceed
this limit otherwise a client cannot transmit logs.

Move the goroutine for periodically dumping netlog messages
from wgengine/netlog to net/connstats.
This allows net/connstats to manage when it dumps messages,
either based on time or by size.

Updates tailscale/corp#8427

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Brad Fitzpatrick 44be59c15a wgengine/magicsock: fix panic in wireguard-go rate limiting path
Fixes #6686

Change-Id: I1055a87141b07261afed8e36c963a69f3be26088
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham 0d47cd2284 wgengine/monitor: fix panic due to race on Windows
It's possible for the 'somethingChanged' callback to be registered and
then trigger before the ctx field is assigned; move the assignment
earlier so this can't happen.

Change-Id: Ia7ee8b937299014a083ab40adf31a8b3e0db4ec5
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2 years ago
Jordan Whited ea5ee6f87c
all: update golang.zx2c4.com/wireguard to github.com/tailscale/wireguard-go (#6692)
This is temporary while we work to upstream performance work in
https://github.com/WireGuard/wireguard-go/pull/64. A replace directive
is less ideal as it breaks dependent code without duplication of the
directive.

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2 years ago
Andrew Dunham b63094431b wgengine/router: fix tests on systems with older Busybox 'ip' binary
Adjust the expected system output by removing the unsupported mask
component including and after the slash in expected output like:
  fwmask 0xabc/0xdef

This package's tests now pass in an Alpine container when the 'go' and
'iptables' packages are installed (and run as privileged so /dev/net/tun
exists).

Fixes #5928

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id1a3896282bfa36b64afaec7a47205e63ad88542
2 years ago
Jordan Whited 76389d8baf
net/tstun, wgengine/magicsock: enable vectorized I/O on Linux (#6663)
This commit updates the wireguard-go dependency and implements the
necessary changes to the tun.Device and conn.Bind implementations to
support passing vectors of packets in tailscaled. This significantly
improves throughput performance on Linux.

Updates #414

Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Co-authored-by: James Tucker <james@tailscale.com>
2 years ago
Mihai Parparita bdc45b9066 wgengine/magicsock: fix panic when rebinding fails
We would replace the existing real implementation of nettype.PacketConn
with a blockForeverConn, but that violates the contract of atomic.Value
(where the type cannot change). Fix by switching to a pointer value
(atomic.Pointer[nettype.PacketConn]).

A longstanding issue, but became more prevalent when we started binding
connections to interfaces on macOS and iOS (#6566), which could lead to
the bind call failing if the interface was no longer available.

Fixes #6641

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Joe Tsai 2e5d08ec4f
net/connstats: invert network logging data flow (#6272)
Previously, tstun.Wrapper and magicsock.Conn managed their
own statistics data structure and relied on an external call to
Extract to extract (and reset) the statistics.
This makes it difficult to ensure a maximum size on the statistics
as the caller has no introspection into whether the number
of unique connections is getting too large.

Invert the control flow such that a *connstats.Statistics
is registered with tstun.Wrapper and magicsock.Conn.
Methods on non-nil *connstats.Statistics are called for every packet.
This allows the implementation of connstats.Statistics (in the future)
to better control when it needs to flush to ensure
bounds on maximum sizes.

The value registered into tstun.Wrapper and magicsock.Conn could
be an interface, but that has two performance detriments:

1. Method calls on interface values are more expensive since
they must go through a virtual method dispatch.

2. The implementation would need a sync.Mutex to protect the
statistics value instead of using an atomic.Pointer.

Given that methods on constats.Statistics are called for every packet,
we want reduce the CPU cost on this hot path.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Joe Tsai 35c10373b5
types/logid: move logtail ID types here (#6336)
Many packages reference the logtail ID types,
but unfortunately pull in the transitive dependencies of logtail.
Fix this problem by putting the log ID types in its own package
with minimal dependencies.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Brad Fitzpatrick ea25ef8236 util/set: add new set package for SetHandle type
We use this pattern in a number of places (in this repo and elsewhere)
and I was about to add a fourth to this repo which was crossing the line.
Add this type instead so they're all the same.

Also, we have another Set type (SliceSet, which tracks its keys in
order) in another repo we can move to this package later.

Change-Id: Ibbdcdba5443fae9b6956f63990bdb9e9443cefa9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Aaron Klotz 033bd94d4c cmd/tailscaled, wgengine/router: use wingoes/com for COM initialization instead of go-ole
This patch removes the crappy, half-backed COM initialization used by `go-ole`
and replaces that with the `StartRuntime` function from `wingoes`, a library I
have started which, among other things, initializes COM properly.

In particular, we should always be initializing COM to use the multithreaded
apartment. Every single OS thread in the process becomes implicitly initialized
as part of the MTA, so we do not need to concern ourselves as to whether or not
any particular OS thread has initialized COM. Furthermore, we no longer need to
lock the OS thread when calling methods on COM interfaces.

Single-threaded apartments are designed solely for working with Win32 threads
that have a message pump; any other use of the STA is invalid.

Fixes https://github.com/tailscale/tailscale/issues/3137

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2 years ago
Maisem Ali 18c7c3981a ipn/ipnlocal: call checkPrefs in Start too
We were not calling checkPrefs on `opts.*Prefs` in (*LocalBackend).Start().

Updates #713

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick 3a168cc1ff wgengine/magicsock: ignore pre-disco (pre-0.100) peers
There aren't any in the wild, other than one we ran on purpose to keep
us honest, but we can bump that one forward to 0.100.

Change-Id: I129e70724b2d3f8edf3b496dc01eba3ac5a2a907
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
phirework a011320370
magicsock: cleanup canp2p (#6391)
This renames canP2P in magicsock to canP2PLocked to reflect
expectation of mutex lock, fixes a race we discovered in the meantime,
and updates the current stats.

Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jenny Zhang <jz@tailscale.com>
2 years ago
Brad Fitzpatrick 3f8e185003 health: add Warnable, move ownership of warnable items to callers
The health package was turning into a rando dumping ground. Make a new
Warnable type instead that callers can request an instance of, and
then Set it locally in their code without the health package being
aware of all the things that are warnable. (For plenty of things the
health package will want to know details of how Tailscale works so it
can better prioritize/suppress errors, but lots of the warnings are
pretty leaf-y and unrelated)

This just moves two of the health warnings. Can probably move more
later.

Change-Id: I51e50e46eb633f4e96ced503d3b18a1891de1452
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Maisem Ali d585cbf02a wgengine/router: [bsd/darwin] remove and readd routes on profile change
Noticed when testing FUS on tailscale-on-macOS, that routing would break
completely when switching between profiles. However, it would start working
again when going back to the original profile tailscaled started with.

Turns out that if we change the addrs on the interface we need to remove and readd
all the routes.

Updates #713

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali 4d330bac14 ipn/ipnlocal: add support for multiple user profiles
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick b683921b87 ipn/ipnlocal: add start of handling TCP proxying
Updates tailscale/corp#7515

Change-Id: I82d19b5864674b2169f25ec8e429f60a543e0c57
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 79472a4a6e wgengine/netstack: optimize shouldProcessInbound, avoiding 4via6 lookups
All IPv6 packets for the self address were doing netip.Prefix.Contains
lookups.

If if we know they're for a self address (which we already previously
computed and have sitting in a bool), then they can't be for a 4via6
range.

Change-Id: Iaaaf1248cb3fecec229935a80548ead0eb4cb892
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 2daf0f146c ipn/ipnlocal, wgengine/netstack: start handling ports for future serving
Updates tailscale/corp#7515

Change-Id: I966e936e72a2ee99be8d0f5f16872b48cc150258
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham acf5839dd2 wgengine/netstack: add tests for shouldProcessInbound
Inspired by #6235, let's explicitly test the behaviour of this function
to ensure that we're not processing things we don't expect to.

Change-Id: I158050a63be7410fb99452089ea607aaf89fe91a
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
Brad Fitzpatrick abfdcd0f70 wgengine/netstack: fix shouldProcessInbound peerapi non-SYN handling
It was eating TCP packets to peerapi ports to subnet routers.  Some of
the TCP flow's packets went onward, some got eaten.  So some TCP flows
to subnet routers, if they used an unfortunate TCP port number, got
broken.

Change-Id: Ifea036119ccfb081f4dfa18b892373416a5239f8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick da8def8e13 all: remove old +build tags
The //go:build syntax was introduced in Go 1.17:

https://go.dev/doc/go1.17#build-lines

gofmt has kept the +build and go:build lines in sync since
then, but enough time has passed. Time to remove them.

Done with:

    perl -i -npe 's,^// \+build.*\n,,' $(git grep -l -F '+build')

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 3367136d9e wgengine/netstack: remove old unused handleSSH hook
It's leftover from an earlier Tailscale SSH wiring and I forgot to
delete it apparently.

Change-Id: I14f071f450e272b98d90080a71ce68ba459168d1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Joe Tsai 2327c6b05f
wgengine/netlog: preserve Tailscale addresses for exit traffic (#6165)
Exit node traffic is aggregated to protect the privacy
of those using an exit node. However, it is reasonable to
at least log which nodes are making most use of an exit node.

For a node using an exit node,
the source will be the taiscale IP address of itself,
while the destination will be zeroed out.

For a node that serves as an exit node,
the source will be zeroed out,
while the destination will be tailscale IP address
of the node that initiated the exit traffic.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Joe Tsai 7d6775b082
wgengine: respect --no-logs-no-support flag for network logging (#6172)
In the future this will cause a node to be unable to join the tailnet
if network logging is enabled.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Maisem Ali 42f7ef631e wgengine/netstack: use 72h as the KeepAlive Idle time for Tailscale SSH
Setting TCP KeepAlives for Tailscale SSH connections results in them
unnecessarily disconnecting. However, we can't turn them off completely
as that would mean we start leaking sessions waiting for a peer to come
back which may have gone away forever (e.g. if the node was deleted from
the tailnet during a session).

Updates #5021

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Joe Tsai cfef47ddcc
wgengine: perform router reconfig for netlog-only changes (#6118)
If the network logging configruation changes (and nothing else)
we will tear down the network logger and start it back up.
However, doing so will lose the router configuration state.
Manually reconfigure it with the routing state.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Joe Tsai 48ddb3af2a
wgengine/netlog: enforce hard limit on network log message sizes (#6109)
This is a temporary hack to prevent logtail getting stuck
uploading the same excessive message over and over.
A better solution will be discussed and implemented.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Joe Tsai a3602c28bd
wgengine/netlog: embed the StableNodeID of the authoring node (#6105)
This allows network messages to be annotated with which node it came from.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Joe Tsai 81fd259133
wgengine/magicsock: gather physical-layer statistics (#5925)
There is utility in logging traffic statistics that occurs at the physical layer.
That is, in order to send packets virtually to a particular tailscale IP address,
what physical endpoints did we need to communicate with?

This functionality logs IP addresses identical to
what had always been logged in magicsock prior to #5823,
so there is no increase in PII being logged.

ExtractStatistics returns a mapping of connections to counts.
The source is always a Tailscale IP address (without port),
while the destination is some endpoint reachable on WAN or LAN.
As a special case, traffic routed through DERP will use 127.3.3.40
as the destination address with the port being the DERP region.

This entire feature is only enabled if data-plane audit logging
is enabled on the tailnet (by default it is disabled).

Example of type of information logged:

	------------------------------------  Tx[P/s]    Tx[B/s]  Rx[P/s]   Rx[B/s]
	PhysicalTraffic:                       25.80      3.39Ki   38.80     5.57Ki
	    100.1.2.3 -> 143.11.22.33:41641    15.40      2.00Ki   23.20     3.37Ki
	    100.4.5.6 -> 192.168.0.100:41641   10.20      1.38Ki   15.60     2.20Ki
	    100.7.8.9 -> 127.3.3.40:2           0.20      6.40      0.00     0.00

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Joe Tsai c21a3c4733
types/netlogtype: new package for network logging types (#6092)
The netlog.Message type is useful to depend on from other packages,
but doing so would transitively cause gvisor and other large packages
to be linked in.

Avoid this problem by moving all network logging types to a single package.

We also update staticcheck to take in:

	003d277bcf

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Aaron Klotz a44687e71f wgengine/winnet: invoke some COM methods directly instead of through IDispatch.
Intermittently in the wild we are seeing failures when calling
`INetworkConnection::GetNetwork`. It is unclear what the root cause is, but what
is clear is that the error is happening inside the object's `IDispatch` invoker
(as opposed to the method implementation itself).

This patch replaces our wrapper for `INetworkConnection::GetNetwork` with an
alternate implementation that directly invokes the method, instead of using
`IDispatch`. I also replaced the implementations of `INetwork::SetCategory` and
`INetwork::GetCategory` while I was there.

This patch is speculative and tightly-scoped so that we could possibly add it
to a dot-release if necessary.

Updates https://github.com/tailscale/tailscale/issues/4134
Updates https://github.com/tailscale/tailscale/issues/6037

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2 years ago
Jordan Whited a471681e28
wgengine/netstack: enable TCP SACK (#6066)
TCP selective acknowledgement can improve throughput by an order
of magnitude in the presence of loss.

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2 years ago
Andrew Dunham 74693793be net/netcheck, tailcfg: track whether OS supports IPv6
We had previously added this to the netcheck report in #5087 but never
copied it into the NetInfo struct. Additionally, add it to log lines so
it's visible to support.

Change-Id: Ib6266f7c6aeb2eb2a28922aeafd950fe1bf5627e
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
Maisem Ali 74637f2c15 wgengine/router: [linux] add before deleting interface addrs
Deleting may temporarily result in no addrs on the interface, which results in
all other rules (like routes) to get dropped by the OS.

I verified this fixes the problem.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
phirework d13c9cdfb4
wgengine/magicsock: set up pathfinder (#5994)
Sets up new file for separate silent disco goroutine, tentatively named
pathfinder for now.

Updates #540

Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jenny Zhang <jz@tailscale.com>
2 years ago
Brad Fitzpatrick deac82231c wgengine/magicsock: add start of alternate send path
During development of silent disco (#540), an alternate send policy
for magicsock that doesn't wake up the radio frequently with
heartbeats, we want the old & new policies to coexist, like we did
previously pre- and post-disco.

We started to do that earlier in 5c42990c2f but only set up the
env+control knob plumbing to set a bool about which path should be
used.

This starts to add a way for the silent disco code to update the send
path from a separate goroutine. (Part of the effort is going to
de-state-machinify the event based soup that is the current disco
code and make it more Go synchronous style.)

So far this does nothing. (It does add an atomic load on each send
but that should be noise in the grand scheme of things, and a even more
rare atomic store of nil on node config changes.)

Baby steps.

Updates #540

Co-authored-by: Jenny Zhang <jz@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Joe Tsai 14100c0985
wgengine/magicsock: restore allocation-free endpoint.DstToString (#5971)
The wireguard-go code unfortunately calls this unconditionally
even when verbose logging is disabled.

Partial revert of #5911.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Joe Tsai 9ee3df02ee
wgengine/magicsock: remove endpoint.wgEndpoint (#5911)
This field seems seldom used and the documentation is wrong.
It is simpler to just derive its original value dynamically
when endpoint.DstToString is called.

This method is potentially used by wireguard-go,
but not in any code path is performance sensitive.
All calls to it use it in conjunction with fmt.Printf,
which is going to be slow anyways since it uses Go reflection.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Maisem Ali 3555a49518 net/dns: always attempt to read the OS config on macOS/iOS
Also reconfigure DNS on iOS/macOS on link changes.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
James Tucker 539c073cf0 wgengine/magicsock: set UDP socket buffer sizes to 7MB
- At high data rates more buffer space is required in order to avoid
  packet loss during any cause of delay.
- On slower machines more buffer space is required in order to avoid
  packet loss while decryption & tun writing is underway.
- On higher latency network paths more buffer space is required in order
  to overcome BDP.
- On Linux set with SO_*BUFFORCE to bypass net.core.{r,w}mem_max.
- 7MB is the current default maximum on macOS 12.6
- Windows test is omitted, as Windows does not support getsockopt for
  these options.

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
James Tucker 4ec6d41682 wgengine/router: fix MTU configuration on Windows
Always set the MTU to the Tailscale default MTU. In practice we are
missing applying an MTU for IPv6 on Windows prior to this patch.

This is the simplest patch to fix the problem, the code in here needs
some more refactoring.

Fixes #5914

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Joe Tsai a1a43ed266
wgengine/netlog: add support for magicsock statistics (#5913)
This sets up Logger to handle statistics at the magicsock layer,
where we can correlate traffic between a particular tailscale IP address
and any number of physical endpoints used to contact the node
that hosts that tailscale address.

We also export Message and TupleCounts to better document the JSON format
that is being sent to the logging infrastructure.

This commit does NOT yet enable the actual logging of magicsock statistics.
That will be a future commit.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Joe Tsai f9120eee57
wgengine: start network logger in Userspace.Reconfig (#5908)
If the wgcfg.Config is specified with network logging arguments,
then Userspace.Reconfig starts up an asynchronous network logger,
which is shutdown either upon Userspace.Close or when Userspace.Reconfig
is called again without network logging or route arguments.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Joe Tsai 49bae7fd5c
wgengine: fix typo in Engine.PeerForIP (#5912)
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Joe Tsai 1b4e4cc1e8
wgengine/netlog: new package for traffic flow logging (#5864)
The Logger type managers a logtail.Logger for extracting
statistics from a tstun.Wrapper.
So long as Shutdown is called, it ensures that logtail
and statistic gathering resources are properly cleared up.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Emmanuel T Odeke 680f8d9793 all: fix more resource leaks found by staticmajor
Updates #5706

Signed-off-by: Emmanuel T Odeke <emmanuel@orijtech.com>
2 years ago
Joe Tsai 82f5f438e0
wgengine/wgcfg: plumb down audit log IDs (#5855)
The node and domain audit log IDs are provided in the map response,
but are ultimately going to be used in wgengine since
that's the layer that manages the tstun.Wrapper.

Do the plumbing work to get this field passed down the stack.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Brad Fitzpatrick 1841d0bf98 wgengine/magicsock: make debug-level stuff not logged by default
And add a CLI/localapi and c2n mechanism to enable it for a fixed
amount of time.

Updates #1548

Change-Id: I71674aaf959a9c6761ff33bbf4a417ffd42195a7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham e5636997c5
wgengine: don't re-allocate trimmedNodes map (#5825)
Change-Id: I512945b662ba952c47309d3bf8a1b243e05a4736
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
Mihai Parparita 8343b243e7 all: consistently initialize Logf when creating tsdial.Dialers
Most visible when using tsnet.Server, but could have resulted in dropped
messages in a few other places too.

Fixes #5743

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Josh Soref d4811f11a0 all: fix spelling mistakes
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2 years ago
Andrew Dunham 420d841292
wgengine: log subnet router decision at v1 if we have a BIRD client (#5786)
Updates tailscale/coral#82

Change-Id: I398d75f7e178ff7c531ca09899c82cf974fc30c9
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
Tom DNetto ab591906c8 wgengine/router: Increase range of rule priorities when detecting mwan3
Context: https://github.com/tailscale/tailscale/pull/5588#issuecomment-1260655929

It seems that if the interface at index 1 is down, the rule is not installed. As such,
we increase the range we detect up to 2004 in the hope that at least one of the interfaces
1-4 will be up.

Signed-off-by: Tom DNetto <tom@tailscale.com>
2 years ago
Emmanuel T Odeke f981b1d9da all: fix resource leaks with missing .Close() calls
Fixes #5706

Signed-off-by: Emmanuel T Odeke <emmanuel@orijtech.com>
2 years ago
Kyle Carberry 91794f6498 wgengine/magicsock: move firstDerp check after nil derpMap check
This fixes a race condition which caused `c.muCond.Broadcast()` to
never fire in the `firstDerp` if block. It resulted in `Close()`
hanging forever.

Signed-off-by: Kyle Carberry <kyle@carberry.com>
2 years ago
Andrew Dunham 0607832397
wgengine/netstack: always respond to 4via6 echo requests (#5712)
As the comment in the code says, netstack should always respond to ICMP
echo requests to a 4via6 address, even if the netstack instance isn't
normally processing subnet traffic.

Follow-up to #5709

Change-Id: I504d0776c5824071b2a2e0e687bc33e24f6c4746
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
Andrew Dunham b9b0bf65a0
wgengine/netstack: handle 4via6 packets when pinging (#5709)
Change-Id: Ib6ebbaa11219fb91b550ed7fc6ede61f83262e89
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
Brad Fitzpatrick 832031d54b wgengine/magicsock: fix recently introduced data race
From 5c42990c2f, not yet released in a stable build.
Caught by existing tests.

Fixes #5685

Change-Id: Ia76bb328809d9644e8b96910767facf627830600
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
phirework 5c42990c2f
wgengine/magicsock: add client flag and envknob to disable heartbeat (#5638)
Baby steps towards turning off heartbeat pings entirely as per #540.
This doesn't change any current magicsock functionality and requires additional
changes to send/disco paths before the flag can be turned on.

Updates #540

Change-Id: Idc9a72748e74145b068d67e6dd4a4ffe3932efd0
Signed-off-by: Jenny Zhang <jz@tailscale.com>

Signed-off-by: Jenny Zhang <jz@tailscale.com>
2 years ago
Eng Zer Jun f0347e841f refactor: move from io/ioutil to io and os packages
The io/ioutil package has been deprecated as of Go 1.16 [1]. This commit
replaces the existing io/ioutil functions with their new definitions in
io and os packages.

Reference: https://golang.org/doc/go1.16#ioutil
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2 years ago
Brad Fitzpatrick 74674b110d envknob: support changing envknobs post-init
Updates #5114

Change-Id: Ia423fc7486e1b3f3180a26308278be0086fae49b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 33ee2c058e wgengine: update comments, remove redundant code in forceFullWireguardConfig
Change-Id: I464a0bce36e3a362c7d7ace0e8d2dd77fa825ee2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Tom DNetto f6da2220d3 wgengine: set fwmark masks in netfilter & ip rules
This change masks the bitspace used when setting and querying the fwmark on packets. This allows
tailscaled to play nicer with other networking software on the host, assuming the other networking
software is also using fwmarks & a different mask.

IPTables / mark module has always supported masks, so this is safe on the netfilter front.

However, busybox only gained support for parsing + setting masks in 1.33.0, so we make sure we
arent such a version before we add the "/<mask>" syntax to an ip rule command.

Signed-off-by: Tom DNetto <tom@tailscale.com>
2 years ago
David Anderson 7c49db02a2 wgengine/magicsock: don't use BPF receive when SO_MARK doesn't work.
Fixes #5607

Signed-off-by: David Anderson <danderson@tailscale.com>
2 years ago
Tom DNetto ed2b8b3e1d wgengine/router: reduce routing rule priority for openWRT + mwan3
Fixes #3659

Signed-off-by: Tom DNetto <tom@tailscale.com>
Co-authored-by: Ian Foster <ian@vorsk.com>
2 years ago
Colin Adler 9c8bbc7888 wgengine/magicsock: fix panic in http debug server
Fixes an panic in `(*magicsock.Conn).ServeHTTPDebug` when the
`recentPongs` ring buffer for an endpoint wraps around.

Signed-off-by: Colin Adler <colin1adler@gmail.com>
2 years ago
Andrew Dunham 9240f5c1e2
wgengine/netstack: only accept connection after dialing (#5503)
If we accept a forwarded TCP connection before dialing, we can
erroneously signal to a client that we support IPv6 (or IPv4) without
that actually being possible. Instead, we only complete the client's TCP
handshake after we've dialed the outbound connection; if that fails, we
respond with a RST.

Updates #5425 (maybe fixes!)

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
James Tucker 672c2c8de8 wgengine/magicsock: add filter to ignore disco to old/other ports
Incoming disco packets are now dropped unless they match one of the
current bound ports, or have a zero port*.

The BPF filter passes all packets with a disco header to the raw packet
sockets regardless of destination port (in order to avoid needing to
reconfigure BPF on rebind).

If a BPF enabled node has just rebound, due to restart or rebind, it may
receive and reply to disco ping packets destined for ports other than
those which are presently bound. If the pong is accepted, the pinging
node will now assume that it can send WireGuard traffic to the pinged
port - such traffic will not reach the node as it is not destined for a
bound port.

*The zero port is ignored, if received. This is a speculative defense
and would indicate a problem in the receive path, or the BPF filter.
This condition is allowed to pass as it may enable traffic to flow,
however it will also enable problems with the same symptoms this patch
otherwise fixes.

Fixes #5536

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
James Tucker be140add75 wgengine/magicsock: fix regression in initial bind for js
1f959edeb0 introduced a regression for JS
where the initial bind no longer occurred at all for JS.

The condition is moved deeper in the call tree to avoid proliferation of
higher level conditions.

Updates #5537

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
James Tucker 1f959edeb0 wgengine/magicksock: remove nullability of RebindingUDPConns
Both RebindingUDPConns now always exist. the initial bind (which now
just calls rebind) now ensures that bind is called for both, such that
they both at least contain a blockForeverConn. Calling code no longer
needs to assert their state.

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Brad Fitzpatrick 56f6fe204b go.mod, wgengine/wgint: bump wireguard-go
For b51010ba13

Change-Id: Ibf767dfad98aef7e9f0505d91c0d26f924e046d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
James Tucker 265b008e49 wgengine: fix race on endpoints in getStatus
Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Brad Fitzpatrick e470893ba0 wgengine/magicsock: use mak in another spot
Change-Id: I0a46d6243371ae6d126005a2bd63820cb2d1db6b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham c72caa6672 wgengine/magicsock: use AF_PACKET socket + BPF to read disco messages
This is entirely optional (i.e. failing in this code is non-fatal) and
only enabled on Linux for now. Additionally, this new behaviour can be
disabled by setting the TS_DEBUG_DISABLE_AF_PACKET environment variable.

Updates #3824
Replaces #5474

Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: David Anderson <danderson@tailscale.com>
2 years ago
Brad Fitzpatrick 9bd9f37d29 go.mod: bump wireguard/windows, which moves to using net/netip
Updates #5162

Change-Id: If99a3f0000bce0c01bdf44da1d513f236fd7cdf8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham e945d87d76
util/uniq: use generics instead of reflect (#5491)
This takes 75% less time per operation per some benchmarks on my mac.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2 years ago
James Tucker 90dc0e1702 wgengine: remove unused singleflight group
Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Andrew Dunham d6c3588ed3
wgengine/wgcfg: only write peer headers if necessary (#5449)
On sufficiently large tailnets, even writing the peer header (~95 bytes)
can result in a large amount of data that needs to be serialized and
deserialized. Only write headers for peers that need to have their
configuration changed.

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
James Tucker 81dba3738e wgengine: remove all peer status from open timeout diagnostics
Avoid contention from fetching status for all peers, and instead fetch
status for a single peer.

Updates tailscale/coral#72
Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
James Tucker ad1cc6cff9 wgengine: use Go API rather than UAPI for status
Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Brad Fitzpatrick 08b3f5f070 wgengine/wgint: add shady temporary package to get at wireguard internals
For #5451

Change-Id: I43482289e323ba9142a446d551ab7a94a467c43a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham 9b77ac128a
wgengine: print in-flight operations on watchdog trigger (#5447)
In addition to printing goroutine stacks, explicitly track all in-flight
operations and print them when the watchdog triggers (along with the
time they were started at). This should make debugging watchdog failures
easier, since we can look at the longest-running operation(s) first.

Signed-off-by: Andrew Dunham <andrew@tailscale.com>

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
Andrew Dunham e8f09d24c7
wgengine: use a singleflight.Group to reduce status contention (#5450)
Updates tailscale/coral#72

Signed-off-by: Andrew Dunham <andrew@tailscale.com>

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
Kris Brandow 5d559141d5 wgengine/magicsock: remove mention of Start
The Start method was removed in 4c27e2fa22, but the comment on NewConn
still mentioned it doesn't do anything until this method is called.

Signed-off-by: Kris Brandow <kris.brandow@gmail.com>
2 years ago
Joe Tsai 32a1a3d1c0
util/deephash: avoid variadic argument for Update (#5372)
Hashing []any is slow since hashing of interfaces is slow.
Hashing of interfaces is slow since we pessimistically assume
that cycles can occur through them and start cycle tracking.

Drop the variadic signature of Update and fix callers to pass in
an anonymous struct so that we are hashing concrete types
near the root of the value tree.

Signed-off-by: Joe Tsai <joetsai@digital-static.net>

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Andrew Dunham f0d6f173c9
net/netcheck: try ICMP if UDP is blocked (#5056)
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2 years ago
Maisem Ali a9f6cd41fd all: use syncs.AtomicValue
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick 4950fe60bd syncs, all: move to using Go's new atomic types instead of ours
Fixes #5185

Change-Id: I850dd532559af78c3895e2924f8237ccc328449d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Maisem Ali 9bb5a038e5 all: use atomic.Pointer
Also add some missing docs.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick 5381437664 logtail, net/portmapper, wgengine/magicsock: use fmt.Appendf
Fixes #5206

Change-Id: I490bb92e774ce7c044040537e2cd864fcf1dbe5a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 5f6abcfa6f all: migrate code from netaddr.FromStdAddr to Go 1.18
With caveat https://github.com/golang/go/issues/53607#issuecomment-1203466984
that then requires a new wrapper. But a simpler one at least.

Updates #5162

Change-Id: I0a5265065bfcd7f21e8dd65b2bd74cae90d76090
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 8725b14056 all: migrate more code code to net/netip directly
Instead of going through the tailscale.com/net/netaddr transitional
wrappers.

Updates #5162

Change-Id: I3dafd1c2effa1a6caa9b7151ecf6edd1a3fda3dd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick fb82299f5a wgengine/magicsock: avoid RebindingUDPConn mutex in common read/write case
Change-Id: I209fac567326f2e926bace2582dbc67a8bc94c78
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 116f55ff66 all: gofmt for Go 1.19
Updates #5210

Change-Id: Ib02cd5e43d0a8db60c1f09755a8ac7b140b670be
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick a12aad6b47 all: convert more code to use net/netip directly
perl -i -npe 's,netaddr.IPPrefixFrom,netip.PrefixFrom,' $(git grep -l -F netaddr.)
    perl -i -npe 's,netaddr.IPPortFrom,netip.AddrPortFrom,' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IPPrefix,netip.Prefix,g' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IPPort,netip.AddrPort,g' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IP\b,netip.Addr,g' $(git grep -l -F netaddr. )
    perl -i -npe 's,netaddr.IPv6Raw\b,netip.AddrFrom16,g' $(git grep -l -F netaddr. )
    goimports -w .

Then delete some stuff from the net/netaddr shim package which is no
longer neeed.

Updates #5162

Change-Id: Ia7a86893fe21c7e3ee1ec823e8aba288d4566cd8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 6a396731eb all: use various net/netip parse funcs directly
Mechanical change with perl+goimports.

Changed {Must,}Parse{IP,IPPrefix,IPPort} to their netip variants, then
goimports -d .

Finally, removed the net/netaddr wrappers, to prevent future use.

Updates #5162

Change-Id: I59c0e38b5fbca5a935d701645789cddf3d7863ad
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 7eaf5e509f net/netaddr: start migrating to net/netip via new netaddr adapter package
Updates #5162

Change-Id: Id7bdec303b25471f69d542f8ce43805328d56c12
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Maisem Ali 9514ed33d2 go.mod: bump gvisor.dev/gvisor
Pick up https://github.com/google/gvisor/pull/7787

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick d8cb5aae17 tailcfg, control/controlclient: add tailcfg.PeersChangedPatch [capver 33]
This adds a lighter mechanism for endpoint updates from control.

Change-Id: If169c26becb76d683e9877dc48cfb35f90cc5f24
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 469c30c33b ipn/localapi: define a cert dir for Synology DSM6
Fixes #4060

Change-Id: I5f145d4f56f6edb14825268e858d419c55918673
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Mihai Parparita 06aa141632 wgengine/router: avoid unncessary routing configuration changes
The iOS and macOS networking extension API only exposes a single setter
for the entire routing and DNS configuration, and does not appear to
do any kind of diffing or deltas when applying changes. This results
in spurious "network changed" errors in Chrome, even when the
`OneCGNATRoute` flag from df9ce972c7 is
used (because we're setting the same configuration repeatedly).

Since we already keep track of the current routing and DNS configuration
in CallbackRouter, use that to detect if they're actually changing, and
only invoke the platform setter if it's actually necessary.

Updates #3102

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
kylecarbs 9280d39678 wgengine/netstack: close ipstack when netstack.Impl is closed
Fixes netstack.Impl leaking goroutines after shutdown.

Signed-off-by: kylecarbs <kyle@carberry.com>
2 years ago
James Tucker 76256d22d8 wgengine/router: windows: set SkipAsSource on IPv6 LL addresses
Link-local addresses on the Tailscale interface are not routable.
Ideally they would be removed, however, a concern exists that the
operating system will attempt to re-add them which would lead to
thrashing.

Setting SkipAsSource attempts to avoid production of packets using the
address as a source in any default behaviors.

Before, in powershell: `ping (hostname)` would ping the link-local
address of the Tailscale interface, and fail.
After: `ping (hostname)` now pings the link-local address on the next
highest priority metric local interface.

Fixes #4647
Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Mihai Parparita c41837842b wasm: drop pprof dependency
We can use the browser tools to profile, pprof adds 200K to the binary size.

Updates #3157

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
3 years ago
Mihai Parparita 27a1ad6a70
wasm: exclude code that's not used on iOS for Wasm too
It has similar size constraints. Saves ~1.9MB from the Wasm build.

Updates #3157

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
3 years ago
Brad Fitzpatrick 69b535c01f wgengine/netstack: replace a 1500 with a const + doc
Per post-submit code review feedback of 1336fb740b from @maisem.

Change-Id: Ic5c16306cbdee1029518448642304981f77ea1fd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 1336fb740b wgengine/netstack: make netstack MTU be 1280 also
Updates #3878

Change-Id: I1850085b32c8a40d85607b4ad433622c97d96a8d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Tom 2903d42921
wgengine/router: delete hardcoded link-local address on Windows (#4740)
Fixes #4647

It seems that Windows creates a link-local address for the TUN driver, seemingly
based on the (fixed) adapter GUID. This results in a fixed MAC address, which
for some reason doesn't handle loopback correctly. Given the derived link-local
address is preferred for lookups (thanks LLMNR), traffic which addresses the
current node by hostname uses this broken address and never works.

To address this, we remove the broken link-local address from the wintun adapter.

Signed-off-by: Tom DNetto <tom@tailscale.com>
3 years ago
Tom fc5839864b
wgengine/netstack: handle multiple magicDNS queries per UDP socket (#4708)
Fixes: #4686

Signed-off-by: Tom DNetto <tom@tailscale.com>
3 years ago
Tom 9343967317
wgengine/filter: preallocate some hot slices in MatchesFromFilterRules (#4672)
Profiling identified this as a fairly hot path for growing a slice.

Given this is only used in control & when a new packet filter is received, this shouldnt be hot in the client.
3 years ago
Mihai Parparita 561f7be434 wgengine/magicsock: remove unused metric
We don't increment the metricRecvData anywhere, just the per-protocol
ones.

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
3 years ago
Mihai Parparita 86069874c9 net/tstun, wgengine: use correct type for counter metrics
We were marking them as gauges, but they are only ever incremented,
thus counter is more appropriate.

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
3 years ago
Maisem Ali fd99c54e10 tailcfg,all: change structs to []*dnstype.Resolver
Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Maisem Ali e409e59a54 cmd/cloner,util/codegen: refactor cloner internals to allow reuse
Also run go generate again for Copyright updates.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Brad Fitzpatrick 35111061e9 wgengine/netstack, ipn/ipnlocal: serve http://100.100.100.100/
For future stuff.

Change-Id: I64615b8b2ab50b57e4eef1ca66fa72e3458cb4a9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Tom d1d6ab068e
net/dns, wgengine: implement DNS over TCP (#4598)
* net/dns, wgengine: implement DNS over TCP

Signed-off-by: Tom DNetto <tom@tailscale.com>

* wgengine/netstack: intercept only relevant port/protocols to quad-100

Signed-off-by: Tom DNetto <tom@tailscale.com>
3 years ago
James Tucker f9e86e64b7 *: use WireGuard where logged, printed or named
Signed-off-by: James Tucker <james@tailscale.com>
3 years ago
James Tucker ae483d3446 wgengine, net/packet, cmd/tailscale: add ICMP echo
Updates tailscale/corp#754

Signed-off-by: James Tucker <james@tailscale.com>
3 years ago
Tom DNetto 2a0b5c21d2 net/dns/{., resolver}, wgengine: fix goroutine leak on shutdown
Signed-off-by: Tom DNetto <tom@tailscale.com>
3 years ago
Tom DNetto 7f45734663 assorted: documentation and readability fixes
This were intended to be pushed to #4408, but in my excitement I
forgot to git push :/ better late than never.

Signed-off-by: Tom DNetto <tom@tailscale.com>
3 years ago
Tom DNetto 9e77660931 net/tstun,wgengine/{.,netstack}: handle UDP magicDNS traffic in netstack
This change wires netstack with a hook for traffic coming from the host
into the tun, allowing interception and handling of traffic to quad-100.

With this hook wired, magicDNS queries over UDP are now handled within
netstack. The existing logic in wgengine to handle magicDNS remains for now,
but its hook operates after the netstack hook so the netstack implementation
takes precedence. This is done in case we need to support platforms with
netstack longer than expected.

Signed-off-by: Tom DNetto <tom@tailscale.com>
3 years ago
Tom DNetto dc71d3559f net/tstun,wgengine: split PreFilterOut into multiple hooks
A subsequent commit implements handling of magicDNS traffic via netstack.
Implementing this requires a hook for traffic originating from the host and
hitting the tun, so we make another hook to support this.

Signed-off-by: Tom DNetto <tom@tailscale.com>
3 years ago
Tom DNetto 9dee6adfab cmd/tailscaled,ipn/ipnlocal,wgengine/...: pass dns.Manager into netstack
Needed for a following commit which moves magicDNS handling into
netstack.

Signed-off-by: Tom DNetto <tom@tailscale.com>
3 years ago
Brad Fitzpatrick 6bed781259 all: gofmt all
Well, goimports actually (which adds the normal import grouping order we do)

Change-Id: I0ce1b1c03185f3741aad67c14a7ec91a838de389
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
James Tucker 1aa75b1c9e wgengine/netstack: always set TCP keepalive
Setting keepalive ensures that idle connections will eventually be
closed. In userspace mode, any application configured TCP keepalive is
effectively swallowed by the host kernel, and is not easy to detect.
Failure to close connections when a peer tailscaled goes offline or
restarts may result in an otherwise indefinite connection for any
protocol endpoint that does not initiate new traffic.

This patch does not take any new opinion on a sensible default for the
keepalive timers, though as noted in the TODO, doing so likely deserves
further consideration.

Update #4522

Signed-off-by: James Tucker <james@tailscale.com>
3 years ago
Maisem Ali 80ba161c40 wgengine/monitor: do not ignore changes to pdp_ip*
One current theory (among other things) on battery consumption is that
magicsock is resorting to using the IPv6 over LTE even on WiFi.
One thing that could explain this is that we do not get link change updates
for the LTE modem as we ignore them in this list.
This commit makes us not ignore changes to `pdp_ip` as a test.

Updates #3363

Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Maisem Ali 2265587d38 wgengine/{,magicsock}: add metrics for rebinds and restuns
Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Brad Fitzpatrick 910ae68e0b util/mak: move tailssh's mapSet into a new package for reuse elsewhere
Change-Id: Idfe95db82275fd2be6ca88f245830731a0d5aecf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 53588f632d Revert "wgengine/router,util/kmod: load & log xt_mark"
This reverts commit 8d6793fd70.

Reason: breaks Android build (cgo/pthreads addition)

We can try again next cycle.

Change-Id: I5e7e1730a8bf399a8acfce546a6d22e11fb835d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
James Tucker 8d6793fd70 wgengine/router,util/kmod: load & log xt_mark
Attempt to load the xt_mark kernel module when it is not present. If the
load fails, log error information.

It may be tempting to promote this failure to an error once it has been
in use for some time, so as to avoid reaching an error with the iptables
invocation, however, there are conditions under which the two stages may
disagree - this change adds more useful breadcrumbs.

Example new output from tailscaled running under my WSL2:

```
router: ensure module xt_mark: "/usr/sbin/modprobe xt_mark" failed: exit status 1; modprobe: FATAL: Module xt_mark not found in directory /lib/modules/5.10.43.3-microsoft-standard-WSL2
```

Background:

There are two places to lookup modules, one is `/proc/modules` "old",
the other is `/sys/module/` "new".

There was query_modules(2) in linux <2.6, alas, it is gone.

In a docker container in the default configuration, you would get
/proc/modules and /sys/module/ both populated. lsmod may work file,
modprobe will fail with EPERM at `finit_module()` for an unpriviliged
container.

In a priviliged container the load may *succeed*, if some conditions are
met. This condition should be avoided, but the code landing in this
change does not attempt to avoid this scenario as it is both difficult
to detect, and has a very uncertain impact.

In an nspawn container `/proc/modules` is populated, but `/sys/module`
does not exist. Modern `lsmod` versions will fail to gather most module
information, without sysfs being populated with module information.

In WSL2 modules are likely missing, as the in-use kernel typically is
not provided by the distribution filesystem, and WSL does not mount in a
module filesystem of its own. Notably the WSL2 kernel supports iptables
marks without listing the xt_mark module in /sys/module, and
/proc/modules is empty.

On a recent kernel, we can ask the capabilities system about SYS_MODULE,
that will help to disambiguate between the non-privileged container case
and just being root. On older kernels these calls may fail.

Update #4329

Signed-off-by: James Tucker <james@tailscale.com>
3 years ago
Maisem Ali 136f30fc92 wgengine/monitor: split the unexpected stringification log line
It unfortuantely gets truncated because it's too long, split it into 3
different log lines to circumvent truncation.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Maisem Ali 8e40bfc6ea wgengine/monitor: ignore OS-specific uninteresting interfaces
Currently we ignore these interfaces in the darwin osMon but then would consider it
interesting when checking if anything had changed.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Brad Fitzpatrick 0ce67ccda6 wgengine/router: make supportsV6NAT check catch more cases
Updates #4459

Change-Id: Ic27621569d2739298e652769d10e38608c6012be
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Maisem Ali 3ffd88a84a wgengine/monitor: do not set timeJumped on iOS/Android
In `(*Mon).Start` we don't run a timer to update `(*Mon).lastWall` on iOS and
Android as their sleep patterns are bespoke. However, in the debounce
goroutine we would notice that the the wall clock hadn't been updated
since the last event would assume that a time jump had occurred. This would
result in non-events being considered as major-change events.

This commit makes it so that `(*Mon).timeJumped` is never set to `true`
on iOS and Android.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Brad Fitzpatrick 16f3520089 all: add arbitrary capability support
Updates #4217

RELNOTE=start of WhoIsResponse capability support

Change-Id: I6522998a911fe49e2f003077dad6164c017eed9b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 8ee044ea4a ssh/tailssh: make the SSH server a singleton, register with LocalBackend
Remove the weird netstack -> tailssh dependency and instead have tailssh
register itself with ipnlocal when linked.

This makes tailssh.server a singleton, so we can have a global map of
all sessions.

Updates #3802

Change-Id: Iad5caec3a26a33011796878ab66b8e7b49339f29
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick da14e024a8 tailcfg, ssh/tailssh: optionally support SSH public keys in wire policy
And clean up logging.

Updates #3802

Change-Id: I756dc2d579a16757537142283d791f1d0319f4f0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 3ae701f0eb net/tsaddr, wgengine/netstack: add IPv6 range that forwards to site-relative IPv4
This defines a new magic IPv6 prefix, fd7a:115c:a1e0:b1a::/64, a
subset of our existing /48, where the final 32 bits are an IPv4
address, and the middle 32 bits are a user-chosen "site ID". (which
must currently be 0000:00xx; the top 3 bytes must be zero for now)

e.g., I can say my home LAN's "site ID" is "0000:00bb" and then
advertise its 10.2.0.0/16 IPv4 range via IPv6, like:

    tailscale up --advertise-routes=fd7a:115c:a1e0:b1a::bb:10.2.0.0/112

(112 being /128 minuse the /96 v6 prefix length)

Then people in my tailnet can:

     $ curl '[fd7a:115c:a1e0:b1a::bb:10.2.0.230]'
     <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" ....

Updates #3616, etc

RELNOTE=initial support for TS IPv6 addresses to route v4 "via" specific nodes

Change-Id: I9b49b6ad10410a24b5866b9fbc69d3cae1f600ef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
James Tucker f4aad61e67 wgengine/monitor: ignore duplicate RTM_NEWADDRs
Ignoring the events at this layer is the simpler path for right now, a
broader change should follow to suppress irrelevant change events in a
higher layer so as to avoid related problems with other monitoring paths
on other platforms.  This approach may also carry a small risk that it
applies an at-most-once invariant low in the chain that could be assumed
otherwise higher in the code.

I adjusted the newAddrMessage type to include interface index rather
than a label, as labels are not always supplied, and in particular on my
test hosts they were consistently missing for ipv6 address messages.

I adjusted the newAddrMessage.Addr field to be populated from
Attributes.Address rather than Attributes.Local, as again for ipv6
.Local was always empty, and with ipv4 the .Address and .Local contained
the same contents in each of my test environments.

Update #4282

Signed-off-by: James Tucker <james@tailscale.com>
3 years ago
James Tucker 2f69c383a5 wgengine/monitor: add envknob TS_DEBUG_NETLINK
While I trust the test behavior, I also want to assert the behavior in a
reproduction environment, this envknob gives me the log information I
need to do so.

Update #4282

Signed-off-by: James Tucker <james@tailscale.com>
3 years ago
Tom 24bdcbe5c7
net/dns, net/dns/resolver, wgengine: refactor DNS request path (#4364)
* net/dns, net/dns/resolver, wgengine: refactor DNS request path

Previously, method calls into the DNS manager/resolver types handled DNS
requests rather than DNS packets. This is fine for UDP as one packet
corresponds to one request or response, however will not suit an
implementation that supports DNS over TCP.

To support PRs implementing this in the future, wgengine delegates
all handling/construction of packets to the magic DNS endpoint, to
the DNS types themselves. Handling IP packets at this level enables
future support for both UDP and TCP.

Signed-off-by: Tom DNetto <tom@tailscale.com>
3 years ago
James Tucker c6ac29bcc4
wgengine/netstack: disable refsvfs2 leak tracking (#4378)
In addition an envknob (TS_DEBUG_NETSTACK_LEAK_MODE) now provides access
to set leak tracking to more useful values.

Fixes #4309

Signed-off-by: James Tucker <james@tailscale.com>
3 years ago
Brad Fitzpatrick e4d8d5e78b net/packet, wgengine/netstack: remove workaround for old gvisor ECN bug
Fixes #2642

Change-Id: Ic02251d24a4109679645d1c8336e0f961d0cce13
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Maisem Ali 6fecc16c3b ipn/ipnlocal: do not process old status messages received out of order
When `setWgengineStatus` is invoked concurrently from multiple
goroutines, it is possible that the call invoked with a newer status is
processed before a call with an older status. e.g. a status that has
endpoints might be followed by a status without endpoints. This causes
unnecessary work in the engine and can result in packet loss.

This patch adds an `AsOf time.Time` field to the status to specifiy when the
status was calculated, which later allows `setWgengineStatus` to ignore
any status messages it receives that are older than the one it has
already processed.

Updates tailscale/corp#2579

Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
James Tucker 445c04c938
wgengine: inject packetbuffers rather than bytes (#4220)
Plumb the outbound injection path to allow passing netstack
PacketBuffers down to the tun Read, where they are decref'd to enable
buffer re-use. This removes one packet alloc & copy, and reduces GC
pressure by pooling outbound injected packets.

Fixes #2741
Signed-off-by: James Tucker <james@tailscale.com>
3 years ago
Brad Fitzpatrick f2041c9088 all: use strings.Cut even more
Change-Id: I943ce72c6f339589235bddbe10d07799c4e37979
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder 0868329936 all: use any instead of interface{}
My favorite part of generics.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 5f176f24db go.mod: upgrade to the latest wireguard-go
This pulls in a handful of fixes and an update to Go 1.18.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 61ee72940c all: use Go 1.18's strings.Cut
More remain.

Change-Id: I6ec562cc1f687600758deae1c9d7dbd0d04004cb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder 1b57b0380d wgengine/magicsock: remove final alloc from ReceiveFrom
And now that we don't have to play escape analysis and inlining games,
simplify the code.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 08cf54f386 wgengine/magicsock: fix goMajorVersion for 1.18 ts release
The version string changed slightly. Adapt.
And always check the current Go version to prevent future
accidental regressions. I would have missed this one had
I not explicitly manually checked it.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Maisem Ali 07f48a7bfe wgengine: handle nil netmaps when assigning isSubnetRouter.
Fixes tailscale/coral#51

Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Brad Fitzpatrick 26f27a620a wgengine/router: delete legacy netfilter rule cleanup [Linux]
This was just cleanup for an ancient version of Tailscale. Any such machines
have upgraded since then.

Change-Id: Iadcde05b37c2b867f92e02ec5d2b18bf2b8f653a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c9eca9451a ssh: make it build on darwin
For local dev testing initially. Product-wise, it'll probably only be
workable on the two unsandboxed builds.

Updates #3802

Change-Id: Ic352f966e7fb29aff897217d79b383131bf3f92b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Maisem Ali 72d8672ef7 tailcfg: make Node.Hostinfo a HostinfoView
Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Brad Fitzpatrick 1b87e025e9 ssh/tailssh: move SSH code from wgengine/netstack to this new package
Still largely incomplete, but in a better home now.

Updates #3802

Change-Id: I46c5ffdeb12e306879af801b06266839157bc624
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 2db6cd1025 ipn/ipnlocal, wgengine/magicsock, logpolicy: quiet more logs
Updates #1548

Change-Id: Ied169f872e93be2857890211f2e018307d4aeadc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 86a902b201 all: adjust some log verbosity
Updates #1548

Change-Id: Ia55f1b5dc7dfea09a08c90324226fb92cd10fa00
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 6eed2811b2 wgengine/netstack: start supporting different SSH users
Updates #3802

Change-Id: I44de6897e36b1362cd74c9b10c9cbfeb9abc3dbc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick bd90781b34 ipn/ipnlocal, wgengine/netstack: use netstack for peerapi server
We're finding a bunch of host operating systems/firewalls interact poorly
with peerapi. We either get ICMP errors from the host or users need to run
commands to allow the peerapi port:

https://github.com/tailscale/tailscale/issues/3842#issuecomment-1025133727

... even though the peerapi should be an internal implementation detail.

Rather than fight the host OS & firewalls, this change handles the
server side of peerapi entirely in netstack (except on iOS), so it
never makes its way to the host OS where it might be messed with. Two
main downsides are:

1) netstack isn't as fast, but we don't really need speed for peerapi.
   And actually, with fewer trips to/from the kernel, we might
   actually make up for some of the netstack performance loss by
   staying in userspace.

2) tcpdump / Wireshark etc packet captures will no longer see the peerapi
   traffic. Oh well. Crawshaw's been wanting to add packet capture server
   support to tailscaled, so we'll probably do that sooner now.

A future change might also then use peerapi for the client-side
(except on iOS).

Updates #3842 (probably fixes, as well as many exit node issues I bet)

Change-Id: Ibc25edbb895dc083d1f07bd3cab614134705aa39
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 730aa1c89c derp/derphttp, wgengine/magicsock: prefer IPv6 to DERPs when IPv6 works
Fixes #3838

Change-Id: Ie47a2a30c7e8e431512824798d2355006d72fb6a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 1af26222b6 go.mod: bump netstack, switch to upstream netstack
Now that Go 1.17 has module graph pruning
(https://go.dev/doc/go1.17#go-command), we should be able to use
upstream netstack without breaking our private repo's build
that then depends on the tailscale.com Go module.

This is that experiment.

Updates #1518 (the original bug to break out netstack to own module)
Updates #2642 (this updates netstack, but doesn't remove workaround)

Change-Id: I27a252c74a517053462e5250db09f379de8ac8ff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson 7a18fe3dca wgengine/magicsock: make debugUseDerpRoute an opt.Bool.
Can still be constant, just needs the extra methods.

Fixes #3812

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Brad Fitzpatrick f3c0023add wgengine/netstack: add an SSH server experiment
Disabled by default.

To use, run tailscaled with:

    TS_SSH_ALLOW_LOGIN=you@bar.com

And enable with:

    $ TAILSCALE_USE_WIP_CODE=true tailscale up --ssh=true

Then ssh [any-user]@[your-tailscale-ip] for a root bash shell.
(both the "root" and "bash" part are temporary)

Updates #3802

Change-Id: I268f8c3c95c8eed5f3231d712a5dc89615a406f0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 41fd4eab5c envknob: add new package for all the strconv.ParseBool(os.Getenv(..))
A new package can also later record/report which knobs are checked and
set. It also makes the code cleaner & easier to grep for env knobs.

Change-Id: Id8a123ab7539f1fadbd27e0cbeac79c2e4f09751
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c64af5e676 wgengine/netstack: clear TCP ECN bits before giving to gvisor
Updates #2642

Change-Id: Ic219442a2656dd9dc99ae1dd91e907fd3d924987
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder de4696da10 wgengine/magicsock: fix deadlock on shutdown
This fixes a deadlock on shutdown.
One goroutine is waiting to send on c.derpRecvCh before unlocking c.mu.
The other goroutine is waiting to lock c.mu before receiving from c.derpRecvCh.

#3736 has a more detailed explanation of the sequence of events.

Fixes #3736

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 185825df11 wgengine/netstack: add a missing refcount decrement after packet injection
Fixes #3762
Updates #3745 (probably fixes?)

Change-Id: I1d3f0590fd5b8adfbc9110bc45ff717bb9e79aae
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 790e41645b wgengine/netstack: add an Impl.Close method for tests
Change-Id: Idbb3fd6d749d3e4effdf96de77a1106584822fef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 166fe3fb12 wgengine/netstack: add missing error logging in a RST case
Updates #2642

Change-Id: I9f2f8fd28fc980208b0739eb9caf9db7b0977c09
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 6be48dfcc6 wgengine/netstack: fix netstack ping timeout on darwin
-W is milliseconds on darwin, not seconds, and empirically it's
milliseconds after a 1 second base.

Change-Id: I2520619e6699d9c505d9645ce4dfee4973555227
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 5404a0557b wgengine/magicsock: remove a per-DERP-packet map lookup in common case
Updates #150

Change-Id: Iffb6eccbe7ca97af97d29be63b7e37d487b3ba28
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 5a317d312d wgengine/magicsock: enable DERP Return Path Optimization (DRPO)
Turning this on at the beginning of the 1.21.x dev cycle, for 1.22.

Updates #150

Change-Id: I1de567cfe0be3df5227087de196ab88e60c9eb56
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c6c39930cc wgengine/magicsock: fix lock ordering deadlock with derphttp
Fixes #3726

Change-Id: I32631a44dcc1da3ae47764728ec11ace1c78190d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick a93937abc3 wgengine/netstack: make userspace ping work when tailscaled has CAP_NET_RAW
Updates #3710

Change-Id: Ief56c7ac20f5f09a2f940a1906b9efbf1b0d6932
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 1a4e8da084 wgengine/netstack: fake pings through netstack on Android too
Every OS ping binary is slightly different. Adjust for Android's.

Updates #1738

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 1b426cc232 wgengine/netstack: add env knob to turn on netstack debug logs
Except for the super verbose packet-level dumps. Keep those disabled
by default with a const.

Updates #2642

Change-Id: Ia9eae1677e8b3fe6f457a59e44896a335d95d547
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick addda5b96f wgengine/magicsock: fix watchdog timeout on Close when IPv6 not available
The blockForeverConn was only using its sync.Cond one side. Looks like it
was just forgotten.

Fixes #3671

Change-Id: I4ed0191982cdd0bfd451f133139428a4fa48238c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 28bf53f502 wgengine/magicsock: reduce disco ping heartbeat aggressiveness a bit
Bigger changes coming later, but this should improve things a bit in
the meantime.

Rationale:

* 2 minutes -> 45 seconds: 2 minutes was overkill and never considered
  phones/battery at the time. It was totally arbitrary. 45 seconds is
  also arbitrary but is less than 2 minutes.

* heartbeat from 2 seconds to 3 seconds: in practice this meant two
  packets per second (2 pings and 2 pongs every 2 seconds) because the
  other side was also pinging us every 2 seconds on their own.
  That's just overkill. (see #540 too)

So in the worst case before: when we sent a single packet (say: a DNS
packet), we ended up sending 61 packets over 2 minutes: the 1 DNS
query and then then 60 disco pings (2 minutes / 2 seconds) & received
the same (1 DNS response + 60 pongs).  Now it's 15. In 1.22 we plan to
remove this whole timer-based heartbeat mechanism entirely.

The 5 seconds to 6.5 seconds change is just stretching out that
interval so you can still miss two heartbeats (other 3 + 3 seconds
would be greater than 5 seconds). This means that if your peer moves
without telling you, you can have a path out for 6.5 seconds
now instead of 5 seconds before disco finds a new one. That will also
improve in 1.22 when we start doing UDP+DERP at the same time
when confidence starts to go down on a UDP path.

Updates #3363

Change-Id: Ic2314bbdaf42edcdd7103014b775db9cf4facb47
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick a201b89e4a wgengine/magicsock: reconnect to DERP when its definition changes
Change-Id: I7c560feb9e4a6e155a35ec764a68354f19f694e4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 506c727e30 ipnlocal, net/{dns,tsaddr,tstun}, wgengine: support MagicDNS on IPv6
Fixes #3660

RELNOTE=MagicDNS now works over IPv6 when CGNAT IPv4 is disabled.

Change-Id: I001e983df5feeb65289abe5012dedd177b841b45
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 7d9b1de3aa netcheck,portmapper,magicsock: ignore some UDP write errors on Linux
Treat UDP send EPERM errors as a lost UDP packet, not something super
fatal. That's just the Linux firewall preventing it from going out.

And add a leaf package net/neterror for that (and future) policy that
all three packages can share, with tests.

Updates #3619

Change-Id: Ibdb838c43ee9efe70f4f25f7fc7fdf4607ba9c1d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 2c94e3c4ad wgengine/magicsock: don't unconditionally close DERP connections on rebind
Only if the source address isn't on the currently active interface or
a ping of the DERP server fails.

Updates #3619

Change-Id: I6bf06503cff4d781f518b437c8744ac29577acc8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick ae319b4636 wgengine/magicsock: add HTML debug handler to see magicsock state
Change-Id: Ibc46f4e9651e1c86ec6f5d139f5e9bdc7a488415
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c7f5bc0f69 wgengine/magicsock: add metrics for sent disco messages
We only tracked the transport type (UDP vs DERP), not what they were.

Change-Id: Ia4430c1c53afd4634e2d9893d96751a885d77955
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 5a9914a92f wgengine/netstack: don't remove 255.255.255.255/32 from netstack
The intent of the updateIPs code is to add & remove IP addresses
to netstack based on what we get from the netmap.

But netstack itself adds 255.255.255.255/32 apparently and we always
fight it (and it adds it back?). So stop fighting it.

Updates #2642 (maybe fixes? maybe.)

Change-Id: I37cb23f8e3f07a42a1a55a585689ca51c2be7c60
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder 93ae11105d ipn/ipnlocal: clear magicsock's netmap on logout
magicsock was hanging onto its netmap on logout,
which caused tailscale status to display partial
information about a bunch of zombie peers.
After logout, there should be no peers.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 6590fc3a94 wgengine/netstack: remove some logging on forwarding connections
Change-Id: Ib1165b918cd5da38583f8e7d4be8cda54af3c81d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 486059589b all: gofmt -w -s (simplify) tests
And it updates the build tag style on a couple files.

Change-Id: I84478d822c8de3f84b56fa1176c99d2ea5083237
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Maisem Ali d24a8f7b5a wgengine/router{windows}: return the output from the firewallTweaker
on error.

While debugging a customer issue where the firewallTweaker was failing
the only message we have is `router: firewall: error adding
Tailscale-Process rule: exit status 1` which is not really helpful.
This will help diagnose firewall tweaking failures.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Brad Fitzpatrick b59e7669c1 wgengine/netstack: in netstack/hybrid mode, fake ICMP using ping command
Change-Id: I42cb4b9b326337f4090d9cea532230e36944b6cb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 7b9c7bc42b ipn/ipnstate: remove old deprecated TailAddr IPv4-only field
It's been a bunch of releases now since the TailscaleIPs slice
replacement was added.

Change-Id: I3bd80e1466b3d9e4a4ac5bedba8b4d3d3e430a03
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Denton Gentry 878a20df29 net/dns: add GetBaseConfig to CallbackRouter.
Allow users of CallbackRouter to supply a GetBaseConfig
implementation. This is expected to be used on Android,
which currently lacks both a) platform support for
Split-DNS and b) a way to retrieve the current DNS
servers.

iOS/macOS also use the CallbackRouter but have platform
support for SplitDNS, so don't need getBaseConfig.

Updates https://github.com/tailscale/tailscale/issues/2116
Updates https://github.com/tailscale/tailscale/issues/988

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
3 years ago
Todd Neal eeccbccd08 support running in a FreeBSD jail
Since devd apparently can't be made to work in a FreeBSD jail
fall back to polling.

Fixes tailscale#2858

Signed-off-by: Todd Neal <todd@tneal.org>
3 years ago
Brad Fitzpatrick 69de3bf7bf wgengine/filter: let unknown IPProto match if IP okay & match allows all ports
RELNOTE=yes

Change-Id: I96eaf3cf550cee7bb6cdb4ad81fc761e280a1b2a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 9c5c9d0a50 ipn/ipnlocal, net/tsdial: make SOCKS/HTTP dials use ExitDNS
And simplify, unexport some tsdial/netstack stuff in the the process.

Fixes #3475

Change-Id: I186a5a5cbd8958e25c075b4676f7f6e70f3ff76e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick adc5997592 net/tsdial: give netstack a Dialer, start refactoring name resolution
This starts to refactor tsdial.Dialer's name resolution to have
different stages: in-memory MagicDNS vs system resolution. A future
change will plug in ExitDNS resolution.

This also plumbs a Dialer into netstack and unexports the dnsMap
internals.

And it removes some of the async AddNetworkMapCallback usage and
replaces it with synchronous updates of the Dialer's netmap
from LocalBackend, since the LocalBackend has the Dialer too.

Updates #3475

Change-Id: Idcb7b1169878c74f0522f5151031ccbc49fe4cb4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick ad3d6e31f0 net/tsdial: move macOS/iOS peerapi sockopt logic from LocalBackend
Change-Id: I812cae027c40c70cdc701427b1a1850cd9bcd60c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c7fb26acdb net/tsdial: also plumb TUN name and monitor into tsdial.Dialer
In prep for moving stuff out of LocalBackend.

Change-Id: I9725aa9c3ebc7275f8c40e040b326483c0340127
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c37af58ea4 net/tsdial: move more weirdo dialing into new tsdial package, plumb
Not done yet, but this move more of the outbound dial special casing
from random packages into tsdial, which aspires to be the one unified
place for all outbound dialing shenanigans.

Then this plumbs it all around, so everybody is ultimately
holding on to the same dialer.

As of this commit, macOS/iOS using an exit node should be able to
reach to the exit node's DoH DNS proxy over peerapi, doing the sockopt
to stay within the Network Extension.

A number of steps remain, including but limited to:

* move a bunch more random dialing stuff

* make netstack-mode tailscaled be able to use exit node's DNS proxy,
  teaching tsdial's resolver to use it when an exit node is in use.

Updates #1713

Change-Id: I1e8ee378f125421c2b816f47bc2c6d913ddcd2f5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick bf1d69f25b wgengine/monitor: fix docs on Mon.InterfaceState
The behavior was changed in March (in 7f174e84e6)
but that change forgot to update these docs.

Change-Id: I79c0301692c1d13a4a26641cc5144baf48ec1360
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick d5405c66b7 net/tsdial: start of new package to unify all outbound dialing complexity
For now this just deletes the net/socks5/tssocks implementation (and
the DNSMap stuff from wgengine/netstack) and moves it into net/tsdial.

Then initialize a Dialer early in tailscaled, currently only use for the
outbound and SOCKS5 proxies. It will be plumbed more later. Notably, it
needs to get down into the DNS forwarder for exit node DNS forwading
in netstack mode. But it will also absorb all the peerapi setsockopt
and netns Dial and tlsdial complexity too.

Updates #1713

Change-Id: Ibc6d56ae21a22655b2fa1002d8fc3f2b2ae8b6df
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick bb91cfeae7 net/socks5/tssocks, wgengine: permit SOCKS through subnet routers/exit nodes
Fixes #1970

Change-Id: Ibef45e8796e1d9625716d72539c96d1dbf7b1f76
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick ff9727c9ff wgengine/filter: fix, test NewAllowAllForTest
I probably broke it when SCTP support was added but nothing apparently
ever used NewAllowAllForTest so it wasn't noticed when it broke.

Change-Id: Ib5a405be233d53cb7fcc61d493ae7aa2d1d590a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson 33c541ae30 ipn/ipnlocal: populate self status from netmap in ipnlocal, not magicsock.
Fixes #1933

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Brad Fitzpatrick 283ae702c1 ipn/ipnlocal: start adding DoH DNS server to peerapi when exit node
Updates #1713

Change-Id: I8d9c488f779e7acc811a9bc18166a2726198a429
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder ad5e04249b wgengine/monitor: ignore adding/removing uninteresting IPs
One of the most common "unexpected" log lines is:

"network state changed, but stringification didn't"

One way that this can occur is if an interesting interface
(non-Tailscale, has interesting IP address)
gains or loses an uninteresting IP address (link local or loopback).

The fact that the interface is interesting is enough for EqualFiltered
to inspect it. The fact that an IP address changed is enough for
EqualFiltered to declare that the interfaces are not equal.

But the State.String method reasonably declines to print any
uninteresting IP addresses. As a result, the network state appears
to have changed, but the stringification did not.

The String method is correct; nothing interesting happened.

This change fixes this by adding an IP address filter to EqualFiltered
in addition to the interface filter. This lets the network monitor
ignore the addition/removal of uninteresting IP addresses.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick e8db43e8fa wgengine/router: demote TestDebugListRules fail to skip
Updates #3360

Change-Id: Ic5c98ea03f3171c13ab9293a0ae74d17fd04d149
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 2ea765e5d8 go.mod: bump inet.af/netstack
Updates #2642 (I'd hoped, but doesn't seem to fix it)

Change-Id: Id54af7c90a1206bc7018215957e20e954782b911
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 946dfec98a wgengine/router: fix checkIPRuleSupportsV6 to actually use IPv6
Updates #3358 (should fix it)
Updates #391

Change-Id: Ia62437dfa81247b0b5994d554cf279c3d540e4e7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 9259377a7f wgengine/router: don't assume Linux was built with IP_MULTIPLE_TABLES
Updates #3351
Updates #391

Change-Id: I7e66b686e05f3c970846513679cc62556ebe322a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 0350cf0438 wgengine{,/router}: annotate some more errors
Updates #3351

Change-Id: I8b4f957d2051b3e29401bb449dbadbdada3a7c46
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder 758c37b83d net/netns: thread logf into control functions
So that darwin can log there without panicking during tests.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 85184a58ed wgengine/wgcfg: recover from mismatched PublicKey/Endpoints
In rare circumstances (tailscale/corp#3016), the PublicKey
and Endpoints can diverge.

This by itself doesn't cause any harm, but our early exit
in response did, because it prevented us from recovering from it.

Remove the early exit.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 8ec44d0d5f wgengine/magicsock: remove some log spam
Fixes tailscale/corp#3070

Change-Id: Ie50031800ec8669e0596ad6d59d1e329a5c88516
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 61d0435ed9 wgengine/monitor: reduce Windows log spam
Fixes #3345

Change-Id: Icde9c92f88f98bb3b030d39b0424a7d389bceb88
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick d24ed3f68e wgengine/router: add debug knob to resort to Linux "ip" command usage
Tailscale 1.18 uses netlink instead of the "ip" command to program the
Linux kernel.

The old way was kept primarily for tests, but this also adds a
TS_DEBUG_USE_IP_COMMAND environment knob to force the old way
temporarily for debugging anybody who might have problems with the
new way in 1.18.

Updates #391

Change-Id: I0236fbfda6c9c05dcb3554fcc27ec0c86456efd9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Josh Bleecher Snyder b3d6704aa3 wgengine/magicsock: fix data race on endpoint.discoKey
endpoint.discoKey is protected by endpoint.mu.
endpoint.sendDiscoMessage was reading it without holding the lock.
This showed up in a CI failure and is readily reproducible locally.

The fix is in two parts.

First, for Conn.enqueueCallMeMaybe, eliminate the one-line helper method endpoint.sendDiscoMessage; call Conn.sendDiscoMessage directly.
This makes it more natural to read endpoint.discoKey in a context
in which endpoint.mu is already held.

Second, for endpoint.sendDiscoPing, explicitly pass the disco key
as an argument. Again, this makes it easier to read endpoint.discoKey
in a context in which endpoint.mu is already held.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick cf06f9df37 net/tstun, wgengine: add packet-level and drop metrics
Primarily tstun work, but some MagicDNS stuff spread into wgengine.

No wireguard reconfig metrics (yet).

Updates #3307

Change-Id: Ide768848d7b7d0591e558f118b553013d1ec94ad
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 7901289578 wgengine/magicsock: add a stress test
And add a peerMap validate method that checks its internal invariants.

Updates tailscale/corp#3016

Change-Id: I23708e68ed44d81986d9e2be82029d4555547592
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 5a60781919 wgengine/magicsock: increase TestDiscokeyChange connection timeout
I believe that this should eliminate the flakiness.
If GitHub CI manages to be even slower that can be believed
(and I can believe a lot at this point),
then we should roll this back and make some more invasive changes.

Updates #654
Fixes #3247 (I hope)

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 773af7292b wgengine/magicsock: simplify peerMap.upsertEndpoint
We can do the "maybe delete" check unilaterally:
In the case of an insert, both oldDiscoKey
and ep.discoKey will be the zero value.

And since we don't use pi again, we can skip
giving it a name, which makes scoping clearer.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 9da22dac3d wgengine/magicsock: fix bug in peerMap.upsertEndpoint
Found by inspection by David Crawshaw while
investigating tailscale/corp#3016.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 16870cb754 wgengine/magicsock: fix typo in comment
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
David Anderson 41da7620af go.mod: update wireguard-go to pick up roaming toggle
wgengine/wgcfg: introduce wgcfg.NewDevice helper to disable roaming
at all call sites (one real plus several tests).

Fixes tailscale/corp#3016.

Signed-off-by: David Anderson <danderson@tailscale.com>
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick 24ea365d48 netcheck, controlclient, magicsock: add more metrics
Updates #3307

Change-Id: Ibb33425764a75bde49230632f1b472f923551126
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 57b039c51d util/clientmetrics: add new package to add metrics to the client
And annotate magicsock as a start.

And add localapi and debug handlers with the Prometheus-format
exporter.

Updates #3307

Change-Id: I47c5d535fe54424741df143d052760387248f8d3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
David Anderson 0532eb30db all: replace tailcfg.DiscoKey with key.DiscoPublic.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Josh Bleecher Snyder c467ed0b62 wgengine/wgcfg: always close io.Pipe
In DeviceConfig, we did not close r after calling FromUAPI.
If FromUAPI returned early due to an error, then it might
not have read all the data that IpcGetOperation wanted to write.
As a result, IpcGetOperation could hang, as in #3220.

We were also closing the wrong end of the pipe after IpcSetOperation
in ReconfigDevice.

To ensure that we get all available information to diagnose
such a situation, include all errors anytime something goes wrong.

This should fix the immediate crashing problem in #3220.
We'll then need to figure out why IpcGetOperation was failing.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 3fd5f4380f util/multierr: new package
github.com/go-multierror/multierror served us well.
But we need a few feature from it (implement Is),
and it's not worth maintaining a fork of such a small module.

Instead, I did a clean room implementation inspired by its API.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
David Anderson 7e6a1ef4f1 tailcfg: use key.NodePublic in wire protocol types.
Updates #3206.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson c17250cee2 ipn/ipnstate: use key.NodePublic instead of tailcfg.NodeKey.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson c3d7115e63 wgengine: use key.NodePublic instead of tailcfg.NodeKey.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 72ace0acba wgengine/magicsock: use key.NodePublic instead of tailcfg.NodeKey.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson d6e7cec6a7 types/netmap: use key.NodePublic instead of tailcfg.NodeKey.
Update #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Brad Fitzpatrick 408b0923a6 wgengine/router: remove last non-test "ip" command usage on Linux
Updates #391

Change-Id: Ic2c3f8460b1e4b8d34b936a1725705fcc1effbae
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick ff1954cfd9 wgengine/router: use netlink for ip rules on Linux
Using temporary netlink fork in github.com/tailscale/netlink until we
get the necessary changes upstream in either vishvananda/netlink
or jsimonetti/rtnetlink.

Updates #391

Change-Id: I6e1de96cf0750ccba53dabff670aca0c56dffb7c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago