Commit Graph

794 Commits (22680a11ae2a3061c57123bc6afac2f11b7b51a7)

Author SHA1 Message Date
Will Norris 22680a11ae net/sockstats: return early if no radio period length
Signed-off-by: Will Norris <will@tailscale.com>
2 years ago
Will Norris 75784e10e2 sockstats: add client metrics for radio power state
power state is very roughly approximated based on observed network
activity and AT&T's state transition timings for a typical 3G radio.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Will Norris <will@tailscale.com>
2 years ago
Tom DNetto 6a627e5a33 net, wgengine/capture: encode NAT addresses in pcap stream
Signed-off-by: Tom DNetto <tom@tailscale.com>
2 years ago
Jordan Whited f475e5550c
net/neterror, wgengine/magicsock: use UDP GSO and GRO on Linux (#7791)
This commit implements UDP offloading for Linux. GSO size is passed to
and from the kernel via socket control messages. Support is probed at
runtime.

UDP GSO is dependent on checksum offload support on the egress netdev.
UDP GSO will be disabled in the event sendmmsg() returns EIO, which is
a strong signal that the egress netdev does not support checksum
offload.

Updates tailscale/corp#8734

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2 years ago
David Anderson 4d1b3bc26f net/art: implement the stride table building block of ART
A stride table is an 8-bit routing table implemented as an array binary
tree, with a special tree updating function (allot) that enables lightning
fast address lookups and reasonably fast insertion and deletion.

Insertion, deletion and lookup are all allocation-free.

Updates #7781

                                        │    sec/op    │
StrideTableInsertion/10/random_order       16.79n ± 2%
StrideTableInsertion/10/largest_first      16.83n ± 1%
StrideTableInsertion/10/smallest_first     16.83n ± 0%
StrideTableInsertion/50/random_order       17.84n ± 1%
StrideTableInsertion/50/largest_first      20.04n ± 1%
StrideTableInsertion/50/smallest_first     16.39n ± 0%
StrideTableInsertion/100/random_order      14.63n ± 0%
StrideTableInsertion/100/largest_first     17.45n ± 4%
StrideTableInsertion/100/smallest_first    12.98n ± 0%
StrideTableInsertion/200/random_order      12.51n ± 4%
StrideTableInsertion/200/largest_first     18.36n ± 3%
StrideTableInsertion/200/smallest_first    9.609n ± 3%
StrideTableDeletion/10/random_order        19.50n ± 1%
StrideTableDeletion/10/largest_first       19.34n ± 0%
StrideTableDeletion/10/smallest_first      19.43n ± 0%
StrideTableDeletion/50/random_order        14.58n ± 1%
StrideTableDeletion/50/largest_first       14.27n ± 2%
StrideTableDeletion/50/smallest_first      15.51n ± 0%
StrideTableDeletion/100/random_order       12.02n ± 3%
StrideTableDeletion/100/largest_first      10.64n ± 0%
StrideTableDeletion/100/smallest_first     13.21n ± 3%
StrideTableDeletion/200/random_order       14.05n ± 4%
StrideTableDeletion/200/largest_first      9.288n ± 5%
StrideTableDeletion/200/smallest_first     18.51n ± 1%
StrideTableGet                            0.5010n ± 0%

                                        │  routes/s   │
StrideTableInsertion/10/random_order      59.55M ± 2%
StrideTableInsertion/10/largest_first     59.42M ± 1%
StrideTableInsertion/10/smallest_first    59.43M ± 0%
StrideTableInsertion/50/random_order      56.04M ± 1%
StrideTableInsertion/50/largest_first     49.91M ± 1%
StrideTableInsertion/50/smallest_first    61.00M ± 0%
StrideTableInsertion/100/random_order     68.35M ± 0%
StrideTableInsertion/100/largest_first    57.32M ± 3%
StrideTableInsertion/100/smallest_first   77.06M ± 0%
StrideTableInsertion/200/random_order     79.93M ± 4%
StrideTableInsertion/200/largest_first    54.47M ± 3%
StrideTableInsertion/200/smallest_first   104.1M ± 3%
StrideTableDeletion/10/random_order       51.28M ± 1%
StrideTableDeletion/10/largest_first      51.70M ± 0%
StrideTableDeletion/10/smallest_first     51.48M ± 0%
StrideTableDeletion/50/random_order       68.60M ± 1%
StrideTableDeletion/50/largest_first      70.09M ± 2%
StrideTableDeletion/50/smallest_first     64.45M ± 0%
StrideTableDeletion/100/random_order      83.21M ± 3%
StrideTableDeletion/100/largest_first     94.03M ± 0%
StrideTableDeletion/100/smallest_first    75.69M ± 3%
StrideTableDeletion/200/random_order      71.20M ± 5%
StrideTableDeletion/200/largest_first     107.7M ± 5%
StrideTableDeletion/200/smallest_first    54.02M ± 1%
StrideTableGet                            1.996G ± 0%

Signed-off-by: David Anderson <danderson@tailscale.com>
2 years ago
James Tucker 40fa2a420c envknob,net/tstun,wgengine: use TS_DEBUG_MTU consistently
Noted on #5915 TS_DEBUG_MTU was not used consistently everywhere.
Extract the default into a function that can apply this centrally and
use it everywhere.

Added envknob.Lookup{Int,Uint}Sized to make it easier to keep CodeQL
happy when using converted values.

Updates #5915

Signed-off-by: James Tucker <james@tailscale.com>
2 years ago
Will Norris e99c7c3ee5 sockstats: add labels for netlog and sockstatlog packages
Signed-off-by: Will Norris <will@tailscale.com>
2 years ago
Andrew Dunham 38e4d303a2 net/tshttpproxy: don't proxy through ourselves
When running a SOCKS or HTTP proxy, configure the tshttpproxy package to
drop those addresses from any HTTP_PROXY or HTTPS_PROXY environment
variables.

Fixes #7407

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I6cd7cad7a609c639780484bad521c7514841764b
2 years ago
Maisem Ali 985535aebc net/tstun,wgengine/*: add support for NAT to routes
This adds support to make exit nodes and subnet routers work
when in scenarios where NAT is required.

It also updates the NATConfig to be generated from a `wgcfg.Config` as
that handles merging prefs with the netmap, so it has the required information
about whether an exit node is already configured and whether routes are accepted.

Updates tailscale/corp#8020

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali d1d5d52b2c net/tstun/table: add initial RoutingTable implementation
It is based on `*tempfork/device.AllowedIPs`.

Updates tailscale/corp#8020

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Jordan Whited 27e37cf9b3
go.mod, net/tstun, wgengine/magicsock: update wireguard-go (#7712)
This commit updates the wireguard-go dependency to pull in fixes for
the tun package, specifically 052af4a and aad7fca.

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2 years ago
Maisem Ali d2fd101eb4 net/tstun: only log natConfig on changes
Updates tailscale/corp#8020

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Andrew Dunham 33b359642e net/dns: don't send on closed channel in resolvedManager
Fixes #7686

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ibffb05539ab876b12407d77dcf2201d467895981
2 years ago
Andrew Dunham 4cb1bfee44 net/netcheck: improve determinism in hairpinning test
If multiple Go channels have a value (or are closed), receiving from
them all in a select will nondeterministically return one of the two
arms. In this case, it's possible that the hairpin check timer will have
expired between when we start checking and before we check at all, but
the hairpin packet has already been received. In such cases, we'd
nondeterministically set report.HairPinning.

Instead, check if we have a value in our results channel first, then
select on the value and timeout channel after. Also, add a test that
catches this particular failure.

Fixes #1795

Change-Id: I842ab0bd38d66fabc6cabf2c2c1bb9bd32febf35
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
2 years ago
Maisem Ali 0e203e414f net/packet: add checksum update tests
Updates tailscale/corp#8020

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali 0bf8c8e710 net/tstun: use p.Buffer() in more places
Updates tailscale/corp#8020

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali bb31fd7d1c net/tstun: add inital support for NAT v4
This adds support in tstun to utitilize the SelfNodeV4MasqAddrForThisPeer and
perform the necessary modifications to the packet as it passes through tstun.

Currently this only handles ICMP, UDP and TCP traffic.
Subnet routers and Exit Nodes are also unsupported.

Updates tailscale/corp#8020

Co-authored-by: Melanie Warrick <warrick@tailscale.com>
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali 535fad16f8 net/tstun: rename filterIn/filterOut methods to be more descriptive
Updates tailscale/corp#8020

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Mihai Parparita d2dec13392 net/sockstats: export cellular-only clientmetrics
Followup to #7518 to also export client metrics when the active interface
is cellular.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Denton Gentry ebc630c6c0 net/interfaces: also allow link-local for AzureAppServices.
In May 2021, Azure App Services used 172.16.x.x addresses:
```
10: eth0@if11: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
    link/ether 02:42:ac:10:01:03 brd ff:ff:ff:ff:ff:ff
    inet 172.16.1.3/24 brd 172.16.1.255 scope global eth0
       valid_lft forever preferred_lft forever
```

Now it uses link-local:
```
2: eth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
    link/ether 8a:30:1f:50:1d:23 brd ff:ff:ff:ff:ff:ff
    inet 169.254.129.3/24 brd 169.254.129.255 scope global eth0
       valid_lft forever preferred_lft forever
```

This is reasonable for them to choose to do, it just broke the handling in net/interfaces.

This PR proposes to:
1. Always allow link-local in LocalAddresses() if we have no better
   address available.
2. Continue to make isUsableV4() conditional on an environment we know
   requires it.

I don't love the idea of having to discover these environments one by
one, but I don't understand the consequences of making isUsableV4()
return true unconditionally. It makes isUsableV4() essentially always
return true and perform no function.

Fixes https://github.com/tailscale/tailscale/issues/7603

Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2 years ago
Mihai Parparita 97b6d3e917 sockstats: remove per-interface stats from Get
They're not needed for the sockstats logger, and they're somewhat
expensive to return (since they involve the creation of a map per
label). We now have a separate GetInterfaces() method that returns
them instead (which we can still use in the PeerAPI debug endpoint).

If changing sockstatlog to sample at 10,000 Hz (instead of the default
of 10Hz), the CPU usage would go up to 59% on a iPhone XS. Removing the
per-interface stats drops it to 20% (a no-op implementation of Get that
returns a fixed value is 16%).

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Will Norris a1d9f65354 ipn,log: add logger for sockstat deltas
Signed-off-by: Will Norris <will@tailscale.com>
Co-authored-by: Melanie Warrick <warrick@tailscale.com>
2 years ago
Maisem Ali 5e8a80b845 all: replace /kb/ links with /s/ equivalents
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Andrew Dunham 83fa17d26c various: pass logger.Logf through to more places
Updates #7537

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id89acab70ea678c8c7ff0f44792d54c7223337c6
2 years ago
Mihai Parparita b64d78d58f sockstats: refactor validation to be opt-in
Followup to #7499 to make validation a separate function (
GetWithValidation vs. Get). This way callers that don't need it don't
pay the cost of a syscall per active TCP socket.

Also clears the conn on close, so that we don't double-count the stats.

Also more consistently uses Go doc comments for the exported API of the
sockstats package.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Mihai Parparita ea81bffdeb sockstats: export as client metrics
Though not fine-grained enough to be useful for detailed analysis, we
might as well export that we gather as client metrics too, since we have
an upload/analysis pipeline for them.

clientmetric.Metric.Add is an atomic add, so it's pretty cheap to also
do per-packet.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Mihai Parparita 4c2f67a1d0 net/sockstat: fix per-interface statistics not always being available
withSockStats may be called before setLinkMonitor, in which case we
don't have a populated knownInterfaces map. Since we pre-populate the
per-interface counters at creation time, we would end up with an
empty map. To mitigate this, we do an on-demand request for the list of
interfaces.

This would most often happen with the logtail instrumentation, since we
initialize it very early on.

Updates tailscale/corp#9230

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Mihai Parparita f4f8ed98d9 sockstats: add validation for TCP socket stats
We can use the TCP_CONNECTION_INFO getsockopt() on Darwin to get
OS-collected tx/rx bytes for TCP sockets. Since this API is not available
for UDP sockets (or on Linux/Android), we can't rely on it for actual
stats gathering.

However, we can use it to validate the stats that we collect ourselves
using read/write hooks, so that we can be more confident in them. We
do need additional hooks from the Go standard library (added in
tailscale/go#59) to be able to collect them.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Mihai Parparita 6ac6ddbb47 sockstats: switch label to enum
Makes it cheaper/simpler to persist values, and encourages reuse of
labels as opposed to generating an arbitrary number.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Aaron Klotz 9687f3700d net/dns: deal with Windows wsl.exe hangs
Despite the fact that WSL configuration is still disabled by default, we
continue to log the machine's list of WSL distros as a diagnostic measure.

Unfortunately I have seen the "wsl.exe -l" command hang indefinitely. This patch
adds a (more than reasonable) 10s timeout to ensure that tailscaled does not get
stuck while executing this operation.

I also modified the Windows implementation of NewOSConfigurator to do the
logging asynchronously, since that information is not required in order to
continue starting up.

Fixes #7476

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
2 years ago
David Crawshaw 96a555fc5a net/socks5: add password auth support
Conforms to RFC 1929.

To support Java HTTP clients via libtailscale, who offer no other
reliable hooks into their sockets.

Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2 years ago
Andrew Dunham f6cd24499b net/portmapper: relax source port check for UPnP responses
Per a packet capture provided, some gateways will reply to a UPnP
discovery packet with a UDP packet with a source port that does not come
from the UPnP port. Accept these packets with a log message.

Updates #7377

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I5d4d5b2a0275009ed60f15c20b484fe2025d094b
2 years ago
Andrew Dunham 51eb0b2cb7 net/portmapper: send UPnP protocol in upper-case
We were previously sending a lower-case "udp" protocol, whereas other
implementations like miniupnp send an upper-case "UDP" protocol. For
compatibility, use an upper-case protocol instead.

Updates #7377

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I4aed204f94e4d51b7a256d29917af1536cb1b70f
2 years ago
Andrew Dunham d379a25ae4 net/portmapper: don't pick external ports below 1024
Some devices don't let you UPnP portmap a port below 1024, so let's just
avoid that range of ports entirely.

Updates #7377

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ib7603b1c9a019162cdc4fa21744a2cae48bb1d86
2 years ago
Maisem Ali 1a30b2d73f all: use tstest.Replace more
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Andrew Dunham 2d3ae485e3 net/interfaces: add better test for LikelyHomeRouterIP
Return a mock set of interfaces and a mock gateway during this test and
verify that LikelyHomeRouterIP returns the outcome we expect. Also
verify that we return an error if there are no IPv4 addresses available.

Follow-up to #7447

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8f06989e7f1f0bebd108861cbff17b820ed2e6e4
2 years ago
Maisem Ali b9ebf7cf14 tstest: add method to Replace values for tests
We have many function pointers that we replace for the duration of test and
restore it on test completion, add method to do that.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Andrew Dunham 12100320d2 net/interfaces: always return an IPv4 LikelyHomeRouterIP
We weren't filtering out IPv6 addresses from this function, so we could
be returning an IPv4 gateway IP and an IPv6 self IP. Per the function
comments, only return IPv4 addresses for the self IP.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: If19a4aadc343fbd4383fc5290befa0eff006799e
2 years ago
Andrew Dunham 73fa7dd7af util/slicesx: add package for generic slice functions, use
Now that we're using rand.Shuffle in a few locations, create a generic
shuffle function and use it instead. While we're at it, move the
interleaveSlices function to the same package for use.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I0b00920e5b3eea846b6cedc30bd34d978a049fd3
2 years ago
Andrew Dunham 3f8e8b04fd cmd/tailscale, cmd/tailscaled: move portmapper debugging into tailscale CLI
The debug flag on tailscaled isn't available in the macOS App Store
build, since we don't have a tailscaled binary; move it to the
'tailscale debug' CLI that is available on all platforms instead,
accessed over LocalAPI.

Updates #7377

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I47bffe4461e036fab577c2e51e173f4003592ff7
2 years ago
Mihai Parparita 3e71e0ef68
net/sockstats: remove explicit dependency on wgengine/monitor
Followup to #7177 to avoid adding extra dependencies to the CLI. We
instead declare an interface for the link monitor.

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Andrew Dunham 27575cd52d net/dnsfallback: shuffle returned IPs
This ensures that we're trying multiple returned IPs, since the DERP
servers return the same response to all queries. This should increase
the chances that we eventually reach a working IP.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ie8d4fb93df96da910fae49ae71bf3e402b9fdecc
2 years ago
Mihai Parparita 9cb332f0e2 sockstats: instrument networking code paths
Uses the hooks added by tailscale/go#45 to instrument the reads and
writes on the major code paths that do network I/O in the client. The
convention is to use "<package>.<type>:<label>" as the annotation for
the responsible code path.

Enabled on iOS, macOS and Android only, since mobile platforms are the
ones we're most interested in, and we are less sensitive to any
throughput degradation due to the per-I/O callback overhead (macOS is
also enabled for ease of testing during development).

For now just exposed as counters on a /v0/sockstats PeerAPI endpoint.

We also keep track of the current interface so that we can break out
the stats by interface.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Mihai Parparita 780c56e119 ipn/ipnlocal: add delegated interface information to /interfaces PeerAPI handler
Exposes the delegated interface data added by #7248 in the debug
endpoint. I would have found it useful when working on that PR, and
it may be handy in the future as well.

Also makes the interfaces table slightly easier to parse by adding
borders to it. To make then nicer-looking, the CSP was relaxed to allow
inline styles.

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
David Anderson 8b2ae47c31 version: unexport all vars, turn Short/Long into funcs
The other formerly exported values aren't used outside the package,
so just unexport them.

Signed-off-by: David Anderson <danderson@tailscale.com>
2 years ago
Mihai Parparita fa932fefe7 net/interfaces: redo how we get the default interface on macOS and iOS
With #6566 we added an external mechanism for getting the default
interface, and used it on macOS and iOS (see tailscale/corp#8201).
The goal was to be able to get the default physical interface even when
using an exit node (in which case the routing table would say that the
Tailscale utun* interface is the default).

However, the external mechanism turns out to be unreliable in some
cases, e.g. when multiple cellular interfaces are present/toggled (I
have occasionally gotten my phone into a state where it reports the pdp_ip1
interface as the default, even though it can't actually route traffic).

It was observed that `ifconfig -v` on macOS reports an "effective interface"
for the Tailscale utn* interface, which seems promising. By examining
the ifconfig source code, it turns out that this is done via a
SIOCGIFDELEGATE ioctl syscall. Though this is a private API, it appears
to have been around for a long time (e.g. it's in the 10.13 xnu release
at https://opensource.apple.com/source/xnu/xnu-4570.41.2/bsd/net/if_types.h.auto.html)
and thus is unlikely to go away.

We can thus use this ioctl if the routing table says that a utun*
interface is the default, and go back to the simpler mechanism that
we had before #6566.

Updates #7184
Updates #7188

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Mihai Parparita 21fda7f670 net/routetable: include unknown flags in the routetable doctor output
As part of the work on #7248 I wanted to know all of the flags on the
RouteMessage struct that we get back from macOS. Though it doesn't turn
out to be useful (when using an exit node/Tailscale is the default route,
the flags for the physical interface routes are the same), it still seems
useful from a debugging/comprehensiveness perspective.

Adds additional Darwin flags that were output once I enabled this mode.

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Tom DNetto 2ca6dd1f1d wgengine: start logging DISCO frames to pcap stream
Signed-off-by: Tom DNetto <tom@tailscale.com>
2 years ago
Colin Adler 3c107ff301
net/connstats: fix ticker in NewStatistics (#7225)
`:=` was accidentally used, so `maxPeriod` never worked.

Signed-off-by: Colin Adler <colin1adler@gmail.com>
2 years ago
Mihai Parparita 62f4df3257 net/interfaces, net/netns: add node attributes to control default interface getting and binding
With #6566 we started to more aggressively bind to the default interface
on Darwin. We are seeing some reports of the wrong cellular interface
being chosen on iOS. To help with the investigation, this adds to knobs
to control the behavior changes:
- CapabilityDebugDisableAlternateDefaultRouteInterface disables the
  alternate function that we use to get the default interface on macOS
  and iOS (implemented in tailscale/corp#8201). We still log what it
  would have returned so we can see if it gets things wrong.
- CapabilityDebugDisableBindConnToInterface is a bigger hammer that
  disables binding of connections to the default interface altogether.

Updates #7184
Updates #7188

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago