Commit Graph

465 Commits (94304819263b0553d7a93d2c061e8f824233456d)

Author SHA1 Message Date
Brad Fitzpatrick 947def7688 types/netmap: remove redundant Netmap.Hostinfo
It was in SelfNode.Hostinfo anyway. The redundant copy was just
costing us an allocation per netmap (a Hostinfo.Clone).

Updates #1909

Change-Id: Ifac568aa5f8054d9419828489442a0f4559bc099
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick db017d3b12 control/controlclient: remove quadratic allocs in mapSession
The mapSession code was previously quadratic: N clients in a netmap
send updates proportional to N and then for each, we do N units of
work. This removes most of that "N units of work" per update. There's
still a netmap-sized slice allocation per update (that's #8963), but
that's it.

Bit more efficient now, especially with larger netmaps:

                                 │     before     │                after                │
                                 │     sec/op     │   sec/op     vs base                │
    MapSessionDelta/size_10-8       47.935µ ±  3%   1.232µ ± 2%  -97.43% (p=0.000 n=10)
    MapSessionDelta/size_100-8      79.950µ ±  3%   1.642µ ± 2%  -97.95% (p=0.000 n=10)
    MapSessionDelta/size_1000-8    355.747µ ± 10%   4.400µ ± 1%  -98.76% (p=0.000 n=10)
    MapSessionDelta/size_10000-8   3079.71µ ±  3%   27.89µ ± 3%  -99.09% (p=0.000 n=10)
    geomean                          254.6µ         3.969µ       -98.44%

                                 │     before     │                after                 │
                                 │      B/op      │     B/op      vs base                │
    MapSessionDelta/size_10-8        9.651Ki ± 0%   2.395Ki ± 0%  -75.19% (p=0.000 n=10)
    MapSessionDelta/size_100-8      83.097Ki ± 0%   3.192Ki ± 0%  -96.16% (p=0.000 n=10)
    MapSessionDelta/size_1000-8     800.25Ki ± 0%   10.32Ki ± 0%  -98.71% (p=0.000 n=10)
    MapSessionDelta/size_10000-8   7896.04Ki ± 0%   82.32Ki ± 0%  -98.96% (p=0.000 n=10)
    geomean                          266.8Ki        8.977Ki       -96.64%

                                 │    before     │               after                │
                                 │   allocs/op   │ allocs/op   vs base                │
    MapSessionDelta/size_10-8         72.00 ± 0%   20.00 ± 0%  -72.22% (p=0.000 n=10)
    MapSessionDelta/size_100-8       523.00 ± 0%   20.00 ± 0%  -96.18% (p=0.000 n=10)
    MapSessionDelta/size_1000-8     5024.00 ± 0%   20.00 ± 0%  -99.60% (p=0.000 n=10)
    MapSessionDelta/size_10000-8   50024.00 ± 0%   20.00 ± 0%  -99.96% (p=0.000 n=10)
    geomean                          1.754k        20.00       -98.86%

Updates #1909

Change-Id: I41ee29358a5521ed762216a76d4cc5b0d16e46ac
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick cb4a61f951 control/controlclient: don't clone self node on each NetworkMap
Drop in the bucket, but have to start somewhere.

Real wins will come once this is done for peers.

                                 │     before     │                after                │
                                 │      B/op      │     B/op       vs base              │
    MapSessionDelta/size_10-8      10.213Ki ± ∞ ¹   9.650Ki ± ∞ ¹  -5.51% (p=0.008 n=5)
    MapSessionDelta/size_100-8      83.64Ki ± ∞ ¹   83.08Ki ± ∞ ¹  -0.67% (p=0.008 n=5)
    MapSessionDelta/size_1000-8     800.8Ki ± ∞ ¹   800.3Ki ± ∞ ¹  -0.07% (p=0.008 n=5)
    MapSessionDelta/size_10000-8    7.712Mi ± ∞ ¹   7.711Mi ± ∞ ¹  -0.01% (p=0.008 n=5)
    geomean                         271.1Ki         266.8Ki        -1.59%

                                 │    before    │               after                │
                                 │  allocs/op   │  allocs/op    vs base              │
    MapSessionDelta/size_10-8       73.00 ± ∞ ¹    72.00 ± ∞ ¹  -1.37% (p=0.008 n=5)
    MapSessionDelta/size_100-8      524.0 ± ∞ ¹    523.0 ± ∞ ¹  -0.19% (p=0.008 n=5)
    MapSessionDelta/size_1000-8    5.025k ± ∞ ¹   5.024k ± ∞ ¹  -0.02% (p=0.008 n=5)
    MapSessionDelta/size_10000-8   50.02k ± ∞ ¹   50.02k ± ∞ ¹  -0.00% (p=0.040 n=5)
    geomean                        1.761k         1.754k        -0.40%

Updates #1909

Change-Id: Ie19dea3371de251d64d4373dd00422f53c2675ea
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 84b94b3146 types/netmap, all: make NetworkMap.SelfNode a tailcfg.NodeView
Updates #1909

Change-Id: I8c470cbc147129a652c1d58eac9b790691b87606
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick b5ff68a968 control/controlclient: flesh out mapSession to break up gigantic method
Now mapSession has a bunch more fields and methods, rather than being
just one massive func with a ton of local variables.

So far there are no major new optimizations, though. It should behave
the same as before.

This has been done with an eye towards testability (so tests can set
all the callback funcs as needed, or not, without a huge Direct client
or long-running HTTP requests), but this change doesn't add new tests
yet. That will follow in the changes which flesh out the NetmapUpdater
interface.

Updates #1909

Change-Id: Iad4e7442d5bbbe2614bd4b1dc4b02e27504898df
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 165f0116f1 types/netmap: move some mutations earlier, remove, document some fields
And optimize the Persist setting a bit, allocating later and only mutating
fields when there's been a Node change.

Updates #1909

Change-Id: Iaddfd9e88ef76e1d18e8d0a41926eb44d0955312
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 21170fb175 control/controlclient: scope a variable tighter, de-pointer a *time.Time
Just misc cleanups.

Updates #1909

Change-Id: I9d64cb6c46d634eb5fdf725c13a6c5e514e02e9a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 4511e7d64e ipn/ipnstate: add PeerStatus.AltSharerUserID, stop mangling Node.User
In b987b2ab18 (2021-01-12) when we introduced sharing we mapped
the sharer to the userid at a low layer, mostly to fix the display of
"tailscale status" and the client UIs, but also some tests.

The commit earlier today, 7dec09d169, removed the 2.5yo option
to let clients disable that automatic mapping, as clearly we were never
getting around to it.

This plumbs the Sharer UserID all the way to ipnstatus so the CLI
itself can choose to print out the Sharer's identity over the node's
original owner.

Then we stop mangling Node.User and let clients decide how they want
to render things.

To ease the migration for the Windows GUI (which currently operates on
tailcfg.Node via the NetMap from WatchIPNBus, instead of PeerStatus),
a new method Node.SharerOrUser is added to do the mapping of
Sharer-else-User.

Updates #1909
Updates tailscale/corp#1183

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 7dec09d169 control/controlclient: remove Opts.KeepSharerAndUserSplit
It was added 2.5 years ago in c1dabd9436 but was never used.
Clearly that migration didn't matter.

We can attempt this again later if/when this matters.

Meanwhile this simplifies the code and thus makes working on other
current efforts in these parts of the code easier.

Updates #1909

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 58a4fd43d8 types/netmap, all: use read-only tailcfg.NodeView in NetworkMap
Updates #8948

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick af2e4909b6 all: remove some Debug fields, NetworkMap.Debug, Reconfig Debug arg
Updates #8923

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 121d1d002c tailcfg: add nodeAttrs for forcing OneCGNAT on/off [capver 71]
Updates #8923

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 25663b1307 tailcfg: remove most Debug fields, move bulk to nodeAttrs [capver 70]
Now a nodeAttr: ForceBackgroundSTUN, DERPRoute, TrimWGConfig,
DisableSubnetsIfPAC, DisableUPnP.

Kept support for, but also now a NodeAttr: RandomizeClientPort.

Removed: SetForceBackgroundSTUN, SetRandomizeClientPort (both never
used, sadly... never got around to them. But nodeAttrs are better
anyway), EnableSilentDisco (will be a nodeAttr later when that effort
resumes).

Updates #8923

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 239ad57446 tailcfg: move LogHeapPprof from Debug to c2n [capver 69]
And delete Debug.GoroutineDumpURL, which was already in c2n.

Updates #8923

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 1fcae42055 control/controlclient: move lastUpdateGenInformed to tighter scope
No need to have it on Auto or be behind a mutex; it's only read/written
from a single goroutine. Move it there.

Updates tailscale/corp#5761

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 2398993804 control/controlclient: refactor in prep for optimized delta handling
See issue. This is a baby step towards passing through deltas
end-to-end from node to control back to node and down to the various
engine subsystems, not computing diffs from two full netmaps at
various levels. This will then let us support larger netmaps without
burning CPU.

But this change itself changes no behavior. It just changes a func
type to an interface with one method. That paves the way for future
changes to then add new NetmapUpdater methods that do more
fine-grained work than updating the whole world.

Updates #1909

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
M. J. Fromberger 9e24a6508a
control/controlclient: avert a data race when logging (#8863)
The read of the synced field for logging takes place outside the lock, and
races with other (locked) writes of this field, including for example the one
at current line 556 in mapRoutine.

Updates tailscale/corp#13856

Change-Id: I056b36d7a93025aafdf73528dd7645f10b791af6
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
1 year ago
Maisem Ali d16946854f control/controlclient: add Auto.updateRoutine
Instead of having updates replace the map polls, create
a third goroutine which is solely responsible for making
sure that control is aware of the latest client state.

This also makes it so that the streaming map polls are only
broken when there are auth changes, or the client is paused.

Updates tailscale/corp#5761

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Andrew Lytvynov 90081a25ca
control/controlhttp: remove tstest.Clock from tests (#8830)
These specific tests rely on some timers in the controlhttp code.
Without time moving forward and timers triggering, the tests fail.

Updates #8587

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
1 year ago
Maisem Ali 734928d3cb control/controlclient: make Direct own all changes to Persist
It was being modified in two places in Direct for the auth routine
and then in LocalBackend when a new NetMap was received. This was
confusing, so make Direct also own changes to Persist when a new
NetMap is received.

Updates #7726

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Maisem Ali 6aaf1d48df types/persist: drop duplicated Persist.LoginName
It was duplicated from Persist.UserProfile.LoginName, drop it.

Updates #7726

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Maisem Ali 17ed2da94d control/controlclient: use ptr.To
Updates #cleanup

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
salman aljammaz 25a7204bb4
wgengine,ipn,cmd/tailscale: add size option to ping (#8739)
This adds the capability to pad disco ping message payloads to reach a
specified size. It also plumbs it through to the tailscale ping -size
flag.

Disco pings used for actual endpoint discovery do not use this yet.

Updates #311.

Signed-off-by: salman <salman@tailscale.com>
Co-authored-by: Val <valerie@tailscale.com>
1 year ago
Claire Wang a17c45fd6e
control: use tstime instead of time (#8595)
Updates #8587
Signed-off-by: Claire Wang <claire@tailscale.com>
1 year ago
Maisem Ali 60e5761d60 control/controlclient: reset backoff in mapRoutine on netmap recv
We were never resetting the backoff in streaming mapResponses.
The call to `PollNetMap` always returns with an error. Changing that contract
is harder, so manually reset backoff when a netmap is received.

Updates tailscale/corp#12894

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Andrew Dunham 7aba0b0d78 net/netcheck, tailcfg: add DERPHomeParams and use it
This allows providing additional information to the client about how to
select a home DERP region, such as preferring a given DERP region over
all others.

Updates #8603

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I7c4a270f31d8585112fab5408799ffba5b75266f
1 year ago
Maisem Ali 9d1a3a995c control/controlclient: use ctx passed down to NoiseClient.getConn
Without this, the client would just get stuck dialing even if the
context was canceled.

Updates tailscale/corp#12590

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Maisem Ali c11af12a49 .github: actually run tests in CI
Updates #cleanup

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Brad Fitzpatrick 7c1068b7ac util/goroutines: let ScrubbedGoroutineDump get only current stack
ScrubbedGoroutineDump previously only returned the stacks of all
goroutines. I also want to be able to use this for only the current
goroutine's stack. Add a bool param to support both ways.

Updates tailscale/corp#5149

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Andrew Dunham 42fd964090 control/controlclient: use dnscache.Resolver for Noise client
This passes the *dnscache.Resolver down from the Direct client into the
Noise client and from there into the controlhttp client. This retains
the Resolver so that it can share state across calls instead of creating
a new resolver.

Updates #4845
Updates #6110

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ia5d6af1870f3b5b5d7dd5685d775dcf300aec7af
2 years ago
Mihai Parparita 7330aa593e all: avoid repeated default interface lookups
On some platforms (notably macOS and iOS) we look up the default
interface to bind outgoing connections to. This is both duplicated
work and results in logspam when the default interface is not available
(i.e. when a phone has no connectivity, we log an error and thus cause
more things that we will try to upload and fail).

Fixed by passing around a netmon.Monitor to more places, so that we can
use its cached interface state.

Fixes #7850
Updates #7621

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Mihai Parparita 4722f7e322 all: move network monitoring from wgengine/monitor to net/netmon
We're using it in more and more places, and it's not really specific to
our use of Wireguard (and does more just link/interface monitoring).

Also removes the separate interface we had for it in sockstats -- it's
a small enough package (we already pull in all of its dependencies
via other paths) that it's not worth the extra complexity.

Updates #7621
Updates #7850

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Andrew Dunham 280255acae
various: add golangci-lint, fix issues (#7905)
This adds an initial and intentionally minimal configuration for
golang-ci, fixes the issues reported, and adds a GitHub Action to check
new pull requests against this linter configuration.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8f38fbc315836a19a094d0d3e986758b9313f163
2 years ago
Mihai Parparita 9a655a1d58 net/dnsfallback: more explicitly pass through logf function
Redoes the approach from #5550 and #7539 to explicitly pass in the logf
function, instead of having global state that can be overridden.

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Mihai Parparita edb02b63f8 net/sockstats: pass in logger to sockstats.WithSockStats
Using log.Printf may end up being printed out to the console, which
is not desirable. I noticed this when I was investigating some client
logs with `sockstats: trace "NetcheckClient" was overwritten by another`.
That turns to be harmless/expected (the netcheck client will fall back
to the DERP client in some cases, which does its own sockstats trace).

However, the log output could be visible to users if running the
`tailscale netcheck` CLI command, which would be needlessly confusing.

Updates tailscale/corp#9230

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Kurnia D Win 9526858b1e control/controlclient: fix accidental backoff reset
Signed-off-by: Kurnia D Win <kurnia.d.win@gmail.com>
2 years ago
Andrew Dunham 83fa17d26c various: pass logger.Logf through to more places
Updates #7537

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Id89acab70ea678c8c7ff0f44792d54c7223337c6
2 years ago
Maisem Ali be027a9899 control/controlclient: improve handling of concurrent lite map requests
This reverts commit 6eca47b16c and fixes forward.

Previously the first ever streaming MapRequest that a client sent would also
set ReadOnly to true as it didn't have any endpoints and expected/relied on the
map poll to restart as soon as it got endpoints. However with 48f6c1eba4,
we would no longer restart MapRequests as frequently as we used to, so control
would only ever get the first streaming MapRequest which had ReadOnly=true.

Control would treat this as an uninteresting request and would not send it
any further netmaps, while the client would happily stay in the map poll forever
while litemap updates happened in parallel.

This makes it so that we never set `ReadOnly=true` when we are doing a streaming
MapRequest. This is no longer necessary either as most endpoint discovery happens
over disco anyway.

Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Tom DNetto ce99474317 all: implement preauth-key support with tailnet lock
Signed-off-by: Tom DNetto <tom@tailscale.com>
2 years ago
Tom DNetto 6eca47b16c Revert "control/controlclient: improve handling of concurrent lite map requests"
This reverts commit 48f6c1eba4.

It unfortunately breaks mapresponse wakeups.

Signed-off-by: Tom DNetto <tom@tailscale.com>
2 years ago
Andrew Dunham 48f6c1eba4 control/controlclient: improve handling of concurrent lite map requests
Prior to this change, if we were in the middle of a lite map update we'd
tear down the entire map session and restart it. With this change, we'll
cancel an in-flight lite map request up to 10 times and restart before
we tear down the streaming map request. We tear down everything after 10
retries to ensure that a steady stream of calls to sendNewMapRequest
doesn't fail to make progress by repeatedly canceling and restarting.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
Change-Id: I9392bf8cf674e7a58ccd1e476039300a359ef3b1
2 years ago
Mihai Parparita 6ac6ddbb47 sockstats: switch label to enum
Makes it cheaper/simpler to persist values, and encourages reuse of
labels as opposed to generating an arbitrary number.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Maisem Ali 1a30b2d73f all: use tstest.Replace more
Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Mihai Parparita 9cb332f0e2 sockstats: instrument networking code paths
Uses the hooks added by tailscale/go#45 to instrument the reads and
writes on the major code paths that do network I/O in the client. The
convention is to use "<package>.<type>:<label>" as the annotation for
the responsible code path.

Enabled on iOS, macOS and Android only, since mobile platforms are the
ones we're most interested in, and we are less sensitive to any
throughput degradation due to the per-I/O callback overhead (macOS is
also enabled for ease of testing during development).

For now just exposed as counters on a /v0/sockstats PeerAPI endpoint.

We also keep track of the current interface so that we can break out
the stats by interface.

Updates tailscale/corp#9230
Updates #3363

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Brad Fitzpatrick fb84ccd82d control/controlhttp: don't require valid TLS cert for Noise connection
We don't require any cert at all for Noise-over-plaintext-port-80-HTTP,
so why require a valid cert chain for Noise-over-HTTPS? The reason we use
HTTPS at all is to get through firewalls that allow tcp/443 but not tcp/80,
not because we need the security properties of TLS.

Updates #3198

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Maisem Ali 5bba65e978 net/memnet: rename from net/nettest
This is just #cleanup to resolve a TODO

Also add a package doc.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Will Norris 71029cea2d all: update copyright and license headers
This updates all source files to use a new standard header for copyright
and license declaration.  Notably, copyright no longer includes a date,
and we now use the standard SPDX-License-Identifier header.

This commit was done almost entirely mechanically with perl, and then
some minimal manual fixes.

Updates #6865

Signed-off-by: Will Norris <will@tailscale.com>
2 years ago
Brad Fitzpatrick 6edf357b96 all: start groundwork for using capver for localapi & peerapi
Updates #7015

Change-Id: I3d4c11b42a727a62eaac3262a879f29bb4ce82dd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 3eb986fe05 control/controlhttp: add TS_FORCE_NOISE_443, TS_DEBUG_NOISE_DIAL envknobs
Updates tailscale/docker-extension#49

Change-Id: I99a154c16c92228bfdf4d2cf6c58cda00e22d72f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham 3a018e51bb ipn/ipnlocal: move handling of expired nodes to LocalBackend
In order to be able to synthesize a new NetMap when a node expires, have
LocalBackend start a timer when receiving a new NetMap that fires
slightly after the next node expires. Additionally, move the logic that
updates expired nodes into LocalBackend so it runs on every netmap
(whether received from controlclient or self-triggered).

Updates #6932

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I833390e16ad188983eac29eb34cc7574f555f2f3
2 years ago