Commit Graph

40 Commits (531ccca648d97d2368a6cd167d9c42e1733b196b)

Author SHA1 Message Date
Brad Fitzpatrick 4950fe60bd syncs, all: move to using Go's new atomic types instead of ours
Fixes #5185

Change-Id: I850dd532559af78c3895e2924f8237ccc328449d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 87a4c75fd4 control/controlclient, ipn/ipnlocal: remove Client.SetExpirySooner, fix race
Client.SetExpirySooner isn't part of the state machine. Remove it from
the Client interface.

And fix a use of LocalBackend.cc without acquiring the lock that
guards that field.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick ef0d740270 control/controlclient: remove Client.SetStatusFunc
It can't change at runtime. Make it an option.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 70a2797064 control/controlclient, ipn/ipnlocal: remove some Client methods
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick a1e429f7c3 control/controlclient, types/netmap: remove unused LocalPort field
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Maisem Ali 6fecc16c3b ipn/ipnlocal: do not process old status messages received out of order
When `setWgengineStatus` is invoked concurrently from multiple
goroutines, it is possible that the call invoked with a newer status is
processed before a call with an older status. e.g. a status that has
endpoints might be followed by a status without endpoints. This causes
unnecessary work in the engine and can result in packet loss.

This patch adds an `AsOf time.Time` field to the status to specifiy when the
status was calculated, which later allows `setWgengineStatus` to ignore
any status messages it receives that are older than the one it has
already processed.

Updates tailscale/corp#2579

Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Josh Bleecher Snyder 0868329936 all: use any instead of interface{}
My favorite part of generics.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Brad Fitzpatrick efc48b0578 ssh/tailssh, ipnlocal, controlclient: fetch next SSHAction from network
Updates #3802

Change-Id: I08e98805ab86d6bbabb6c365ed4526f54742fd8e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Nick O'Neill 1625e87526
control/controlclient, localapi: shorten expiry time via localapi (#4112)
Signed-off-by: Nick O'Neill <nick@tailscale.com>
3 years ago
Maisem Ali 497324ddf6 ipn/store: add common package for instantiating ipn.StateStores
Also move KubeStore and MemStore into their own package.

RELNOTE: tsnet now supports providing a custom ipn.StateStore.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Maisem Ali f9a50779e2 cmd/tailscaled: add `-state=mem:` to support creation of an ephemeral node.
RELNOTE=`tailscaled --state=mem:` registers as an ephemeral node and
does not store state to disk.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Brad Fitzpatrick aac974a5e5 ipn/ipnlocal: deflake (mostly) TestStateMachine
I'm sick of this flaking. Even if this isn't the right fix, it
stops the alert fatigue.

Updates #3020

Change-Id: I4001c127d78f1056302f7741adec34210a72ee61
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick c37af58ea4 net/tsdial: move more weirdo dialing into new tsdial package, plumb
Not done yet, but this move more of the outbound dial special casing
from random packages into tsdial, which aspires to be the one unified
place for all outbound dialing shenanigans.

Then this plumbs it all around, so everybody is ultimately
holding on to the same dialer.

As of this commit, macOS/iOS using an exit node should be able to
reach to the exit node's DoH DNS proxy over peerapi, doing the sockopt
to stay within the Network Extension.

A number of steps remain, including but limited to:

* move a bunch more random dialing stuff

* make netstack-mode tailscaled be able to use exit node's DNS proxy,
  teaching tsdial's resolver to use it when an exit node is in use.

Updates #1713

Change-Id: I1e8ee378f125421c2b816f47bc2c6d913ddcd2f5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Maisem Ali d6dde5a1ac ipn/ipnlocal: handle key extensions after key has already expired
Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
David Anderson 0c546a28ba types/persist: use new node key type.
Updates #3206

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Maisem Ali 81cabf48ec control/controlclient,tailcfg: propagate registration errors to the frontend
Signed-off-by: Maisem Ali <maisem@tailscale.com>
3 years ago
Josh Bleecher Snyder 07c09f470d ipn/ipnlocal: do not shut down the backend halfway through TestStateMachine
LocalBackend.Shutdown's docs say:

> The backend can no longer be used after Shutdown returns.

Nevertheless, TestStateMachine blithely calls Shutdown, talks some smack,
and continues on, expecting things to work. Other uses of Shutdown
in the codebase are as intended.

Things mostly kinda work anyway, except that the wgengine.Engine has been
shut down, so calls to Reconfig fail. Those get logged:

> local.go:603: wgengine status error: engine closing; no status

but otherwise ignored.

However, the Reconfig failure caused one fewer call to pause/unpause
than normal. Now the assertCalls lines match the equivalent ones
earlier in the test.

I don't see an obvious correct replacement for Shutdown in the context
of this test; I'm not sure entirely what it is trying to accomplish.
It is possible that many of the tests remaining after the prior call
to Shutdown are now extraneous. They don't harm anything, though,
so err on the side of safety and leave them for now.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 221cc5f8f2 ipn/ipnlocal: reduce line noise in tests
Use helpers and variadic functions to make the call sites
a lot easier to read, since they occur a lot.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 3ea8cf9c62 ipn/ipnlocal: use quicktest.IsNotNil in tests
This didn't exist when the test was written.
Now it does (frankban/quicktest#94); use it.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 7421ba91ec ipn/ipnlocal: only call UpdateEndpoints when the endpoints change
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder b681edc572 ipn/ipnlocal: add failing test
Concurrent calls to LocalBackend.setWgengineStatus
could result in some of the status updates being dropped.
This was exacerbated by 92077ae78c,
which increases the probability of concurrent status updates,
causing test failures (tailscale/corp#2579).

It's going to take a bit of work to fix this test.
The ipnlocal state machine is difficult to reason about,
particularly in the face of concurrency.
We could fix the test trivially by throwing a new mutex around
setWgengineStatus to serialize calls to it,
but I'd like to at least try to do better than cosmetics.

In the meantime, commit the test.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Josh Bleecher Snyder 7693d36aed all: close fake userspace engines when tests complete
We were leaking FDs.
In a few places, switch from defer to t.Cleanup.

Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
3 years ago
Dave Anderson 980acc38ba
types/key: add a special key with custom serialization for control private keys (#2792)
* Revert "Revert "types/key: add MachinePrivate and MachinePublic.""

This reverts commit 61c3b98a24.

Signed-off-by: David Anderson <danderson@tailscale.com>

* types/key: add ControlPrivate, with custom serialization.

ControlPrivate is just a MachinePrivate that serializes differently
in JSON, to be compatible with how the Tailscale control plane
historically serialized its private key.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 61c3b98a24 Revert "types/key: add MachinePrivate and MachinePublic."
Broke the tailscale control plane due to surprise different serialization.

This reverts commit 4fdb88efe1.
3 years ago
David Anderson 4fdb88efe1 types/key: add MachinePrivate and MachinePublic.
Plumb throughout the codebase as a replacement for the mixed use of
tailcfg.MachineKey and wgkey.Private/Public.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Avery Pennarun 14135dd935 ipn/ipnlocal: make state_test catch the bug fixed by #2445
Updates #2434

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
3 years ago
Brad Fitzpatrick 6eecf3c9d1 ipn/ipnlocal: stay out of map poll when down
Fixes #2434

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick 46896a9311 ipn/ipnlocal: start to test whether all state transitions save prefs to disk
Updates #2321

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
3 years ago
Brad Fitzpatrick e29cec759a ipn/{ipnlocal,localapi}, control/controlclient: add SetDNS localapi
Updates #1235

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Avery Pennarun eaa6507cc9 ipnlocal: in Start() fast path, don't forget to send Prefs.
The resulting empty Prefs had AllowSingleHosts=false and
Routeall=false, so that on iOS if you did these steps:
- Login and leave running
- Terminate the frontend
- Restart the frontend (fast path restart, missing prefs)
- Set WantRunning=false
- Set WantRunning=true
...then you would have Tailscale running, but with no routes. You would
also accidentally disable the ExitNodeID/IP prefs (symptom: the current
exit node setting didn't appear in the UI), but since nothing
else worked either, you probably didn't notice.

The fix was easy enough. It turns out we already knew about the
problem, so this also fixes one of the BUG entries in state_test.

Fixes: #1918 (BUG-1) and some as-yet-unreported bugs with exit nodes.
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
4 years ago
Avery Pennarun 8a7d35594d ipnlocal: don't assume NeedsLogin immediately after StartLogout().
Previously, there was no server round trip required to log out, so when
you asked ipnlocal to Logout(), it could clear the netmap immediately
and switch to NeedsLogin state.

In v1.8, we added a true Logout operation. ipn.Logout() would trigger
an async cc.StartLogout() and *also* immediately switch to NeedsLogin.
Unfortunately, some frontends would see NeedsLogin and immediately
trigger a new StartInteractiveLogin() operation, before the
controlclient auth state machine actually acted on the Logout command,
thus accidentally invalidating the entire logout operation, retaining
the netmap, and violating the user's expectations.

Instead, add a new LogoutFinished signal from controlclient
(paralleling LoginFinished) and, upon starting a logout, don't update
the ipn state machine until it's received.

Updates: #1918 (BUG-2)
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
4 years ago
Avery Pennarun 6fd4e8d244 ipnlocal: fix switching users while logged in + Stopped.
This code path is very tricky since it was originally designed for the
"re-authenticate to refresh my keys" use case, which didn't want to
lose the original session even if the refresh cycle failed. This is why
it acts differently from the Logout(); Login(); case.

Maybe that's too fancy, considering that it probably never quite worked
at all, for switching between users without logging out first. But it
works now.

This was more invasive than I hoped, but the necessary fixes actually
removed several other suspicious BUG: lines from state_test.go, so I'm
pretty confident this is a significant net improvement.

Fixes tailscale/corp#1756.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
4 years ago
Josh Bleecher Snyder 96ef8d34ef ipn/ipnlocal: switch from testify to quicktest
Per discussion, we want to have only one test assertion library,
and we want to start by exploring quicktest.

This was a mostly mechanical translation.
I think we could make this nicer by defining a few helper
closures at the beginning of the test. Later.

Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
4 years ago
Brad Fitzpatrick 544d8d0ab8 ipn/ipnlocal: remove NewLocalBackendWithClientGen
This removes the NewLocalBackendWithClientGen constructor added in
b4d04a065f and instead adds
LocalBackend.SetControlClientGetterForTesting, mirroring
LocalBackend.SetHTTPTestClient. NewLocalBackendWithClientGen was
weird in being exported but taking an unexported type. This was noted
during code review:

https://github.com/tailscale/tailscale/pull/1818#discussion_r623155669

which ended in:

"I'll leave it for y'all to clean up if you find some way to do it elegantly."

This is more idiomatic.

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 years ago
Avery Pennarun 0181a4d0ac ipnlocal: don't pause the controlclient until we get at least one netmap.
Without this, macOS would fail to display its menu state correctly if you
started it while !WantRunning. It relies on the netmap in order to show
the logged-in username.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
4 years ago
Avery Pennarun 4ef207833b ipn: !WantRunning + !LoggedOut should not be idle on startup.
There was logic that would make a "down" tailscale backend (ie.
!WantRunning) refuse to do any network activity. Unfortunately, this
makes the macOS and iOS UI unable to render correctly if they start
while !WantRunning.

Now that we have Prefs.LoggedOut, use that instead. So `tailscale down`
will still allow the controlclient to connect its authroutine, but
pause the maproutine. `tailscale logout` will entirely stop all
activity.

This new behaviour is not obviously correct; it's a bit annoying that
`tailsale down` doesn't terminate all activity like you might expect.
Maybe we should redesign the UI code to render differently when
disconnected, and then revert this change.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
4 years ago
Avery Pennarun 4f3315f3da ipnlocal: setting WantRunning with EditPrefs was special.
EditPrefs should be just a wrapper around the action of changing prefs,
but someone had added a side effect of calling Login() sometimes. The
side effect happened *after* running the state machine, which would
sometimes result in us going into NeedsLogin immediately before calling
cc.Login().

This manifested as the macOS app not being able to Connect if you
launched it with LoggedOut=false and WantRunning=false. Trying to
Connect() would sent us to the NeedsLogin state instead.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
4 years ago
Avery Pennarun 2a4d1cf9e2 Add prefs.LoggedOut to fix several state machine bugs.
Fixes: tailscale/corp#1660

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
4 years ago
Avery Pennarun b0382ca167 ipn/ipnlocal: some state_test cleanups.
This doesn't change the actual functionality. Just some additional
comments and fine tuning.

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
4 years ago
Avery Pennarun b7e31ab1a4 ipn: mock controlclient.Client; big ipn.Backend state machine test.
A very long unit test that verifies the way the controlclient and
ipn.Backend interact.

This is a giant sequential test of the state machine. The test passes,
but only because it's asserting all the wrong behaviour. I marked all
the behaviour I think is wrong with BUG comments, and several
additional test opportunities with TODO.

Note: the new test supercedes TestStartsInNeedsLoginState, which was
checking for incorrect behaviour (although the new test still checks
for the same incorrect behaviour) and assumed .Start() would converge
before returning, which it happens to do, but only for this very
specific case, for the current implementation. You're supposed to wait
for the notifications.

Updates: tailscale/corp#1660

Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
4 years ago