Polls IMDS (currently only AWS) for extra IPs to advertise as udprelay.
Updates #17796
Change-Id: Iaaa899ef4575dc23b09a5b713ce6693f6a6a6964
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
* k8s-operator,kube: removing enableSessionRecordings option. It seems
like it is going to create a confusing user experience and it's going to
be a very niche use case, so we have decided to defer this for now.
Updates tailscale/corp#35796
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
* k8s-operator: adding metric for env var deprecation
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
---------
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
net/portmapper: Stop replacing the internal port with the upnp external port
This causes the UPnP mapping to break in the next recreation of the
mapping.
Fixes#18348
Signed-off-by: Eduardo Sorribas <eduardo@sorribas.org>
This change allows tsnet nodes to act as Service hosts by adding a new
function, tsnet.Server.ListenService. Invoking this function will
advertise the node as a host for the Service and create a listener to
receive traffic for the Service.
Fixes#17697Fixestailscale/corp#27200
Signed-off-by: Harry Harpham <harry@tailscale.com>
This change adds API to ipn.LocalBackend to retrieve the ETag when
querying for the current serve config. This allows consumers of
ipn.LocalBackend.SetServeConfig to utilize the concurrency control
offered by ETags. Previous to this change, utilizing serve config ETags
required copying the local backend's internal ETag calcuation.
The local API server was previously copying the local backend's ETag
calculation as described above. With this change, the local API server
now uses the new ETag retrieval function instead. Serve config ETags are
therefore now opaque to clients, in line with best practices.
Fixestailscale/corp#35857
Signed-off-by: Harry Harpham <harry@tailscale.com>
fixestailscale/tailscale#18418
Both Serve and PeerAPI broke when we moved the TailscaleInterfaceName
into State, which is updated asynchronously and may not be
available when we configure the listeners.
This extracts the explicit interface name property from netmon.State
and adds as a static struct with getters that have proper error
handling.
The bug is only found in sandboxed Darwin clients, where we
need to know the Tailscale interface details in order to set up the
listeners correctly (they must bind to our interface explicitly to escape
the network sandboxing that is applied by NECP).
Currently set only sandboxed macOS and Plan9 set this but it will
also be useful on Windows to simplify interface filtering in netns.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Policy editors, such as gpedit.msc and gpme.msc, rely on both the presence and the value of the
registry value to determine whether a policy is enabled. Unless an enabledValue is specified
explicitly, it defaults to REG_DWORD 1.
Therefore, we cannot rely on the same registry value to track the policy configuration state when
it is already used by a policy option, such as a dropdown. Otherwise, while the policy setting
will be written and function correctly, it will appear as Not Configured in the policy editor
due to the value mismatch (for example, REG_SZ "always" vs REG_DWORD 1).
In this PR, we update the DNSRegistration policy setting to use the DNSRegistrationConfigured
registry value for tracking. This change has no effect on the client side and exists solely to
satisfy ADMX and policy editor requirements.
Updates #14917
Signed-off-by: Nick Khyl <nickk@tailscale.com>
gocross-wrapper.ps1 is written to use the version of tar that ships with
Windows; we want to avoid conflicts with any other tar on the PATH, such
ones installed by MSYS and/or Cygwin.
Updates https://github.com/tailscale/corp/issues/29940
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Recently, the golangci-lint workflow has been taking longer and longer
to complete, causing it to timeout after the default of 5 minutes.
Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: context deadline exceeded
Timeout exceeded: try increasing it by passing --timeout option
Although PR #18398 enabled the Go module cache, bootstrapping with a
cold cache still takes too long.
This PR doubles the default 5 minute timeout for golangci-lint to 10
minutes so that golangci-lint can finish downloading all of its
dependencies.
Note that this doesn’t affect the 5 minute timeout configured in
.golangci.yml, since running golangci-lint on your local instance
should still be plenty fast.
Fixes#18366
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Allow for optionally specifying an audience for containerboot. This is
passed to tailscale up to allow for containerboot to use automatic ID
token generation for authentication.
Updates https://github.com/tailscale/corp/issues/34430
Signed-off-by: Mario Minardi <mario@tailscale.com>
Allow for optionally specifiying an audience for tsnet. This is passed
to the underlying identity federation logic to allow for tsnet auth to
use automatic ID token generation for authentication.
Updates https://github.com/tailscale/corp/issues/33316
Signed-off-by: Mario Minardi <mario@tailscale.com>
If local tailscale/tailscale checkout is not available,
pulll cigocacher remotely.
Fall back to ./tool/go if no other Go installation
is present.
Updates tailscale/corp#32493
Signed-off-by: Irbe Krumina <irbekrm@gmail.com>
Adds the ability to detect what provider the client is running on and tries fetch the ID token to use with Workload Identity.
Updates https://github.com/tailscale/corp/issues/33316
Signed-off-by: Danni Popova <danni@tailscale.com>
Recently, the golangci-lint workflow has been taking longer and longer
to complete, causing it to timeout after the default of 5 minutes.
Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: context deadline exceeded
Timeout exceeded: try increasing it by passing --timeout option
This PR upgrades actions/setup-go to version 6, the latest, and
enables caching for Go modules and build outputs. This should speed up
linting because most packages won’t have to be downloaded over and
over again.
Fixes#18366
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Fixes a bug where, for kube HA proxies, TLS certs for the replica
responsible for cert issuance where loaded in memory on startup,
although the in-memory store was not updated after renewal (to
avoid failing re-issuance for re-created Ingresses).
Now the 'write' replica always reads certs from the kube Secret.
Updates tailscale/tailscale#18394
Signed-off-by: Irbe Krumina <irbekrm@gmail.com>
Previously the funnel listener would leave artifacts in the serve
config. This caused weird out-of-sync effects like the admin panel
showing that funnel was enabled for a node, but the node rejecting
packets because the listener was closed.
This change resolves these synchronization issues by ensuring that
funnel listeners clean up the serve config when closed.
See also:
e109cf9fdd
Updates #cleanup
Signed-off-by: Harry Harpham <harry@tailscale.com>
Prior to this change, we were resetting the tsnet's serve config every
time tsnet.Server.Up was run. This is important to do on startup, to
prevent messy interactions with stale configuration when the code has
changed.
However, Up is frequently run as a just-in-case step (for example, by
Server.ListenTLS/ListenFunnel and possibly by consumers of tsnet). When
the serve config is reset on each of these calls to Up, this creates
situations in which the serve config disappears unexpectedly. The
solution is to reset the serve config only on the first call to Up.
Fixes#8800
Updates tailscale/corp#27200
Signed-off-by: Harry Harpham <harry@tailscale.com>
Add support for authenticating the gitops-pusher using workload identity
federation.
Updates https://github.com/tailscale/corp/issues/34172
Signed-off-by: Mario Minardi <mario@tailscale.com>
QR codes are used by `tailscale up --qr` to provide an easy way to
open a web-page without transcribing a difficult URI. However, there’s
no need for this feature if the client will never be called
interactively. So this PR adds the `ts_omit_qrcodes` build tag.
Updates #18182
Signed-off-by: Simon Law <sfllaw@tailscale.com>
It's not worth adding the v2 client just for these e2e tests. Remove
that dependency for now to keep a clear separation, but we should revive
the v2 client version if we ever decide to take that dependency for the
tailscale/tailscale repo as a whole.
Updates tailscale/corp#32085
Change-Id: Ic51ce233d5f14ce2d25f31a6c4bb9cf545057dd0
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
* cmd/k8s-operator/e2e: run self-contained e2e tests with devcontrol
Adds orchestration for more of the e2e testing setup requirements to
make it easier to run them in CI, but also run them locally in a way
that's consistent with CI. Requires running devcontrol, but otherwise
supports creating all the scaffolding required to exercise the operator
and proxies.
Updates tailscale/corp#32085
Change-Id: Ia7bff38af3801fd141ad17452aa5a68b7e724ca6
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
* cmd/k8s-operator/e2e: being more specific on tmp dir cleanup
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
---------
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Co-authored-by: chaosinthecrd <tom@tmlabs.co.uk>
Raw Linux consoles support UTF-8, but we cannot assume that all UTF-8
characters are available. The default Fixed and Terminus fonts don’t
contain half-block characters (`▀` and `▄`), but do contain the
full-block character (`█`).
Sometimes, Linux doesn’t have a framebuffer, so it falls back to VGA.
When this happens, the full-block character could be anywhere in
extended ASCII block, because we don’t know which code page is active.
This PR introduces `--qr-format=auto` which tries to heuristically
detect when Tailscale is printing to a raw Linux console, whether
UTF-8 is enabled, and which block characters have been mapped in the
console font.
If Unicode characters are unavailable, the new `--qr-format=ascii`
formatter uses `#` characters instead of full-block characters.
Fixes#12935
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Moves magicksock.cloudInfo into util/cloudinfo with minimal changes.
Updates #17796
Change-Id: I83f32473b9180074d5cdbf00fa31e5b3f579f189
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
Bump peter-evans/create-pull-request to 8.0.0 to ensure compatibility
with actions/checkout 6.x.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
The funnel command is sort of an alias for the serve command. This means
that the subcommands added to serve to support Services appear as
subcommands for funnel as well, despite having no meaning for funnel.
This change removes all such Services-specific subcommands from funnel.
Fixestailscale/corp#34167
Signed-off-by: Harry Harpham <harry@tailscale.com>
Ensure that hardware attestation keys are not added to tailscaled
state stores that are Kubernetes Secrets or AWS SSM as those Tailscale
devices should be able to be recreated on different nodes, for example,
when moving Pods between nodes.
Updates tailscale/tailscale#18302
Signed-off-by: Irbe Krumina <irbekrm@gmail.com>
TPM-based features have been incredibly painful due to the heterogeneous
devices in the wild, and many situations in which the TPM "changes" (is
reset or replaced). All of this leads to a lot of customer issues.
We hoped to iron out all the kinks and get all users to benefit from
state encryption and hardware attestation without manually opting in,
but the long tail of kinks is just too long.
This change disables TPM-based features on Windows and Linux by default.
Node state should get auto-decrypted on update, and old attestation keys
will be removed.
There's also tailscaled-on-macOS, but it won't have a TPM or Keychain
bindings anyway.
Updates #18302
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Soft-fail on initial unmarshal and try again, ignoring the
AttestationKey. This helps in cases where something about the
attestation key storage (usually a TPM) is messed up. The old key will
be lost, but at least the node can start again.
Updates #18302
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Send LOGIN audit messages to the kernel audit subsystem on Linux
when users successfully authenticate to Tailscale SSH. This provides
administrators with audit trail integration via auditd or journald,
recording details about both the Tailscale user (whois) and the
mapped local user account.
The implementation uses raw netlink sockets to send AUDIT_USER_LOGIN
messages to the kernel audit subsystem. It requires CAP_AUDIT_WRITE
capability, which is checked at runtime. If the capability is not
present, audit logging is silently skipped.
Audit messages are sent to the kernel (pid 0) and consumed by either
auditd (written to /var/log/audit/audit.log) or journald (available
via journalctl _TRANSPORT=audit), depending on system configuration.
Note: This may result in duplicate messages on a system where
auditd/journald audit logs are enabled and the system has and supports
`login -h`. Sadly Linux login code paths are still an inconsistent wild
west so we accept the potential duplication rather than trying to avoid
it.
Fixes#18332
Signed-off-by: James Tucker <james@tailscale.com>
GCP Certificate Manager requires an email contact on ACME accounts.
Add --acme-email flag that is required for --certmode=gcp and
optional for --certmode=letsencrypt.
Fixes#18277
Signed-off-by: Raj Singh <raj@tailscale.com>
An error returned by net.Listener.Accept() causes the owning http.Server to shut down.
With the deprecation of net.Error.Temporary(), there's no way for the http.Server to test
whether the returned error is temporary / retryable or not (see golang/go#66252).
Because of that, errors returned by (*safesocket.winIOPipeListener).Accept() cause the LocalAPI
server (aka ipnserver.Server) to shut down, and tailscaled process to exit.
While this might be acceptable in the case of non-recoverable errors, such as programmer errors,
we shouldn't shut down the entire tailscaled process for client- or connection-specific errors,
such as when we couldn't obtain the client's access token because the client attempts to connect
at the Anonymous impersonation level. Instead, the LocalAPI server should gracefully handle
these errors by denying access and returning a 401 Unauthorized to the client.
In tailscale/tscert#15, we fixed a known bug where Caddy and other apps using tscert would attempt
to connect at the Anonymous impersonation level and fail. However, we should also fix this on the tailscaled
side to prevent a potential DoS, where a local app could deliberately open the Tailscale LocalAPI named pipe
at the Anonymous impersonation level and cause tailscaled to exit.
In this PR, we defer token retrieval until (*WindowsClientConn).Token() is called and propagate the returned token
or error via ipnauth.GetConnIdentity() to ipnserver, which handles it the same way as other ipnauth-related errors.
Fixes#18212Fixestailscale/tscert#13
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This gauge will be reworked to include endpoint state in future.
Updates tailscale/corp#30820
Change-Id: I66f349d89422b46eec4ecbaf1a99ad656c7301f9
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
In dynamically changing environments where ACME account keys and certs
are stored separately, it can happen that the account key would get
deleted (and recreated) between issuances. If that is the case,
we currently fail renewals and the only way to recover is for users
to delete certs.
This adds a config knob to allow opting out of the replaces extension
and utilizes it in the Kubernetes operator where there are known
user workflows that could end up with this edge case.
Updates #18251
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Adding both user and client metrics for peer relay forwarded bytes and
packets, and the total endpoints gauge.
User metrics:
tailscaled_peer_relay_forwarded_packets_total{transport_in, transport_out}
tailscaled_peer_relay_forwarded_bytes_total{transport_in, transport_out}
tailscaled_peer_relay_endpoints_total{}
Where the transport labels can be of "udp4" or "udp6".
Client metrics:
udprelay_forwarded_(packets|bytes)_udp(4|6)_udp(4|6)
udprelay_endpoints
RELNOTE: Expose tailscaled metrics for peer relay.
Updates tailscale/corp#30820
Change-Id: I1a905d15bdc5ee84e28017e0b93210e2d9660259
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
Adds support for targeting FQDNs that are a Tailscale Service. Uses the
same method of searching for Services as the tailscale configure
kubeconfig command. This fixes using the tailscale.com/tailnet-fqdn
annotation for Kubernetes Service when the specified FQDN is a Tailscale
Service.
Fixes#16534
Change-Id: I422795de76dc83ae30e7e757bc4fbd8eec21cc64
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Signed-off-by: Becky Pauley <becky@tailscale.com>
IsZero is required by the interface, so we should use that before trying
to serialize the key.
Updates #35412
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
When the TS_DEBUG_DNS_FORWARD_SEND envknob is turned on, also log the
source IP:port of the query that tailscaled is forwarding.
Updates tailscale/corp#35374
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
updates tailscale/corp#33891
Addresses several older the TODO's in netmon. This removes the
Major flag precomputes the ChangeDelta state, rather than making
consumers of ChangeDeltas sort that out themselves. We're also seeing
a lot of ChangeDelta's being flagged as "Major" when they are
not interesting, triggering rebinds in wgengine that are not needed. This
cleans that up and adds a host of additional tests.
The dependencies are cleaned, notably removing dependency on netmon
itself for calculating what is interesting, and what is not. This includes letting
individual platforms set a bespoke global "IsInterestingInterface"
function. This is only used on Darwin.
RebindRequired now roughly follows how "Major" was historically
calculated but includes some additional checks for various
uninteresting events such as changes in interface addresses that
shouldn't trigger a rebind. This significantly reduces thrashing (by
roughly half on Darwin clients which switching between nics). The individual
values that we roll into RebindRequired are also exposed so that
components consuming netmap.ChangeDelta can ask more
targeted questions.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
The existing client metric methods only support incrementing (or
decrementing) a delta value. This new method allows setting the metric
to a specific value.
Updates tailscale/corp#35327
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
This commit also introduces a sync.Mutex for guarding mutatable fields
on serverEndpoint, now that it is no longer guarded by the sync.Mutex
in Server.
These changes reduce lock contention and by effect increase aggregate
throughput under high flow count load. A benchmark on Linux with AWS
c8gn instances showed a ~30% increase in aggregate throughput (37Gb/s
vs 28Gb/s) for 12 tailscaled flows.
Updates tailscale/corp#35264
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Add flags:
* --cigocached-host to support alternative host resolution in other
environments, like the corp repo.
* --stats to reduce the amount of bash script we need.
* --version to support a caching tool/cigocacher script that will
download from GitHub releases.
Updates tailscale/corp#10808
Change-Id: Ib2447bc5f79058669a70f2c49cef6aedd7afc049
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
tcpHandlerForVIPService was missing ProxyProtocol support that
tcpHandlerForServe already had. Extract the shared logic into
forwardTCPWithProxyProtocol helper and use it in both handlers.
Fixes#18172
Signed-off-by: Raj Singh <raj@tailscale.com>
Add metrics about logtail uploading and underlying buffer.
Add metrics to the in-memory buffer implementation.
Updates tailscale/corp#21363
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
PR #18033 skipped tests for the versions of Linux 6.6 and 6.12 that
had a regression in /proc/net/tcp that causes seek operations to fail
with “illegal seek”.
This PR skips tests for Linux 6.14.0, which is the default Ubuntu
kernel, that also contains this regression.
Updates #16966
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The filch implementation is fairly broken:
* When Filch.cur exceeds MaxFileSize, it calls moveContents
to copy the entirety of cur into alt (while holding the write lock).
By nature, this is the movement of a lot of data in a hot path,
meaning that all log calls will be globally blocked!
It also means that log uploads will be blocked during the move.
* The implementation of moveContents is buggy in that
it copies data from cur into the start of alt,
but fails to truncate alt to the number of bytes copied.
Consequently, there are unrelated lines near the end,
leading to out-of-order lines when being read back.
* Data filched via stderr do not directly respect MaxFileSize,
which is only checked every 100 Filch.Write calls.
This means that it is possible that the file grows far beyond
the specified max file size before moveContents is called.
* If both log files have data when New is called,
it also copies the entirety of cur into alt.
This can block the startup of a process copying lots of data
before the process can do any useful work.
* TryReadLine is implemented using bufio.Scanner.
Unfortunately, it will choke on any lines longer than
bufio.MaxScanTokenSize, rather than gracefully skip over them.
The re-implementation avoids a lot of these problems
by fundamentally eliminating the need for moveContent.
We enforce MaxFileSize by simply rotating the log files
whenever the current file exceeds MaxFileSize/2.
This is a constant-time operation regardless of file size.
To more gracefully handle lines longer than bufio.MaxScanTokenSize,
we skip over these lines (without growing the read buffer)
and report an error. This allows subsequent lines to be read.
In order to improve debugging, we add a lot of metrics.
Note that the the mechanism of dup2 with stderr
is inherently racy with a the two file approach.
The order of operations during a rotation is carefully chosen
to reduce the race window to be as short as possible.
Thus, this is slightly less racy than before.
Updates tailscale/corp#21363
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
When receiving a TSMPDiscoAdvertisement from peer, update the discokey
for said peer.
Some parts taken from: https://github.com/tailscale/tailscale/pull/18073/
Updates #12639
Co-authored-by: James Tucker <james@tailscale.com>
Re-instate the linking of iptables installed in Tailscale container
to the legacy iptables version. In environments where the legacy
iptables is not needed, we should be able to run nftables instead,
but this will ensure that Tailscale keeps working in environments
that don't support nftables, such as some Synology NAS hosts.
Updates #17854
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Add --certmode=gcp for using Google Cloud Certificate Manager's
public CA instead of Let's Encrypt. GCP requires External Account
Binding (EAB) credentials for ACME registration, so this adds
--acme-eab-kid and --acme-eab-key flags.
The EAB key accepts both base64url and standard base64 encoding
to support both ACME spec format and gcloud output.
Fixestailscale/corp#34881
Signed-off-by: Raj Singh <raj@tailscale.com>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When using the resolve.conf file for setting DNS, it is possible that
some other services will trample the file and overwrite our set DNS
server. Experiments has shown this to be a racy error depending on how
quickly processes start.
Make an attempt to trample back the file a limited number of times if
the file is changed.
Updates #16635
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
When peers request an IP address mapping to be stored, the connector
stores it in memory.
Fixestailscale/corp#34251
Signed-off-by: Fran Bull <fran@tailscale.com>
To save rebuilding cigocacher on each CI job, build it on-demand, and
publish a release similar to how we publish releases for tool/go to
consume. Once the first release is done, we can add a new
tool/cigocacher script that pins to a specific release for each branch
to download.
Updates tailscale/corp#10808
Change-Id: I7694b2c2240020ba2335eb467522cdd029469b6c
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
It appears (*controlclient.Auto).Shutdown() can still deadlock when called with b.mu held, and therefore the changes in #18127 are unsafe.
This reverts #18127 until we figure out what causes it.
This reverts commit d199ecac80.
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This improves our test coverage of the Bootstrap() method, especially
around catching AUMs that shouldn't pass validation.
Updates #cleanup
Change-Id: Idc61fcbc6daaa98c36d20ec61e45ce48771b85de
Signed-off-by: Alex Chan <alexc@tailscale.com>
Previously, if users attempted to expose a headless Service to tailnet,
this just silently did not work.
This PR makes the operator throw a warning event + update Service's
status with an error message.
Updates #18139
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The event queue gets deleted events, which means that sometimes
the object that should be reconciled no longer exists.
Don't log user facing errors if that is the case.
Updates #18141
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The service was starting after systemd itself, and while this
surprisingly worked for some situations, it broke for others.
Change it to start after a GUI has been initialized.
Updates #17656
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Previously we only set this when it updated, which was fine for the first
call to Start(), but after that point future updates would be skipped if
nothing had changed. If Start() was called again, it would wipe the peer API
endpoints and they wouldn't get added back again, breaking exit nodes (and
anything else requiring peer API to be advertised).
Updates tailscale/corp#27173
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
Based on PR #16700 by @lox, adapted to current codebase.
Adds support for proxying HTTP requests to Unix domain sockets via
tailscale serve unix:/path/to/socket, enabling exposure of services
like Docker, containerd, PHP-FPM over Tailscale without TCP bridging.
The implementation includes reasonable protections against exposure of
tailscaled's own socket.
Adaptations from original PR:
- Use net.Dialer.DialContext instead of net.Dial for context propagation
- Use http.Transport with Protocols API (current h2c approach, not http2.Transport)
- Resolve conflicts with hasScheme variable in ExpandProxyTargetValue
Updates #9771
Signed-off-by: Peter A. <ink.splatters@pm.me>
Co-authored-by: Lachlan Donald <lachlan@ljd.cc>
If a packet arrives while WireGuard is being reconfigured with b.mu held, such as during a profile switch,
calling back into (*LocalBackend).GetPeerAPIPort from (*Wrapper).filterPacketInboundFromWireGuard
may deadlock when it tries to acquire b.mu.
This occurs because a peer cannot be removed while an inbound packet is being processed.
The reconfig and profile switch wait for (*Peer).RoutineSequentialReceiver to return, but it never finishes
because GetPeerAPIPort needs b.mu, which the waiting goroutine already holds.
In this PR, we make peerAPIPorts a new syncs.AtomicValue field that is written with b.mu held
but can be read by GetPeerAPIPort without holding the mutex, which fixes the deadlock.
There might be other long-term ways to address the issue, such as moving peer API listeners
from LocalBackend to nodeBackend so they can be accessed without holding b.mu,
but these changes are too large and risky at this stage in the v1.92 release cycle.
Updates #18124
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Previously, callers of (*LocalBackend).resetControlClientLocked were supposed
to call Shutdown on the returned controlclient.Client after releasing b.mu.
In #17804, we started calling Shutdown while holding b.mu, which caused
deadlocks during profile switches due to the (*ExecQueue).RunSync implementation.
We first patched this in #18053 by calling Shutdown in a new goroutine,
which avoided the deadlocks but made TestStateMachine flaky because
the shutdown order was no longer guaranteed.
In #18070, we updated (*ExecQueue).RunSync to allow shutting down
the queue without waiting for RunSync to return. With that change,
shutting down the control client while holding b.mu became safe.
Therefore, this PR updates (*LocalBackend).resetControlClientLocked
to shut down the old client synchronously during the reset, instead of
returning it and shifting that responsibility to the callers.
This fixes the flaky tests and simplifies the code.
Fixes#18052
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This commit uses SO_REUSEPORT (when supported) to bind multiple sockets
per address family. Increasing the number of sockets can increase
aggregate throughput when serving many peer relay client flows.
Benchmarks show 3x improvement in max aggregate bitrate in some
environments.
Updates tailscale/corp#34745
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Add support for pinning specific Tailscale versions during installation
via the TAILSCALE_VERSION environment variable.
Example usage:
curl -fsSL https://tailscale.com/install.sh | TAILSCALE_VERSION=1.88.4 sh
Fixes#17776
Signed-off-by: Raj Singh <raj@tailscale.com>
111 is 3 years old, and there have been a lot of speed improvements
since then. We run wasm-opt twice as part of the CI wasm job, and it
currently takes about 3 minutes each time. With 125, it takes ~40
seconds, a 4.5x speed-up.
Updates #cleanup
Change-Id: I671ae6cefa3997a23cdcab6871896b6b03e83a4f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Implements a new disk put function for cigocacher that does not cause
locking issues on Windows when there are multiple processes reading and
writing the same files concurrently. Integrates cigocacher into test.yml
for Windows where we are running on larger runners that support
connecting to private Azure vnet resources where cigocached is hosted.
Updates tailscale/corp#10808
Change-Id: I0d0e9b670e49e0f9abf01ff3d605cd660dd85ebb
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
The cache artifacts from a full run of test.yml are 14GB. Only save
artifacts from the main branch to ensure we don't thrash too much. Most
branches should get decent performance with a hit from recent main.
Fixestailscale/corp#34739
Change-Id: Ia83269d878e4781e3ddf33f1db2f21d06ea2130f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Thanks to seamless key renewal, you can now do a force-reauth without
losing your connection in all circumstances. We softened the interactive
warning (see #17262) so let's soften the help text as well.
Updates https://github.com/tailscale/corp/issues/32429
Signed-off-by: Alex Chan <alexc@tailscale.com>
* cmd/k8s-operator: add support for taiscale.com/http-redirect
The k8s-operator now supports a tailscale.com/http-redirect annotation
on Ingress resources. When enabled, this automatically creates port 80
handlers that automatically redirect to the equivalent HTTPS location.
Fixes#11252
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
* Fix for permanent redirect
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
* lint
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
* warn for redirect+endpoint
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
* tests
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
---------
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
Restrict running the golangci-lint workflow to when the workflow file
itself or a .go file, go.mod, or go.sum have actually been modified.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Skip the "request review" workflows for PRs that are in draft to reduce
noise / skip adding reviewers to PRs that are intentionally marked as
not ready to review.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Adds an observation point that may identify potentially abusive traffic
patterns at outlier values.
Updates tailscale/corp#24681
Signed-off-by: James Tucker <james@tailscale.com>
We don't hold q.mu while running normal ExecQueue.Add funcs, so we
shouldn't in RunSync either. Otherwise code it calls can't shut down
the queue, as seen in #18502.
Updates #18052
Co-authored-by: Nick Khyl <nickk@tailscale.com>
Change-Id: Ic5e53440411eca5e9fabac7f4a68a9f6ef026de1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This patch adds an integration test for Tailnet Lock, checking that a node can't
talk to peers in the tailnet until it becomes signed.
This patch also introduces a new package `tstest/tkatest`, which has some helpers
for constructing a mock control server that responds to TKA requests. This allows
us to reduce boilerplate in the IPN tests.
Updates tailscale/corp#33599
Signed-off-by: Alex Chan <alexc@tailscale.com>
In preparation for exposing its configuration via ipn.ConfigVAlpha,
change {Masked}Prefs.RelayServerPort from *int to *uint16. This takes a
defensive stance against invalid inputs at JSON decode time.
'tailscale set --relay-server-port' is currently the only input to this
pref, and has always sanitized input to fit within a uint16.
Updates tailscale/corp#34591
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Adds a new types of TSMP messages for advertising disco keys keys
to/from a peer, and implements the advertising triggered by a TSMP ping.
Needed as part of the effort to cache the netmap and still let clients
connect without control being reachable.
Updates #12639
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Co-authored-by: James Tucker <james@tailscale.com>
In suggestExitNodeLocked, if no exit node candidates have a home DERP or
valid location info, `bestCandidates` is an empty slice. This slice is
passed to `selectNode` (`randomNode` in prod):
```go func randomNode(nodes views.Slice[tailcfg.NodeView], …) tailcfg.NodeView {
…
return nodes.At(rand.IntN(nodes.Len()))
}
```
An empty slice becomes a call to `rand.IntN(0)`, which panics.
This patch changes the behaviour, so if we've filtered out all the
candidates before calling `selectNode`, reset the list and then pick
from any of the available candidates.
This patch also updates our tests to give us more coverage of `randomNode`,
so we can spot other potential issues.
Updates #17661
Change-Id: I63eb5e4494d45a1df5b1f4b1b5c6d5576322aa72
Signed-off-by: Alex Chan <alexc@tailscale.com>
And fix up the TestAutoUpdateDefaults integration tests as they
weren't testing reality: the DefaultAutoUpdate is supposed to only be
relevant on the first MapResponse in the stream, but the tests weren't
testing that. They were instead injecting a 2nd+ MapResponse.
This changes the test control server to add a hook to modify the first
map response, and then makes the test control when the node goes up
and down to make new map responses.
Also, the test now runs on macOS where the auto-update feature being
disabled would've previously t.Skipped the whole test.
Updates #11502
Change-Id: If2319bd1f71e108b57d79fe500b2acedbc76e1a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In PR tailscale/corp#34401, the `traffic-steering` feature flag does
not automatically enable traffic steering for all nodes. Instead, an
admin must add the `traffic-steering` node attribute to each client
node that they want opted-in.
For backwards compatibility with older clients, tailscale/corp#34401
strips out the `traffic-steering` node attribute if the feature flag
is not enabled, even if it is set in the policy file. This lets us
safely disable the feature flag.
This PR adds a missing test case for suggested exit nodes that have no
priority.
Updates tailscale/corp#34399
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This commit fixes a bug in our HA ingress reconciler where ingress resources would
be stuck in a deleting state should their associated VIP service be deleted within
control.
The reconciliation loop would check for the existence of the VIP service and if not
found perform no additional cleanup steps. The code has been modified to continue
onwards even if the VIP service is not found.
Fixes: https://github.com/tailscale/tailscale/issues/18049
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit replaces crypto/rand challenge generation with a blake2s-256
MAC. This enables the peer relay server to respond to multiple forward
disco.BindUDPRelayEndpoint messages per handshake generation without
sacrificing the proof of IP ownership properties of the handshake.
Responding to multiple forward disco.BindUDPRelayEndpoint messages per
handshake generation improves client address/path selection where
lowest client->server path/addr one-way delay does not necessarily
equate to lowest client<->server round trip delay.
It also improves situations where outbound traffic is filtered
independent of input, and the first reply
disco.BindUDPRelayEndpointChallenge message is dropped on the reply
path, but a later reply using a different source would make it through.
Reduction in serverEndpoint state saves 112 bytes per instance, trading
for slightly more expensive crypto ops: 277ns/op vs 321ns/op on an M1
Macbook Pro.
Updates tailscale/corp#34414
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Adds cmd/cigocacher as the client to cigocached for Go caching over
HTTP. The HTTP cache is best-effort only, and builds will fall back to
disk-only cache if it's not available, much like regular builds.
Not yet used in CI; that will follow in another PR once we have runners
available in this repo with the right network setup for reaching
cigocached.
Updates tailscale/corp#10808
Change-Id: I13ae1a12450eb2a05bd9843f358474243989e967
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
When the underlying transport returns a network error, the RoundTrip
method returns (nil, error). The defer was attempting to access resp
without checking if it was nil first, causing a panic. Fix this by
checking for nil in the defer.
Also changes driveTransport.tr from *http.Transport to http.RoundTripper
and adds a test.
Fixes#17306
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
Change-Id: Icf38a020b45aaa9cfbc1415d55fd8b70b978f54c
SetSubnetRoutes was not sending update notifications to nodes when their
approved routes changed, causing nodes to not fetch updated netmaps with
PrimaryRoutes populated. This resulted in TestUserMetricsRouteGauges
flaking because it waited for PrimaryRoutes to be set, which only happened
if the node happened to poll for other reasons.
Now send updateSelfChanged notification to affected nodes so they fetch
an updated netmap immediately.
Fixes#17962
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
Linux kernel versions 6.6.102-104 and 6.12.42-45 have a regression
in /proc/net/tcp that causes seek operations to fail with "illegal seek".
This breaks portlist tests on these kernels.
Add kernel version detection for Linux systems and a SkipOnKernelVersions
helper to tstest. Use it to skip affected portlist tests on the broken
kernel versions.
Thanks to philiptaron for the list of kernels with the issue and fix.
Updates #16966
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
Bounded DeliveredEvent queues reduce memory usage, but they can deadlock under load.
Two common scenarios trigger deadlocks when the number of events published in a short
period exceeds twice the queue capacity (there's a PublishedEvent queue of the same size):
- a subscriber tries to acquire the same mutex as held by a publisher, or
- a subscriber for A events publishes B events
Avoiding these scenarios is not practical and would limit eventbus usefulness and reduce its adoption,
pushing us back to callbacks and other legacy mechanisms. These deadlocks already occurred in customer
devices, dev machines, and tests. They also make it harder to identify and fix slow subscribers and similar
issues we have been seeing recently.
Choosing an arbitrary large fixed queue capacity would only mask the problem. A client running
on a sufficiently large and complex customer environment can exceed any meaningful constant limit,
since event volume depends on the number of peers and other factors. Behavior also changes
based on scheduling of publishers and subscribers by the Go runtime, OS, and hardware, as the issue
is essentially a race between publishers and subscribers. Additionally, on lower-end devices,
an unreasonably high constant capacity is practically the same as using unbounded queues.
Therefore, this PR changes the event queue implementation to be unbounded by default.
The PublishedEvent queue keeps its existing capacity of 16 items, while subscribers'
DeliveredEvent queues become unbounded.
This change fixes known deadlocks and makes the system stable under load,
at the cost of higher potential memory usage, including cases where a queue grows
during an event burst and does not shrink when load decreases.
Further improvements can be implemented in the future as needed.
Fixes#17973Fixes#18012
Signed-off-by: Nick Khyl <nickk@tailscale.com>
As of 2025-11-20, publishing more events than the eventbus's
internal queues can hold may deadlock if a subscriber tries
to publish events itself.
This commit adds a test that demonstrates this deadlock,
and skips it until the bug is fixed.
Updates #18012
Signed-off-by: Nick Khyl <nickk@tailscale.com>
As of 2025-11-20, publishing more events than the eventbus's
internal queues can hold may deadlock if a subscriber tries
to acquire a mutex that can also be held by a publisher.
This commit adds a test that demonstrates this deadlock,
and skips it until the bug is fixed.
Updates #17973
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This is causing confusing panics in tailscale/corp#34485. We'll keep
using the tka.ChonkMem constructor as much as we can, but don't panic
if you create a tka.Mem directly -- we know what the sensible thing is.
Updates #cleanup
Signed-off-by: Alex Chan <alexc@tailscale.com>
Change-Id: I49309f5f403fc26ce4f9a6cf0edc8eddf6a6f3a4
With the introduction of node sealing, store.New fails in some cases due
to the TPM device being reset or unavailable. Currently it results in
tailscaled crashing at startup, which is not obvious to the user until
they check the logs.
Instead of crashing tailscaled at startup, start with an in-memory store
with a health warning about state initialization and a link to (future)
docs on what to do. When this health message is set, also block any
login attempts to avoid masking the problem with an ephemeral node
registration.
Updates #15830
Updates #17654
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
These validations were previously performed in the CLI frontend. There
are two motivations for moving these to the local backend:
1. The backend controls synchronization around the relevant state, so
only the backend can guarantee many of these validations.
2. Doing these validations in the back-end avoids the need to repeat
them across every frontend (e.g. the CLI and tsnet).
Updates tailscale/corp#27200
Signed-off-by: Harry Harpham <harry@tailscale.com>
This commit adds the `spec.replicas` field to the `Recorder` custom
resource that allows for a highly available deployment of `tsrecorder`
within a kubernetes cluster.
Many changes were required here as the code hard-coded the assumption
of a single replica. This has required a few loops, similar to what we
do for the `Connector` resource to create auth and state secrets. It
was also required to add a check to remove dangling state and auth
secrets should the recorder be scaled down.
Updates: https://github.com/tailscale/tailscale/issues/17965
Signed-off-by: David Bond <davidsbond93@gmail.com>
fixestailscale/tailscale#17990
The logging for the netns caps is spammy. Log only on changes
to the values and don't log Darwin specific stuff on non Darwin
clients.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
This commit modifies the kubernetes operator to use the "stable" version
of `k8s-nameserver` by default.
Updates: https://github.com/tailscale/corp/issues/19028
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit enables user to set service backend to remote destinations, that can be a partial
URL or a full URL. The commit also prevents user to set remote destinations on linux system
when socket mark is not working. For user on any version of mac extension they can't serve a
service either. The socket mark usability is determined by a new local api.
Fixestailscale/corp#24783
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
Now that we support using an in-memory backend for TKA state (#17946),
this function always returns `nil` – we can always support Network Lock.
We don't need it any more.
Plus, clean up a couple of errant TODOs from that PR.
Updates tailscale/corp#33599
Change-Id: Ief93bb9adebb82b9ad1b3e406d1ae9d2fa234877
Signed-off-by: Alex Chan <alexc@tailscale.com>
Our style guide recommends avoiding Latin abbreviations in technical
documentation, which includes the CLI help text. This is causing linter
issues for the docs site, because this help text is copied into the docs.
See http://go/style-guide/kb/language-and-grammar/abbreviations#latin-abbreviations
Updates #cleanup
Change-Id: I980c28d996466f0503aaaa65127685f4af608039
Signed-off-by: Alex Chan <alexc@tailscale.com>
ArgoCD sends boolean values but the template expects strings, causing
"incompatible types for comparison" errors. Wrap values with toString
so both work.
Fixes#17158
Signed-off-by: Raj Singh <raj@tailscale.com>
Previously a TKA compaction would only run when a node starts, which means a long-running node could use unbounded storage as it accumulates ever-increasing amounts of TKA state. This patch changes TKA so it runs a compaction after every sync.
Updates https://github.com/tailscale/corp/issues/33537
Change-Id: I91df887ea0c5a5b00cb6caced85aeffa2a4b24ee
Signed-off-by: Alex Chan <alexc@tailscale.com>
This commit modifies the helm/static manifest configuration for the
k8s-operator to prefer the stable image tag. This avoids making those
using static manifests seeing unstable behaviour by default if they
do not manually make the change.
This is managed for us when using helm but not when generating the
static manifests.
Updates https://github.com/tailscale/tailscale/issues/10655
Signed-off-by: David Bond <davidsbond93@gmail.com>
(trying to get in smaller obvious chunks ahead of later PRs to make
them smaller)
Updates #17925
Change-Id: I184002001055790484e4792af8ffe2a9a2465b2e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We now embed node information into network flow logs.
By default, netlogfmt still prints out using Tailscale IP addresses.
Support a "--resolve-addrs=TYPE" flag that can be used to specify
resolving IP addresses as node IDs, hostnames, users, or tags.
Updates tailscale/corp#33352
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Adds the ability to rotate discovery keys on running clients, needed for
testing upcoming disco key distribution changes.
Introduces key.DiscoKey, an atomic container for a disco private key,
public key, and the public key's ShortString, replacing the prior
separate atomic fields.
magicsock.Conn has a new RotateDiscoKey method, and access to this is
provided via localapi and a CLI debug command.
Note that this implementation is primarily for testing as it stands, and
regular use should likely introduce an additional mechanism that allows
the old key to be used for some time, to provide a seamless key rotation
rather than one that invalidates all sessions.
Updates tailscale/corp#34037
Signed-off-by: James Tucker <james@tailscale.com>
As part of the conn25 work we will want to be able to keep track of a
pool of IP Addresses and know which have been used and which have not.
Fixestailscale/corp#34247
Signed-off-by: Fran Bull <fran@tailscale.com>
We use `tka.AUMHash` in `netmap.NetworkMap`, and we serialise it as JSON
in the `/debug/netmap` C2N endpoint. If the binary omits Tailnet Lock support,
the debug endpoint returns an error because it's unable to marshal the
AUMHash.
This patch adds a sentinel value so this marshalling works, and we can
use the debug endpoint.
Updates https://github.com/tailscale/tailscale/issues/17115
Signed-off-by: Alex Chan <alexc@tailscale.com>
Change-Id: I51ec1491a74e9b9f49d1766abd89681049e09ce4
Existing compaction logic seems to have had an assumption that
markActiveChain would cover a longer part of the chain than
markYoungAUMs. This prevented long, but fresh, chains, from being
compacted correctly.
Updates tailscale/corp#33537
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
6a73c0bdf5 added a feature tag but didn't re-run go generate on ./feature/buildfeatures.
Updates #9192
Change-Id: I7819450453e6b34c60cad29d2273e3e118291643
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I added a RemoveAll() method on tka.Chonk in #17946, but it's only used
in the node to purge local AUMs. We don't need it in the SQLite storage,
which currently implements tka.Chonk, so move it to CompactableChonk
instead.
Also add some automated tests, as a safety net.
Updates tailscale/corp#33599
Change-Id: I54de9ccf1d6a3d29b36a94eccb0ebd235acd4ebc
Signed-off-by: Alex Chan <alexc@tailscale.com>
The REST API does not return a node name
with a trailing dot, while the internal node name
reported in the netmap does have one.
In order to be consistent with the API,
strip the dot when recording node information.
Updates tailscale/corp#33352
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Perform a path check first before attempting exec of `true`.
Try /usr/bin/true first, as that is now and increasingly so, the more
common and more portable path.
Fixes tests on macOS arm64 where exec was returning a different kind of
path error than previously checked.
Updates #16569
Signed-off-by: James Tucker <james@tailscale.com>
DA protection is not super helpful because we don't set an authorization
password on the key. But if authorization fails for other reasons (like
TPM being reset), we will eventually cause DA lockout with tailscaled
trying to load the key. DA lockout then leads to (1) issues for other
processes using the TPM and (2) the underlying authorization error being
masked in logs.
Updates #17654
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
For manual (human) testing, this lets the user disable control plane
map polls with "tailscale set --sync=false" (which survives restarts)
and "tailscale set --sync" to restore.
A high severity health warning is shown while this is active.
Updates #12639
Updates #17945
Change-Id: I83668fa5de3b5e5e25444df0815ec2a859153a6d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Let's fix all the typos, which lets the code be more readable, lest we
confuse our readers.
Updates #cleanup
Change-Id: I4954601b0592b1fda40269009647bb517a4457be
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This requires making the internals of LocalBackend a bit more generic,
and implementing the `tka.CompactableChonk` interface for `tka.Mem`.
Signed-off-by: Alex Chan <alexc@tailscale.com>
Updates https://github.com/tailscale/corp/issues/33599
Pick up a fix for https://pkg.go.dev/vuln/GO-2025-4116 (even though
we're not affected).
Updates #cleanup
Change-Id: I9f2571b17c1f14db58ece8a5a34785805217d9dd
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Includes adding StartPaused, which will be used in a future change to
enable netmap caching testing.
Updates #12639
Change-Id: Iec39915d33b8d75e9b8315b281b1af2f5d13a44a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This patch changes the behaviour of `tailscale lock log --json` to make
it more useful for users. It also introduces versioning of our JSON output.
## Changes to `tailscale lock log --json`
Previously this command would print the hash and base64-encoded bytes of
each AUM, and users would need their own CBOR decoder to interpret it in
a useful way:
```json
[
{
"Hash": [
80,
136,
151,
…
],
"Change": "checkpoint",
"Raw": "pAEFAvYFpQH2AopYIAkPN+8V3cJpkoC5ZY2+RI2Bcg2q5G7tRAQQd67W3YpnWCDPOo4KGeQBd8hdGsjoEQpSXyiPdlm+NXAlJ5dS1qEbFlggylNJDQM5ZQ2ULNsXxg2ZBFkPl/D93I1M56/rowU+UIlYIPZ/SxT9EA2Idy9kaCbsFzjX/s3Ms7584wWGbWd/f/QAWCBHYZzYiAPpQ+NXN+1Wn2fopQYk4yl7kNQcMXUKNAdt1lggcfjcuVACOH0J9pRNvYZQFOkbiBmLOW1hPKJsbC1D1GdYIKrJ38XMgpVMuTuBxM4YwoLmrK/RgXQw1uVEL3cywl3QWCA0FilVVv8uys8BNhS62cfNvCew1Pw5wIgSe3Prv8d8pFggQrwIt6ldYtyFPQcC5V18qrCnt7VpThACaz5RYzpx7RNYIKskOA7UoNiVtMkOrV2QoXv6EvDpbO26a01lVeh8UCeEA4KjAQECAQNYIORIdNHqSOzz1trIygnP5w3JWK2DtlY5NDIBbD7SKcjWowEBAgEDWCD27LpxiZNiA19k0QZhOWmJRvBdK2mz+dHu7rf0iGTPFwQb69Gt42fKNn0FGwRUiav/k6dDF4GiAVgg5Eh00epI7PPW2sjKCc/nDclYrYO2Vjk0MgFsPtIpyNYCWEDzIAooc+m45ay5PB/OB4AA9Fdki4KJq9Ll+PF6IJHYlOVhpTbc3E0KF7ODu1WURd0f7PXnW72dr89CSfGxIHAF"
}
]
```
Now we print the AUM in an expanded form that can be easily read by scripts,
although we include the raw bytes for verification and auditing.
```json
{
"SchemaVersion": "1",
"Messages": [
{
"Hash": "KCEJPRKNSXJG2TPH3EHQRLJNLIIK2DV53FUNPADWA7BZJWBDRXZQ",
"AUM": {
"MessageKind": "checkpoint",
"PrevAUMHash": null,
"Key": null,
"KeyID": null,
"State": {
…
},
"Votes": null,
"Meta": null,
"Signatures": [
{
"KeyID": "tlpub:e44874d1ea48ecf3d6dac8ca09cfe70dc958ad83b656393432016c3ed229c8d6",
"Signature": "8yAKKHPpuOWsuTwfzgeAAPRXZIuCiavS5fjxeiCR2JTlYaU23NxNChezg7tVlEXdH+z151u9na/PQknxsSBwBQ=="
}
]
},
"Raw": "pAEFAvYFpQH2AopYIAkPN-8V3cJpkoC5ZY2-RI2Bcg2q5G7tRAQQd67W3YpnWCDPOo4KGeQBd8hdGsjoEQpSXyiPdlm-NXAlJ5dS1qEbFlggylNJDQM5ZQ2ULNsXxg2ZBFkPl_D93I1M56_rowU-UIlYIPZ_SxT9EA2Idy9kaCbsFzjX_s3Ms7584wWGbWd_f_QAWCBHYZzYiAPpQ-NXN-1Wn2fopQYk4yl7kNQcMXUKNAdt1lggcfjcuVACOH0J9pRNvYZQFOkbiBmLOW1hPKJsbC1D1GdYIKrJ38XMgpVMuTuBxM4YwoLmrK_RgXQw1uVEL3cywl3QWCA0FilVVv8uys8BNhS62cfNvCew1Pw5wIgSe3Prv8d8pFggQrwIt6ldYtyFPQcC5V18qrCnt7VpThACaz5RYzpx7RNYIKskOA7UoNiVtMkOrV2QoXv6EvDpbO26a01lVeh8UCeEA4KjAQECAQNYIORIdNHqSOzz1trIygnP5w3JWK2DtlY5NDIBbD7SKcjWowEBAgEDWCD27LpxiZNiA19k0QZhOWmJRvBdK2mz-dHu7rf0iGTPFwQb69Gt42fKNn0FGwRUiav_k6dDF4GiAVgg5Eh00epI7PPW2sjKCc_nDclYrYO2Vjk0MgFsPtIpyNYCWEDzIAooc-m45ay5PB_OB4AA9Fdki4KJq9Ll-PF6IJHYlOVhpTbc3E0KF7ODu1WURd0f7PXnW72dr89CSfGxIHAF"
}
]
}
```
This output was previously marked as unstable, and it wasn't very useful,
so changing it should be fine.
## Versioning our JSON output
This patch introduces a way to version our JSON output on the CLI, so we
can make backwards-incompatible changes in future without breaking existing
scripts or integrations.
You can run this command in two ways:
```
tailscale lock log --json
tailscale lock log --json=1
```
Passing an explicit version number allows you to pick a specific JSON schema.
If we ever want to change the schema, we increment the version number and
users must opt-in to the new output.
A bare `--json` flag will always return schema version 1, for compatibility
with existing scripts.
Updates https://github.com/tailscale/tailscale/issues/17613
Updates https://github.com/tailscale/corp/issues/23258
Signed-off-by: Alex Chan <alexc@tailscale.com>
Change-Id: I897f78521cc1a81651f5476228c0882d7b723606
This adds the --proxy-protocol flag to 'tailscale serve' and
'tailscale funnel', which tells the Tailscale client to prepend a PROXY
protocol[1] header when making connections to the proxied-to backend.
I've verified that this works with our existing funnel servers without
additional work, since they pass along source address information via
PeerAPI already.
Updates #7747
[1]: https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
Change-Id: I647c24d319375c1b33e995555a541b7615d2d203
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
It's an unnecessary nuisance having it. We go out of our way to redact
it in so many places when we don't even need it there anyway.
Updates #12639
Change-Id: I5fc72e19e9cf36caeb42cf80ba430873f67167c3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Remove the State enum (StateNew, StateNotAuthenticated, etc.) from
controlclient and replace it with two explicit boolean fields:
- LoginFinished: indicates successful authentication
- Synced: indicates we've received at least one netmap
This makes the state more composable and easier to reason about, as
multiple conditions can be true independently rather than being
encoded in a single enum value.
The State enum was originally intended as the state machine for the
whole client, but that abstraction moved to ipn.Backend long ago.
This change continues moving away from the legacy state machine by
representing state as a combination of independent facts.
Also adds test helpers in ipnlocal that check independent, observable
facts (hasValidNetMap, needsLogin, etc.) rather than relying on
derived state enums, making tests more robust.
Updates #12639
Signed-off-by: James Tucker <james@tailscale.com>
The key.NewEmptyHardwareAttestationKey hook returns a non-nil empty
attestationKey, which means that the nil check in Clone doesn't trigger
and proceeds to try and clone an empty key. Check IsZero instead to
reduce log spam from Clone.
As a drive-by, make tpmAvailable check a sync.Once because the result
won't change.
Updates #17882
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Most /etc/os-release files set the VERSION_ID to a `MAJOR.MINOR`
string, but we were trying to compare this numerically against a major
version number. I can only assume that Linux Mint used switched from a
plain integer, since shells only do integer comparisons.
This patch extracts a VERSION_MAJOR from the VERSION_ID using
parameter expansion and unifies all the other ad-hoc comparisons to
use it.
Fixes#15841
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Co-authored-by: Xavier <xhienne@users.noreply.github.com>
LinkChangeLogLimiter keeps a subscription to track rate limits for log
messages. But when its context ended, it would exit the subscription loop,
leaving the subscriber still alive. Ensure the subscriber gets cleaned up
when the context ends, so we don't stall event processing.
Updates tailscale/corp#34311
Change-Id: I82749e482e9a00dfc47f04afbc69dd0237537cb2
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
On the corp tailnet (using Mullvad exit nodes + bunch of expired
devices + subnet routers), these were generating big ~35 KB blobs of
logging regularly.
This logging shouldn't even exist at this level, and should be rate
limited at a higher level, but for now as a bandaid, make it less
spammy.
Updates #cleanup
Change-Id: I0b5e9e6e859f13df5f982cd71cd5af85b73f0c0a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When TS_LOG_TARGET is set to an invalid URL, url.Parse returns an error
and nil pointer, which caused a panic when accessing u.Host.
Now we check the error from url.Parse and log a helpful message while
falling back to the default log host.
Fixes#17792
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
As a baby step towards eventbus-ifying controlclient, make the
Observer optional.
This also means callers that don't care (like this network lock test,
and some tests in other repos) can omit it, rather than passing in a
no-op one.
Updates #12639
Change-Id: Ibd776b45b4425c08db19405bc3172b238e87da4e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit replaces usage of local.Client in net/udprelay with DERPMap
plumbing over the eventbus. This has been a longstanding TODO. This work
was also accelerated by a memory leak in net/http when using
local.Client over long periods of time. So, this commit also addresses
said leak.
Updates #17801
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Instead of trying to call View() on something that's already a View
type (or trying to Clone the view unnecessarily), we can re-use the
existing View values in a map[T]ViewType.
Fixes#17866
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
They distracted me in some refactoring. They're set but never used.
Updates #17858
Change-Id: I6ec7d6841ab684a55bccca7b7cbf7da9c782694f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
updates tailscale/corp#31571
It appears that on the latest macOS, iOS and tVOS versions, the work
that netns is doing to bind outgoing connections to the default interface (and all
of the trimmings and workarounds in netmon et al that make that work) are
not needed. The kernel is extension-aware and doing nothing, is the right
thing. This is, however, not the case for tailscaled (which is not a
special process).
To allow us to test this assertion (and where it might break things), we add a
new node cap that turns this behaviour off only for network-extension equipped clients,
making it possible to turn this off tailnet-wide, without breaking any tailscaled
macos nodes.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
I noticed a deadlock in a test in a in-development PR where during a
shutdown storm of things (from a tsnet.Server.Close), LocalBackend was
trying to call magicsock.Conn.Synchronize but the magicsock and/or
eventbus was already shut down and no longer processing events.
Updates #16369
Change-Id: I58b1f86c8959303c3fb46e2e3b7f38f6385036f1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Unfortunately I closed the tab and lost it in my sea of CI failures
I'm currently fighting.
Updates #cleanup
Change-Id: I4e3a652d57d52b75238f25d104fc1987add64191
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So they're not all run N times on the sharded oss builders
and are only run one time each.
Updates tailscale/corp#28679
Change-Id: Ie21e84b06731fdc8ec3212eceb136c8fc26b0115
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When systemd notification support was omitted from the build, or on
non-Linux systems, we were unnecessarily emitting code and generating
garbage stringifying addresses upon transition to the Running state.
Updates #12614
Change-Id: If713f47351c7922bb70e9da85bf92725b25954b9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This removes one of the O(n=peers) allocs in getStatus, as
Engine.getStatus happens more often than Reconfig.
Updates #17814
Change-Id: I8a87fbebbecca3aedadba38e46cc418fd163c2b0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously if `chains` was empty, it would be passed to `computeActiveAncestor()`,
which would fail with the misleading error "multiple distinct chains".
Updates tailscale/corp#33846
Signed-off-by: Alex Chan <alexc@tailscale.com>
Change-Id: Ib93a755dbdf4127f81cbf69f3eece5a388db31c8
* lock released early just to call `b.send` when it can call
`b.sendToLocked` instead
* `UnlockEarly` called to release the lock before trivially fast
operations, we can wait for a defer there
Updates #11649
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
It was disabled in May 2024 in #12205 (9eb72bb51).
This removes the unused symbols.
Updates #188
Updates tailscale/corp#19106
Updates tailscale/corp#19116
Change-Id: I5208b7b750b18226ed703532ed58c4ea17195a8e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Use GetGlobalAddrs() to discover all STUN endpoints, handling bad NATs
that create multiple mappings. When MappingVariesByDestIP is true, also
add the first STUN IPv4 address with the relay's local port for static
port mapping scenarios.
Updates #17796
Signed-off-by: Raj Singh <raj@tailscale.com>
The feature is currently in private alpha, so requires a tailnet feature
flag. Initially focuses on supporting the operator's own auth, because the
operator is the only device we maintain that uses static long-lived
credentials. All other operator-created devices use single-use auth keys.
Testing steps:
* Create a cluster with an API server accessible over public internet
* kubectl get --raw /.well-known/openid-configuration | jq '.issuer'
* Create a federated OAuth client in the Tailscale admin console with:
* The issuer from the previous step
* Subject claim `system:serviceaccount:tailscale:operator`
* Write scopes services, devices:core, auth_keys
* Tag tag:k8s-operator
* Allow the Tailscale control plane to get the public portion of
the ServiceAccount token signing key without authentication:
* kubectl create clusterrolebinding oidc-discovery \
--clusterrole=system:service-account-issuer-discovery \
--group=system:unauthenticated
* helm install --set oauth.clientId=... --set oauth.audience=...
Updates #17457
Change-Id: Ib29c85ba97b093c70b002f4f41793ffc02e6c6e9
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Now that the feature is in beta, no one should encounter this error.
Updates #cleanup
Change-Id: I69ed3f460b7f28c44da43ce2f552042f980a0420
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This starts running the jsontags vet checker on the module.
All existing findings are adding to an allowlist.
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
The cmd/jsontags is non-idiomatic since it is not a main binary.
Move it to a vet directory, which will eventually contain a vettool binary.
Update tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Include the node's OS with network flow log information.
Refactor the JSON-length computation to be a bit more precise.
Updates tailscale/corp#33352Fixestailscale/corp#34030
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Prior to this change a SubscriberFunc treated the call to the subscriber's
function as the completion of delivery. But that means when we are closing the
subscriber, that callback could continue to execute for some time after the
close returns.
For channel-based subscribers that works OK because the close takes effect
before the subscriber ever sees the event. To make the two subscriber types
symmetric, we should also wait for the callback to finish before returning.
This ensures that a Close of the client means the same thing with both kinds of
subscriber.
Updates #17638
Change-Id: I82fd31bcaa4e92fab07981ac0e57e6e3a7d9d60b
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Add options to the eventbus.Bus to plumb in a logger.
Route that logger in to the subscriber machinery, and trigger a log message to
it when a subscriber fails to respond to its delivered events for 5s or more.
The log message includes the package, filename, and line number of the call
site that created the subscription.
Add tests that verify this works.
Updates #17680
Change-Id: I0546516476b1e13e6a9cf79f19db2fe55e56c698
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
In particular on Windows, the `transport.TPMCloser` we get is not safe
for concurrent use. This is especially noticeable because
`tpm.attestationKey.Clone` uses the same open handle as the original
key. So wrap the operations on ak.tpm with a mutex and make a deep copy
with a new connection in Clone.
Updates #15830
Updates #17662
Updates #17644
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Specify the app apability that failed the test, instead of the
entire comma-separated list.
Fixes #cleanup
Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
In #17639 we moved the subscription into NewLogger to ensure we would not race
subscribing with shutdown of the eventbus client. Doing so fixed that problem,
but exposed another: As we were only servicing events occasionally when waiting
for the network to come up, we could leave the eventbus to stall in cases where
a number of network deltas arrived later and weren't processed.
To address that, let's separate the concerns: As before, we'll Subscribe early
to avoid conflicts with shutdown; but instead of using the subscriber directly
to determine readiness, we'll keep track of the last-known network state in a
selectable condition that the subscriber updates for us. When we want to wait,
we'll wait on that condition (or until our context ends), ensuring all the
events get processed in a timely manner.
Updates #17638
Updates #15160
Change-Id: I28339a372be4ab24be46e2834a218874c33a0d2d
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Adds a new Redirect field to HTTPHandler for serving HTTP redirects
from the Tailscale serve config. The redirect URL supports template
variables ${HOST} and ${REQUEST_URI} that are resolved per request.
By default, it redirects using HTTP Status 302 (Found). For another
redirect status, like 301 - Moved Permanently, pass the HTTP status
code followed by ':' on Redirect, like: "301:https://tailscale.com"
Updates #11252
Updates #11330
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
In 3f5c560fd4 I changed to use std net/http's HTTP/2 support,
instead of pulling in x/net/http2.
But I forgot to update DialTLSContext to DialContext, which meant it
was falling back to using the std net.Dialer for its dials, instead
of the passed-in one.
The tests only passed because they were using localhost addresses, so
the std net.Dialer worked. But in prod, where a tsnet Dialer would be
needed, it didn't work, and would time out for 10 seconds before
resorting to the old protocol.
So this fixes the tests to use an isolated in-memory network to prevent
that class of problem in the future. With the test change, the old code
fails and the new code passes.
Thanks to @jasonodonnell for debugging!
Updates #17304
Updates 3f5c560fd4
Change-Id: I3602bafd07dc6548e2c62985af9ac0afb3a0e967
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Single letter 'l' variables can eventually become confusing when
they're rendered in some fonts that make them similar to 1 or I.
Updates #cleanup
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
A follow-up to #17411. Put AppConnector events into a task queue, as they may
take some time to process. Ensure that the queue is stopped at shutdown so that
cleanup will remain orderly.
Because events are delivered on a separate goroutine, slow processing of an
event does not cause an immediate problem; however, a subscriber that blocks
for a long time will push back on the bus as a whole. See
https://godoc.org/tailscale.com/util/eventbus#hdr-Expected_subscriber_behavior
for more discussion.
Updates #17192
Updates #15160
Change-Id: Ib313cc68aec273daf2b1ad79538266c81ef063e3
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This migrates an internal tool to open source
so that we can run it on the tailscale.com module as well.
This PR does not yet set up a CI to run this analyzer.
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
This rewrites the netlog package to support embedding node information in network flow logs.
Some bit of complexity comes in trying to pre-compute the expected size of the log message
after JSON serialization to ensure that we can respect maximum body limits in log uploading.
We also fix a bug in tstun, where we were recording the IP address after SNAT,
which was resulting in non-sensible connection flows being logged.
Updates tailscale/corp#33352
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
This migrates an internal tool to open source
so that we can run it on the tailscale.com module as well.
We add the "util/safediff" also as a dependency of the tool.
This PR does not yet set up a CI to run this analyzer.
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Found by staticcheck, the test was calling derphttp.NewClient but not checking
its error result before doing other things to it.
Updates #cleanup
Change-Id: I4ade35a7de7c473571f176e747866bc0ab5774db
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This reverts commit 4346615d77.
We averted the shutdown race, but will need to service the subscriber even when
we are not waiting for a change so that we do not delay the bus as a whole.
Updates #17638
Change-Id: I5488466ed83f5ad1141c95267f5ae54878a24657
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Drop usage of the branches filter with a single asterisk as this matches
against zero or more characters but not a forward slash, resulting in
PRs to branch names with forwards slashes in them not having these
workflow run against them as expected.
Updates https://github.com/tailscale/corp/issues/33523
Signed-off-by: Mario Minardi <mario@tailscale.com>
Also consolidates variable and header naming and amends the
CLI behavior
* multiple app-caps have to be specified as comma-separated
list
* simple regex-based validation of app capability names is
carried out during flag parsing
Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
Given that we filter based on the usercaps argument now, truncation
should not be necessary anymore.
Updates tailscale/corp/#28372
Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
Temporarily back out the TPM-based hw attestation code while we debug
Windows exceptions.
Updates tailscale/corp#31269
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
When the eventbus is enabled, set up the subscription for change deltas at the
beginning when the client is created, rather than waiting for the first
awaitInternetUp check.
Otherwise, it is possible for a check to race with the client close in
Shutdown, which triggers a panic.
Updates #17638
Change-Id: I461c07939eca46699072b14b1814ecf28eec750c
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This compares the warnings we actually care about and skips the unstable
warnings and the changes with no warnings.
Fixes#17635
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
If you run tailscaled without passing a `--statedir`, Tailnet Lock is
unavailable -- we don't have a folder to store the AUMs in.
This causes a lot of unnecessary requests to bootstrap TKA, because
every time the node receives a NetMap with some TKA state, it tries to
bootstrap, fetches the bootstrap TKA state from the control plane, then
fails with the error:
TKA sync error: bootstrap: network-lock is not supported in this
configuration, try setting --statedir
We can't prevent the error, but we can skip the control plane request
that immediately gets dropped on the floor.
In local testing, a new node joining a tailnet caused *three* control
plane requests which were unused.
Updates tailscale/corp#19441
Signed-off-by: Alex Chan <alexc@tailscale.com>
This fixes a regression from dd615c8fdd that moved the
newIPTablesRunner constructor from a any-Linux-GOARCH file to one that
was only amd64 and arm64, thus breaking iptables on other platforms
(notably 32-bit "arm", as seen on older Pis running Buster with
iptables)
Tested by hand on a Raspberry Pi 2 w/ Buster + iptables for now, for
lack of automated 32-bit arm tests at the moment. But filed #17629.
Fixes#17623
Updates #17629
Change-Id: Iac1a3d78f35d8428821b46f0fed3f3717891c1bd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
On some platforms e.g. ChromeOS the owner hierarchy might not always be
available to us. To avoid stale sealing exceptions later we probe to
confirm it's working rather than rely solely on family indicator status.
Updates #17622
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Check that the TPM we have opened is advertised as a 2.0 family device
before using it for state sealing / hardware attestation.
Updates #17622
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
This reformats the existing text to have line breaks at sentences. This
commit contains no textual changes to the code of conduct, but is done
to make any subsequent changes easier to review. (sembr.org)
Also apply prettier formatting for consistency.
Updates #cleanup
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
* When we do the TKA sync, log whether TKA is enabled and whether
we want it to be enabled. This would help us see if a node is
making bootstrap errors.
* When we fail to look up an AUM locally, log the ID of the AUM
rather than a generic "file does not exist" error.
These AUM IDs are cryptographic hashes of the TKA state, which
itself just contains public keys and signatures. These IDs aren't
sensitive and logging them is safe.
Signed-off-by: Alex Chan <alexc@tailscale.com>
Updates https://github.com/tailscale/corp/issues/33594
Service hosts must be tagged nodes, meaning it is only valid to
advertise a Service from a machine which has at least one ACL tag.
Fixestailscale/corp#33197
Signed-off-by: Harry Harpham <harry@tailscale.com>
If users start the application with sudo, DBUS is likely not available
or will not have the correct endpoints. We want to warn users when doing
this.
Closes#17593
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
This does not change which subscriptions are made, it only swaps them to use
the SubscribeFunc API instead of Subscribe.
Updates #15160
Updates #17487
Change-Id: Id56027836c96942206200567a118f8bcf9c07f64
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This patch creates a set of tests that should be true for all implementations of Chonk and CompactableChonk, which we can share with the SQLite implementation in corp.
It includes all the existing tests, plus a test for LastActiveAncestor which was in corp but not in oss.
Updates https://github.com/tailscale/corp/issues/33465
Signed-off-by: Alex Chan <alexc@tailscale.com>
Previously, running `tailscale lock log` in a tailnet without Tailnet
Lock enabled would return a potentially confusing error:
$ tailscale lock log
2025/10/20 11:07:09 failed to connect to local Tailscale service; is Tailscale running?
It would return this error even if Tailscale was running.
This patch fixes the error to be:
$ tailscale lock log
Tailnet Lock is not enabled
Fixes#17586
Signed-off-by: Alex Chan <alexc@tailscale.com>
Add new arguments to `tailscale up` so authkeys can be generated dynamically via identity federation.
Updates #9192
Signed-off-by: mcoulombe <max@tailscale.com>
* Remove a couple of single-letter `l` variables
* Use named struct parameters in the test cases for readability
* Delete `wantAfterInactivityForFn` parameter when it returns the
default zero
Updates #cleanup
Signed-off-by: Alex Chan <alexc@tailscale.com>
We soft-delete AUMs when they're purged, but when we call `ChildAUMs()`,
we look up soft-deleted AUMs to find the `Children` field.
This patch changes the behaviour of `ChildAUMs()` so it only looks at
not-deleted AUMs. This means we don't need to record child information
on AUMs any more, which is a minor space saving for any newly-recorded
AUMs.
Updates https://github.com/tailscale/tailscale/issues/17566
Updates https://github.com/tailscale/corp/issues/27166
Signed-off-by: Alex Chan <alexc@tailscale.com>
This method was added in cca25f6 in the initial in-memory implementation
of Chonk, but it's not part of the Chonk interface and isn't implemented
or used anywhere else. Let's get rid of it.
Updates https://github.com/tailscale/corp/issues/33465
Signed-off-by: Alex Chan <alexc@tailscale.com>
This commit modifies the k8s-operator's api proxy implementation to only
enable forwarding of api requests to tsrecorder when an environment
variable is set.
This new environment variable is named `TS_EXPERIMENTAL_KUBE_API_EVENTS`.
Updates https://github.com/tailscale/corp/issues/32448
Signed-off-by: David Bond <davidsbond93@gmail.com>
Merge the connstats package into the netlog package
and unexport all of its declarations.
Remove the buildfeatures.HasConnStats and use HasNetLog instead.
Updates tailscale/corp#33352
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
The connstats package was an unnecessary layer of indirection.
It was seperated out of wgengine/netlog so that net/tstun and
wgengine/magicsock wouldn't need a depenedency on the concrete
implementation of network flow logging.
Instead, we simply register a callback for counting connections.
This PR does the bare minimum work to prepare tstun and magicsock
to only care about that callback.
A future PR will delete connstats and merge it into netlog.
Updates tailscale/corp#33352
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Remove CBOR representation since it was never used.
We should support CBOR in the future, but for remove it
for now so that it is less work to add more fields.
Also, rely on just omitzero for JSON now that it is supported in Go 1.24.
Updates tailscale/corp#33352
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
I was debugging a customer issue and saw in their 1.88.3 logs:
TPM: error opening: stat /dev/tpm0: no such file or directory
That's unnecessary output. The lack of TPM will be reported by
them having a nil Hostinfo.TPM, which is plenty elsewhere in logs.
Let's only write out an "error opening" line if it's an interesting
error. (perhaps permissions, or EIO, etc)
Updates #cleanup
Change-Id: I3f987f6bf1d3ada03473ca3eef555e9cfafc7677
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Before synctest, timers was needed to allow the events to flow into the
test bus. There is still a timer, but this one is not derived from the
test deadline and it is mostly arbitrary as synctest will render it
practically non-existent.
With this approach, tests that do not need to test for the absence of
events do not rely on synctest.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
On Windows arm64 we are going to need to ship two different GUI builds;
one for Win10 (GOARCH=386) and one for Win11 (GOARCH=amd64, tags +=
winui). Due to quirks in MSI packaging, they cannot both share the
same filename. This requires some fixes in places where we have
hardcoded "tailscale-ipn" as the GUI filename.
We also do some cleanup in clientupdate to ensure that autoupdates
will continue to work correctly with the temporary "-winui" package
variant.
Fixes#17480
Updates https://github.com/tailscale/corp/issues/29940
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Extend Persist with AttestationKey to record a hardware-backed
attestation key for the node's identity.
Add a flag to tailscaled to allow users to control the use of
hardware-backed keys to bind node identity to individual machines.
Updates tailscale/corp#31269
Change-Id: Idcf40d730a448d85f07f1bebf387f086d4c58be3
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
The default representation of time.Duration has different
JSON representation between v1 and v2.
Apply an explicit format flag that uses the v1 representation
so that this behavior does not change if serialized with v2.
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
updates tailscale/tailscale#16836
Android's altNetInterfaces implementation now returns net.IPAddr
types which netmon wasn't handling.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
With a channel subscriber, the subscription processing always occurs on another
goroutine. The SubscriberFunc (prior to this commit) runs its callbacks on the
client's own goroutine. This changes the semantics, though: In addition to more
directly pushing back on the publisher, a publisher and subscriber can deadlock
in a SubscriberFunc but succeed on a Subscriber. They should behave
equivalently regardless which interface they use.
Arguably the caller should deal with this by creating its own goroutine if it
needs to. However, that loses much of the benefit of the SubscriberFunc API, as
it will need to manage the lifecycle of that goroutine. So, for practical
ergonomics, let's make the SubscriberFunc do this management on the user's
behalf. (We discussed doing this in #17432, but decided not to do it yet). We
can optimize this approach further, if we need to, without changing the API.
Updates #17487
Change-Id: I19ea9e8f246f7b406711f5a16518ef7ff21a1ac9
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This commit adds the subcommands `get-config` and `set-config` to Serve,
which can be used to read the current Tailscale Services configuration
in a standard syntax and provide a configuration to declaratively apply
with that same syntax.
Both commands must be provided with either `--service=svc:service` for
one service, or `--all` for all services. When writing a config,
`--set-config --all` will overwrite all existing Services configuration,
and `--set-config --service=svc:service` will overwrite all
configuration for that particular Service. Incremental changes are not
supported.
Fixestailscale/corp#30983.
cmd/tailscale/cli: hide serve "get-config"/"set-config" commands for now
tailscale/corp#33152 tracks unhiding them when docs exist.
Signed-off-by: Naman Sood <mail@nsood.in>
when tsrecorder receives events, it populates this field with
information about the node the request was sent to.
Updates #17141
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
The lazy init led to confusion and a belief that was something was
wrong. It's reasonable to expect the daemon to listen on the port at the
time it's configured.
Updates tailscale/corp#33094
Signed-off-by: Jordan Whited <jordan@tailscale.com>
I got sidetracked apparently and never finished writing this Clone
code in 316afe7d02 (#17448). (It really should use views instead.)
And then I missed one of the users of "routerChanged" that was broken up
into "routerChanged" vs "dnsChanged".
This broke integration tests elsewhere.
Fixes#17506
Change-Id: I533bf0fcf3da9ac6eb4a6cdef03b8df2c1fb4c8e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Update Nix flake to use go 1.25.2
Create the hash from the toolchain rev file automatically from
update-flake.sh
Updates tailscale/go#135
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
This patch fixes several issues related to printing login and device
approval URLs, especially when `tailscale up` is interrupted:
1. Only print a login URL that will cause `tailscale up` to complete.
Don't print expired URLs or URLs from previous login attempts.
2. Print the device approval URL if you run `tailscale up` after
previously completing a login, but before approving the device.
3. Use the correct control URL for device approval if you run a bare
`tailscale up` after previously completing a login, but before
approving the device.
4. Don't print the device approval URL more than once (or at least,
not consecutively).
Updates tailscale/corp#31476
Updates #17361
## How these fixes work
This patch went through a lot of trial and error, and there may still
be bugs! These notes capture the different scenarios and considerations
as we wrote it, which are also captured by integration tests.
1. We were getting stale login URLs from the initial IPN state
notification.
When the IPN watcher was moved to before Start() in c011369, we
mistakenly continued to request the initial state. This is only
necessary if you start watching after you call Start(), because
you may have missed some notifications.
By getting the initial state before calling Start(), we'd get
a stale login URL. If you clicked that URL, you could complete
the login in the control server (if it wasn't expired), but your
instance of `tailscale up` would hang, because it's listening for
login updates from a different login URL.
In this patch, we no longer request the initial state, and so we
don't print a stale URL.
2. Once you skip the initial state from IPN, the following sequence:
* Run `tailscale up`
* Log into a tailnet with device approval
* ^C after the device approval URL is printed, but without approving
* Run `tailscale up` again
means that nothing would ever be printed.
`tailscale up` would send tailscaled the pref `WantRunning: true`,
but that was already the case so nothing changes. You never get any
IPN notifications, and in particular you never get a state change to
`NeedsMachineAuth`. This means we'd never print the device approval URL.
In this patch, we add a hard-coded rule that if you're doing a simple up
(which won't trigger any other IPN notifications) and you start in the
`NeedsMachineAuth` state, we print the device approval message without
waiting for an IPN notification.
3. Consider the following sequence:
* Run `tailscale up --login-server=<custom server>`
* Log into a tailnet with device approval
* ^C after the device approval URL is printed, but without approving
* Run `tailscale up` again
We'd print the device approval URL for the default control server,
rather than the real control server, because we were using the `prefs`
from the CLI arguments (which are all the defaults) rather than the
`curPrefs` (which contain the custom login server).
In this patch, we use the `prefs` if the user has specified any settings
(and other code will ensure this is a complete set of settings) or
`curPrefs` if it's a simple `tailscale up`.
4. Consider the following sequence: you've logged in, but not completed
device approval, and you run `down` and `up` in quick succession.
* `up`: sees state=NeedsMachineAuth
* `up`: sends `{wantRunning: true}`, prints out the device approval URL
* `down`: changes state to Stopped
* `up`: changes state to Starting
* tailscaled: changes state to NeedsMachineAuth
* `up`: gets an IPN notification with the state change, and prints
a second device approval URL
Either URL works, but this is annoying for the user.
In this patch, we track whether the last printed URL was the device
approval URL, and if so, we skip printing it a second time.
Signed-off-by: Alex Chan <alexc@tailscale.com>
This patch extends the integration tests for `tailscale up` to include tailnets
where new devices need to be approved. It doesn't change the CLI, because it's
mostly working correctly already -- these tests are just to prevent future
regressions.
I've added support for `MachineAuthorized` to mock control, and I've refactored
`TestOneNodeUpAuth` to be more flexible. It now takes a sequence of steps to
run and asserts whether we got a login URL and/or machine approval URL after
each step.
Updates tailscale/corp#31476
Updates #17361
Signed-off-by: Alex Chan <alexc@tailscale.com>
This commit also shuffles the hasPeerRelayServers atomic load
to happen sooner, reducing the cost for clients with no peer relay
servers.
Updates tailscale/corp#33099
Signed-off-by: Jordan Whited <jordan@tailscale.com>
The hijacker on k8s-proxy's reverse proxy is used to stream recordings
to tsrecorder as they pass through the proxy to the kubernetes api
server. The connection to the recorder was using the client's
(e.g., kubectl) context, rather than a dedicated one. This was causing
the recording stream to get cut off in scenarios where the client
cancelled the context before streaming could be completed.
By using a dedicated context, we can continue streaming even if the
client cancels the context (for example if the client request
completes).
Fixes#17404
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Originally proposed by @bradfitz in #17413.
In practice, a lot of subscribers have only one event type of interest, or a
small number of mostly independent ones. In that case, the overhead of running
and maintaining a goroutine to select on multiple channels winds up being more
noisy than we'd like for the user of the API.
For this common case, add a new SubscriberFunc[T] type that delivers events to
a callback owned by the subscriber, directly on the goroutine belonging to the
client itself. This frees the consumer from the need to maintain their own
goroutine to pull events from the channel, and to watch for closure of the
subscriber.
Before:
s := eventbus.Subscribe[T](eventClient)
go func() {
for {
select {
case <-s.Done():
return
case e := <-s.Events():
doSomethingWith(e)
}
}
}()
// ...
s.Close()
After:
func doSomethingWithT(e T) { ... }
s := eventbus.SubscribeFunc(eventClient, doSomethingWithT)
// ...
s.Close()
Moreover, unless the caller wants to explicitly stop the subscriber separately
from its governing client, it need not capture the SubscriberFunc value at all.
One downside of this approach is that a slow or deadlocked callback could block
client's service routine and thus stall all other subscriptions on that client,
However, this can already happen more broadly if a subscriber fails to service
its delivery channel in a timely manner, it just feeds back more immediately.
Updates #17487
Change-Id: I64592d786005177aa9fd445c263178ed415784d5
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Since #17376, containerboot crashes on startup in k8s because state
encryption is enabled by default without first checking that it's
compatible with the selected state store. Make sure we only default
state encryption to enabled if it's not going to immediately clash with
other bits of tailscaled config.
Updates tailscale/corp#32909
Change-Id: I76c586772750d6da188cc97b647c6e0c1a8734f0
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Saves ~94 KB from the min build.
Updates #12614
Change-Id: I3b0b8a47f80b9fd3b1038c2834b60afa55bf02c2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Part of making all netlink monitoring code optional.
Updates #17311 (how I got started down this path)
Updates #12614
Change-Id: Ic80d8a7a44dc261c4b8678b3c2241c3b3778370d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Also pull out interface method only needed in Linux.
Instead of having userspace do the call into the router, just let the
router pick up the change itself.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Before we introduced seamless, the "blocked" state was used to track:
* Whether a login was required for connectivity, and therefore we should
keep the engine deconfigured until that happened
* Whether authentication was in progress
"blocked" would stop authReconfig from running. We want this when a login is
required: if your key has expired we want to deconfigure the engine and keep
it down, so that you don't keep using exit nodes (which won't work because
your key has expired).
Taking the engine down while auth was in progress was undesirable, so we
don't do that with seamless renewal. However, not entering the "blocked"
state meant that we needed to change the logic for when to send
LoginFinished on the IPN bus after seeing StateAuthenticated from the
controlclient. Initially we changed the "if blocked" check to "if blocked or
seamless is enabled" which was correct in other places.
In this place however, it introduced a bug: we are sending LoginFinished
every time we see StateAuthenticated, which happens even on a down & up, or
a profile switch. This in turn made it harder for UI clients to track when
authentication is complete.
Instead we should only send it out if we were blocked (i.e. seamless is
disabled, or our key expired) or an auth was in progress.
Updates tailscale/corp#31476
Updates tailscale/corp#32645
Fixes#17363
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
Saves 45 KB from the min build, no longer pulling in deephash or
util/hashx, both with unsafe code.
It can actually be more efficient to not use deephash, as you don't
have to walk all bytes of all fields recursively to answer that two
things are not equal. Instead, you can just return false at the first
difference you see. And then with views (as we use ~everywhere
nowadays), the cloning the old value isn't expensive, as it's just a
pointer under the hood.
Updates #12614
Change-Id: I7b08616b8a09b3ade454bb5e0ac5672086fe8aec
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Historically, and until recently, --extra-small produced a usable build.
When I recently made osrouter be modular in 39e35379d4 (which is
useful in, say, tsnet builds) after also making netstack modular, that
meant --min now lacked both netstack support for routing and system
support for routing, making no way to get packets into
wireguard. That's not a nice default to users. (we've documented
build_dist.sh in our KB)
Restore --extra-small to making a usable build, and add --min for
benchmarking purposes.
Updates #12614
Change-Id: I649e41e324a36a0ca94953229c9914046b5dc497
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Some of the test cases access fields of the backend that are supposed to be
locked while the test is running, which can trigger the race detector. I fixed
a few of these in #17411, but I missed these two cases.
Updates #15160
Updates #17192
Change-Id: I45664d5e34320ecdccd2844e0f8b228145aaf603
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Saves ~53 KB from the min build.
Updates #12614
Change-Id: I73f9544a9feea06027c6ebdd222d712ada851299
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add subscribers for AppConnector events
Make the RouteAdvertiser interface optional We cannot yet remove it because
the tests still depend on it to verify correctness. We will need to separately
update the test fixtures to remove that dependency.
Publish RouteInfo via the event bus, so we do not need a callback to do that.
Replace it with a flag that indicates whether to treat the route info the connector
has as "definitive" for filtering purposes.
Update the tests to simplify the construction of AppConnector values now that a
store callback is no longer required. Also fix a couple of pre-existing racy tests that
were hidden by not being concurrent in the same way production is.
Updates #15160
Updates #17192
Change-Id: Id39525c0f02184e88feaf0d8a3c05504850e47ee
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
If we received a wg engine status while processing an auth URL, there was a
race condition where the authURL could be reset to "" immediately after we
set it.
To fix this we need to check that we are moving from a non-Running state to
a Running state rather than always resetting the URL when we "move" into a
Running state even if that is the current state.
We also need to make sure that we do not return from stopEngineAndWait until
the engine is stopped: before, we would return as soon as we received any
engine status update, but that might have been an update already in-flight
before we asked the engine to stop. Now we wait until we see an update that
is indicative of a stopped engine, or we see that the engine is unblocked
again, which indicates that the engine stopped and then started again while
we were waiting before we checked the state.
Updates #17388
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
Co-authored-by: Nick Khyl <nickk@tailscale.com>
Saves ~102 KB from the min build.
Updates #12614
Change-Id: Ie1d4f439321267b9f98046593cb289ee3c4d6249
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Due to iOS memory limitations in 2020 (see
https://tailscale.com/blog/go-linker, etc) and wireguard-go using
multiple goroutines per peer, commit 16a9cfe2f4 introduced some
convoluted pathsways through Tailscale to look at packets before
they're delivered to wireguard-go and lazily reconfigure wireguard on
the fly before delivering a packet, only telling wireguard about peers
that are active.
We eventually want to remove that code and integrate wireguard-go's
configuration with Tailscale's existing netmap tracking.
To make it easier to find that code later, this makes it modular. It
saves 12 KB (of disk) to turn it off (at the expense of lots of RAM),
but that's not really the point. The point is rather making it obvious
(via the new constants) where this code even is.
Updates #12614
Change-Id: I113b040f3e35f7d861c457eaa710d35f47cee1cb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Explain that this file stays forked from coder/websocket until we can
depend on an upstream release for the helper.
Updates #cleanup
Signed-off-by: kscooo <kscowork@gmail.com>
Switching to a Geneve-encapsulated (peer relay) path in
endpoint.handlePongConnLocked is expected around port rebinds, which end
up clearing endpoint.bestAddr.
Fixestailscale/corp#33036
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Saves only 12 KB, but notably removes some deps on packages that future
changes can then eliminate entirely.
Updates #12614
Change-Id: Ibf830d3ee08f621d0a2011b1d4cd175427ef50df
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
c2n was already a conditional feature, but it didn't have a
feature/c2n directory before (rather, it was using consts + DCE). This
adds it, and moves some code, which removes the httprec dependency.
Also, remove some unnecessary code from our httprec fork.
Updates #12614
Change-Id: I2fbe538e09794c517038e35a694a363312c426a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
As found by @cmol in #17423.
Updates #17423
Change-Id: I1492501f74ca7b57a8c5278ea6cb87a56a4086b9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Saves 86 KB.
And stop depending on expvar and usermetrics when disabled,
in prep to removing all the expvar/metrics/tsweb stuff.
Updates #12614
Change-Id: I35d2479ddd1d39b615bab32b1fa940ae8cbf9b11
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This patch removes some code that didn’t get removed before merging
the changes in #16580.
Updates #cleanup
Updates #16551
Signed-off-by: Simon Law <sfllaw@tailscale.com>
kubestore init function has now been moved to a more explicit path of
ipn/store/kubestore meaning we can now avoid the generic import of
feature/condregister.
Updates #12614
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
When running integration tests on macOS, we get a panic from a nil
pointer dereference when calling `ci.creds.PID()`.
This panic occurs because the `ci.creds != nil` check is insufficient
after a recent refactoring (c45f881) that changed `ci.creds` from a
pointer to the `PeerCreds` interface. Now `ci.creds` always compares as
non-nil, so we enter this block even when the underlying value is nil.
The integration tests fail on macOS when `peercred.Get()` returns the
error `unix.GetsockoptInt: socket is not connected`. This error isn't
new, and the previous code was ignoring it correctly.
Since we trust that `peercred` returns either a usable value or an error,
checking for a nil error is a sufficient and correct gate to prevent the
method call and avoid the panic.
Fixes#17421
Signed-off-by: Alex Chan <alexc@tailscale.com>
In the earlier http2 package migration (1d93bdce20, #17394) I had
removed Direct.Close's tracking of the connPool, thinking it wasn't
necessary.
Some tests (in another repo) are strict and like it to tear down the
world and wait, to check for leaked goroutines. And they caught this
letting some goroutines idle past Close, even if they'd eventually
close down on their own.
This restores the connPool accounting and the aggressife close.
Updates #17305
Updates #17394
Change-Id: I5fed283a179ff7c3e2be104836bbe58b05130cc7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The control plane will sometimes determine that a node is not online,
while the node is still able to connect to its peers. This patch
doesn’t solve this problem, but it does mitigate it.
This PR introduces the `client-side-reachability` node attribute that
switches the node to completely ignore the online signal from control.
In the future, the client itself should collect reachability data from
active Wireguard flows and Tailscale pings.
Updates #17366
Updates tailscale/corp#30379
Updates tailscale/corp#32686
Signed-off-by: Simon Law <sfllaw@tailscale.com>
A recent change (009d702adf) introduced a deadlock where the
/machine/update-health network request to report the client's health
status update to the control plane was moved to being synchronous
within the eventbus's pump machinery.
I started to instead make the health reporting be async, but then we
realized in the three years since we added that, it's barely been used
and doesn't pay for itself, for how many HTTP requests it makes.
Instead, delete it all and replace it with a c2n handler, which
provides much more helpful information.
Fixestailscale/corp#32952
Change-Id: I9e8a5458269ebfdda1c752d7bbb8af2780d71b04
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Saves 262 KB so far. I'm sure I missed some places, but shotizam says
these were the low hanging fruit.
Updates #12614
Change-Id: Ia31c01b454f627e6d0470229aae4e19d615e45e3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Maybe it matters? At least globally across all nodes?
Fixes#17343
Change-Id: I3f61758ea37de527e16602ec1a6e453d913b3195
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add and wire up event publishers for these two event types in the AppConnector.
Nothing currently subscribes to them, so this is harmless. Subscribers for
these events will be added in a near-future commit.
As part of this, move the appc.RouteInfo type to the types/appctype package.
It does not contain any package-specific details from appc. Beside it, add
appctype.RouteUpdate to carry route update event state, likewise not specific
to appc. Update all usage of the appc.* types throughout to use appctype.*
instead, and update depaware files to reflect these changes.
Add a Close method to the AppConnector to make sure the client gets cleaned up
when the connector is dropped (we re-create connectors).
Update the unit tests in the appc package to also check the events published
alongside calls to the RouteAdvertiser.
For now the tests still rely on the RouteAdvertiser for correctness; this is OK
for now as the two methods are always performed together. In the near future,
we need to rework the tests so not require that, but that will require building
some more test fixtures that we can handle separately.
Updates #15160
Updates #17192
Change-Id: I184670ba2fb920e0d2cb2be7c6816259bca77afe
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Instead of using separate channels to manage the lifecycle of the eventbus
client, use the recently-added eventbus.Monitor, which handles signaling the
processing loop to stop and waiting for it to complete. This allows us to
simplify some of the setup and cleanup code in the relay server.
Updates #15160
Change-Id: Ia1a47ce2e5a31bc8f546dca4c56c3141a40d67af
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Saves 352 KB, removing one of our two HTTP/2 implementations linked
into the binary.
Fixes#17305
Updates #15015
Change-Id: I53a04b1f2687dca73c8541949465038b69aa6ade
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add a .gitignore for the chart version of the CRDs that we never commit,
because the static manifest CRD files are the canonical version. This
makes it easier to deploy the CRDs via the helm chart in a way that
reflects the production workflow without making the git checkout
"dirty".
Given that the chart CRDs are ignored, we can also now safely generate
them for the kube-generate-all Makefile target without being a nuisance
to the state of the git checkout. Added a slightly more robust repo root
detection to the generation logic to make sure the command works from
the context of both the Makefile and the image builder command we run
for releases in corp.
Updates tailscale/corp#32085
Change-Id: Id44a4707c183bfaf95a160911ec7a42ffb1a1287
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
mkctr already has support for including extra files in the built
container image. Wire up a new optional environment variable to thread
that through to mkctr. The operator e2e tests will use this to bake
additional trusted CAs into the test image without significantly
departing from the normal build or deployment process for our
containers.
Updates tailscale/corp#32085
Change-Id: Ica94ed270da13782c4f5524fdc949f9218f79477
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Using memnet and synctest removes flakiness caused by real networking
and subtle timing differences.
Additionally, remove the `t.Logf` call inside the server's shutdown
goroutine that was causing a false positive data race detection.
The race detector is flagging a double write during this `t.Logf` call.
This is a common pattern, noted in golang/go#40343 and elsehwere in
this file, where using `t.Logf` after a test has finished can interact
poorly with the test runner.
This is a long-standing issue which became more common after rewriting
this test to use memnet and synctest.
Fixed#17355
Signed-off-by: Alex Chan <alexc@tailscale.com>
Whenever running on a platform that has a TPM (and tailscaled can access
it), default to encrypting the state. The user can still explicitly set
this flag to disable encryption.
Updates https://github.com/tailscale/corp/issues/32909
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
A following change will split out the controlclient.NoiseClient type
out, away from the rest of the controlclient package which is
relatively dependency heavy.
A question was where to move it, and whether to make a new (a fifth!)
package in the ts2021 dependency chain.
@creachadair and I brainstormed and decided to merge
internal/noiseconn and controlclient.NoiseClient into one package,
with names ts2021.Conn and ts2021.Client.
For ease of reviewing the subsequent PR, this is the first step that
just renames the internal/noiseconn package to control/ts2021.
Updates #17305
Change-Id: Ib5ea162dc1d336c1d805bdd9548d1702dd6e1468
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
depaware was merging golang.org/x/foo and std's
vendor/golang.org/x/foo packages (which could both be in the binary!),
leading to confusing output, especially when I was working on
eliminating duplicate packages imported under different names.
This makes the depaware output longer and grosser, but doesn't hide
reality from us.
Updates #17305
Change-Id: I21cc3418014e127f6c1a81caf4e84213ce84ab57
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Require the presence of the bus, but do not use it yet. Check for required
fields and update tests and production use to plumb the necessary arguments.
Updates #15160
Updates #17192
Change-Id: I8cefd2fdb314ca9945317d3320bd5ea6a92e8dcb
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
The callback itself is not removed as it is used in other repos, making
it simpler for those to slowly transition to the eventbus.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Replace the positional arguments to NewAppConnector with a Config struct.
Update the existing uses. Other than the API change, there are no functional
changes in this commit.
Updates #15160
Updates #17192
Change-Id: Ibf37f021372155a4db8aaf738f4b4f2c746bf623
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
It never launched and I've lost hope of it launching and it's in my
way now, so I guess it's time to say goodbye.
Updates tailscale/corp#4383
Updates #17305
Change-Id: I2eb551d49f2fb062979cc307f284df4b3dfa5956
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This permits other programs (in other repos) to conditionally
import ipn/store/awsstore and/or ipn/store/kubestore and have them
register themselves, rather than feature/condregister doing it.
Updates tailscale/corp#32922
Change-Id: I2936229ce37fd2acf9be5bf5254d4a262d090ec1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The `put` callback runs on a different goroutine to the test, so calling
t.Fatalf in put had no effect. `drain` is always called when checking what
was put and is called from the test goroutine, so that's a good place to
fail the test if the channel was too full.
Updates #17363
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
https://github.com/tailscale/tailscale/pull/17346 moved the kube and aws
arn store initializations to feature/condregister, under the assumption
that anything using it would use kubestore.New. Unfortunately,
cmd/k8s-proxy makes use of store.New, which compares the `<prefix>:`
supplied in the provided `path string` argument against known stores. If
it doesn't find it, it fallsback to using a FileStore.
Since cmd/k8s-proxy uses store.New to try and initialize a kube store in
some cases (without importing feature/condregister), it silently creates
a FileStore and that leads to misleading errors further along in
execution.
This fixes this issue by importing condregister, and successfully
initializes a kube store.
Updates #12614
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Saves 442 KB. Lock it with a new min test.
Updates #12614
Change-Id: Ia7bf6f797b6cbf08ea65419ade2f359d390f8e91
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We will need this for unmarshaling node prefs: use the zero
HardwareAttestationKey implementation when parsing and later check
`IsZero` to see if anything was loaded.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
I'm trying to remove the "regexp" and "regexp/syntax" packages from
our minimal builds. But tsweb pulls in regexp (via net/http/pprof etc)
and util/eventbus was importing the tsweb for no reason.
Updates #12614
Change-Id: Ifa8c371ece348f1dbf80d6b251381f3ed39d5fbd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Even with ts_omit_drive, the drive package is currently still imported
for some types. So it should be light. But it was depending on the
"regexp" packge, which I'd like to remove from our minimal builds.
Updates #12614
Change-Id: I5bf85d8eb15a739793723b1da11c370d3fcd2f32
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The Tailscale CLI is the primary configuration interface and as such it
is used in scripts, container setups, and many other places that do not
have a terminal available and should not be made to respond to prompts.
The default is set to false where the "risky" API is being used by the
CLI and true otherwise, this means that the `--yes` flags are only
required under interactive runs and scripts do not need to be concerned
with prompts or extra flags.
Updates #19445
Signed-off-by: James Tucker <james@tailscale.com>
Saves 139 KB.
Also Synology support, which I saw had its own large-ish proxy parsing
support on Linux, but support for proxies without Synology proxy
support is reasonable, so I pulled that out as its own thing.
Updates #12614
Change-Id: I22de285a3def7be77fdcf23e2bec7c83c9655593
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In Dec 2021 in d3d503d997 I had grand plans to make exit node DNS
cheaper by using HTTP/2 over PeerAPI, at least on some platforms. I
only did server-side support though and never made it to the client.
In the ~4 years since, some things have happened:
* Go 1.24 got support for http.Protocols (https://pkg.go.dev/net/http#Protocols)
and doing UnencryptedHTTP2 ("HTTP2 with prior knowledge")
* The old h2c upgrade mechanism was deprecated; see https://github.com/golang/go/issues/63565
and https://github.com/golang/go/issues/67816
* Go plans to deprecate x/net/http2 and move everything to the standard library.
So this drops our use of the x/net/http2/h2c package and instead
enables h2c (on all platforms now) using the standard library.
This does mean we lose the deprecated h2c Upgrade support, but that's
fine.
If/when we do the h2c client support for ExitDNS, we'll have to probe
the peer to see whether it supports it. Or have it reply with a header
saying that future requests can us h2c. (It's tempting to use capver,
but maybe people will disable that support anyway, so we should
discover it at runtime instead.)
Also do the same in the sessionrecording package.
Updates #17305
Change-Id: If323f5ef32486effb18ed836888aa05c0efb701e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Saves 328 KB (2.5%) off the minimal binary.
For IoT devices that don't need MagicDNS (e.g. they don't make
outbound connections), this provides a knob to disable all the DNS
functionality.
Rather than a massive refactor today, this uses constant false values
as a deadcode sledgehammer, guided by shotizam to find the largest DNS
functions which survived deadcode.
A future refactor could make it so that the net/dns/resolver and
publicdns packages don't even show up in the import graph (along with
their imports) but really it's already pretty good looking with just
these consts, so it's not at the top of my list to refactor it more
soon.
Also do the same in a few places with the ACME (cert) functionality,
as I saw those while searching for DNS stuff.
Updates #12614
Change-Id: I8e459f595c2fde68ca16503ff61c8ab339871f97
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
DNS configuration support to ProxyClass, allowing users to customize DNS resolution for Tailscale proxy pods.
Fixes#16886
Signed-off-by: Raj Singh <raj@tailscale.com>
When I added dependency support to featuretag, I broke the handling of
the non-omit build tags (as used by the "box" support for bundling the
CLI into tailscaled). That then affected depaware. The
depaware-minbox.txt this whole time recently has not included the CLI.
So fix that, and also add a new depaware variant that's only the
daemon, without the CLI.
Updates #12614
Updates #17139
Change-Id: I4a4591942aa8c66ad8e3242052e3d9baa42902ca
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Otherwise they're uselessly imported by tsnet applications, even
though they do nothing. tsnet applications wanting to use these
already had to explicitly import them and use kubestore.New or
awsstore.New and assign those to their tsnet.Server.Store fields.
Updates #12614
Change-Id: I358e3923686ddf43a85e6923c3828ba2198991d4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Listen address reuse is allowed as soon as the previous listener is
closed. There is no attempt made to emulate more complex address reuse
logic.
Updates tailscale/corp#28078
Change-Id: I56be1c4848e7b3f9fc97fd4ef13a2de9dcfab0f2
Signed-off-by: Brian Palmer <brianp@tailscale.com>
So wgengine/router is just the docs + entrypoint + types, and then
underscore importing wgengine/router/osrouter registers the constructors
with the wgengine/router package.
Then tsnet can not pull those in.
Updates #17313
Change-Id: If313226f6987d709ea9193c8f16a909326ceefe7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Allow the user to access information about routes an app connector has
learned, such as how many routes for each domain.
Fixestailscale/corp#32624
Signed-off-by: Fran Bull <fran@tailscale.com>
Removes 434 KB from the minimal Linux binary, or ~3%.
Primarily this comes from not linking in the zstd encoding code.
Fixes#17323
Change-Id: I0a90de307dfa1ad7422db7aa8b1b46c782bfaaf7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit modifies the `DNSConfig` custom resource to allow specifying
a replica count when deploying a nameserver. This allows deploying
nameservers in a HA configuration.
Updates https://github.com/tailscale/corp/issues/32589
Signed-off-by: David Bond <davidsbond93@gmail.com>
As of the earlier 85febda86d, our new preferred zstd API of choice
is zstdframe.
Updates #cleanup
Updates tailscale/corp#18514
Change-Id: I5a6164d3162bf2513c3673b6d1e34cfae84cb104
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It has nothing to do with logtail and is confusing named like that.
Updates #cleanup
Updates #17323
Change-Id: Idd34587ba186a2416725f72ffc4c5778b0b9db4a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Now cmd/derper doesn't depend on iptables, nftables, and netlink code :)
But this is really just a cleanup step I noticed on the way to making
tsnet applications able to not link all the OS router code which they
don't use.
Updates #17313
Change-Id: Ic7b4e04e3a9639fd198e9dbeb0f7bae22a4a47a9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This PR cleans up a bunch of things in ./tstest/integration/vms:
- Bumps version of Ubuntu that's actually run from CI 20.04 -> 24.04
- Removes Ubuntu 18.04 test
- Bumps NixOS 21.05 -> 25.05
Updates#cleanup
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The dnstype package is used by tailcfg, which tries to be light and
leafy. But it brings in dnstype. So dnstype shouldn't bring in
x/net/dns/dnsmessage.
Updates #12614
Change-Id: I043637a7ce7fed097e648001f13ca1927a781def
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I noticed this while modularizing clientupdate. With this in first,
moving clientupdate to be modular removes a bunch more stuff from
the minimal build + tsnet.
Updates #17115
Change-Id: I44bd055fca65808633fd3a848b0bbc09b00ad4fa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
As part of making Tailscale's gvisor dependency optional for small builds,
this was one of the last places left that depended on gvisor. Just copy
the couple functions were were using.
Updates #17283
Change-Id: Id2bc07ba12039afe4c8a3f0b68f4d76d1863bbfe
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Baby steps. This permits building without much of gvisor, but not all of it.
Updates #17283
Change-Id: I8433146e259918cc901fe86b4ea29be22075b32c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This only saves ~32KB in the minimal linux/amd64 binary, but it's a
step towards permitting not depending on gvisor for small builds.
Updates #17283
Change-Id: Iae8da5e9465127de354dbcaf25e794a6832d891b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We can only register one key implementation per process. When running on
macOS or Android, trying to register a separate key implementation from
feature/tpm causes a panic.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
On platforms that are causing EPIPE at a high frequency this is
resulting in non-working connections, for example when Apple decides to
forcefully close UDP sockets due to an unsoliced packet rejection in the
firewall.
Too frequent rebinds cause a failure to solicit the endpoints triggering
the rebinds, that would normally happen via CallMeMaybe.
Updates #14551
Updates tailscale/corp#25648
Signed-off-by: James Tucker <james@tailscale.com>
This commit fixes a race condition where `tailscale up --force-reauth` would
exit prematurely on an already-logged in device.
Previously, the CLI would wait for IPN to report the "Running" state and then
exit. However, this could happen before the new auth URL was printed, leading
to two distinct issues:
* **Without seamless key renewal:** The CLI could exit immediately after
the `StartLoginInteractive` call, before IPN has time to switch into
the "Starting" state or send a new auth URL back to the CLI.
* **With seamless key renewal:** IPN stays in the "Running" state
throughout the process, so the CLI exits immediately without performing
any reauthentication.
The fix is to change the CLI's exit condition.
Instead of waiting for the "Running" state, if we're doing a `--force-reauth`
we now wait to see the node key change, which is a more reliable indicator
that a successful authentication has occurred.
Updates tailscale/corp#31476
Updates tailscale/tailscale#17108
Signed-off-by: Alex Chan <alexc@tailscale.com>
This partially reverts f3d2fd2.
When that patch was written, the goroutine that responds to IPN notifications
could call `StartLoginInteractive`, creating a race condition that led to
flaky integration tests. We no longer call `StartLoginInteractive` in that
goroutine, so the race is now impossible.
Moving the `WatchIPNBus` call earlier ensures the CLI gets all necessary
IPN notifications, preventing a reauth from hanging.
Updates tailscale/corp#31476
Signed-off-by: Alex Chan <alexc@tailscale.com>
A customer wants to allow their employees to restart tailscaled at will, when access rights and MDM policy allow it,
as a way to fully reset client state and re-create the tunnel in case of connectivity issues.
On Windows, the main tailscaled process runs as a child of a service process. The service restarts the child
when it exits (or crashes) until the service itself is stopped. Regular (non-admin) users can't stop the service,
and allowing them to do so isn't ideal, especially in managed or multi-user environments.
In this PR, we add a LocalAPI endpoint that instructs ipnserver.Server, and by extension the tailscaled process,
to shut down. The service then restarts the child tailscaled. Shutting down tailscaled requires LocalAPI write access
and an enabled policy setting.
Updates tailscale/corp#32674
Updates tailscale/corp#32675
Signed-off-by: Nick Khyl <nickk@tailscale.com>
And yay: tsnet (and thus k8s-operator etc) no longer depends on
portlist! And LocalBackend is smaller.
Removes 50 KB from the minimal binary.
Updates #12614
Change-Id: Iee04057053dc39305303e8bd1d9599db8368d926
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We made changes to ipnext callback registration/unregistration/invocation in #15780
that made resetting b.exthost to a nil, no-op host in (*LocalBackend).Shutdown() unnecessary.
But resetting it is also racy: b.exthost must be safe for concurrent use with or without b.mu held,
so it shouldn't be written after NewLocalBackend returns. This PR removes it.
Fixes#17279
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This change adds full IPv6 support to the Kubernetes operator's DNS functionality,
enabling dual-stack and IPv6-only cluster support.
Fixes#16633
Signed-off-by: Raj Singh <raj@tailscale.com>
Expand the integration tests to cover a wider range of scenarios, including:
* Before and after a successful initial login
* Auth URLs and auth keys
* With and without the `--force-reauth` flag
* With and without seamless key renewal
These tests expose a race condition when using `--force-reauth` on an
already-logged in device. The command completes too quickly, preventing
the auth URL from being displayed. This issue is identified and will be
fixed in a separate commit.
Updates #17108
Signed-off-by: Alex Chan <alexc@tailscale.com>
Ideally we would remove this warning entirely, as it is now possible to
reauthenticate without losing connectivty. However, it is still possible to
lose SSH connectivity if the user changes the ownership of the machine when
they do a force-reauth, and we have no way of knowing if they are going to
do that before they do it.
For now, let's just reduce the strength of the warning to warn them that
they "may" lose their connection, rather than they "will".
Updates tailscale/corp#32429
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
I think this was originally a brain-o in 9380e2dfc6. It's
disabling the port _poller_, listing what open ports (i.e. services)
are open, not PMP/PCP/UPnP port mapping.
While there, drop in some more testenv.AssertInTest() in a few places.
Updates #cleanup
Change-Id: Ia6f755ad3544f855883b8a7bdcfc066e8649547b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
PR #17258 extracted `derp.Server` into `derp/derpserver.Server`.
This followup patch adds the following cleanups:
1. Rename `derp_server*.go` files to `derpserver*.go` to match
the package name.
2. Rename the `derpserver.NewServer` constructor to `derpserver.New`
to reduce stuttering.
3. Remove the unnecessary `derpserver.Conn` type alias.
Updates #17257
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Sidestep cmd/viewer incompatibility hiccups with
HardwareAttestationPublic type due to its *ecdsa.PublicKey inner member
by serializing the key to a byte slice instead.
Updates tailscale/corp#31269
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
This exports a number of things from the derp (generic + client) package
to be used by the new derpserver package, as now used by cmd/derper.
And then enough other misc changes to lock in that cmd/tailscaled can
be configured to not bring in tailscale.com/client/local. (The webclient
in particular, even when disabled, was bringing it in, so that's now fixed)
Fixes#17257
Change-Id: I88b6c7958643fb54f386dd900bddf73d2d4d96d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Some systems need to tell whether the monitored goroutine has finished
alongside other channel operations (notably in this case the relay server, but
there seem likely to be others similarly situated).
Updates #15160
Change-Id: I5f0f3fae827b07f9b7102a3b08f60cda9737fe28
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Help out the linker's dead code elimination.
Updates #12614
Change-Id: I6c13cb44d3250bf1e3a01ad393c637da4613affb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This doesn't yet fully pull it out into a feature/captiveportal package.
This is the usual first step, moving the code to its own files within
the same packages.
Updates #17254
Change-Id: Idfaec839debf7c96f51ca6520ce36ccf2f8eec92
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
updates tailscale/corp#32600
A localAPI/cli call to reload-config can end up leaving magicsock's mutex
locked. We were missing an unlock for the early exit where there's no change in
the static endpoints when the disk-based config is loaded. This is not likely
the root cause of the linked issue - just noted during investigation.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
In MacOS GUI apps, users have to select folders to share via the GUI. This is both because
the GUI app keeps its own record of shares, and because the sandboxed version of the GUI
app needs to gain access to the shared folders by having the user pick them in a file
selector.
The new build tag `ts_mac_gui` allows the MacOS GUI app build to signal that this
is a MacOS GUI app, which causes the `drive` subcommand to be omitted so that people
do not mistakenly attempt to use it.
Updates tailscale/tailscale#17210
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Also update to use the new DisplayNameOrDefault.
Updates tailscale/corp#30456
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
This commit does not change the order or meaning of any eventbus activity, it
only updates the way the plumbing is set up.
Updates #15160
Change-Id: I61b863f9c05459d530a4c34063a8bad9046c0e27
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Add a last seen time on the cli's status command, similar to the web
portal.
Before:
```
100.xxx.xxx.xxx tailscale-operator tagged-devices linux offline
```
After:
```
100.xxx.xxx.xxx tailscale-operator tagged-devices linux offline, last seen 20d ago
```
Fixes#16584
Signed-off-by: Mahyar Mirrashed <mah.mirr@gmail.com>
This commit does not change the order or meaning of any eventbus activity, it
only updates the way the plumbing is set up.
Updates #15160
Change-Id: I06860ac4e43952a9bb4d85366138c9d9a17fd9cd
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
It is a programming error to Publish or Subscribe on a closed Client, but now
the way you discover that is by getting a panic from down in the machinery of
the bus after the client state has been cleaned up.
To provide a more helpful error, let's panic explicitly when that happens and
say what went wrong ("the client is closed"), by preventing subscriptions from
interleaving with closure of the client. With this change, either an attachment
fails outright (because the client is already closed) or completes and then
shuts down in good order in the normal course.
This does not change the semantics of the client, publishers, or subscribers,
it's just making the failure more eager so we can attach explanatory text.
Updates #15160
Change-Id: Ia492f4c1dea7535aec2cdcc2e5ea5410ed5218d2
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Only changes how the go routine consuming the events starts and stops,
not what it does.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
This commit modifies the k8s operator to wrap its logger using the logtail
logger provided via the tsnet server. This causes any logs written by
the operator to make their way to Tailscale in the same fashion as
wireguard logs to be used by support.
This functionality can also be opted-out of entirely using the
"TS_NO_LOGS_NO_SUPPORT" environment variable.
Updates https://github.com/tailscale/corp/issues/32037
Signed-off-by: David Bond <davidsbond93@gmail.com>
We never implemented the peercred package on OpenBSD (and I just tried
again and failed), but we've always documented that the creds pointer
can be nil for operating systems where we can't map the unix socket
back to its UID. On those platforms, we set the default unix socket
permissions such that only the admin can open it anyway and we don't
have a read-only vs read-write distinction. OpenBSD was always in that
camp, where any access to Tailscale's unix socket meant full access.
But during some refactoring, we broke OpenBSD in that we started
assuming during one logging path (during login) that Creds was non-nil
when looking up an ipnauth.Actor's username, which wasn't relevant (it
was called from a function "maybeUsernameOf" anyway, which threw away
errors).
Verified on an OpenBSD VM. We don't have any OpenBSD integration tests yet.
Fixes#17209
Updates #17221
Change-Id: I473c5903dfaa645694bcc75e7f5d484f3dd6044d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
controlhttp has the responsibility of dialing a set of candidate control
endpoints in a way that minimizes user facing latency. If one control
endpoint is unavailable we promptly dial another, racing across the
dimensions of: IPv6, IPv4, port 80, and port 443, over multiple server
endpoints.
In the case that the top priority endpoint was not available, the prior
implementation would hang waiting for other results, so as to try to
return the highest priority successful connection to the rest of the
client code. This hang would take too long with a large dialplan and
sufficient client to endpoint latency as to cause the server to timeout
the connection due to inactivity in the intermediate state.
Instead of trying to prioritize non-ideal candidate connections, the
first successful connection is now used unconditionally, improving user
facing latency and avoiding any delays that would encroach on the
server-side timeout.
The tests are converted to memnet and synctest, running on all
platforms.
Fixes#8442Fixestailscale/corp#32534
Co-authored-by: James Tucker <james@tailscale.com>
Change-Id: I4eb57f046d8b40403220e40eb67a31c41adb3a38
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
The controlhttp dialer with a ControlDialPlan IPv6 entry was hitting a
case where the dnscache Resolver was returning an netip.Addr zero
value, where it should've been returning the IPv6 address.
We then tried to dial "invalid IP:80", which would immediately fail,
at least locally.
Mostly this was causing spammy logs when debugging other stuff.
Updates tailscale/corp#32534
Change-Id: If8b9a20f10c1a6aa8a662c324151d987fe9bd2f8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
tsnet apps in particular never use the Linux DNS OSManagers, so they don't need
DBus, etc. I started to pull that all out into separate features so tsnet doesn't
need to bring in DBus, but hit this first.
Here you can see that tsnet (and the k8s-operator) no longer pulls in inotify.
Updates #17206
Change-Id: I7af0f391f60c5e7dbeed7a080346f83262346591
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit does not change the order or meaning of any eventbus activity, it
only updates the way the plumbing is set up.
Updates #15160
Change-Id: I0a175e67e867459daaedba0731bf68bd331e5ebc
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This commit does not change the order or meaning of any eventbus activity, it
only updates the way the plumbing is set up.
Updates #15160
Change-Id: I40c23b183c2a6a6ea3feec7767c8e5417019fc07
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
A common pattern in event bus usage is to run a goroutine to service a
collection of subscribers on a single bus client. To have an orderly shutdown,
however, we need a way to wait for such a goroutine to be finished.
This commit adds a Monitor type that makes this pattern easier to wire up:
rather than having to track all the subscribers and an extra channel, the
component need only track the client and the monitor. For example:
cli := bus.Client("example")
m := cli.Monitor(func(c *eventbus.Client) {
s1 := eventbus.Subscribe[T](cli)
s2 := eventbus.Subscribe[U](cli)
for {
select {
case <-c.Done():
return
case t := <-s1.Events():
processT(t)
case u := <-s2.Events():
processU(u)
}
}
})
To shut down the client and wait for the goroutine, the caller can write:
m.Close()
which closes cli and waits for the goroutine to finish. Or, separately:
cli.Close()
// do other stuff
m.Wait()
While the goroutine management is not explicitly tied to subscriptions, it is a
common enough pattern that this seems like a useful simplification in use.
Updates #15160
Change-Id: I657afda1cfaf03465a9dce1336e9fd518a968bca
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Pulls out the last callback logic and ensures timers are still running.
The eventbustest package is updated support the absence of events.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
And another case of the same typo in a comment elsewhere.
Updates #cleanup
Change-Id: Iaa9d865a1cf83318d4a30263c691451b5d708c9c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* tsnet,internal/client/tailscale: resolve OAuth into authkeys in tsnet
Updates #8403.
* internal/client/tailscale: omit OAuth library via build tag
Updates #12614.
Signed-off-by: Naman Sood <mail@nsood.in>
Expand TestRedactNetmapPrivateKeys to cover all sub-structs of
NetworkMap and confirm that a) all fields are annotated as private or
public, and b) all private fields are getting redacted.
Updates tailscale/corp#32095
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
For debugging purposes, add a new C2N endpoint returning the current
netmap. Optionally, coordination server can send a new "candidate" map
response, which the client will generate a separate netmap for.
Coordination server can later compare two netmaps, detecting unexpected
changes to the client state.
Updates tailscale/corp#32095
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Instead of a single hard-coded C2N handler, add support for calling
arbitrary C2N endpoints via a node roundtripper.
Updates tailscale/corp#32095
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
When tests run in parallel, events from multiple tests on the same bus can
intercede with each other. This is working as intended, but for the test cases
we want to control exactly what goes through the bus.
To fix that, allocate a fresh bus for each subtest.
Fixes#17197
Change-Id: I53f285ebed8da82e72a2ed136a61884667ef9a5e
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
When developing (and debugging) tests, it is useful to be able to see all the
traffic that transits the event bus during the execution of a test.
Updates #15160
Change-Id: I929aee62ccf13bdd4bd07d786924ce9a74acd17a
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Previously, seamless key renewal was an opt-in feature. Customers had
to set a `seamless-key-renewal` node attribute in their policy file.
This patch enables seamless key renewal by default for all clients.
It includes a `disable-seamless-key-renewal` node attribute we can set
in Control, so we can manage the rollout and disable the feature for
clients with known bugs. This new attribute makes the feature opt-out.
Updates tailscale/corp#31479
Signed-off-by: Alex Chan <alexc@tailscale.com>
This makes the `switch` command use the helper `matchProfile` function
that was introduced in the `remove` sub command.
Signed-off-by: Esteban-Bermudez <esteban@bermudezaguirre.com>
Fixes#12255
Add a new subcommand to `switch` for removing a profile from the local
client. This does not delete the profile from the Tailscale account, but
removes it from the local machine. This functionality is available on
the GUI's, but not yet on the CLI.
Signed-off-by: Esteban-Bermudez <esteban@bermudezaguirre.com>
It doesn't really pull its weight: it adds 577 KB to the binary and
is rarely useful.
Also, we now have static IPs and other connectivity paths coming
soon enough.
Updates #5853
Updates #1278
Updates tailscale/corp#32168
Change-Id: If336fed00a9c9ae9745419e6d81f7de6da6f7275
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For a common case of events being simple struct types with some exported
fields, add a helper to check (reflectively) for equal values using cmp.Diff so
that a failed comparison gives a useful diff in the test output.
More complex uses will still want to provide their own comparisons; this
(intentionally) does not export diff options or other hooks from the cmp
package.
Updates #15160
Change-Id: I86bee1771cad7debd9e3491aa6713afe6fd577a6
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This makes things work slightly better over the eventbus.
Also switches ipnlocal to use the event over the eventbus instead of the
direct callback.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Extend the Expect method of a Watcher to allow filter functions that report
only an error value, and which "pass" when the reported error is nil.
Updates #15160
Change-Id: I582d804554bd1066a9e499c1f3992d068c9e8148
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This fixes a flaky test which has been occasionally timing out in CI.
In particular, this test times out if `watchFile` receives multiple
notifications from inotify before we cancel the test context. We block
processing the second notification, because we've stopped listening to
the `callbackDone` channel.
This patch changes the test so we only send on the first notification.
Testing this locally with `stress` confirms that the test is no longer
flaky.
Fixes#17172
Updates #14699
Signed-off-by: Alex Chan <alexc@tailscale.com>
I'd started to do this in the earlier ts_omit_server PR but
decided to split it into this separate PR.
Updates #17128
Change-Id: Ief8823a78d1f7bbb79e64a5cab30a7d0a5d6ff4b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Instead of waiting for a designated subscription to close as a canary for the
bus being stopped, use the bus Client's own signal for closure added in #17118.
Updates #cleanup
Change-Id: I384ea39f3f1f6a030a6282356f7b5bdcdf8d7102
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
The Tracker was using direct callbacks to ipnlocal. This PR moves those
to be triggered via the eventbus.
Additionally, the eventbus is now closed on exit from tailscaled
explicitly, and health is now a SubSystem in tsd.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Subscribers already have a Done channel that the caller can use to detect when
the subscriber has been closed. Typically this happens when the governing
Client closes, which in turn is typically because the Bus closed.
But clients and subscribers can stop at other times too, and a caller has no
good way to tell the difference between "this subscriber closed but the rest
are OK" and "the client closed and all these subscribers are finished".
We've worked around this in practice by knowing the closure of one subscriber
implies the fate of the rest, but we can do better: Add a Done method to the
Client that allows us to tell when that has been closed explicitly, after all
the publishers and subscribers associated with that client have been closed.
This allows the caller to be sure that, by the time that occurs, no further
pending events are forthcoming on that client.
Updates #15160
Change-Id: Id601a79ba043365ecdb47dd035f1fdadd984f303
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Remove the need for the caller to hold on to and call an unregister
function. Both two callers (one real, one test) already have a context
they can use. Use context.AfterFunc instead. There are no observable
side effects from scheduling too late if the goroutine doesn't run sync.
Updates #17148
Change-Id: Ie697dae0e797494fa8ef27fbafa193bfe5ceb307
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This test ostensibly checks whether we record an error metric if a packet
is dropped because the network is down, but the network connectivity is
irrelevant -- the send error is actually because the arguments to Send()
are invalid:
RebindingUDPConn.WriteWireGuardBatchTo:
[unexpected] offset (0) != Geneve header length (8)
This patch changes the test so we try to send a valid packet, and we
verify this by sending it once before taking the network down. The new
error is:
magicsock: network down
which is what we're trying to test.
We then test sending an invalid payload as a separate test case.
Updates tailscale/corp#22075
Signed-off-by: Alex Chan <alexc@tailscale.com>
endpointState is used for tracking UDP direct connection candidate
addresses. If it contains a DERP addr, then direct connection path
discovery will always send a wasteful disco ping over it. Additionally,
CLI "tailscale ping" via peer relay will race over DERP, leading to a
misleading result if pong arrives via DERP first.
Disco pongs arriving via DERP never influence path selection. Disco
ping/pong via DERP only serves "tailscale ping" reporting.
Updates #17121
Signed-off-by: Jordan Whited <jordan@tailscale.com>
When you say --features=foo,bar, that was supposed to mean
to only show features "foo" and "bar" in the table.
But it was also being used as the set of all features that are
omittable, which was wrong, leading to misleading numbers
when --features was non-empty.
Updates #12614
Change-Id: Idad2fa67fb49c39454032e84a3dede967890fdf5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The Unix implementation of doExec propagates error codes by virtue of
the fact that it does an execve; the replacement binary will return the
exit code.
On non-Unix, we need to simulate these semantics by checking for an
ExitError and, when present, passing that value on to os.Exit.
We also add error handling to the doExec call for the benefit of
handling any errors where doExec fails before being able to execute
the desired binary.
Updates https://github.com/tailscale/corp/issues/29940
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
This renames the package+symbols in the earlier 17ffa80138 to be
in their own package ("buildfeatures") and start with the word "Has"
like "if buildfeatures.HasFoo {".
Updates #12614
Change-Id: I510e5f65993e5b76a0e163e3aa4543755213cbf6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Extend the client state management to generate a hardware attestation
key if none exists.
Extend MapRequest with HardwareAttestationKey{,Signature} fields that
optionally contain the public component of the hardware attestation key
and a signature of the node's node key using it. This will be used by
control to associate hardware attesation keys with node identities on a
TOFU basis.
Updates tailscale/corp#31269
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
So code (in upcoming PRs) can test for the build tags with consts and
get dead code elimination from the compiler+linker.
Updates #12614
Change-Id: If6160453ffd01b798f09894141e7631a93385941
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is a small introduction of the eventbus into controlclient that
communicates with mainly ipnlocal. While ipnlocal is a complicated part
of the codebase, the subscribers here are from the perspective of
ipnlocal already called async.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
This commit fixes an issue within the service reconciler where we end
up in a constant reconciliation loop. When reconciling, the loadbalancer
status is appended to but not reset between each reconciliation, leading
to an ever growing slice of duplicate statuses.
Fixes https://github.com/tailscale/tailscale/issues/17105
Fixes https://github.com/tailscale/tailscale/issues/17107
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit adds a new method to the tsnet.Server type named `Logger`
that returns the underlying logtail instance's Logf method.
This is intended to be used within the Kubernetes operator to wrap its
existing logger in a way such that operator specific logs can also be
sent to control for support & debugging purposes.
Updates https://github.com/tailscale/corp/issues/32037
Signed-off-by: David Bond <davidsbond93@gmail.com>
As of this commit (per the issue), the Taildrive code remains where it
was, but in new files that are protected by the new ts_omit_drive
build tag. Future commits will move it.
Updates #17058
Change-Id: Idf0a51db59e41ae8da6ea2b11d238aefc48b219e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Its doc said its signature matched a std signature, but it used
Tailscale-specific types.
Nowadays it's the caller (func control) that curries the logf/netmon
and returns the std-matching signature.
Updates #cleanup (while answering a question on Slack)
Change-Id: Ic99de41fc6a1c720575a7f33c564d0bcfd9a2c30
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To support integration testing of client features that rely on it, e.g.
peer relay.
Updates tailscale/corp#30903
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Removes ACL edits from e2e tests in favour of trying to simplify the
tests and separate the actual test logic from the environment setup
logic as much as possible. Also aims to fit in with the requirements
that will generally be filled anyway for most devs working on the
operator; in particular using tags that fit in with our documentation.
Updates tailscale/corp#32085
Change-Id: I7659246e39ec0b7bcc4ec0a00c6310f25fe6fac2
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This adds a file that's not compiled by default that exists just to
make it easier to do binary size checks, probing what a binary would
be like if it included reflect methods (as used by html/template, etc).
As an example, once tailscaled uses reflect.Type.MethodByName(non-const-string) anywhere,
the build jumps up by 14.5 MB:
$ GOOS=linux GOARCH=amd64 ./tool/go build -tags=ts_include_cli,ts_omit_webclient,ts_omit_systray,ts_omit_debugeventbus -o before ./cmd/tailscaled
$ GOOS=linux GOARCH=amd64 ./tool/go build -tags=ts_include_cli,ts_omit_webclient,ts_omit_systray,ts_omit_debugeventbus,ts_debug_forcereflect -o after ./cmd/tailscaled
$ ls -l before after
-rwxr-xr-x@ 1 bradfitz staff 41011861 Sep 9 07:28 before
-rwxr-xr-x@ 1 bradfitz staff 55610948 Sep 9 07:29 after
This is particularly pronounced with large deps like the AWS SDK. If you compare using ts_omit_aws:
-rwxr-xr-x@ 1 bradfitz staff 38284771 Sep 9 07:40 no-aws-no-reflect
-rwxr-xr-x@ 1 bradfitz staff 45546491 Sep 9 07:41 no-aws-with-reflect
That means adding AWS to a non-reflect binary adds 2.7 MB but adding
AWS to a reflect binary adds 10 MB.
Updates #17063
Updates #12614
Change-Id: I18e9b77c9cf33565ce5bba65ac5584fa9433f7fb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* cmd/tailscale/cli: use client/local instead of deprecated client/tailscale
Updates tailscale/corp#22748
Signed-off-by: Alex Chan <alexc@tailscale.com>
* derp: use client/local instead of deprecated client/tailscale
Updates tailscale/corp#22748
Signed-off-by: Alex Chan <alexc@tailscale.com>
---------
Signed-off-by: Alex Chan <alexc@tailscale.com>
I probably could've deflaked this without synctest, but might as well use
it now that Go 1.25 has it.
Fixes#15348
Change-Id: I81c9253fcb7eada079f3e943ab5f1e29ba8e8e31
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* utils/expvarx: mark TestSafeFuncHappyPath as known flaky
Updates #15348
Signed-off-by: Alex Chan <alexc@tailscale.com>
* tstest/integration: mark TestCollectPanic as known flaky
Updates #15865
Signed-off-by: Alex Chan <alexc@tailscale.com>
---------
Signed-off-by: Alex Chan <alexc@tailscale.com>
It was a bit confusing that provided history did not include the
current probe results.
Updates tailscale/corp#20583
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
We should never use the real syspolicy implementation in tests by
default. (the machine's configuration shouldn't affect tests)
You either specify a test policy, or you get a no-op one.
Updates #16998
Change-Id: I3350d392aad11573a5ad7caab919bb3bbaecb225
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit modifies containerboot's state reset process to handle the
state secret not existing. During other parts of the boot process we
gracefully handle the state secret not being created yet, but missed
that check within `resetContainerbootState`
Fixes https://github.com/tailscale/tailscale/issues/16804
Signed-off-by: David Bond <davidsbond93@gmail.com>
Fix "file not found" errors when WebDAV clients access files/dirs inside
directories with spaces.
The issue occurred because StatCache was mixing URL-escaped and
unescaped paths, causing cache key mismatches.
Specifically, StatCache.set() parsed WebDAV responses containing
URL-escaped paths (ex. "Dir%20Space/file1.txt") and stored them
alongside unescaped cache keys (ex. "Dir Space/file1.txt").
This mismatch prevented StatCache.get() from correctly determining whether
a child file existed.
See https://github.com/tailscale/tailscale/issues/13632#issuecomment-3243522449
for the full explanation of the issue.
The decision to keep all paths references unescaped inside the StatCache
is consistent with net/http.Request.URL.Path and rewrite.go (sole consumer)
Update unit test to detect this directory space mishandling.
Fixes tailscale#13632
Signed-off-by: Craig Hesling <craig@hesling.com>
There's a TODO to delete all of handler.go, but part of it's
still used in another repo.
But this deletes some.
Updates #17022
Change-Id: Ic5a8a5a694ca258440307436731cd92b45ee2d21
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Before:
$ tailscale ip -4
1.2.3.4
$ tailscale set --exit-node=1.2.3.4
no node found in netmap with IP 1.2.3.4
After:
$ tailscale set --exit-node=1.2.3.4
cannot use 1.2.3.4 as an exit node as it is a local IP address to this machine; did you mean --advertise-exit-node?
The new error message already existed in the code, but would only be
triggered if the backend wasn't running -- which means, in practice,
it would almost never be triggered.
The old error message is technically true, but could be confusing if you
don't know the distinction between "netmap" and "tailnet" -- it could
sound like the exit node isn't part of your tailnet. A node is never in
its own netmap, but it is part of your tailnet.
This error confused me when I was doing some local dev work, and it's
confused customers before (e.g. #7513). Using the more specific error
message should reduce confusion.
Updates #7513
Updates https://github.com/tailscale/corp/issues/23596
Signed-off-by: Alex Chan <alexc@tailscale.com>
Now that we have policytest and the policyclient.Client interface, we
can de-global-ify many of the tests, letting them run concurrently
with each other, and just removing global variable complexity.
This does ~half of the LocalBackend ones.
Updates #16998
Change-Id: Iece754e1ef4e49744ccd967fa83629d0dca6f66a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is step 4 of making syspolicy a build-time feature.
This adds a policyclient.Get() accessor to return the correct
implementation to use: either the real one, or the no-op one. (A third
type, a static one for testing, also exists, so in general a
policyclient.Client should be plumbed around and not always fetched
via policyclient.Get whenever possible, especially if tests need to use
alternate syspolicy)
Updates #16998
Updates #12614
Change-Id: Iaf19670744a596d5918acfa744f5db4564272978
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Step 4 of N. See earlier commits in the series (via the issue) for the
plan.
This adds the missing methods to policyclient.Client and then uses it
everywhere in ipn/ipnlocal and locks it in with a new dep test.
Still plenty of users of the global syspolicy elsewhere in the tree,
but this is a lot of them.
Updates #16998
Updates #12614
Change-Id: I25b136539ae1eedbcba80124de842970db0ca314
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Step 3 in the series. See earlier cc532efc20 and d05e6dc09e.
This step moves some types into a new leaf "ptype" package out of the
big "settings" package. The policyclient.Client will later get new
methods to return those things (as well as Duration and Uint64, which
weren't done at the time of the earlier prototype).
Updates #16998
Updates #12614
Change-Id: I4d72d8079de3b5351ed602eaa72863372bd474a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously, when attempting a risky action, the CLI printed a 5 second countdown saying
"Continuing in 5 seconds...". When the countdown finished, the CLI aborted rather than
continuing.
To avoid confusion, but also avoid accidentally continuing if someone (or an automated
process) fails to manually abort within the countdown, we now explicitly prompt for a
y/n response on whether or not to continue.
Updates #15445
Co-authored-by: Kot C <kot@kot.pink>
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This commit adds a `replicas` field to the `Connector` custom resource that
allows users to specify the number of desired replicas deployed for their
connectors.
This allows users to deploy exit nodes, subnet routers and app connectors
in a highly available fashion.
Fixes#14020
Signed-off-by: David Bond <davidsbond93@gmail.com>
This is step 2 of ~4, breaking up #14720 into reviewable chunks, with
the aim to make syspolicy be a build-time configurable feature.
Step 1 was #16984.
In this second step, the util/syspolicy/policyclient package is added
with the policyclient.Client interface. This is the interface that's
always present (regardless of build tags), and is what code around the
tree uses to ask syspolicy/MDM questions.
There are two implementations of policyclient.Client for now:
1) NoPolicyClient, which only returns default values.
2) the unexported, temporary 'globalSyspolicy', which is implemented
in terms of the global functions we wish to later eliminate.
This then starts to plumb around the policyclient.Client to most callers.
Future changes will plumb it more. When the last of the global func
callers are gone, then we can unexport the global functions and make a
proper policyclient.Client type and constructor in the syspolicy
package, removing the globalSyspolicy impl out of tsd.
The final change will sprinkle build tags in a few more places and
lock it in with dependency tests to make sure the dependencies don't
later creep back in.
Updates #16998
Updates #12614
Change-Id: Ib2c93d15c15c1f2b981464099177cd492d50391c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is step 1 of ~3, breaking up #14720 into reviewable chunks, with
the aim to make syspolicy be a build-time configurable feature.
In this first (very noisy) step, all the syspolicy string key
constants move to a new constant-only (code-free) package. This will
make future steps more reviewable, without this movement noise.
There are no code or behavior changes here.
The future steps of this series can be seen in #14720: removing global
funcs from syspolicy resolution and using an interface that's plumbed
around instead. Then adding build tags.
Updates #12614
Change-Id: If73bf2c28b9c9b1a408fe868b0b6a25b03eeabd1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Apparently, #16989 introduced a bug in request-dataplane-review.yml:
> you may only define one of `paths` and `paths-ignore` for a single event
Related #16372
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
@tailscale/dataplane almost never needs to review depaware.txt, when
it is the only change to the DERP implementation.
Related #16372
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
If the DERP queue is full, drop the oldest item first, rather than the
youngest, on the assumption that older data is more likely to be
unanswerable.
Updates tailscale/corp#31762
Signed-off-by: James Tucker <james@tailscale.com>
Add a ternary flag that unless set explicitly to false keeps the
insecure behavior of TSIDP.
If the flag is false, add functionality on startup to migrate
oidc-funnel-clients.json to oauth-clients.json if it doesn’t exist.
If the flag is false, modify endpoints to behave similarly regardless
of funnel, tailnet, or localhost. They will all verify client ID & secret
when appropriate per RFC 6749. The authorize endpoint will no longer change
based on funnel status or nodeID.
Add extra tests verifying TSIDP endpoints behave as expected
with the new flag.
Safely create the redirect URL from what's passed into the
authorize endpoint.
Fixes #16880
Signed-off-by: Remy Guercio <remy@tailscale.com>
Doesn't look to affect us, but pacifies security scanners.
See 88ddf1d0d9
It's for decoding. We only use this package for encoding (via
github.com/google/rpmpack / github.com/goreleaser/nfpm/v2).
Updates #8043
Change-Id: I87631aa5048f9514bb83baf1424f6abb34329c46
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Our own WaitGroup wrapper type was a prototype implementation
for the Go method on the standard sync.WaitGroup type.
Now that there is first-class support for Go,
we should migrate over to using it and delete syncs.WaitGroup.
Updates #cleanup
Updates tailscale/tailscale#16330
Change-Id: Ib52b10f9847341ce29b4ca0da927dc9321691235
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
DERP writes go via TCP and the host OS will have plenty of buffer space.
We've observed in the wild with a backed up TCP socket kernel side
buffers of >2.4MB. The DERP internal queue being larger causes an
increase in the probability that the contents of the backbuffer are
"dead letters" - packets that were assumed to be lost.
A first step to improvement is to size this queue only large enough to
avoid some of the initial connect stall problem, but not large enough
that it is contributing in a substantial way to buffer bloat /
dead-letter retention.
Updates tailscale/corp#31762
Signed-off-by: James Tucker <james@tailscale.com>
I need a ringbuffer in the more traditional sense, one that has a notion
of item removal as well as tail loss on overrun. This implementation is
really a clearable log window, and is used as such where it is used.
Updates #cleanup
Updates tailscale/corp#31762
Signed-off-by: James Tucker <james@tailscale.com>
Bump Go 1.25 release to include a go/types patch and resolve govulncheck
CI exceptions.
Updates tailscale/corp#31755
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Extract field comments from AST and include them in generated view
methods. Comments are preserved from the original struct fields to
provide documentation for the view accessors.
Fixes#16958
Signed-off-by: Maisem Ali <3953239+maisem@users.noreply.github.com>
fixestailscale/corp#26369
The suggested exit node is currently only calculated during a localAPI request.
For older UIs, this wasn't a bad choice - we could just fetch it on-demand when a menu
presented itself. For newer incarnations however, this is an always-visible field
that needs to react to changes in the suggested exit node's value.
This change recalculates the suggested exit node ID on netmap updates and
broadcasts it on the IPN bus. The localAPI version of this remains intact for the
time being.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
updates tailscale/corp#29841
Adds a node cap macOS UIs can query to determine
whether then should enable the new windowed UI.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Pull the lock-bearing code into a closure, and use a clone rather than a
shallow copy of the hostinfo record.
Updates #11649
Change-Id: I4f1d42c42ce45e493b204baae0d50b1cbf82b102
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
The early unlock on this branch was required because the "send" method goes on
to acquire the mutex itself. Rather than release the lock just to acquire it
again, call the underlying locked helper directly.
Updates #11649
Change-Id: I50d81864a00150fc41460b7486a9c65655f282f5
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
In places where we are locking the LocakBackend and immediately deferring an
unlock, and where there is no shortcut path in the control flow below the
deferral, we do not need the unlockOnce helper. Replace all these with use of
the lock directly.
Updates #11649
Change-Id: I3e6a7110dfc9ec6c1d38d2585c5367a0d4e76514
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Instead of referring to groups, which is a term of art for a different entity,
update the doc comments to more accurately describe what tags are in reference
to the policy document.
Updates #cleanup
Change-Id: Iefff6f84981985f834bae7c6a6c34044f53f2ea2
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
There are several methods within the LocalBackend that used an unusual and
error-prone lock discipline whereby they require the caller to hold the backend
mutex on entry, but release it on the way out.
In #11650 we added some support code to make this pattern more visible.
Now it is time to eliminate the pattern (at least within this package).
This is intended to produce no semantic changes, though I am relying on
integration tests and careful inspection to achieve that.
To the extent possible I preserved the existing control flow. In a few places,
however, I replaced this with an unlock/lock closure. This means we will
sometimes reacquire a lock only to release it again one frame up the stack, but
these operations are not performance sensitive and the legibility gain seems
worthwhile.
We can probably also pull some of these out into separate methods, but I did
not do that here so as to avoid other variable scope changes that might be hard
to see. I would like to do some more cleanup separately.
As a follow-up, we could also remove the unlockOnce helper, but I did not do
that here either.
Updates #11649
Change-Id: I4c92d4536eca629cfcd6187528381c33f4d64e20
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
The serve code leaves it up to the system's DNS resolver and netstack to
figure out how to reach the proxy destination. Combined with k8s-proxy
running in userspace mode, this means we can't rely on MagicDNS being
available or tailnet IPs being routable. I'd like to implement that as a
feature for serve in userspace mode, but for now the safer fix to get
kube-apiserver ProxyGroups consistently working in all environments is to
switch to using localhost as the proxy target instead.
This has a small knock-on in the code that does WhoIs lookups, which now
needs to check the X-Forwarded-For header that serve populates to get
the correct tailnet IP to look up, because the request's remote address
will be loopback.
Fixes#16920
Change-Id: I869ddcaf93102da50e66071bb00114cc1acc1288
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This increases throughput over long fat networks, and in the presence
of crypto/syscall-induced delay.
Updates tailscale/corp#31164
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Update odic-funnel-clients.json to take a path, this
allows setting the location of the file and prevents
it from landing in the root directory or users home directory.
Move setting of rootPath until after tsnet has started.
Previously this was added for the lazy creation of the
oidc-key.json. It's now needed earlier in the flow.
Updates #16734Fixes#16844
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Add the ability for operators of natc in consensus mode to remove
servers from the raft cluster config, without losing other state.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
Currently consensus has a bootstrap routine where a tsnet node tries to
join each other node with the cluster tag, and if it is not able to join
any other node it starts its own cluster.
That algorithm is racy, and can result in split brain (more than one
leader/cluster) if all the nodes for a cluster are started at the same
time.
Add a FollowOnly argument to the bootstrap function. If provided this
tsnet node will never lead, it will try (and retry with exponential back
off) to follow any node it can contact.
Add a --follow-only flag to cmd/natc that uses this new tsconsensus
functionality.
Also slightly reorganize some arguments into opts structs.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
This significantly improves throughput of a peer relay server on Linux.
Server.packetReadLoop no longer passes sockets down the stack. Instead,
packet handling methods return a netip.AddrPort and []byte, which
packetReadLoop gathers together for eventual batched writes on the
appropriate socket(s).
Updates tailscale/corp#31164
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We have been unintentionally ignoring errors from calling bootstrap.
bootstrap sometimes calls raft.BootstrapCluster which sometimes returns
a safe to ignore error, handle that case appropriately.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
This has come up in a few situations recently and adding these helpers
is much better than copying the slice (calling AsSlice()) in order to
use slices.Max and friends.
Updates #cleanup
Change-Id: Ib289a07d23c3687220c72c4ce341b9695cd875bf
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
Update the runall handler to be more generic with an
exclude param to exclude multiple probes as the requesters
definition.
Updates tailscale/corp#27370
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Cleanup nix support, make flake easier to read with nix-systems.
This also harmonizes with golinks flake setup and reduces an input
dependency by 1.
Update deps test to ensure the vendor hash stays harmonized
with go.mod.
Update make tidy to ensure vendor hash stays current.
Overlay the current version of golang, tailscale runs
recent releases faster than nixpkgs can update them into
the unstable branch.
Updates #16637
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
The -Environment argument to Start-Process is essentially being treated
as a delta; removing a particular variable from the argument's hash
table does not indicate to delete. Instead we must set the value of each
unwanted variable to $null.
Updates https://github.com/tailscale/corp/issues/29940
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Some of the operations of the local API need an event bus to correctly
instantiate other components (notably including the portmapper).
This commit adds that, and as the parameter list is starting to get a bit long
and hard to read, I took the opportunity to move the arguments to a config
type. Only a few call sites needed to be updated and this API is not intended
for general use, so I did not bother to stage the change.
Updates #15160
Updates #16842
Change-Id: I7b057d71161bd859f5acb96e2f878a34c85be0ef
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
gocross-wrapper.ps1 is a PowerShell core script that is essentially a
straight port of gocross-wrapper.sh. It requires PowerShell 7.4, which
is the latest LTS release of PSCore.
Why use PowerShell Core instead of Windows PowerShell? Essentially
because the former is much better to script with and is the edition
that is currently maintained.
Because we're using PowerShell Core, but many people will be running
scripts from a machine that only has Windows PowerShell, go.cmd has
been updated to prompt the user for PowerShell core installation if
necessary.
gocross-wrapper.sh has also been updated to utilize the PSCore script
when running under cygwin or msys.
gocross itself required a couple of updates:
We update gocross to output the PowerShell Core wrapper alongside the
bash wrapper, which will propagate the revised scripts to other repos
as necessary.
We also fix a couple of things in gocross that didn't work on Windows:
we change the toolchain resolution code to use os.UserHomeDir instead
of directly referencing the HOME environment variable, and we fix a
bug in the way arguments were being passed into exec.Command on
non-Unix systems.
Updates https://github.com/tailscale/corp/issues/29940
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Add a Run all probes handler that executes all
probes except those that are continuous or the derpmap
probe.
This is leveraged by other tooling to confirm DERP
stability after a deploy.
Updates tailscale/corp#27370
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
fixestailscale/corp#31299
Fixes two issues:
getInterfaceIndex would occasionally race with netmon's state, returning
the cached default interface index after it had be changed by NWNetworkMonitor.
This had the potential to cause connections to bind to the prior default. The fix
here is to preferentially use the interface index provided by NWNetworkMonitor
preferentially.
When no interfaces are available, macOS will set the tunnel as the default
interface when an exit node is enabled, potentially causing getInterfaceIndex
to return utun's index. We now guard against this when taking the
defaultIdx path.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
This pulls in a change from github.com/tailscale/QDK to verify code signing
when using QNAP_SIGNING_SCRIPT.
It also upgrades to the latest Google Cloud PKCS#11 library, and reorders
the Dockerfile to allow for more efficient future upgrades to the included QDK.
Updates tailscale/corp#23528
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Define the HardwareAttestionKey interface describing a platform-specific
hardware backed node identity attestation key. Clients will register the
key type implementations for their platform.
Updates tailscale/corp#31269
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
dnstype.Resolver adds a boolean UseWithExitNode that controls
whether the resolver should be used in tailscale exit node contexts
(not wireguard exit nodes). If UseWithExitNode resolvers are found,
they are installed as the global resolvers. If no UseWithExitNode resolvers
are found, the exit node resolver continues to be installed as the global
resolver. Split DNS Routes referencing UseWithExitNode resolvers are also
installed.
Updates #8237Fixestailscale/corp#30906Fixestailscale/corp#30907
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
We already show a message in the menu itself, this just adds it to the
CLI output as well.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
This adds support for having every viewer type implement
jsonv2.MarshalerTo and jsonv2.UnmarshalerFrom.
This provides a significant boost in performance
as the json package no longer needs to validate
the entirety of the JSON value outputted by MarshalJSON,
nor does it need to identify the boundaries of a JSON value
in order to call UnmarshalJSON.
For deeply nested and recursive MarshalJSON or UnmarshalJSON calls,
this can improve runtime from O(N²) to O(N).
This still references "github.com/go-json-experiment/json"
instead of the experimental "encoding/json/v2" package
now available in Go 1.25 under goexperiment.jsonv2
so that code still builds without the experiment tag.
Of note, the "github.com/go-json-experiment/json" package
aliases the standard library under the right build conditions.
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Adds a setter for proxyFunc to allow macOS to pull defined
system proxies. Disallows overriding if proxyFunc is set via config.
Updates tailscale/corp#30668
Signed-off-by: Will Hannah <willh@tailscale.com>
This affects the 1.87.33 unstable release.
Updates #16842
Updates #15160
Change-Id: Ie6d1b2c094d1a6059fbd1023760567900f06e0ad
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Expected when Peer Relay'ing via self. These disco messages never get
sealed, and never leave the process.
Updates tailscale/corp#30527
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Update some logging to help future failures.
Improve test shutdown concurrency issues.
Fixes#16722
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Peer Relay is dependent on crypto routing, therefore crypto routing is
now mandatory.
Updates tailscale/corp#20732
Updates tailscale/corp#31083
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This commit also extends the updateRelayServersSet unit tests to cover
onNodeViewsUpdate.
Fixestailscale/corp#31080
Signed-off-by: Jordan Whited <jordan@tailscale.com>
One of these tests highlighted a Geneve encap bug, which is also fixed
in this commit.
looksLikeInitMsg was passed a packet post Geneve header stripping with
slice offsets that had not been updated to account for the stripping.
Updates tailscale/corp#30903
Signed-off-by: Jordan Whited <jordan@tailscale.com>
* Update installer.sh add FreeBSD ver 15
this should fix the issue on https://github.com/tailscale/tailscale/issues/16740
Signed-off-by: TheBigBear <471105+TheBigBear@users.noreply.github.com>
* scripts/installer.sh: small indentation change
Signed-off-by: Erisa A <erisa@tailscale.com>
Fixes#16740
---------
Signed-off-by: TheBigBear <471105+TheBigBear@users.noreply.github.com>
Signed-off-by: Erisa A <erisa@tailscale.com>
Co-authored-by: Erisa A <erisa@tailscale.com>
Pass a local.Client to systray.Run, so we can use the existing global
localClient in the cmd/tailscale CLI. Add socket flag to cmd/systray.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Adds the eventbus to the router subsystem.
The event is currently only used on linux.
Also includes facilities to inject events into the bus.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
This will start including the sytray app in unstable builds for Linux,
unless the `ts_omit_systray` build flag is specified.
If we decide not to include it in the v1.88 release, we can pull it
back out or restrict it to unstable builds.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
In Android, we are prompting the user to select a Taildrop directory when they first receive a Taildrop: we block writes on Taildrop dir selection. This means that we cannot use Dir inside managerOptions, since the http request would not get the new Taildrop extension. This PR removes, in the Android case, the reliance on m.opts.Dir, and instead has FileOps hold the correct directory.
This expands FileOps to be the Taildrop interface for all file system operations.
Updates tailscale/corp#29211
Signed-off-by: kari-ts <kari@tailscale.com>
restore tstest
* cmd/k8s-operator,k8s-operator: allow setting a `priorityClassName`
Fixes#16682
Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
* Update k8s-operator/apis/v1alpha1/types_proxyclass.go
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
Signed-off-by: Lee Briggs <jaxxstorm@users.noreply.github.com>
* run make kube-generate-all
Change-Id: I5f8f16694fdc181b048217b9f05ec2ee2aa04def
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
---------
Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
Signed-off-by: Lee Briggs <jaxxstorm@users.noreply.github.com>
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
The tsidp oidc-key.json ended up in the root directory
or home dir of the user process running it.
Update this to store it in a known location respecting
the TS_STATE_DIR and flagDir options.
Fixes#16734
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Also adds a test to kube/kubeclient to defend against the error type
returned by the client changing in future.
Fixestailscale/corp#30855
Change-Id: Id11d4295003e66ad5c29a687f1239333c21226a4
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Some systems have `sudo`, some have `su`. This tries both, increasing
the chance that we can run the file server as an unprivileged user.
Updates #14629
Signed-off-by: Percy Wegmann <percy@tailscale.com>
If a conn.Close call raced conn.ReadFromUDPAddrPort before it could
"register" itself as an active read, the conn.ReadFromUDPAddrPort would
never return.
This commit replaces all the activeRead and breakActiveReads machinery
with a channel. These constructs were only depended upon by
SetReadDeadline, and SetReadDeadline was unused.
Updates #16707
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This commit update the message for recommanding clear command after running serve for service.
Instead of a flag, we pass the service name as a parameter.
Fixestailscale/corp#30846
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
In the components where an event bus is already plumbed through, remove the
exceptions that allow it to be omitted, and update all the tests that relied on
those workarounds execute properly.
This change applies only to the places where we're already using the bus; it
does not enforce the existence of a bus in other components (yet),
Updates #15160
Change-Id: Iebb92243caba82b5eb420c49fc3e089a77454f65
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
jsonv2 now returns an error when you marshal or unmarshal a time.Duration
without an explicit format flag. This is an intentional, temporary choice until
the default [time.Duration] representation is decided (see golang/go#71631).
setting.Snapshot can hold time.Duration values inside a map[string]any,
so the jsonv2 update breaks marshaling. In this PR, we start using
a custom marshaler until that decision is made or golang/go#71664
lets us specify the format explicitly.
This fixes `tailscale syspolicy list` failing when KeyExpirationNotice
or any other time.Duration policy setting is configured.
Fixes#16683
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Ideally when we attempt to create a new port mapping, we should not return
without error when no mapping is available. We already log these cases as
unexpected, so this change is just to avoiding panicking dispatch on the
invalid result in those cases. We still separately need to fix the underlying
control flow.
Updates #16662
Change-Id: I51e8a116b922b49eda45e31cd27f6b89dd51abc8
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This occasionally panics waiting on a nil ctx, but was missed in the
previous PR because it's quite a rare flake as it needs to progress to a
specific point in the parser.
Updates #16678
Change-Id: Ifd36dfc915b153aede36b8ee39eff83750031f95
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
When kubectl starts an interactive attach session, it sends 2 resize
messages in quick succession. It seems that particularly in HTTP mode,
we often receive both of these WebSocket frames from the underlying
connection in a single read. However, our parser currently assumes 0-1
frames per read, and leaves the second frame in the read buffer until
the next read from the underlying connection. It doesn't take long after
that before we end up failing to skip a control message as we normally
should, and then we parse a control message as though it will have a
stream ID (part of the Kubernetes protocol) and error out.
Instead, we should keep parsing frames from the read buffer for as long
as we're able to parse complete frames, so this commit refactors the
messages parsing logic into a loop based on the contents of the read
buffer being non-empty.
k/k staging/src/k8s.io/kubectl/pkg/cmd/attach/attach.go for full
details of the resize messages.
There are at least a couple more multiple-frame read edge cases we
should handle, but this commit is very conservatively fixing a single
observed issue to make it a low-risk candidate for cherry picking.
Updates #13358
Change-Id: Iafb91ad1cbeed9c5231a1525d4563164fc1f002f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This update introduces support for DNS records associated with ProxyGroup egress services, ensuring that the ClusterIP Service IP is used instead of Pod IPs.
Fixes#15945
Signed-off-by: Raj Singh <raj@tailscale.com>
Previously, we used a non-nil Location as an indicator that a peer is a Mullvad exit node.
However, this is not, or no longer, reliable, since regular exit nodes may also have a non-nil Location,
such as when traffic steering is enabled for a tailnet.
In this PR, we update the plaintext `tailscale status` output to omit only Mullvad exit nodes, rather than all
exit nodes with a non-nil Location. The JSON output remains unchanged and continues to include all peers.
Updates tailscale/corp#30614
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In #16625, I introduced a mechanism for sending the selected exit node
to Control via tailcfg.Hostinfo.ExitNodeID as part of the MapRequest.
@nickkhyl pointed out that LocalBackend.doSetHostinfoFilterServices
needs to be triggered in order to actually send this update. This
patch adds that command. It also prevents the client from sending
"auto:any" in that field, because that’s not a real exit node ID.
This patch also fills in some missing checks in TestConfigureExitNode.
Updates tailscale/corp#30536
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Update nixpkgs-unstable to include newer golang
to satisfy go.mod requirement of 1.24.4
Update vendor hash to current.
Updates #15015
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
This commit adds a advertise subcommand for tailscale serve, that would declare the node
as a service proxy for a service. This command only adds the service to node's list of
advertised service, but doesn't modify the list of services currently advertised.
Fixestailscale/corp#28016
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
When a client selects a particular exit node, Control may use that as
a signal for deciding other routes.
This patch causes the client to report whenever the current exit node
changes, through tailcfg.Hostinfo.ExitNodeID. It relies on a properly
set ipn.Prefs.ExitNodeID, which should already be resolved by
`tailscale set`.
Updates tailscale/corp#30536
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This commit reverts the key of Web field in ipn.ServiceConfig to use FQDN instead of service
name for the host part of HostPort. This change is because k8s operator already build base on
the assumption of the part being FQDN. We don't want to break the code with dependency.
Fixestailscale/corp#30695
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* Modifies the k8s-proxy to expose health check and metrics
endpoints on the Pod's IP.
* Moves cmd/containerboot/healthz.go and cmd/containerboot/metrics.go to
/kube to be shared with /k8s-proxy.
Updates #13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
Updates k8s-proxy's config so its auth mode config matches that we set
in kube-apiserver ProxyGroups for consistency.
Updates #13358
Change-Id: I95e29cec6ded2dc7c6d2d03f968a25c822bc0e01
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
The Kubernetes API server proxy is getting the ability to serve on a
Tailscale Service instead of individual node names. Update the configure
kubeconfig sub-command to accept arguments that look like a Tailscale
Service. Note, we can't know for sure whether a peer is advertising a
Tailscale Service, we can only guess based on the ExtraRecords in the
netmap and that IP showing up in a peer's AllowedIPs.
Also adds an --http flag to allow targeting individual proxies that can
be adverting on http for their node name, and makes the command a bit
more forgiving on the range of inputs it accepts and how eager it is to
print the help text when the input is obviously wrong.
Updates #13358
Change-Id: Ica0509c6b2c707252a43d7c18b530ec1acf7508f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This commit modifies the kubernetes operator's `DNSConfig` resource
with the addition of a new field at `nameserver.service.clusterIP`.
This field allows users to specify a static in-cluster IP address of
the nameserver when deployed.
Fixes#14305
Signed-off-by: David Bond <davidsbond93@gmail.com>
This function is behind a sync.Once so we should only see errors at
startup. In particular the error from `open` is useful to diagnose why
TPM might not be accessible.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Adds a new reconciler for ProxyGroups of type kube-apiserver that will
provision a Tailscale Service for each replica to advertise. Adds two
new condition types to the ProxyGroup, TailscaleServiceValid and
TailscaleServiceConfigured, to post updates on the state of that
reconciler in a way that's consistent with the service-pg reconciler.
The created Tailscale Service name is configurable via a new ProxyGroup
field spec.kubeAPISserver.ServiceName, which expects a string of the
form "svc:<dns-label>".
Lots of supporting changes were needed to implement this in a way that's
consistent with other operator workflows, including:
* Pulled containerboot's ensureServicesUnadvertised and certManager into
kube/ libraries to be shared with k8s-proxy. Use those in k8s-proxy to
aid Service cert sharing between replicas and graceful Service shutdown.
* For certManager, add an initial wait to the cert loop to wait until
the domain appears in the devices's netmap to avoid a guaranteed error
on the first issue attempt when it's quick to start.
* Made several methods in ingress-for-pg.go and svc-for-pg.go into
functions to share with the new reconciler
* Added a Resource struct to the owner refs stored in Tailscale Service
annotations to be able to distinguish between Ingress- and ProxyGroup-
based Services that need cleaning up in the Tailscale API.
* Added a ListVIPServices method to the internal tailscale client to aid
cleaning up orphaned Services
* Support for reading config from a kube Secret, and partial support for
config reloading, to prevent us having to force Pod restarts when
config changes.
* Fixed up the zap logger so it's possible to set debug log level.
Updates #13358
Change-Id: Ia9607441157dd91fb9b6ecbc318eecbef446e116
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This commit removes the advertise command for service. The advertising is now embedded into
serve command and unadvertising is moved to drain subcommand
Fixestailscale/corp#22954
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: add clear subcommand for serve services
This commit adds a clear subcommand for serve command, to remove all config for a passed service.
This is a short cut for user to remove services after they drain a service. As an indipendent command
it would avoid accidently remove a service on typo.
Updates tailscale/corp#22954
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* update regarding comments
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* log when clearing a non-existing service but not error
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
---------
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
The tpmrm0 is a kernel-managed version of tpm0 that multiplexes multiple
concurrent connections. The basic tpm0 can only be accessed by one
application at a time, which can be pretty unreliable.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
* cmd/tailscale/cli: add drain subCommand for serve
This commit adds the drain subcommand for serving services. After we merge advertise and serve service as one step,
we now need a way to unadvertise service and this is it.
Updates tailscale/corp#22954
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* move runServeDrain and some update regarding pr comments
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* some code structure change
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
---------
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
Make it possible to dump the eventbus graph as JSON or DOT to both debug
and document what is communicated via the bus.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Package geo provides functionality to represent and process
geographical locations on a sphere. The main type, geo.Point,
represents a pair of latitude and longitude coordinates.
Updates tailscale/corp#29968
Signed-off-by: Simon Law <sfllaw@tailscale.com>
* cmd/tailscale/cli: Add service flag to serve command
This commit adds the service flag to serve command which allows serving a service and add the service
to the advertisedServices field in prefs (What advertise command does that will be removed later).
When adding proxies, TCP proxies and WEB proxies work the same way as normal serve, just under a
different DNSname. There is a services specific L3 serving mode called Tun, can be set via --tun flag.
Serving a service is always in --bg mode. If --bg is explicitly set t o false, an error message will
be sent out. The restriction on proxy target being localhost or 127.0.0.1 also applies to services.
When removing proxies, TCP proxies can be removed with type and port flag and off argument. Web proxies
can be removed with type, port, setPath flag and off argument. To align with normal serve, when setPath
is not set, all handler under the hostport will be removed. When flags are not set but off argument was
passed by user, it will be a noop. Removing all config for a service will be available later with a new
subcommand clear.
Updates tailscale/corp#22954
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: fix ai comments and fix a test
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: Add a test for addServiceToPrefs
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: fix comment
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* add dnsName in error message
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* change the cli input flag variable type
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* replace FindServiceConfig with map lookup
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* some code simplification and add asServiceName
This commit cotains code simplification for IsServingHTTPS, SetWebHandler, SetTCPForwarding
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* replace IsServiceName with tailcfg.AsServiceName
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* replace all assemble of host name for service with strings.Join
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: adjust parameter order and update output message
This commit updates the parameter order for IsTCPForwardingOnPort and SetWebHandler.
Also updated the message msgServiceIPNotAssigned to msgServiceWaitingApproval to adapt to
latest terminologies around services.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: flip bool condition
This commit fixes a previous bug added that throws error when serve funnel without service.
It should've been the opposite, which throws error when serve funnel with service.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: change parameter of IsTCPForwardingOnPort
This commit changes the dnsName string parameter for IsTCPForwardingOnPort to
svcName tailcfg.ServiceName. This change is made to reduce ambiguity when
a single service might have different dnsNames
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* ipn/ipnlocal: replace the key to webHandler for services
This commit changes the way we get the webhandler for vipServices. It used to use the host name
from request to find the webHandler, now everything targeting the vipService IP have the same
set of handlers. This commit also stores service:port instead of FQDN:port as the key in serviceConfig
for Web map.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: Updated use of service name.
This commit removes serviceName.IsEmpty and use direct comparison to instead. In legacy code, when an empty service
name needs to be passed, a new constant noService is passed. Removed redundant code for checking service name validity
and string method for serviceNameFlag.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: Update bgBoolFlag
This commit update field name, set and string method of bgBoolFlag to make code cleaner.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: remove isDefaultService output from srvTypeAndPortFromFlags
This commit removes the isDefaultService out put as it's no longer needed. Also deleted redundant code.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: remove unnessesary variable declare in messageForPort
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* replace bool output for AsServiceName with err
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: Replace DNSName with NoService if DNSname only used to identify service
This commit moves noService constant to tailcfg, updates AsServiceName to return tailcfg.NoService if the input
is not a valid service name. This commit also removes using the local DNSName as scvName parameter. When a function
is only using DNSName to identify if it's working with a service, the input in replaced with svcName and expect
caller to pass tailcfg.NoService if it's a local serve. This commit also replaces some use of Sprintf with
net.JoinHostPort for ipn.HostPort creation.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: Remove the returned error for AsServiceName
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* apply suggested code and comment
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* replace local dnsName in test with tailcfg.NoService
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: move noService back and use else where
The constant serves the purpose of provide readability for passing as a function parameter. It's
more meaningful comparing to a . It can just be an empty string in other places.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* ipn: Make WebHandlerExists and RemoveTCPForwarding accept svcName
This commit replaces two functions' string input with svcName input since they only use the dnsName to
identify service. Also did some minor cleanups
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
---------
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
With auto exit nodes enabled, the client picks exit nodes from the
ones advertised in the network map. Usually, it picks the one with the
highest priority score, but when the top spot is tied, it used to pick
randomly. Then, once it made a selection, it would strongly prefer to
stick with that exit node. It wouldn’t even consider another exit node
unless the client was shutdown or the exit node went offline. This is
to prevent flapping, where a client constantly chooses a different
random exit node.
The major problem with this algorithm is that new exit nodes don’t get
selected as often as they should. In fact, they wouldn’t even move
over if a higher scoring exit node appeared.
Let’s say that you have an exit node and it’s overloaded. So you spin
up a new exit node, right beside your existing one, in the hopes that
the traffic will be split across them. But since the client had this
strong affinity, they stick with the exit node they know and love.
Using rendezvous hashing, we can have different clients spread
their selections equally across their top scoring exit nodes. When an
exit node shuts down, its clients will spread themselves evenly to
their other equal options. When an exit node starts, a proportional
number of clients will migrate to their new best option.
Read more: https://en.wikipedia.org/wiki/Rendezvous_hashing
The trade-off is that starting up a new exit node may cause some
clients to move over, interrupting their existing network connections.
So this change is only enabled for tailnets with `traffic-steering`
enabled.
Updates tailscale/corp#29966
Fixes#16551
Signed-off-by: Simon Law <sfllaw@tailscale.com>
So that conn.PeerAwareEndpoint is always evaluated per-packet, rather
than at least once per packet batch.
Updates tailscale/corp#30042
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This is a follow-up to #15351, which fixed the test for Linux but not for
Darwin, which stores its "true" executable in /usr/bin instead of /bin.
Try both paths when not running on Windows.
In addition, disable CGo in the integration test build, which was causing the
linker to fail. These tests do not need CGo, and it appears we had some version
skew with the base image on the runners.
In addition, in error cases the recover step of the permissions check was
spuriously panicking and masking the "real" failure reason. Don't do that check
when a command was not produced.
Updates #15350
Change-Id: Icd91517f45c90f7554310ebf1c888cdfd109f43a
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Socket read errors currently close the server, so we need to understand
when and why they occur.
Updates tailscale/corp#27502
Updates tailscale/corp#30118
Signed-off-by: Jordan Whited <jordan@tailscale.com>
@nickkyl added an peer.Online check to suggestExitNodeUsingDERP, so it
should also check when running suggestExitNodeUsingTrafficSteering.
Updates tailscale/corp#29966
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Thanks to @nickkhyl for pointing out that NetMap.Peers doesn’t get
incremental updates since the last full NetMap update. Instead, he
recommends using ipn/ipnlocal.nodeBackend.AppendMatchingPeers.
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
A trusted peer relay path is always better than an untrusted direct or
peer relay path.
Updates tailscale/corp#30412
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This package promises more performance, but was never used.
The intent of the package is somewhat moot as "encoding/json"
in Go 1.25 (while under GOEXPERIMENT=jsonv2) has been
completely re-implemented using "encoding/json/v2"
such that unmarshal is dramatically faster.
Updates #cleanup
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
udpRelayEndpointReady used to write into the peerMap, which required
holding Conn.mu, but this changed in f9e7131.
Updates #cleanup
Signed-off-by: Jordan Whited <jordan@tailscale.com>
To write the init script.
And fix the JetKVM detection to work during early boot while the filesystem
and modules are still being loaded; it wasn't being detected on early boot
and then tailscaled was failing to start because it didn't know it was on JetKVM
and didn't modprobe tun.
Updates #16524
Change-Id: I0524ca3abd7ace68a69af96aab4175d32c07e116
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When `tailscale exit-node suggest` contacts the LocalAPI for a
suggested exit node, the client consults its netmap for peers that
contain the `suggest-exit-node` peercap. It currently uses a series of
heuristics to determine the exit node to suggest.
When the `traffic-steering` feature flag is enabled on its tailnet,
the client will defer to Control’s priority scores for a particular
peer. These scores, in `tailcfg.Hostinfo.Location.Priority`, were
historically only used for Mullvad exit nodes, but they have now been
extended to score any peer that could host a redundant resource.
Client capability version 119 is the earliest client that understands
these traffic steering scores. Control tells the client to switch to
rely on these scores by adding `tailcfg.NodeAttrTrafficSteering` to
its `AllCaps`.
Updates tailscale/corp#29966
Signed-off-by: Simon Law <sfllaw@tailscale.com>
In this PR, we make ExitNode.AllowOverride configurable as part of the Exit Node ADMX policy setting,
similarly to Always On w/ "Disconnect with reason" option.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates the output for "tailscale ping" to indicate if a peer relay was traversed, just like the output for DERP or direct connections.
Fixestailscale/corp#30034
Signed-off-by: Dylan Bargatze <dylan@tailscale.com>
To signal when a tailnet has the `traffic-steering` feature flag,
Control will send a `traffic-steering` NodeCapability in netmap’s
AllCaps.
This patch adds `tailcfg.NodeAttrTrafficSteering` so that it can be
used in the control plane. Future patches will implement the actual
steering mechanisms.
Updates tailscale/corp#29966
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This commit modifies the k8s-operator and k8s-proxy to support passing down
the accept-routes configuration from the proxy class as a configuration value
read and used by the k8s-proxy when ran as a distinct container managed by
the operator.
Updates #13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit modifies the k8s proxy application configuration to include a
new field named `ServerURL` which, when set, modifies
the tailscale coordination server used by the proxy. This works in the same
way as the operator and the proxies it deploys.
If unset, the default coordination server is used.
Updates https://github.com/tailscale/tailscale/issues/13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit modifies the operator to detect the usage of k8s-apiserver
type proxy groups that wish to use the letsencrypt staging directory and
apply the appropriate environment variable to the statefulset it
produces.
Updates #13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
Errors were mashalled without the correct newlines. Also, they could
generally be mashalled with more data, so an intermediate was introduced
to make them slightly nicer to look at.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
If the NodeAttrDisableRelayClient node attribute is set, ensures that a node cannot allocate endpoints on a UDP relay server itself, and cannot use newly-discovered paths (via disco/CallMeMaybeVia) that traverse a UDP relay server.
Fixestailscale/corp#30180
Signed-off-by: Dylan Bargatze <dylan@tailscale.com>
If the GUI receives a new exit node ID before the new netmap, it may treat the node as offline or invalid
if the previous netmap didn't include the peer at all, or if the peer was offline or not advertised as an exit node.
This may result in briefly issuing and dismissing a warning, or a similar issue, which isn't ideal.
In this PR, we change the operation order to send the new netmap to clients first before selecting the new exit node
and notifying them of the Exit Node change.
Updates tailscale/corp#30252 (an old issue discovered during testing this)
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We already check this for cases where ipn.Prefs.AutoExitNode is configured via syspolicy.
Configuring it directly through EditPrefs should behave the same, so we add a test for that as well.
Additionally, we clarify the implementation and future extensibility in (*LocalBackend).resolveAutoExitNodeLocked,
where the AutoExitNode is actually enforced.
Updates tailscale/corp#29969
Updates #cleanup
Signed-off-by: Nick Khyl <nickk@tailscale.com>
If the specified exit node string starts with "auto:" (i.e., can be parsed as an ipn.ExitNodeExpression),
we update ipn.Prefs.AutoExitNode instead of ipn.Prefs.ExitNodeID.
Fixes#16459
Signed-off-by: Nick Khyl <nickk@tailscale.com>
So it can be used from the CLI without importing ipnlocal.
While there, also remove isAutoExitNodeID, a wrapper around parseAutoExitNodeID
that's no longer used.
Updates tailscale/corp#29969
Updates #cleanup
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The observed generation was set to always 0 in #16429, but this had the
knock-on effect of other controllers considering ProxyGroups never ready
because the observed generation is never up to date in
proxyGroupCondition. Make sure the ProxyGroupAvailable function does not
requires the observed generation to be up to date, and add testing
coverage to catch regressions.
Updates #16327
Change-Id: I42f50ad47dd81cc2d3c3ce2cd7b252160bb58e40
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Adds a new k8s-proxy command to convert operator's in-process proxy to
a separately deployable type of ProxyGroup: kube-apiserver. k8s-proxy
reads in a new config file written by the operator, modelled on tailscaled's
conffile but with some modifications to ensure multiple versions of the
config can co-exist within a file. This should make it much easier to
support reading that config file from a Kube Secret with a stable file name.
To avoid needing to give the operator ClusterRole{,Binding} permissions,
the helm chart now optionally deploys a new static ServiceAccount for
the API Server proxy to use if in auth mode.
Proxies deployed by kube-apiserver ProxyGroups currently work the same as
the operator's in-process proxy. They do not yet leverage Tailscale Services
for presenting a single HA DNS name.
Updates #13358
Change-Id: Ib6ead69b2173c5e1929f3c13fb48a9a5362195d8
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Based on feedback that it wasn't clear what the user is meant to do with
the output of the last command, clarify that it's an optional command to
explore what got created.
Updates #13427
Change-Id: Iff64ec6d02dc04bf4bbebf415d7ed1a44e7dd658
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
When running `tailscale exit-node list`, an empty city or country name
should be displayed as a hyphen "-". However, this only happened when
there was no location at all. If a node provides a Hostinfo.Location,
then the list would display exactly what was provided.
This patch changes the listing so that empty cities and countries will
either render the provided name or "-".
Fixes#16500
Signed-off-by: Simon Law <sfllaw@tailscale.com>
When the policy setting is enabled, it allows users to override the exit node enforced by the ExitNodeID
or ExitNodeIP policy. It's primarily intended for use when ExitNodeID is set to auto:any, but it can also
be used with specific exit nodes. It does not allow disabling exit node usage entirely.
Once the exit node policy is overridden, it will not be enforced again until the policy changes,
the user connects or disconnects Tailscale, switches profiles, or disables the override.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Now that applySysPolicy is only called by (*LocalBackend).reconcilePrefsLocked,
we can make it a method to avoid passing state via parameters and to support
future extensibility.
Also factor out exit node-specific logic into applyExitNodeSysPolicyLocked.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Now that resolveExitNodeInPrefsLocked is the only caller of setExitNodeID,
and setExitNodeID is the only caller of resolveExitNodeIP, we can restructure
the code with resolveExitNodeInPrefsLocked now calling both
resolveAutoExitNodeLocked and resolveExitNodeIPLocked directly.
This prepares for factoring out resolveAutoExitNodeLocked and related
auto-exit-node logic into an ipnext extension in a future commit.
While there, we also update exit node by IP lookup to use (*nodeBackend).NodeByAddr
and (*nodeBackend).NodeByID instead of iterating over all peers in the most recent netmap.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we start passing a LocalAPI actor to (*LocalBackend).Logout to make it subject
to the same access check as disconnects made via tailscale down or the GUI.
We then update the CLI to allow `tailscale logout` to accept a reason, similar to `tailscale down`.
Updates tailscale/corp#26249
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Since a [*lazyEndpoint] makes wireguard-go responsible for peer ID, but
wireguard-go may not yet be configured for said peer, we need a JIT hook
around initiation message reception to call what is usually called from
an [*endpoint].
Updates tailscale/corp#30042
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We can't relay a packet received over the IPv4 socket back out the same
socket if destined to an IPv6 address, and vice versa.
Updates tailscale/corp#30206
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We extract checkEditPrefsAccessLocked, adjustEditPrefsLocked, and onEditPrefsLocked from the EditPrefs
execution path, defining when each step is performed and what behavior is allowed at each stage.
Currently, this is primarily used to support Always On mode, to handle the Exit Node enablement toggle,
and to report prefs edit metrics.
We then use it to enforce Exit Node policy settings by preventing users from setting an exit node
and making EditPrefs return an error when an exit node is restricted by policy. This enforcement is also
extended to the Exit Node toggle.
These changes prepare for supporting Exit Node overrides when permitted by policy and preventing logout
while Always On mode is enabled.
In the future, implementation of these methods can be delegated to ipnext extensions via the feature hooks.
Updates tailscale/corp#29969
Updates tailscale/corp#26249
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We have several places where we call applySysPolicy, suggestExitNodeLocked, and setExitNodeID.
While there are cases where we want to resolve the exit node specifically, such as when network
conditions change or a new netmap is received, we typically need to perform all three steps.
For example, enforcing policy settings may enable auto exit nodes or set an ExitNodeIP,
which in turn requires picking a suggested exit node or resolving the IP to an ID, respectively.
In this PR, we introduce (*LocalBackend).resolveExitNodeInPrefsLocked and (*LocalBackend).reconcilePrefsLocked,
with the latter calling both applySysPolicy and resolveExitNodeInPrefsLocked.
Consolidating these steps into a single extensibility point would also make it easier to support
future hooks registered by ipnext extensions.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we update setExitNodeID to retain the existing exit node if auto exit node is enabled,
the current exit node is allowed by policy, and no suggested exit node is available yet.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Now that (*LocalBackend).suggestExitNodeLocked is never called with a non-current netmap
(the netMap parameter is always nil, indicating that the current netmap should be used),
we can remove the unused parameter.
Additionally, instead of suggestExitNodeLocked passing the most recent full netmap to suggestExitNode,
we now pass the current nodeBackend so it can access peers with delta updates applied.
Finally, with that fixed, we no longer need to skip TestUpdateNetmapDeltaAutoExitNode.
Updates tailscale/corp#29969
Fixes#16455
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we add (*LocalBackend).RefreshExitNode which determines which exit node
to use based on the current prefs and netmap and switches to it if needed. It supports
both scenarios when an exit node is specified by IP (rather than ID) and needs to be resolved
once the netmap is ready as well as auto exit nodes.
We then use it in (*LocalBackend).SetControlClientStatus when the netmap changes,
and wherever (*LocalBackend).pickNewAutoExitNode was previously used.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
TCP connections are two unidirectional data streams, and if one of these
streams closes, we should not assume the other half is closed as well.
For example, if an HTTP client closes its write half of the connection
early, it may still be expecting to receive data on its read half, so we
should keep the server -> client half of the connection open, while
terminating the client -> server half.
Fixestailscale/corp#29837.
Signed-off-by: Naman Sood <mail@nsood.in>
These were flipped. DstIP() and DstIPBytes() are used internally by
wireguard-go as part of a handshake DoS mitigation strategy.
Updates tailscale/corp#20732
Updates tailscale/corp#30042
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Just make [relayManager] always handle it, there's no benefit to
checking bestAddr's.
Also, remove passing of disco.Pong to [relayManager] in
endpoint.handlePongConnLocked(), which is redundant with the callsite in
Conn.handleDiscoMessage(). Conn.handleDiscoMessage() already passes to
[relayManager] if the txID us not known to any [*endpoint].
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
A lazyEndpoint may end up on this TX codepath when wireguard-go is
deemed "under load" and ends up transmitting a cookie reply using the
received conn.Endpoint.
Updates tailscale/corp#20732
Updates tailscale/corp#30042
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This commit modifies the k8s operator to allow for customisation of the ingress class name
via a new `OPERATOR_INGRESS_CLASS_NAME` environment variable. For backwards compatibility,
this defaults to `tailscale`.
When using helm, a new `ingress.name` value is provided that will set this environment variable
and modify the name of the deployed `IngressClass` resource.
Fixes https://github.com/tailscale/tailscale/issues/16248
Signed-off-by: David Bond <davidsbond93@gmail.com>
Refactors setting status into its own top-level function to make it
easier to ensure we _always_ set the status if it's changed on every
reconcile. Previously, it was possible to have stale status if some
earlier part of the provision logic failed.
Updates #16327
Change-Id: Idab0cfc15ae426cf6914a82f0d37a5cc7845236b
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Inverts the nodeAttrs related to UDP relay client/server enablement to disablement, and fixes up the corresponding logic that uses them. Also updates the doc comments on both nodeAttrs.
Fixestailscale/corp#30024
Signed-off-by: Dylan Bargatze <dylan@tailscale.com>
This commit modifies the operator helm chart values to bring the newly
added `loginServer` field to the top level. We felt as though it was a bit
confusing to be at the `operatorConfig` level as this value modifies the
behaviour or the operator, api server & all resources that the operator
manages.
Updates https://github.com/tailscale/corp/issues/29847
Signed-off-by: David Bond <davidsbond93@gmail.com>
With this change, policy enforcement and exit node resolution can happen in separate steps,
since enforcement no longer depends on resolving the suggested exit node. This keeps policy
enforcement synchronous (e.g., when switching profiles), while allowing exit node resolution
to be asynchronous on netmap updates, link changes, etc.
Additionally, the new preference will be used to let GUIs and CLIs switch back to "auto" mode
after a manual exit node override, which is necessary for tailscale/corp#29969.
Updates tailscale/corp#29969
Updates #16459
Signed-off-by: Nick Khyl <nickk@tailscale.com>
TestSetControlClientStatusAutoExitNode is broken similarly to TestUpdateNetmapDeltaAutoExitNode
as suggestExitNode didn't previously check the online status of exit nodes, and similarly to the other test
it succeeded because the test itself is also broken.
However, it is easier to fix as it sends out a full netmap update rather than a delta peer update,
so it doesn't depend on the same refactoring as TestSetControlClientStatusAutoExitNode.
Updates #16455
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
suggestExitNode never checks whether an exit node candidate is online.
It also accepts a full netmap, which doesn't include changes from delta updates.
The test can't work correctly until both issues are fixed.
Previously, it passed only because the test itself is flawed.
It doesn't succeed because the currently selected node goes offline and a new one is chosen.
Instead, it succeeds because lastSuggestedExitNode is incorrect, and suggestExitNode picks
the correct node the first time it runs, based on the DERP map and the netcheck report.
The node in exitNodeIDWant just happens to be the optimal choice.
Fixing SuggestExitNode requires refactoring its callers first, which in turn reveals the flawed test,
as suggestExitNode ends up being called slightly earlier.
In this PR, we update the test to correctly fail due to existing bugs in SuggestExitNode,
and temporarily skip it until those issues are addressed in a future commit.
Updates #16455
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
(*profileManager).CurrentPrefs() is always valid. Additionally, there's no value in cloning
and passing the full ipn.Prefs when editing preferences. Instead, ipn.MaskedPrefs should
only have ExitNodeID set.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Currently, (*LocalBackend).pickNewAutoExitNode() is just a wrapper around
setAutoExitNodeIDLockedOnEntry that sends a prefs-change notification at the end.
It doesn't need to do that, since setPrefsLockedOnEntry already sends the notification
(setAutoExitNodeIDLockedOnEntry calls it via editPrefsLockedOnEntry).
This PR removes the old pickNewAutoExitNode function and renames
setAutoExitNodeIDLockedOnEntry to pickNewAutoExitNode for clarity.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
When dialed with just an URL and no node, the recent proxy fixes caused
a regression where there was no TLS server name being included.
Updates #16222
Updates #16223
Signed-off-by: James Tucker <james@tailscale.com>
Co-Authored-by: Jordan Whited <jwhited@tailscale.com>
This commit modifies the kubernetes operator to allow for customisation of the tailscale
login url. This provides some data locality for people that want to configure it.
This value is set in the `loginServer` helm value and is propagated down to all resources
managed by the operator. The only exception to this is recorder nodes, where additional
changes are required to support modifying the url.
Updates https://github.com/tailscale/corp/issues/29847
Signed-off-by: David Bond <davidsbond93@gmail.com>
Cryptokey Routing identification is now required to set an [epAddr] into
the peerMap for Geneve-encapsulated [epAddr]s.
Updates tailscale/corp#27502
Updates tailscale/corp#29422
Updates tailscale/corp#30042
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Report whether the client is configured with state encryption (which
varies by platform and can be optional on some). Wire it up to
`--encrypt-state` in tailscaled, which is set for Linux/Windows, and set
defaults for other platforms. Macsys will also report this if full
Keychain migration is done.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
We would like to start sending whether a node is a Tailnet owner in netmap responses so that clients can determine what information to display to a user who wants to request account deletion.
Updates tailscale/corp#30016
Signed-off-by: kari-ts <kari@tailscale.com>
Instead of calculating the PeerAPI URL at the time that we add the peer,
we now calculate it on every access to the peer. This way, if we
initially did not have a shared address family with the peer, but
later do, this allows us to access the peer at that point. This
follows the pattern from other places where we access the peer API,
which also calculate the URL on an as-needed basis.
Additionally, we now show peers as not Available when we can't get
a peer API URL.
Lastly, this moves some of the more frequent verbose Taildrive logging
from [v1] to [v2] level.
Updates #29702
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This allows logging the following Taildrive behavior from the client's perspective
when --verbose=1:
- Initialization of Taildrive remotes for every peer
- Peer availability checks
- All HTTP requests to peers (not just GET and PUT)
Updates tailscale/corp#29702
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Changes to our src/address family can trigger blackholes.
This commit also adds a missing set of trustBestAddrUntil when setting
a UDP relay path as bestAddr.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
* cmd/k8s-operator: ProxyClass annotation for Services and Ingresses
Previously, the ProxyClass could only be configured for Services and
Ingresses via a Label. This adds the ability to set it via an
Annotation, but prioritizes the Label if both a Label and Annotation are
set.
Updates #14323
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
* Update cmd/k8s-operator/operator.go
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
* Update cmd/k8s-operator/operator.go
Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
* cmd/k8s-operator: ProxyClass annotation for Services and Ingresses
Previously, the ProxyClass could only be configured for Services and
Ingresses via a Label. This adds the ability to set it via an
Annotation, but prioritizes the Label if both a Label and Annotation are
set.
Updates #14323
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
---------
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
Replace the existing systray_start counter metrics with a
systray_running gauge metrics.
This also adds an IncrementGauge method to local client to parallel
IncrementCounter. The LocalAPI handler supports both, we've just never
added a client method for gauges.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
The server-side code already does e.g. "nodeid:%d" instead of "%x"
and as a result we have to second guess a lot of identifiers that could
be hex or decimal.
This stops the bleeding and means in a year and change we'll stop
seeing the hex forms.
Updates tailscale/corp#29827
Change-Id: Ie5785a07fc32631f7c949348d3453538ab170e6d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Otherwise we can end up mirroring packets to them forever. We may
eventually want to relax this to direct paths as well, but start with
UDP relay paths, which have a higher chance of becoming untrusted and
never working again, to be conservative.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We dropped the idea of the Experimental release stage in
tailscale/tailscale-www#7697, in favour of Community Projects.
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This method is only needed to migrate between store.FileStore and
tpm.tpmStore. We can make a runtime type assertion instead of
implementing an unused method for every platform.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This was previously hooked around direct UDP path discovery /
CallMeMaybe transmission, and related conditions. Now it is subject to
relay-specific considerations.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Previously, the operator checked the ProxyGroup status fields for
information on how many of the proxies had successfully authed. Use
their state Secrets instead as a more reliable source of truth.
containerboot has written device_fqdn and device_ips keys to the
state Secret since inception, and pod_uid since 1.78.0, so there's
no need to use the API for that data. Read it from the state Secret
for consistency. However, to ensure we don't read data from a
previous run of containerboot, make sure we reset containerboot's
state keys on startup.
One other knock-on effect of that is ProxyGroups can briefly be
marked not Ready while a Pod is restarting. Introduce a new
ProxyGroupAvailable condition to more accurately reflect
when downstream controllers can implement flows that rely on a
ProxyGroup having at least 1 proxy Pod running.
Fixes#16327
Change-Id: I026c18e9d23e87109a471a87b8e4fb6271716a66
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Relay handshakes may now occur multiple times over the lifetime of a
relay server endpoint. Handshake messages now include a handshake
generation, which is client specified, as a means to trigger safe
challenge reset server-side.
Relay servers continue to enforce challenge values as single use. They
will only send a given value once, in reply to the first arriving bind
message for a handshake generation.
VNI has been added to the handshake messages, and we expect the outer
Geneve header value to match the sealed value upon reception.
Remote peer disco pub key is now also included in handshake messages,
and it must match the receiver's expectation for the remote,
participating party.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Add a new `--encrypt-state` flag to `cmd/tailscaled`. Based on that
flag, migrate the existing state file to/from encrypted format if
needed.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
GitHub used to recommend the tibdex/github-app-token GitHub Action
until they wrote their own actions/create-github-app-token.
This patch replaces the use of the third-party action with the
official one.
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
go(1) repsects GOROOT if set, but tool/go / gocross-wrapper.sh are explicitly intending to use our toolchain. We don't need to set GOROOT, just unset it, and then go(1) handles the rest.
Updates tailscale/corp#26717
Signed-off-by: James Tucker <james@tailscale.com>
For any changes that involve DERP, automatically add the
@tailscale/dataplane team as a reviewer.
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Premature cancellation was preventing the work from ever being cleaned
up in runLoop().
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
SSH was disabled in #10538
Exit node was disabled in #13726
This enables ssh and exit-node options in case of Home Assistant.
Fixes#15552
Signed-off-by: Laszlo Magyar <lmagyar1973@gmail.com>
This commit adds a NOTES.txt to the operator helm chart that will be written to the
terminal upon successful installation of the operator.
It includes a small list of knowledgebase articles with possible next steps for
the actor that installed the operator to the cluster. It also provides possible
commands to use for explaining the custom resources.
Fixes#13427
Signed-off-by: David Bond <davidsbond93@gmail.com>
Instead of every module having to come up with a set of test methods for
the event bus, this handful of test helpers hides a lot of the needed
setup for the testing of the event bus.
The tests in portmapper is also ported over to the new helpers.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
20.04 is no longer supported.
This pulls in changes to the QDK package that were required to make build succeed on 24.04.
Updates https://github.com/tailscale/corp/issues/29849
Signed-off-by: Percy Wegmann <percy@tailscale.com>
After the switch to 24.04, unsigned packages did not build correctly (came out as only a few KBs).
Fixestailscale/tailscale-qpkg#148
Signed-off-by: Percy Wegmann <percy@tailscale.com>
If we acted as the allocator we are responsible for signaling it to the
remote peer in a CallMeMaybeVia message over DERP.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
udprelay.Server is lazily initialized when the first request is received
over peerAPI. These early requests have a high chance of failure until
the first address discovery cycle has completed.
Return an ErrServerNotReady error until the first address discovery
cycle has completed, and plumb retry handling for this error all the
way back to the client in relayManager.
relayManager can now retry after a few seconds instead of waiting for
the next path discovery cycle, which could take another minute or
longer.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Any return underneath this select case must belong to a type switch case.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This tweaks the just-added ./tool/go.{cmd,ps1} port of ./tool/go for
Windows.
Otherwise in Windows Terminal in Powershell, running just ".\tool\go"
picks up go.ps1 before go.cmd, which means execution gets denied
without the cmd script's -ExecutionPolicy Bypass part letting it work.
This makes it work in both cmd.exe and in Powershell.
Updates tailscale/corp#28679
Change-Id: Iaf628a9fd6cb95670633b2dbdb635dfb8afaa006
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
go.cmd lets you run just "./tool/go" on Windows the same as Linux/Darwin.
The batch script (go.md) then just invokes PowerShell which is more
powerful than batch.
I wanted this while debugging Windows CI performance by reproducing slow
tests on my local Windows laptop.
Updates tailscale/corp#28679
Change-Id: I6e520968da3cef3032091c1c4f4237f663cefcab
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It's one of the slower ones, so split it up into chunks.
Updates tailscale/corp#28679
Change-Id: I16a5ba667678bf238c84417a51dda61baefbecf7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Proxies know how to reload configfile on changes since 1.80, which
is going to be the earliest supported proxy version with 1.84 operator,
so remove the mechanism that was updating configfile hash to force
proxy Pod restarts on config changes.
Updates #13032
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
I earlier thought this saved a second of CPU even on a fast machine,
but I think when I was previously measuring, I still had a 4096 bit
RSA key being generated in the code I was measuring.
Measuring again for this, it's plenty fast.
Prep for using this package more, for derp, etc.
Updates #16315
Change-Id: I4c9008efa9aa88a3d65409d6ffd7b3807f4d75e9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This reverts commit 6a93b17c8c.
The reverted commit added more complexity than it was worth at the
current stage. Handling delta CapVer changes requires extensive changes
to relayManager datastructures in order to also support delta updates of
relay servers.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
If you had HTTPS_PROXY=https://some-valid-cert.example.com running a
CONNECT proxy, we should've been able to do a TLS CONNECT request to
e.g. controlplane.tailscale.com:443 through that, and I'm pretty sure
it used to work, but refactorings and lack of integration tests made
it regress.
It probably regressed when we added the baked-in LetsEncrypt root cert
validation fallback code, which was testing against the wrong hostname
(the ultimate one, not the one which we were being asked to validate)
Fixes#16222
Change-Id: If014e395f830e2f87f056f588edacad5c15e91bc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Resolver.PreferGo didn't used to work on Windows.
It was fixed in 2022, though. (https://github.com/golang/go/issues/33097)
Updates #5161
Change-Id: I4e1aeff220ebd6adc8a14f781664fa6a2068b48c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This patch contains the following cleanups:
1. Simplify `ffcli.Command` definitions;
2. Word-wrap help text, consistent with other commands;
3. `tailscale dns --help` usage makes subcommand usage more obvious;
4. `tailscale dns query --help` describes DNS record types.
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
We aim to make the tsgo directories be read-only mounts on builders.
But gocross was previously writing within the ~/.cache/tsgo/$HASH/
directories to make the synthetic GOROOT directories.
This moves them to ~/.cache/tsgoroot/$HASH/ instead.
Updates tailscale/corp#28679
Updates tailscale/corp#26717
Change-Id: I0d17730bbdce3d6374e79d49486826575d4690af
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The caller of client.RunWatchConnectionLoop may need to be
aware of errors that occur within loop. Add a channel
that notifies of errors to the caller to allow for
decisions to be make as to the state of the client.
Updates tailscale/corp#25756
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Make the OS-specific staticcheck jobs only test stuff that's specialized
for that OS. Do that using a new ./tool/listpkgs program that's a fancy
'go list' with more filtering flags.
Updates tailscale/corp#28679
Change-Id: I790be2e3a0b42b105bd39f68c4b20e217a26de60
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Tests that go mod version matches ./tool/go version.
Mismatched versions result in incosistent Go versions being used i.e.
in CI jobs as the version in go.mod is used to determine what Go version
Github actions pull in.
Updates #16283
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
gocross is not needed like it used to be, now that Go does
version stamping itself.
We keep it for the xcode and Windows builds for now.
This simplifies things in the build, especially with upcoming build
system updates.
Updates tailscale/corp#28679
Updates tailscale/corp#26717
Change-Id: Ib4bebe6f50f3b9c3d6cd27323fca603e3dfb43cc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If natc is running on a host with tailscale using `--accept-dns=true`
then a DNS loop can occur. Provide a flag for some specific DNS
upstreams for natc to use instead, to overcome such situations.
Updates #14667
Signed-off-by: James Tucker <james@tailscale.com>
eventbus.Publish() calls newPublisher(), which in turn invokes (*Client).addPublisher().
That method adds the new publisher to c.pub, so we don’t need to add it again in eventbus.Publish.
Updates #cleanup
Signed-off-by: Nick Khyl <nickk@tailscale.com>
nodeBackend now publishes filter and node changes to eventbus topics
that are consumed by magicsock.Conn
Updates tailscale/corp#27502
Updates tailscale/corp#29543
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Ensure that if the ProxyGroup for HA Ingress changes, the TLS Secret
and Role and RoleBinding that allow proxies to read/write to it are
updated.
Fixes#16259
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We update LocalBackend to shut down the current nodeBackend
when switching to a different node, and to mark the new node's
nodeBackend as ready when the switch completes.
Updates tailscale/corp#28014
Updates tailscale/corp#29543
Updates #12614
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This means the caller does not have to remember to close the reader, and avoids
having to duplicate the logic to decode JSON into events.
Updates #15160
Change-Id: I20186fabb02f72522f61d5908c4cc80b86b8936b
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Record dropped packets as soon as they time out, rather than after tx
record queues spill over, this will more accurately capture small
amounts of packet loss in a timely fashion.
Updates tailscale/corp#24522
Signed-off-by: James Tucker <james@tailscale.com>
The first packet fragment guard had an additional guard clause that was
incorrectly comparing a length in bytes to a length in octets, and was
also comparing what should have been an entire IPv4 through transport
header length to a subprotocol payload length. The subprotocol header
size guards were otherwise protecting against short transport headers,
as is the conservative non-first fragment minimum offset size. Add an
explicit disallowing of fragmentation for TSMP for the avoidance of
doubt.
Updates #cleanup
Updates #5727
Signed-off-by: James Tucker <james@tailscale.com>
During a short period of packet loss, a TCP connection to the home DERP
may be maintained. If no other regions emerge as winners, such as when
all regions but one are avoided/disallowed as candidates, ensure that
the current home region, if still active, is not dropped as the
preferred region until it has failed two keepalives.
Relatedly apply avoid and no measure no home to ICMP and HTTP checks as
intended.
Updates tailscale/corp#12894
Updates tailscale/corp#29491
Signed-off-by: James Tucker <james@tailscale.com>
The relay server now fetches IPs from local interfaces and external
perspective IP:port's via netcheck (STUN).
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Which can make operating the service more convenient.
It makes sense to put the cluster state with this if specified, so
rearrange the logic to handle that.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
This enables us to mark nodes as relay capable or not. We don't actually
do that yet, as we haven't established a relay CapVer.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Add mesh key support to derpprobe for
probing derpers with verify set to true.
Move MeshKey checking to central point for code reuse.
Fix a bad error fmt msg.
Fixestailscale/corp#27294Fixestailscale/corp#25756
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
We already present a health warning about this, but it is easy to miss
on a server when blackholing traffic makes it unreachable.
In addition to a health warning, present a risk message when exit node
is enabled.
Example:
```
$ tailscale up --exit-node=lizard
The following issues on your machine will likely make usage of exit nodes impossible:
- interface "ens4" has strict reverse-path filtering enabled
- interface "tailscale0" has strict reverse-path filtering enabled
Please set rp_filter=2 instead of rp_filter=1; see https://github.com/tailscale/tailscale/issues/3310
To skip this warning, use --accept-risk=linux-strict-rp-filter
$
```
Updates #3310
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
It might complete, interrupting it reduces the chances of establishing a
relay path.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This is simply for consistency with relayManagerInputEvent(), which
should be the sole launcher of runLoop().
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
relayManager can now hand endpoint a relay epAddr for it to consider
as bestAddr.
endpoint and Conn disco ping/pong handling are now VNI-aware.
Updates tailscale/corp#27502
Updates tailscale/corp#29422
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This commit fixes the bug that c2n requests are skiped when updating vipServices in serveConfig. This then resulted
netmap update being skipped which caused inaccuracy of Capmap info on client side. After this fix, client always
inform control about it's vipServices config changes.
Fixestailscale/corp#29219
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
This commit adds a new type to magicsock, epAddr, which largely ends up
replacing netip.AddrPort in packet I/O paths throughout, enabling
Geneve encapsulation over UDP awareness.
The conn.ReceiveFunc for UDP has been revamped to fix and more clearly
distinguish the different classes of packets we expect to receive: naked
STUN binding messages, naked disco, naked WireGuard, Geneve-encapsulated
disco, and Geneve-encapsulated WireGuard.
Prior to this commit, STUN matching logic in the RX path could swallow
a naked WireGuard packet if the keypair index, which is randomly
generated, happened to overlap with a subset of the STUN magic cookie.
Updates tailscale/corp#27502
Updates tailscale/corp#29326
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Fragmented datagrams would be processed instead of being dumped right
away. In reality, thse datagrams would be dropped anyway later so there
should functionally not be any change. Additionally, the feature is off
by default.
Closes#16203
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Also add a trailing newline to error banners so that SSH client messages don't print on the same line.
Updates tailscale/corp#29138
Signed-off-by: Percy Wegmann <percy@tailscale.com>
- Add tsidp target to build_docker.sh for standard Tailscale image builds
- Add publishdevtsidp Makefile target for development image publishing
- Remove Dockerfile, using standard build process
- Include tsidp in depaware dependency tracking
- Update README with comprehensive Docker usage examples
This enables tsidp to be built and published like other Tailscale components
(tailscale/tailscale, tailscale/k8s-operator, tailscale/k8s-nameserver).
Fixes#16077
Signed-off-by: Raj Singh <raj@tailscale.com>
Our conn.Bind implementation is updated to make Send() offset-aware for
future VXLAN/Geneve encapsulation support.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Fix CompareAndSwap in the edge-case where
the underlying sync.AtomicValue is uninitialized
(i.e., Store was never called) and
the oldV is the zero value,
then perform CompareAndSwap with any(nil).
Also, document that T must be comparable.
This is a pre-existing restriction.
Fixes#16135
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
In 1.84 we made 'tailscale set'/'tailscale up' error out if duplicate
command line flags are passed.
This broke some container configurations as we have two env vars that
can be used to set --accept-dns flag:
- TS_ACCEPT_DNS- specifically for --accept-dns
- TS_EXTRA_ARGS- accepts any arbitrary 'tailscale up'/'tailscale set'
flag.
We default TS_ACCEPT_DNS to false (to make the container behaviour more
declarative), which with the new restrictive CLI behaviour resulted in
failure for users who had set --accept-dns via TS_EXTRA_ARGS as the flag would be
provided twice.
This PR re-instates the previous behaviour by checking if TS_EXTRA_ARGS
contains --accept-dns flag and if so using its value to override TS_ACCEPT_DNS.
Updates tailscale/tailscale#16108
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This adds SmallSet.SoleElement, which I need in another repo for
efficiency. I added tests, but those tests failed because Add(1) +
Add(1) was promoting the first Add's sole element to a map of one
item. So fix that, and add more tests.
Updates tailscale/corp#29093
Change-Id: Iadd5ad08afe39721ee5449343095e389214d8389
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Using WINHTTP_AUTOPROXY_ALLOW_AUTOCONFIG on Windows versions older than Windows 10 1703 (build 15063)
is not supported and causes WinHttpGetProxyForUrl to fail with ERROR_INVALID_PARAMETER. This results in failures
reaching the control on environments where a proxy is required.
We use wingoes version detection to conditionally set the WINHTTP_AUTOPROXY_ALLOW_AUTOCONFIG flag
on Windows builds greater than 15063.
While there, we also update proxy detection to use WINHTTP_AUTO_DETECT_TYPE_DNS_A, as DNS-based proxy discovery
might be required with Active Directory and in certain other environments.
Updates tailscale/corp#29168
Fixes#879
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The field must only be accessed while holding LocalBackend's mutex,
but there are two places where it's accessed without the mutex:
- (LocalBackend).MaybeClearAppConnector()
- handleC2NAppConnectorDomainRoutesGet()
Fixes#16123
Signed-off-by: Nick Khyl <nickk@tailscale.com>
As note in the comment, it now being more than six months since this was
deprecated and there being no (further) uses of the old pattern in our internal
services, let's drop the migrator.
Updates #cleanup
Change-Id: Ie4fb9518b2ca04a9b361e09c51cbbacf1e2633a8
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
fixestailscale/corp#25612
We now keep track of any dns configurations which we could not
compile. This gives RecompileDNSConfig a configuration to
attempt to recompile and apply when the OS pokes us to indicate
that the interface dns servers have changed/updated. The manager config
will remain unset until we have the required information to compile
it correctly which should eliminate the problematic SERVFAIL
responses (especially on macOS 15).
This also removes the missingUpstreamRecovery func in the forwarder
which is no longer required now that we have proper error handling
and recovery manager and the client.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
relayManager is responsible for disco ping/pong probing of relay
endpoints once a handshake is complete.
Future work will enable relayManager to set a relay endpoint as the best
UDP path on an endpoint if appropriate.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Fix the wireshark lua dissector to support 0 bit position
and not throw modulo div by 0 errors.
Add new disco frame types to the decoder.
Updates tailscale/corp#29036
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Add comprehensive web interface at ui for managing OIDC clients, similar to tsrecorder's design. Features include list view, create/edit forms with validation, client secret management, delete functionality with confirmation dialogs, responsive design, and restricted tailnet access only.
Fixes#16067
Signed-off-by: Raj Singh <raj@tailscale.com>
In accordance with the OIDC/OAuth 2.0 protocol, do not send an empty
refresh_token and instead omit the field when empty.
Fixes https://github.com/tailscale/tailscale/issues/16073
Signed-off-by: Tim Klocke <taaem@mailbox.org>
Previously, a missing or invalid `dns` parameter on GET `/dns-query`
returned only “missing ‘dns’ parameter”. Now the error message guides
users to use `?dns=` or `?q=`.
Updates: #16055
Signed-off-by: Zach Buchheit <zachb@tailscale.com>
Validate that any tags that users have specified via tailscale.com/tags
annotation are valid Tailscale ACL tags.
Validate that no more than one HA Tailscale Kubernetes Services in a single cluster refer
to the same Tailscale Service.
Updates tailscale/tailscale#16054
Updates tailscale/tailscale#16035
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
As noted in #16048, the ./ssh/tailssh package failed to build on
Android, because GOOS=android also matches the "linux" build
tag. Exclude Android like iOS is excluded from macOS (darwin).
This now works:
$ GOOS=android go install ./ipn/ipnlocal ./ssh/tailssh
The original PR at #16048 is also fine, but this stops the problem
earlier.
Updates #16048
Change-Id: Ie4a6f6966a012e510c9cb11dd0d1fa88c48fac37
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
RELNOTE=Fix CSRF errors in the client Web UI
Replace gorilla/csrf with a Sec-Fetch-Site based CSRF protection
middleware that falls back to comparing the Host & Origin headers if no
SFS value is passed by the client.
Add an -origin override to the web CLI that allows callers to specify
the origin at which the web UI will be available if it is hosted behind
a reverse proxy or within another application via CGI.
Updates #14872
Updates #15065
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
To authenticate mesh keys, the DERP servers used a simple == comparison,
which is susceptible to a side channel timing attack.
By extracting the mesh key for a DERP server, an attacker could DoS it
by forcing disconnects using derp.Client.ClosePeer. They could also
enumerate the public Wireguard keys, IP addresses and ports for nodes
connected to that DERP server.
DERP servers configured without mesh keys deny all such requests.
This patch also extracts the mesh key logic into key.DERPMesh, to
prevent this from happening again.
Security bulletin: https://tailscale.com/security-bulletins#ts-2025-003Fixestailscale/corp#28720
Signed-off-by: Simon Law <sfllaw@tailscale.com>
* control/controlclient,health,tailcfg: refactor control health messages
Updates tailscale/corp#27759
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
Signed-off-by: Paul Scott <408401+icio@users.noreply.github.com>
Co-authored-by: Paul Scott <408401+icio@users.noreply.github.com>
Registering a new store is cheap, it just adds a map entry. No need to
lazy-init it with sync.Once and an intermediate slice holding init
functions.
Updates #cleanup
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Create FileOps for calling platform-specific file operations such as SAF APIs in Taildrop
Update taildrop.PutFile to support both traditional and SAF modes
Updates tailscale/tailscale#15263
Signed-off-by: kari-ts <kari@tailscale.com>
Use of the httptest client doesn't render header ordering
as expected.
Use http.DefaultClient for the test to ensure that
the header ordering test is valid.
Updates tailscale/corp#27370
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
This type improves code clarity and reduces the chance of heap alloc as
we pass it as a non-pointer. VNI being a 3-byte value enables us to
track set vs unset via the reserved/unused byte.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Taildrop wasn't working on iOS since #15971 because GetExt didn't work
until after init, but that PR moved Init until after Start.
This makes GetExt work before LocalBackend.Start (ExtensionHost.Init).
Updates #15812
Change-Id: I6e87257cd97a20f86083a746d39df223e5b6791b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The same message was used for "up" and "down" permission failures, but
"set" works better for both. Suggesting "up --operator" for a "down"
permission failure was confusing.
It's not like the latter command works in one shot anyway.
Fixes#16008
Change-Id: I6e4225ef06ce2d8e19c40bece8104e254c2aa525
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This fixes the implementation and test from #15208 which apparently
never worked.
Ignore the metacert when counting the number of expected certs
presented.
And fix the test, pulling out the TLSConfig setup code into something
shared between the real cmd/derper and the test.
Fixes#15579
Change-Id: I90526e38e59f89b480629b415f00587b107de10a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This reconciler allows users to make applications highly available at L3 by
leveraging Tailscale Virtual Services. Many Kubernetes Service's
(irrespective of the cluster they reside in) can be mapped to a
Tailscale Virtual Service, allowing access to these Services at L3.
Updates #15895
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Adds Recorder fields to configure the name and annotations of the ServiceAccount
created for and used by its associated StatefulSet. This allows the created Pod
to authenticate with AWS without requiring a Secret with static credentials,
using AWS' IAM Roles for Service Accounts feature, documented here:
https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.htmlFixes#15875
Change-Id: Ib0e15c0dbc357efa4be260e9ae5077bacdcb264f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
cmd/containerboot,kube/ingressservices: proxy VIPService TCP/UDP traffic to cluster Services
This PR is part of the work to implement HA for Kubernetes Operator's
network layer proxy.
Adds logic to containerboot to monitor mounted ingress firewall configuration rules
and update iptables/nftables rules as the config changes.
Also adds new shared types for the ingress configuration.
The implementation is intentionally similar to that for HA for egress proxy.
Updates tailscale/tailscale#15895
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
CallMeMaybeVia reception and endpoint allocation have been collapsed to
a single event channel. discoInfo caching for active relay handshakes
is now implemented.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Content-type was responding as test/plain for probes
accepting application/json. Set content type header
before setting the response code to correct this.
Updates tailscale/corp#27370
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Update proxy-to-grafana to strip any X-Webauth prefixed headers passed
by the client in *every* request, not just those to /login.
/api/ routes will also accept these headers to authenticate users,
necessitating their removal to prevent forgery.
Updates tailscale/corp#28687
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Currently, LocalBackend/ExtensionHost doesn't invoke the profile change callback for the initial profile.
Since the initial profile may vary depending on loaded extensions and applied policy settings,
it can't be reliably determined until all extensions are initialized. Additionally, some extensions
may asynchronously trigger a switch to the "best" profile (based on system state and policy settings) during
initialization.
We intended to address these issues as part of the ongoing profileManager/LocalBackend refactoring,
but the changes didn't land in time for the v1.84 release and the Taildrop refactoring.
In this PR, we update the Taildrop extension to retrieve the current profile at initialization time
and handle it as a profile change.
We also defer extension initialization until LocalBackend has started, since the Taildrop extension
already relies on this behavior (e.g., it requires clients to call SetDirectFileRoot before Init).
Fixes#15970
Updates #15812
Updates tailscale/corp#28449
Signed-off-by: Nick Khyl <nickk@tailscale.com>
`cmd/derpprobe --once` didn’t respect the convention of non-zero exit
status for a failed run. It would always exit zero (i.e. success),
even. This patch fixes that, but only for `--once` mode.
Fixes: #15925
Signed-off-by: Simon Law <sfllaw@tailscale.com>
In this PR, we make DNS registration behavior configurable via the EnableDNSRegistration policy setting.
We keep the default behavior unchanged, but allow admins to either enforce DNS registration and dynamic
DNS updates for the Tailscale interface, or prevent Tailscale from modifying the settings configured in
the network adapter's properties or by other means.
Updates #14917
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Add new rules to update DNAT rules for Kubernetes operator's
HA ingress where it's expected that rules will be added/removed
frequently (so we don't want to keep old rules around or rewrite
existing rules unnecessarily):
- allow deleting DNAT rules using metadata lookup
- allow inserting DNAT rules if they don't already
exist (using metadata lookup)
Updates tailscale/tailscale#15895
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: chaosinthecrd <tom@tmlabs.co.uk>
Commit 0841477 moved ServerEndpoint to an independent package. Move its
tests over as well.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
OCSP has been removed from the LE certs.
Use CRL verification instead.
If a cert provides a CRL, check its revocation
status, if no CRL is provided and otherwise
is valid, pass the check.
Fixes#15912
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Co-authored-by: Simon Law <sfllaw@tailscale.com>
In tailssh.go:1284, (*sshSession).startNewRecording starts a fire-and-forget goroutine that can
outlive the test that triggered its creation. Among other things, it uses ss.logf, and may call it
after the test has already returned. Since we typically use (*testing.T).Logf as the logger,
this results in a data race and causes flaky tests.
Ideally, we should fix the root cause and/or use a goroutines.Tracker to wait for the goroutine
to complete. But with the release approaching, it's too risky to make such changes now.
As a workaround, we update the tests to use tstest.WhileTestRunningLogger, which logs to t.Logf
while the test is running and disables logging once the test finishes, avoiding the race.
While there, we also fix TestSSHAuthFlow not to use log.Printf.
Updates #15568
Updates #7707 (probably related)
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We previously kept these methods in local.go when we started moving node-specific state
from LocalBackend to nodeBackend, to make those changes easier to review. But it's time
to move them to node_backend.go.
Updates #cleanup
Updates #12614
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we extract the in-process LocalAPI client/server implementation from ipn/ipnserver/server_test.go
into a new ipntest package to be used in high‑level black‑box tests, such as those for the tailscale CLI.
Updates #15575
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The event loop removes the need for growing locking complexities and
synchronization. Now we simply use channels. The event loop only runs
while there is active work to do.
relayManager remains no-op inside magicsock for the time being.
endpoints are never 'relayCapable' and therefore endpoint & Conn will
not feed CallMeMaybeVia or allocation events into it.
A number of relayManager events remain unimplemented, e.g.
CallMeMaybeVia reception and relay handshaking.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
In this PR, we make the "user-dial-routes" behavior default on all platforms except for iOS and Android.
It can be disabled by setting the TS_DNS_FORWARD_USE_ROUTES envknob to 0 or false.
Updates #12027
Updates #13837
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Set Cross-Origin-Opener-Policy: same-origin for all browser requests to
prevent window.location manipulation by malicious origins.
Updates tailscale/corp#28480
Thank you to Triet H.M. Pham for the report.
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Commit 4b525fdda (ssh/tailssh: only chdir incubator process to user's
homedir when necessary and possible, 2024-08-16) defers changing the
working directory until the incubator process drops its privileges.
However, it didn't account for the case where there is no incubator
process, because no tailscaled was found on the PATH. In that case, it
only intended to run `tailscaled be-child` in the root directory but
accidentally ran everything there.
Fixes: #15350
Signed-off-by: Simon Law <sfllaw@sfllaw.ca>
ServerEndpoint will be used within magicsock and potentially elsewhere,
which should be possible without needing to import the server
implementation itself.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
I'd moved the osshare calls to feature/taildrop hooks, but forgot to
remove them from ipnlocal, or lost them during a rebase.
But then I noticed cmd/tailscaled also had some, so turn those into a
hook.
Updates #12614
Change-Id: I024fb1d27fbcc49c013158882ee5982c2737037d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is an integration test that covers all the code in Direct, Auto, and
LocalBackend that processes NetMaps and creates a Filter. The test uses
tsnet as a convenient proxy for setting up all the client pieces correctly,
but is not actually a test specific to tsnet.
Updates tailscale/corp#20514
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
Android is Linux, but doesn't use Linux DNS managers (or D-Bus).
Updates #12614
Change-Id: I487802ac74a259cd5d2480ac26f7faa17ca8d1c3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This handle can be used in tests and debugging to identify the specific
client connection.
Updates tailscale/corp#28368
Change-Id: I48cc573fc0bcf018c66a18e67ad6c4f248fb760c
Signed-off-by: Brian Palmer <brianp@tailscale.com>
Android is Linux, but that not much Linux.
Updates #12614
Change-Id: Ice80bd3e3d173511c30d05a43d25a31e18928db7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
None of them are applicable to the common tsnet use cases.
If somebody wants one of them, they can empty import it.
Updates #12614
Change-Id: I3d7f74b555eed22e05a09ad667e4572a5bc452d8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For consistency with other flags, per Slack chat.
Updates #5902
Change-Id: I7ae1e4c97b37185573926f5fafda82cf8b46f071
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The defaultEnv and defaultBool functions are copied over temporarily
to minimise diff. This lays the ground work for having both the operator
and the new k8s-proxy binary implement the API proxy
Updates #13358
Change-Id: Ieacc79af64df2f13b27a18135517bb31c80a5a02
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Until we turn on AAAA by default (which might make some people rely on
Happy Eyeballs for targets without IPv6), this lets people turn it on
explicitly if they want.
We still should add a peer cap as well in the future to let a peer
explicitly say that it's cool with IPv6.
Related: #9574
Updates #1813
Updates #1152
Change-Id: Iec6ec9b4b5db7a4dc700ecdf4a11146cc5303989
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is a hack, but should suffice and be fast enough.
I really want to figure out what's keeping that writable fd open.
Fixes#15868
Change-Id: I285d836029355b11b7467841d31432cc5890a67e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Cleanup after #15866. It was using a mix of "b" and "c" before. But "b"
is ambiguous with LocalBackend's usual "b".
Updates #12614
Change-Id: I8c2e84597555ec3db0d783a00ac1c12549ce6706
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
As just discussed on Slack with @nickkhyl.
Updates #12614
Change-Id: I138dd7eaffb274494297567375d969b4122f3f50
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
relayManager will eventually be responsible for handling the allocation
and handshaking of UDP relay server endpoints.
relay servers are endpoint-independent, and Conn must already maintain
handshake state for all endpoints. This justifies a new data structure
to fill these roles.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Previously all tests shared their tailscale+tailscaled binaries in
system /tmp directories, which often leaked, and required TestMain to
clean up (which feature/taildrop didn't use).
This makes it use testing.T.TempDir for the binaries, but still only
builds them once and efficiently as possible depending on the OS
copies them around between each test's temp dir.
Updates #15812
Change-Id: I0e2585613f272c3d798a423b8ad1737f8916f527
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Conn.handleDiscoMessage() now makes a distinction between relay
handshake disco messages and peer disco messages.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Taildrop has never had an end-to-end test since it was introduced.
This adds a basic one.
It caught two recent refactoring bugs & one from 2022 (0f7da5c7dc).
This is prep for moving the rest of Taildrop out of LocalBackend, so
we can do more refactorings with some confidence.
Updates #15812
Change-Id: I6182e49c5641238af0bfdd9fea1ef0420c112738
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This fixes a refactoring bug introduced in 8b72dd7873
Tests (that failed on this) are coming in a separate change.
Updates #15812
Change-Id: Ibbf461b4eaefe22ad3005fc243d0a918e8af8981
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* util/linuxfw: fix delete snat rule
This pr is fixing the bug that in nftables mode setting snat-subnet-routes=false doesn't
delete the masq rule in nat table.
Updates #15661
Signed-off-by: Kevin Liang <kevinliang@tailscale.com>
* change index arithmetic in test to chunk
Signed-off-by: Kevin Liang <kevinliang@tailscale.com>
* reuse rule creation function in rule delete
Signed-off-by: Kevin Liang <kevinliang@tailscale.com>
* add test for deleting the masq rule
Signed-off-by: Kevin Liang <kevinliang@tailscale.com>
---------
Signed-off-by: Kevin Liang <kevinliang@tailscale.com>
This fixes the Taildrop deadlock from 8b72dd7873.
Fixes#15824
Change-Id: I5ca583de20dd0d0b513ce546439dc632408ca1f1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Conn.sendDiscoMessage() now verifies if the destination disco key is
associated with any known peer(s) in a thread-safe manner.
Updates #15844
Signed-off-by: Jordan Whited <jordan@tailscale.com>
And also validate opts for unknown types, before other side effects.
Fixes#15833
Change-Id: I4cabe16c49c5b7566dcafbec59f2cd1e0c8b4b3c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Instead of using the version package (which depends on
tailcfg.CurrentCapabilityVersion) to get the git commit hash, do it
directly using debug.BuildInfo. This way, when changing struct fields in
tailcfg, we can successfully `go generate` it without compiler errors.
Updates #9634
Updates https://github.com/tailscale/corp/issues/26717
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
TS_CONTROL_IS_PLAINTEXT_HTTP no longer does anything as of
8fd471ce57
Updates #13597
Change-Id: I32ae7f8c5f2a2632e80323b1302a36295ee00736
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In prep for Taildrop integration tests using them from another package.
Updates #15812
Change-Id: I6a995de4e7400658229d99c90349ad5bd1f503ae
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So it can be exported & used by other packages in future changes.
Updates #15812
Change-Id: I319000989ebc294e29c92be7f44a0e11ae6f7761
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We were missing this metric, but it can be important for some workloads.
Varz memstats output allocation cost reduced from 30 allocs per
invocation to 1 alloc per invocation.
Updates tailscale/corp#28033
Signed-off-by: James Tucker <james@tailscale.com>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We update profileManager to allow registering a single state (profile+prefs) change hook.
This is to invert the dependency between the profileManager and the LocalBackend, so that
instead of LocalBackend asking profileManager for the state, we can have profileManager
call LocalBackend when the state changes.
We also update feature.Hook with a new (*feature.Hook).GetOk method to avoid calling both
IsSet and Get.
Updates tailscale/corp#28014
Updates #12614
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This further minimizes the number of places where the profile manager updates the current profile and prefs.
We also document a scenario where an implicit profile switch can occur.
We should be able to address it after (partially?) inverting the dependency between
LocalBackend and profileManager, so that profileManager notifies LocalBackend
of profile changes instead of the other way around.
Updates tailscale/corp#28014
Updates #12614
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The previous strategy assumed clients maintained adequate state to
understand the relationship between endpoint allocation and the server
it was allocated on.
magicsock will not have awareness of the server's disco key
pre-allocation, it only understands peerAPI address at this point. The
second client to allocate on the same server could trigger
re-allocation, breaking a functional relay server endpoint.
If magicsock needs to force reallocation we can add opt-in behaviors
for this later.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We had an ordered set type (set.Slice) already but we occasionally want
to do the same thing with a map, preserving the order things were added,
so add that too, as mapsx.OrderedMap[K, V], and then use in ipnext.
Updates #12614
Change-Id: I85e6f5e11035571a28316441075e952aef9a0863
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Now that 25c4dc5fd7 removed unregistering hooks and made them into
slices, just expose the slices and remove the setter funcs.
This removes boilerplate ceremony around adding new hooks.
This does export the hooks and make them mutable at runtime in theory,
but that'd be a data race. If we really wanted to lock it down in the
future we could make the feature.Hooks slice type be an opaque struct
with an All() iterator and a "frozen" bool and we could freeze all the
hooks after init. But that doesn't seem worth it.
This means that hook registration is also now all in one place, rather
than being mixed into ProfilesService vs ipnext.Host vs FooService vs
BarService. I view that as a feature. When we have a ton of hooks and
the list is long, then we can rearrange the fields in the Hooks struct
as needed, or make sub-structs, or big comments.
Updates #12614
Change-Id: I05ce5baa45a61e79c04591c2043c05f3288d8587
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The typical way to implement union types in Go
is to use an interface where the set of types is limited.
However, there historically has been poor support
in v1 "encoding/json" with interface types where
you can marshal such values, but fail to unmarshal them
since type information about the concrete type is lost.
The MakeInterfaceCoders function constructs custom
marshal/unmarshal functions such that the type name
is encoded in the JSON representation.
The set of valid concrete types for an interface
must be statically specified for this to function.
Updates tailscale/corp#22024
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
These were likely added after everything else was updated to use tsd.NewSystem,
in a feature branch, and before it was merged back into main.
Updates #15160
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Both are populated from the current netmap's MagicDNSSuffix.
But building a full ipnstate.Status (with peers!) is expensive and unnecessary.
Updates #cleanup
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Replace all instances of interface{} with any to resolve the
golangci-lint errors that appeared in the previous tsidp PR.
Updates #cleanup
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
tstime.GoDuration JSON serializes with time.Duration.String(), which is
more human-friendly than nanoseconds.
ServerEndpoint is currently experimental, therefore breaking changes
are tolerable.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
The encoding/json/v2 effort may end up changing
the default represention of time.Duration in JSON.
See https://go.dev/issue/71631
The GoDuration type allows us to explicitly use
the time.Duration.String representation regardless of
whether we serialize with v1 or v2 of encoding/json.
Updates tailscale/corp#27502
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
We added this helper in 1e2e319e7d. Remove this copy.
Updates #cleanup
Change-Id: I5b0681acc23692beed35951c9902ac9ceca0a8b9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The relay server is still permanently disabled until node attribute
changes are wired up in a future commit.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
QNAP now requires builds to be signed with an HSM.
This removes support for signing with a local keypair.
This adds support for signing with a Google Cloud hosted key.
The key should be an RSA key with protection level `HSM` and that uses PSS padding and a SHA256 digest.
The GCloud project, keyring and key name are passed in as command-line arguments.
The GCloud credentials and the PEM signing certificate are passed in as Base64-encoded command-line arguments.
Updates tailscale/corp#23528
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This adds a feature/taildrop package, a ts_omit_taildrop build tag,
and starts moving code to feature/taildrop. In some cases, code
remains where it was but is now behind a build tag. Future changes
will move code to an extension and out of LocalBackend, etc.
Updates #12614
Change-Id: Idf96c61144d1a5f707039ceb2ff59c99f5c1642f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When an event bus is plumbed in, use it to subscribe and react to port mapping
updates instead of using the client's callback mechanism. For now, the callback
remains available as a fallback when an event bus is not provided.
Updates #15160
Change-Id: I026adca44bf6187692ee87ae8ec02641c12f7774
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
When an event bus is configured publish an event each time a new port mapping
is updated. Publication is unconditional and occurs prior to calling any
callback that is registered. For now, the callback is still fired in a separate
goroutine as before -- later, those callbacks should become subscriptions to
the published event.
For now, the event type is defined as a new type here in the package. We will
want to move it to a more central package when there are subscribers. The event
wrapper is effectively a subset of the data exported by the internal mapping
interface, but on a concrete struct so the bus plumbing can inspect it.
Updates #15160
Change-Id: I951f212429ac791223af8d75b6eb39a0d2a0053a
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
In preparation for adding more parameters (and later, moving some away), rework
the portmapper constructor to accept its arguments on a Config struct rather
than positionally.
This is a breaking change to the function signature, but one that is very easy
to update, and a search of GitHub reveals only six instances of usage outside
clones and forks of Tailscale itself, that are not direct copies of the code
fixed up here.
While we could stub in another constructor, I think it is safe to let those
folks do the update in-place, since their usage is already affected by other
changes we can't test for anyway.
Updates #15160
Change-Id: I9f8a5e12b38885074c98894b7376039261b43f43
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Although, at the moment, we do not yet require an event bus to be present, as
we start to add more pieces we will want to ensure it is always available. Add
a new constructor and replace existing uses of new(tsd.System) throughout.
Update generated files for import changes.
Updates #15160
Change-Id: Ie5460985571ade87b8eac8b416948c7f49f0f64b
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This feature is "registered" as an ipnlocal.Extension, and
conditionally linked depending on GOOS and ts_omit_relayserver build
tag.
The feature is not linked on iOS in attempt to limit the impact to
binary size and resulting effect of pushing up against NetworkExtension
limits. Eventually we will want to support the relay server on iOS,
specifically on the Apple TV. Apple TVs are well-fitted to act as
underlay relay servers as they are effectively always-on servers.
This skeleton begins to tie a PeerAPI endpoint to a net/udprelay.Server.
The PeerAPI endpoint is currently no-op as
extension.shouldRunRelayServer() always returns false. Follow-up commits
will implement extension.shouldRunRelayServer().
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Bump to latest 22.x LTS release for node as the 18.x line is going EOL this month.
Updates https://github.com/tailscale/corp/issues/27737
Signed-off-by: Mario Minardi <mario@tailscale.com>
https://github.com/tailscale/tailscale/pull/15395 changed the logic to
skip `EditPrefs` when the platform doesn't support auto-updates. But the
old logic would only fail `EditPrefs` if the auto-update value was
`true`. If it was `false`, `EditPrefs` would succeed and store `false`
in prefs. The new logic will keep the value `unset` even if the tailnet
default is `false`.
Fixes#15691
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
In this PR, we enable extensions to track changes in the current prefs. These changes can result from a profile switch
or from the user or system modifying the current profile’s prefs. Since some extensions may want to distinguish between
the two events, while others may treat them similarly, we rename the existing profile-change callback to become
a profile-state-change callback and invoke it whenever the current profile or its preferences change. Extensions can still
use the sameNode parameter to distinguish between situations where the profile information, including its preferences,
has been updated but still represents the same tailnet node, and situations where a switch to a different profile has been made.
Having dedicated prefs-change callbacks is being considered, but currently seems redundant. A single profile-state-change callback
is easier to maintain. We’ll revisit the idea of adding a separate callback as we progress on extracting existing features from LocalBackend,
but the conversion to a profile-state-change callback is intended to be permanent.
Finally, we let extensions retrieve the current prefs or profile state (profile info + prefs) at any time using the new
CurrentProfileState and CurrentPrefs methods. We also simplify the NewControlClientCallback signature to exclude
profile prefs. It’s optional, and extensions can retrieve the current prefs themselves if needed.
Updates #12614
Updates tailscale/corp#27645
Updates tailscale/corp#26435
Updates tailscale/corp#27502
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This change introduces an Age column in the output for all custom
resources to enhance visibility into their lifecycle status.
Fixes#15499
Signed-off-by: satyampsoni <satyampsoni@gmail.com>
[G,S]etWindowLongPtrW are not available on 32-bit Windows, where [G,S]etWindowLongW should be used instead.
The initial revision of #14945 imported the win package for calling and other Win32 API functions, which exported
the correct API depending on the platform. However, the same logic wasn't implemented when we removed
the win package dependency in a later revision, resulting in panics on Windows 10 x86 (there's no 32-bit Windows 11).
In this PR, we update the ipn/desktop package to use either [G,S]etWindowLongPtrW or [G,S]etWindowLongW
depending on the platform.
Fixes#15684
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Query for the const quad-100 reverse DNS name, for which a forward
record will also be served. This test was previously dependent on
search domain behavior, and now it is not.
Updates #15607
Signed-off-by: Jordan Whited <jordan@tailscale.com>
updates tailscale/tailscale#13476
On darwin, os.Hostname is no longer reliable when called
from a sandboxed process. To fix this, we will allow clients
to set an optional callback to query the hostname via an
alternative native API.
We will leave the default implementation as os.Hostname since
this works perfectly well for almost everything besides sandboxed
darwin clients.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Allow builds to be outputted to a specific directory.
By default, or if unset, artifacts are written to PWD/dist.
Updates tailscale/corp#27638
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
In this PR, we add two methods to facilitate extension lookup by both extensions,
and non-extensions (e.g., PeerAPI or LocalAPI handlers):
- FindExtensionByName returns an extension with the specified name.
It can then be type asserted to a given type.
- FindMatchingExtension is like errors.As, but for extensions.
It returns the first extension that matches the target type (either a specific extension
or an interface).
Updates tailscale/corp#27645
Updates tailscale/corp#27502
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Because we derive v6 addresses from v4 addresses we only need to store
the v4 address, not both.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
In this PR, we refactor the LocalBackend extension system, moving from direct callbacks to a more organized extension host model.
Specifically, we:
- Extract interface and callback types used by packages extending LocalBackend functionality into a new ipn/ipnext package.
- Define ipnext.Host as a new interface that bridges extensions with LocalBackend.
It enables extensions to register callbacks and interact with LocalBackend in a concurrency-safe, well-defined, and controlled way.
- Move existing callback registration and invocation code from ipnlocal.LocalBackend into a new type called ipnlocal.ExtensionHost,
implementing ipnext.Host.
- Improve docs for existing types and methods while adding docs for the new interfaces.
- Add test coverage for both the extracted and the new code.
- Remove ipn/desktop.SessionManager from tsd.System since ipn/desktop is now self-contained.
- Update existing extensions (e.g., ipn/auditlog and ipn/desktop) to use the new interfaces where appropriate.
We're not introducing new callback and hook types (e.g., for ipn.Prefs changes) just yet, nor are we enhancing current callbacks,
such as by improving conflict resolution when more than one extension tries to influence profile selection via a background profile resolver.
These further improvements will be submitted separately.
Updates #12614
Updates tailscale/corp#27645
Updates tailscale/corp#26435
Updates tailscale/corp#18342
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Adds a new diagram for ProxyGroups running in Ingress mode.
Documentation is currently not publicly available, but a link needs
adding once it is.
Updates tailscale/corp#24795
Change-Id: I0d5dd6bf6f0e1b8b0becae848dc97d8b4bfb9ccb
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Ultimately, we'd like to get rid of the concept of the "current user". It is only used on Windows,
but even then it doesn't work well in multi-user and enterprise/managed Windows environments.
In this PR, we update LocalBackend and profileManager to decouple them a bit more from this obsolete concept.
This is done in a preparation for extracting ipnlocal.Extension-related interfaces and types, and using them
to implement optional features like tailscale/corp#27645, instead of continuing growing the core ipnlocal logic.
Notably, we rename (*profileManager).SetCurrentUserAndProfile() to SwitchToProfile() and change its signature
to accept an ipn.LoginProfileView instead of an ipn.ProfileID and ipn.WindowsUserID. Since we're not removing
the "current user" completely just yet, the method sets the current user to the owner of the target profile.
We also update the profileResolver callback type, which is typically implemented by LocalBackend extensions,
to return an ipn.LoginProfileView instead of ipn.ProfileID and ipn.WindowsUserID.
Updates tailscale/corp#27645
Updates tailscale/corp#18342
Signed-off-by: Nick Khyl <nickk@tailscale.com>
ResourceCheck was previously using cmp.Diff on multiline goroutine stacks
The produced output was difficult to read for a number of reasons:
- the goroutines were sorted by count, and a changing count caused them to
jump around
- diffs would be in the middle of stacks
Instead, we now parse the pprof/goroutines?debug=1 format goroutines and
only diff whole stacks.
Updates #1253
Signed-off-by: Paul Scott <paul@tailscale.com>
Default tags to `$TAGS` if set, so that people can choose arbitrary
subsets of features.
Updates #12614
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Fix the index out of bound panic when a request is made to the local
fileserver mux with a valid secret-token, but missing share name.
Example error:
http: panic serving 127.0.0.1:40974: runtime error: slice bounds out of range [2:1]
Additionally, we document the edge case behavior of utilities that
this fileserver mux depends on.
Signed-off-by: Craig Hesling <craig@hesling.com>
This makes sure that the log target override is respected even if a
custom HTTP client is passed to logpolicy.
Updates tailscale/maple#29
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
The http.StatusMethodNotAllowed status code was being erroneously
set instead of http.StatusBadRequest in multiple places.
Updates #cleanup
Signed-off-by: Jordan Whited <jordan@tailscale.com>
In this PR, we update the Windows client updater to:
- Run msiexec with logging enabled and preserve the log file in %ProgramData%\Tailscale\Logs;
- Preserve the updater's own log file in the same location;
- Properly handle ERROR_SUCCESS_REBOOT_REQUIRED, ERROR_SUCCESS_REBOOT_INITIATED, and ERROR_INSTALL_ALREADY_RUNNING exit codes. The first two values indicate that installation
completed successfully and no further retries are needed. The last one means the Windows Installer
service is busy. Retrying immediately is likely to fail and may be risky; it could uninstall the current version
without successfully installing the new one, potentially leaving the user without Tailscale.
Updates tailscale/corp#27496
Updates tailscale#15554
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Installer tests only run when changes are made to pkgserve. This PR schedules
these tests to be run daily and report any failures to Slack.
Fixestailscale/corp#19103
Signed-off-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com>
This flag is currently no-op and hidden. The flag does round trip
through the related pref. Subsequent commits will tie them to
net/udprelay.Server. There is no corresponding "tailscale up" flag,
enabling/disabling of the relay server will only be supported via
"tailscale set".
This is a string flag in order to support disablement via empty string
as a port value of 0 means "enable the server and listen on a random
unused port". Disablement via empty string also follows existing flag
convention, e.g. advertise-routes.
Early internal discussions settled on "tailscale set --relay="<port>",
but the author felt this was too ambiguous around client vs server, and
may cause confusion in the future if we add related flags.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
`tailscale set` was created to set preferences, which used to be
overloaded into `tailscale up`. To move people over to the new
command, `up` was supposed to be frozen and no new preference flags
would be added. But people forgot, there was no test to warn them, and
so new flags were added anyway.
TestUpFlagSetIsFrozen complains when new flags are added to
`tailscale up`. It doesn’t try all combinations of GOOS, but since
the CI builds in every OS, the pull-request tests should cover this.
Updates #15460
Signed-off-by: Simon Law <sfllaw@sfllaw.ca>
Ensure no services are advertised as part of shutting down tailscaled.
Prefs are only edited if services are currently advertised, and they're
edited we wait for control's ~15s (+ buffer) delay to failover.
Note that editing prefs will trigger a synchronous write to the state
Secret, so it may fail to persist state if the ProxyGroup is getting
scaled down and therefore has its RBAC deleted at the same time, but that
failure doesn't stop prefs being updated within the local backend,
doesn't affect connectivity to control, and the state Secret is
about to get deleted anyway, so the only negative side effect is a harmless
error log during shutdown. Control still learns that the node is no
longer advertising the service and triggers the failover.
Note that the first version of this used a PreStop lifecycle hook, but
that only supports GET methods and we need the shutdown to trigger side
effects (updating prefs) so it didn't seem appropriate to expose that
functionality on a GET endpoint that's accessible on the k8s network.
Updates tailscale/corp#24795
Change-Id: I0a9a4fe7a5395ca76135ceead05cbc3ee32b3d3c
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
As IPv4 and IPv6 end up with different MSS and different congestion
control strategies, proxying between them can really amplify TCP
meltdown style conditions in many real world network conditions, such as
with higher latency, some loss, etc.
Attempt to match up the protocols, otherwise pick a destination address
arbitrarily. Also shuffle the target address to spread load across
upstream load balancers.
Updates #15367
Signed-off-by: James Tucker <james@tailscale.com>
When we have an old cert that is being rotated, include it in the order.
If we're in the ARI-recommended rotation window, LE should exclude us
from rate limits. If we're not within that window, the order still
succeeds, so there's no risk in including the old cert.
Fixes#15542
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
The test suite had grown to about 20s on my machine, but it doesn't
do much taxing work so was a good candidate to parallelise. Now runs
in under 2s on my machine.
Updates #cleanup
Change-Id: I2fcc6be9ca226c74c0cb6c906778846e959492e4
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
The earlier #15534 prevent some dup string flags. This does it for all
flag types.
Updates #6813
Change-Id: Iec2871448394ea9a5b604310bdbf7b499434bf01
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
tsconsensus enables tsnet.Server instances to form a consensus.
tsconsensus wraps hashicorp/raft with
* the ability to do discovery via tailscale tags
* inter node communication over tailscale
* routing of commands to the leader
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
So we can link open source contributors to it.
Updates #cleanup
Change-Id: I02f612b38db9594f19b3be5d982f58c136120e9a
Co-authored-by: James Sanderson <jsanderson@tailscale.com>
Co-authored-by: Will Norris <will@tailscale.com>
Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Some CLI flags support multiple values separated by commas. These flags
are intended to be declared only once and will silently ignore subsequent
instances. This will now throw an error if multiple instances of advertise-tags
and advertise-routes are detected.
Fixes#6813
Signed-off-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com>
Ensure that the upstream is always queried, so that if upstream is going
to NXDOMAIN natc will also return NXDOMAIN rather than returning address
allocations.
At this time both IPv4 and IPv6 are still returned if upstream has a
result, regardless of upstream support - this is ~ok as we're proxying.
Rewrite the tests to be once again slightly closer to integration tests,
but they're still very rough and in need of a refactor.
Further refactors are probably needed implementation side too, as this
removed rather than added units.
Updates #15367
Signed-off-by: James Tucker <james@tailscale.com>
This adds netx.DialFunc, unifying a type we have a bazillion other
places, giving it now a nice short name that's clickable in
editors, etc.
That highlighted that my earlier move (03b47a55c7) of stuff from
nettest into netx moved too much: it also dragged along the memnet
impl, meaning all users of netx.DialFunc who just wanted netx for the
type definition were instead also pulling in all of memnet.
So move the memnet implementation netx.Network into memnet, a package
we already had.
Then use netx.DialFunc in a bunch of places. I'm sure I missed some.
And plenty remain in other repos, to be updated later.
Updates tailscale/corp#27636
Change-Id: I7296cd4591218e8624e214f8c70dab05fb884e95
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I added yet another one in 6d117d64a2 but that new one is at the
best place int he dependency graph and has the best name, so let's use
that one for everything possible.
types/lazy can't use it for circular dependency reasons, so unexport
that copy at least.
Updates #cleanup
Change-Id: I25db6b6a0d81dbb8e89a0a9080c7f15cbf7aa770
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We want to be able to use the netx.Network (and RealNetwork
implemementation) outside of tests, without linking "testing".
So split out the non-test stuff of nettest into its own package.
We tend to use "foox" as the convention for things we wish were in the
standard library's foo package, so "netx" seems consistent.
Updates tailscale/corp#27636
Change-Id: I1911d361f4fbdf189837bf629a20f2ebfa863c44
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This fixes a bug in the local client where the DELETE request was
not being sent correctly. The route was missing a slash before the url
and this now matches the switch profile function.
Signed-off-by: Esteban-Bermudez <esteban@bermudezaguirre.com>
To avoid ephemeral port / TIME_WAIT exhaustion with high --count
values, and to eventually detect leaked connections in tests. (Later
the memory network will register a Cleanup on the TB to verify that
everything's been shut down)
Updates tailscale/corp#27636
Change-Id: Id06f1ae750d8719c5a75d871654574a8226d2733
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For future in-memory network changes (#15558) to be able to be
stricter and do automatic leak detection when it's safe to do so, in
non-parallel tests.
Updates tailscale/corp#27636
Change-Id: I50f03b16a3f92ce61a7ed88264b49d8c6628f638
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Make the perPeerState objects able to function independently without a
shared reference to the connector.
We don't currently change the values from connector that perPeerState
uses at runtime. Explicitly copying them at perPeerState creation allows
us to, for example, put the perPeerState into a consensus algorithm in
the future.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
This shouldn't be necessary, but while we're continuing to figure out
the root cause, this is better than having to restart the app or switch
profiles on the command line.
Updates #15528
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Android >=14 forbids the use of netlink sockets, and in some configurations
can kill apps that try.
Fixes#9836
Signed-off-by: David Anderson <dave@tailscale.com>
The regular android app constructs its own wgengine with
additional FFI shims, so this default codepath only affects
other handcrafted buids like tsnet, which do not let the
caller customize the innards of wgengine.
Android >=14 forbids the use of netlink sockets, which makes
the standard linux router fail to initialize.
Fixes#9836
Signed-off-by: David Anderson <dave@tailscale.com>
So we can link tailscale and tailscaled together into one.
Updates #5794
Change-Id: I9a8b793c64033827e4188931546cbd64db55982e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To ease local debugging and have fewer moving pieces while bringing up
Plan 9 support.
Updates #5794
Change-Id: I2dc98e73bbb0d4d4730dc47203efc0550a0ac0a0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Otherwise this was repeated closing control/derp connections all the time
on netmon changes. Arguably we should do this on all platforms?
Updates #5794
Change-Id: If6bbeff554235f188bab2a40ab75e08dd14746b2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This wasn't right; it was spinning up new goroutines non-stop.
Revert to a boring localhost TCP implementation for now.
Updates #5794
Change-Id: If93caa20a12ee4e741c0c72b0d91cc0cc5870152
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Not currently used in the OSS tree, a View for tailcfg.VIPService will
make implementing some server side changes easier.
Updates tailscale/corp#26272
Change-Id: If1ed0bea4eff8c4425d3845b433a1c562d99eb9e
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
Avoid the unbounded runtime during random allocation, if random
allocation fails after a first pass at random through the provided
ranges, pick the next free address by walking through the allocated set.
The new ipx utilities provide a bitset based allocation pool, good for
small to moderate ranges of IPv4 addresses as used in natc.
Updates #15367
Signed-off-by: James Tucker <james@tailscale.com>
fixestailscale/corp#27506
The source address link selection on sandboxed macOS doesn't deal
with loopback addresses correctly. This adds an explicit check to ensure
we return the loopback interface for loopback addresses instead of the
default empty interface.
Specifically, this allows the dns resolver to route queries to a loopback
IP which is a common tactic for local DNS proxies.
Tested on both macos, macsys and tailscaled. Forwarded requests to
127/8 all bound to lo0.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
This commit implements an experimental UDP relay server. The UDP relay
server leverages the Disco protocol for a 3-way handshake between
client and server, along with 3 new Disco message types for said
handshake. These new Disco message types are also considered
experimental, and are not yet tied to a capver.
The server expects, and imposes, a Geneve (Generic Network
Virtualization Encapsulation) header immediately following the underlay
UDP header. Geneve protocol field values have been defined for Disco
and WireGuard. The Geneve control bit must be set for the handshake
between client and server, and unset for messages relayed between
clients through the server.
Updates tailscale/corp#27101
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Add the golang-image-ico package, which is an incredibly small package
to handle the ICO container format with PNG inside. Some profile photos
look quite pixelated when displayed at this size, but it's better than
nothing, and any Windows support is just a bonus anyway.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Otherwise you can get stuck finding minor ones nonstop.
Fixes#15484
Change-Id: I7f98ac338c0b32ec1b9fdc47d053207b5fc1bf23
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It only affected js/wasm and tamago.
Updates tailscale/corp#24697
Change-Id: I8fd29323ed9b663fe3fd8d4a86f26ff584a3e134
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
initPeerAPIListener may be returning early unexpectedly. Add debug logging to
see what causes it to return early when it does.
Updates #14393
Signed-off-by: Percy Wegmann <percy@tailscale.com>
If we previously knew of macaddresses of a node, and they
suddenly goes to zero, ignore them and return the previous
hardware addresses.
Updates tailscale/corp#25168
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
For hooking up websocket VM clients to natlab.
Updates #13038
Change-Id: Iaf728b9146042f3d0c2d3a5e25f178646dd10951
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Re-enable HA Ingress again that was disabled for 1.82 release.
This reverts commit fea74a60d5.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Not all platforms have hardlinks, or not easily.
This lets a "tailscale" wrapper script set an environment variable
before calling tailscaled.
Updates #2233
Change-Id: I9eccc18651e56c106f336fcbbd0fd97a661d312e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In this PR, we update ipnlocal.LocalBackend to allow registering callbacks for control client creation
and profile changes. We also allow to register ipnauth.AuditLogFunc to be called when an auditable
action is attempted.
We then use all this to invert the dependency between the auditlog and ipnlocal packages and make
the auditlog functionality optional, where it only registers its callbacks via ipnlocal-provided hooks
when the auditlog package is imported.
We then underscore-import it when building tailscaled for Windows, and we'll explicitly
import it when building xcode/ipn-go-bridge for macOS. Since there's no default log-store
location for macOS, we'll also need to call auditlog.SetStoreFilePath to specify where
pending audit logs should be persisted.
Fixes#15394
Updates tailscale/corp#26435
Updates tailscale/corp#27012
Signed-off-by: Nick Khyl <nickk@tailscale.com>
LocalBackend transitions to ipn.NoState when switching to a different (or new) profile.
When this happens, we should unconfigure wgengine to clear routes, DNS configuration,
firewall rules that block all traffic except to the exit node, etc.
In this PR, we update (*LocalBackend).enterStateLockedOnEntry to do just that.
Fixes#15316
Updates tailscale/corp#23967
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The default values for `tailscale up` and `tailscale set` are supposed
to agree for all common flags. But they don’t for `--accept-routes`
on Windows and from the Mac OS App Store, because `tailscale up`
computes this value based on the operating system:
user@host:~$ tailscale up --help 2>&1 | grep -A1 accept-routes
--accept-dns, --accept-dns=false
accept DNS configuration from the admin panel (default true)
user@host:~$ tailscale set --help 2>&1 | grep -A1 accept-routes
--accept-dns, --accept-dns=false
accept DNS configuration from the admin panel
Luckily, `tailscale set` uses `ipn.MaskedPrefs`, so the default values
don’t logically matter. But someone will get the wrong idea if they
trust the `tailscale set --help` documentation.
In addition, `ipn.Prefs.RouteAll` defaults to true so it disagrees
with both of the flags above.
This patch makes `--accept-routes` use the same logic for in both
commands by hoisting the logic that was buried in `cmd/tailscale/cli`
to `ipn.Prefs.DefaultRouteAll`. Then, all three of defaults can agree.
Fixes: #15319
Signed-off-by: Simon Law <sfllaw@sfllaw.ca>
The default values for `tailscale up` and `tailscale set` are supposed
to agree on all common flags. But they don’t for `--accept-dns`:
user@host:~$ tailscale up --help 2>&1 | grep -A1 accept-dns
--accept-dns, --accept-dns=false
accept DNS configuration from the admin panel (default true)
user@host:~$ tailscale set --help 2>&1 | grep -A1 accept-dns
--accept-dns, --accept-dns=false
accept DNS configuration from the admin panel
Luckily, `tailscale set` uses `ipn.MaskedPrefs`, so the default values
don’t logically matter. But someone will get the wrong idea if they
trust the `tailscale set --help` documentation.
This patch makes `--accept-dns` default to true in both commands and
also introduces `TestSetDefaultsMatchUpDefaults` to prevent any future
drift.
Fixes: #15319
Signed-off-by: Simon Law <sfllaw@sfllaw.ca>
Temporarily make sure that the HA Ingress reconciler does not run,
as we do not want to release this to stable just yet.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We now have a tailscale/alpine-base:3.19 use that as the default base image.
Updates tailscale/tailscale#15328
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This is a very dumb fix as it has an unbounded worst case runtime. IP
allocation needs to be done in a more sane way in a follow-up.
Updates #15367
Signed-off-by: James Tucker <james@tailscale.com>
Bumps Alpine 3.18 -> 3.19.
Alpine 3.19 links iptables to nftables-based
implementation that can break hosts that don't
support nftables.
Link iptables back to the legacy implementation
till we have some certainty that changing to
nftables based implementation will not break existing
setups.
Updates tailscale/tailscale#15328
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
cmd/{k8s-operator,containerboot}: check TLS cert before advertising VIPService
- Ensures that Ingress status does not advertise port 443 before
TLS cert has been issued
- Ensure that Ingress backends do not advertise a VIPService
before TLS cert has been issued, unless the service also
exposes port 80
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
ipn/store/kubestore: skip cache for the write replica in cert share mode
This is to avoid issues where stale cache after Ingress recreation
causes the certs not to be re-issued.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
When compiled into TailscaleKit.framework (via the libtailscale
repository), os.Executable() returns an error instead of the name of the
executable. This commit adds another branch to the switch statement that
enumerates platforms which behave in this manner, and defaults to
"tsnet" in the same manner as those other platforms.
Fixes#15410.
Signed-off-by: James Nugent <james@jen20.com>
Minimal mitigation that doesn't do the full refactor that's probably
warranted.
Updates #15402
Change-Id: I79fd91de0e0661d25398f7d95563982ed1d11561
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
fixestailscale/tailscale#15394
In the current iteration, usage of the memstore for the audit
logger is expected on some platforms.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Add a label which differentiates the address family
for STUN checks.
Also initialize the derpprobe_attempts_total and
derpprobe_seconds_total metrics by adding 0 for
the alternate fail/ok case.
Updates tailscale/corp#27249
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
On Windows and Android, peerAPIListeners may be initialized after a link change.
This commit adds log statements to make it easier to trace this flow.
Updates #14393
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Only send a stored raw map message in reply to a streaming map response.
Otherwise a non-streaming map response might pick it up first, and
potentially drop it. This guarantees that a map response sent via
AddRawMapResponse will be picked up by the main map response loop in the
client.
Fixes#15362
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
These tests aren't perfect, nor is this complete coverage, but this is a
set of coverage that is at least stable.
Updates #15367
Signed-off-by: James Tucker <james@tailscale.com>
Currently nobody calls SetTailscaleInterfaceName yet, so this is a
no-op. I checked oss, android, and the macOS/iOS client. Nobody calls
this, or ever did.
But I want to in the future.
Updates #15408
Updates #9040
Change-Id: I05dfabe505174f9067b929e91c6e0d8bc42628d7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To let you easily run multiple tailscaled instances for development
and let you route CLI commands to the right one.
Updates #15145
Change-Id: I06b6a7bf024f341c204f30705b4c3068ac89b1a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I noticed logs on one of my machines where it can't auto-update with
scary log spam about "failed to apply tailnet-wide default for
auto-updates".
This avoids trying to do the EditPrefs if we know it's just going to
fail anyway.
Updates #282
Change-Id: Ib7db3b122185faa70efe08b60ebd05a6094eed8c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Noticed while working on a dev tool that uses local.Client.
Updates #cleanup
Change-Id: I981efff74a5cac5f515755913668bd0508a4aa14
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Switch from using the Comment field to a ts-scoped annotation for
tracking which operators are cooperating over ownership of a
VIPService.
Updates tailscale/corp#24795
Change-Id: I72d4a48685f85c0329aa068dc01a1a3c749017bf
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
cmd/k8s-operator,k8s-operator: allow using LE staging endpoint for Ingress
Allow to optionally use LetsEncrypt staging endpoint to issue
certs for Ingress/HA Ingress, so that it is easier to
experiment with initial Ingress setup without hiting rate limits.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
(*LocalBackend).setControlClientLocked() is called to both set and reset b.cc.
We shouldn't attempt to start the audit logger when b.cc is being reset (i.e., cc is nil).
However, it's fine to start the audit logger if b.cc implements auditlog.Transport, even if it's not a controlclient.Auto but a mock control client.
In this PR, we fix both issues and add an assertion that controlclient.Auto is an auditlog.Transport. This ensures a compile-time failure if controlclient.Auto ever stops being a valid transport due to future interface or implementation changes.
Updates tailscale/corp#26435
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Resetting LocalBackend's netmap without also unconfiguring wgengine to reset routes, DNS, and the killswitch
firewall rules may cause connectivity issues until a new netmap is received.
In some cases, such as when bootstrap DNS servers are inaccessible due to network restrictions or other reasons,
or if the control plane is experiencing issues, this can result in a complete loss of connectivity until the user disconnects
and reconnects to Tailscale.
As LocalBackend handles state resets in (*LocalBackend).resetForProfileChangeLockedOnEntry(), and this includes
resetting the netmap, resetting the current netmap in (*LocalBackend).Start() is not necessary.
Moreover, it's harmful if (*LocalBackend).Start() is called more than once for the same profile.
In this PR, we update resetForProfileChangeLockedOnEntry() to reset the packet filter and remove
the redundant resetting of the netmap and packet filter from Start(). We also update the state machine
tests and revise comments that became inaccurate due to previous test updates.
Updates tailscale/corp#27173
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This adds a portable way to do a raw LocalAPI request without worrying
about the Unix-vs-macOS-vs-Windows ways of hitting the LocalAPI server.
(It was already possible but tedious with 'tailscale debug local-creds')
Updates tailscale/corp#24690
Change-Id: I0828ca55edaedf0565c8db192c10f24bebb95f1b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If conffile is used to configure tailscaled, always update
currently advertised services from conffile, even if they
are empty in the conffile, to ensure that it is possible
to transition to a state where no services are advertised.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This makes the web server running inside tailscaled on 100.100.100.100:80 support requests with `Host: 100.100.100.100:80` and its IPv6 equivalent.
Prior to this commit, the web server replied to such requests with a redirect to the node's Tailscale IP:5252.
Fixes https://github.com/tailscale/tailscale/issues/14415
Signed-off-by: Alex Klyubin <klyubin@gmail.com>
There was a flaky failure case where renaming a TLS hostname for an
ingress might leave the old hostname dangling in tailscaled config. This
happened when the proxygroup reconciler loop had an outdated resource
version of the config Secret in its cache after the
ingress-pg-reconciler loop had very recently written it to delete the
old hostname. As the proxygroup reconciler then did a patch, there was
no conflict and it reinstated the old hostname.
This commit updates the patch to an update operation so that if the
resource version is out of date it will fail with an optimistic lock
error. It also checks for equality to reduce the likelihood that we make
the update API call in the first place, because most of the time the
proxygroup reconciler is not even making an update to the Secret in the
case that the hostname has changed.
Updates tailscale/corp#24795
Change-Id: Ie23a97440063976c9a8475d24ab18253e1f89050
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
updates tailscale/corp#27145
We require a means to trigger a recompilation of the DNS configuration
to pick up new nameservers for platforms where we blend the interface
nameservers from the OS into our DNS config.
Notably, on Darwin, the only API we have at our disposal will, in rare instances,
return a transient error when querying the interface nameservers on a link change if
they have not been set when we get the AF_ROUTE messages for the link
update.
There's a corresponding change in corp for Darwin clients, to track
the interface namservers during NEPathMonitor events, and call this
when the nameservers change.
This will also fix the slightly more obscure bug of changing nameservers
while tailscaled is running. That change can now be reflected in
magicDNS without having to stop the client.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
cmd/k8s-operator: configure HA Ingress replicas to share certs
Creates TLS certs Secret and RBAC that allows HA Ingress replicas
to read/write to the Secret.
Configures HA Ingress replicas to run in read-only mode.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Update the HA Ingress controller to wait until it sees AdvertisedServices
config propagated into at least 1 Pod's prefs before it updates the status
on the Ingress, to ensure the ProxyGroup Pods are ready to serve traffic
before indicating that the Ingress is ready
Updates tailscale/corp#24795
Change-Id: I1b8ce23c9e312d08f9d02e48d70bdebd9e1a4757
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
The use of html/template causes reflect-based linker bloat. Longer
term we have options to bring the UI back to iOS, but for now, cut
it out.
Updates #15297
Signed-off-by: David Anderson <dave@tailscale.com>
Allows the use of tsweb without pulling in all of the heavy prometheus
client libraries, protobuf and so on.
Updates #15160
Signed-off-by: David Anderson <dave@tailscale.com>
This PR adds some custom logic for reading and writing
kube store values that are TLS certs and keys:
1) when store is initialized, lookup additional
TLS Secrets for this node and if found, load TLS certs
from there
2) if the node runs in certs 'read only' mode and
TLS cert and key are not found in the in-memory store,
look those up in a Secret
3) if the node runs in certs 'read only' mode, run
a daily TLS certs reload to memory to get any
renewed certs
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
When the Ingress is updated to a new hostname, the controller does not
currently clean up the old VIPService from control. Fix this up to parse
the ownership comment correctly and write a test to enforce the improved
behaviour
Updates tailscale/corp#24795
Change-Id: I792ae7684807d254bf2d3cc7aa54aa04a582d1f5
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This adds support for using ACL Grants to configure a role for the
auto-provisioned user.
Fixestailscale/corp#14567
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
cmd/containerboot: manage HA Ingress TLS certs from containerboot
When ran as HA Ingress node, containerboot now can determine
whether it should manage TLS certs for the HA Ingress replicas
and call the LocalAPI cert endpoint to ensure initial issuance
and renewal of the shared TLS certs.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Shovel small events through the pipeine as fast as possible in a few basic
configurations, to establish some baseline performance numbers.
Updates #15160
Change-Id: I1dcbbd1109abb7b93aa4dcb70da57f183eb0e60e
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
* ipn/ipnlocal,envknob: add some primitives for HA replica cert share.
Add an envknob for configuring
an instance's cert store as read-only, so that it
does not attempt to issue or renew TLS credentials,
only reads them from its cert store.
This will be used by the Kubernetes Operator's HA Ingress
to enable multiple replicas serving the same HTTPS endpoint
to be able to share the same cert.
Also some minor refactor to allow adding more tests
for cert retrieval logic.
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Allow customizing the title on the debug index page. Also add methods
for registering http.HandlerFunc to make it a little easier on callers.
Updates tailscale/corp#27058
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
The demo program generates a stream of made up bus events between
a number of bus actors, as a way to generate some interesting activity
to show on the bus debug page.
Signed-off-by: David Anderson <dave@tailscale.com>
This adds a new helper to the netmon package that allows us to
rate-limit log messages, so that they only print once per (major)
LinkChange event. We then use this when constructing the portmapper, so
that we don't keep spamming logs forever on the same network.
Updates #13145
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I6e7162509148abea674f96efd76be9dffb373ae4
updates tailscale/corp#26435
Adds client support for sending audit logs to control via /machine/audit-log.
Specifically implements audit logging for user initiated disconnections.
This will require further work to optimize the peristant storage and exclusion
via build tags for mobile:
tailscale/corp#27011tailscale/corp#27012
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Ensure that the src address for a connection is one of the primary
addresses assigned by Tailscale. Not, for example, a virtual IP address.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
Add support for Cross-Origin XHR requests to the openid-configuration
endpoint to enable clients like Grafana's auto-population of OIDC setup
data from its contents.
Updates https://github.com/tailscale/tailscale/issues/10263
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
fixestailscale/tailscale#15269
Fixes the various CLIs for all of the various flavors of tailscaled on
darwin. The logic in version is updated so that we have methods that
return true only for the actual GUI app (which can beCLI) and the
order of the checks in localTCPPortAndTokenDarwin are corrected so
that the logic works with all 5 combinations of CLI and tailscaled.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
PR #14771 added support for getting certs from alternate ACME servers, but the
certStore caching mechanism breaks unless you install the CA in system roots,
because we check the validity of the cert before allowing a cache hit, which
includes checking for a valid chain back to a trusted CA. For ease of testing,
allow cert cache hits when the chain is unknown to avoid re-issuing the cert
on every TLS request served. We will still get a cache miss when the cert has
expired, as enforced by a test, and this makes it much easier to test against
non-prod ACME servers compared to having to manage the installation of non-prod
CAs on clients.
Updates #14771
Change-Id: I74fe6593fe399bd135cc822195155e99985ec08a
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
And don't return a comma-separated string. That's kinda weird
signature-wise, and not needed by half the callers anyway. The callers
that care can do the join themselves.
Updates #cleanup
Change-Id: Ib5ad51a3c6b663d868eba14fe9dc54b2609cfb0d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
natc itself can't immediately fix the problem, but it can more correctly
error that return bad addresses.
Updates tailscale/corp#26968
Signed-off-by: James Tucker <james@tailscale.com>
This lets debug tools list the types that clients are wielding, so
that they can build a dataflow graph and other debugging views.
Updates #15160
Signed-off-by: David Anderson <dave@tailscale.com>
If any debugging hook might see an event, Publisher.ShouldPublish should
tell its caller to publish even if there are no ordinary subscribers.
Updates #15160
Signed-off-by: David Anderson <dave@tailscale.com>
Enables monitoring events as they flow, listing bus clients, and
snapshotting internal queues to troubleshoot stalls.
Updates #15160
Signed-off-by: David Anderson <dave@tailscale.com>
Delete files from `$(go env GOCACHE)` and `$(go env GOMODCACHE)/cache`
that have not been modified in >= 90 minutes as these files are not
resulting in cache hits on the current branch.
These deltions have resulted in the uploaded / downloaded compressed
cache size to go down to ~1/3 of the original size in some instances
with the extracted size being ~1/4 of the original extraced size.
Updates https://github.com/tailscale/tailscale/issues/15238
Signed-off-by: Mario Minardi <mario@tailscale.com>
We previously retried getting a UPnP mapping when the device returned
error code 725, "OnlyPermanentLeasesSupported". However, we've seen
devices in the wild also return 402, "Invalid Args", when given a lease
duration. Fall back to the no-duration mapping method in these cases.
Updates #15223
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I6a25007c9eeac0dac83750dd3ae9bfcc287c8fcf
We're computing the list of services to hash by iterating over the
values of a map, the ordering of which is not guaranteed. This can cause
the hash to fluctuate depending on the ordering if there's more than one
service hosted by the same host.
Updates tailscale/corp#25733.
Signed-off-by: Naman Sood <mail@nsood.in>
If we get a packet in over some DERP and don't otherwise know how to
reply (no known DERP home or UDP endpoint), this makes us use the
DERP connection on which we received the packet to reply. This will
almost always be our own home DERP region.
This is particularly useful for large one-way nodes (such as
hello.ts.net) that don't actively reach out to other nodes, so don't
need to be told the DERP home of peers. They can instead learn the
DERP home upon getting the first connection.
This can also help nodes from a slow or misbehaving control plane.
Updates tailscale/corp#26438
Change-Id: I6241ec92828bf45982e0eb83ad5c7404df5968bc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For people who can't use LetsEncrypt because it's banned.
Per https://github.com/tailscale/tailscale/issues/11776#issuecomment-2520955317
This does two things:
1) if you run derper with --certmode=manual and --hostname=$IP_ADDRESS
we previously permitted, but now we also:
* auto-generate the self-signed cert for you if it doesn't yet exist on disk
* print out the derpmap configuration you need to use that
self-signed cert
2) teaches derp/derphttp's derp dialer to verify the signature of
self-signed TLS certs, if so declared in the existing
DERPNode.CertName field, which previously existed for domain fronting,
separating out the dial hostname from how certs are validates,
so it's not overloaded much; that's what it was meant for.
Fixes#11776
Change-Id: Ie72d12f209416bb7e8325fe0838cd2c66342c5cf
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Publicly exposed debugging functions will use these hooks to
observe dataflow in the bus.
Updates #15160
Signed-off-by: David Anderson <dave@tailscale.com>
cmd/k8s-operator: ensure HA Ingress can operate in multicluster mode.
Update the owner reference mechanism so that:
- if during HA Ingress resource creation, a VIPService
with some other operator's owner reference is already found,
just update the owner references to add one for this operator
- if during HA Ingress deletion, the VIPService is found to have owner
reference(s) from another operator, don't delete the VIPService, just
remove this operator's owner reference
- requeue after HA Ingress reconciles that resulted in VIPService updates,
to guard against overwrites due to concurrent operations from different
clusters.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Use secure constant time comparisons for the client ID and secret values
during the allowRelyingParty authorization check.
Updates #cleanup
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Now that packets flow for VIPServices, the last piece needed to start
serving them from a ProxyGroup is config to tell the proxy Pods which
services they should advertise.
Updates tailscale/corp#24795
Change-Id: Ic7bbeac8e93c9503558107bc5f6123be02a84c77
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
The natlab VM tests are flaking on GitHub Actions.
To not distract people, disable them for now (unless they're touched
directly) until they're made more reliable, which will be some painful
debugging probably.
Updates #13038
Change-Id: I6570f1cd43f8f4d628a54af8481b67455ebe83dc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This makes the helpers closer in behavior to cancelable contexts
and taskgroup.Single, and makes the worker code use a more normal
and easier to reason about context.Context for shutdown.
Updates #15160
Signed-off-by: David Anderson <dave@tailscale.com>
The Client carries both publishers and subscribers for a single
actor. This makes the APIs for publish and subscribe look more
similar, and this structure is a better fit for upcoming debug
facilities.
Updates #15160
Signed-off-by: David Anderson <dave@tailscale.com>
We are soon going to start assigning shared-in nodes a CGNAT IPv4 in the Hello tailnet when necessary, the same way that normal node shares assign a new IPv4 on conflict.
But Hello wants to display the node's native IPv4, the one it uses in its own tailnet. That IPv4 isn't available anywhere in the netmap today, because it's not normally needed for anything.
We are going to start sending that native IPv4 in the peer node CapMap, only for Hello's netmap responses. This change enables Hello to display that native IPv4 instead, when available.
Updates tailscale/corp#25393
Change-Id: I87480b6d318ab028b41ef149eb3ba618bd7f1e08
Signed-off-by: Brian Palmer <brianp@tailscale.com>
fixestailscale/corp#26806
IsMacSysApp is not returning the correct answer... It looks like the
rest of the code base uses isMacSysExt (when what they really want
to know is isMacSysApp). To fix the immediate issue (localAPI is broken
entirely in corp), we'll add this check to safesocket which lines up with
the other usages, despite the confusing naming.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
fixestailscale/corp#26806
This was still slightly incorrect. We care only if the caller is the macSys
or macOs app. isSandBoxedMacOS doesn't give us the correct answer
for macSys because technically, macsys isn't sandboxed.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Previously, it initialized when the backend was created. This caused two problems:
1. It would not properly switch when changing profiles.
2. If the backend was created before the profile had been selected, Taildrive's shares were uninitialized.
Updates #14825
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Reads use the sanitized form, so unsanitized keys being stored
in memory resulted lookup failures, for example for serve config.
Updates tailscale/tailscale#15134
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We previously were not merging in the TaildropTarget into the PeerStatus because we did not update AddPeer.
Updates tailscale/tailscale#14393
Signed-off-by: kari-ts <kari@tailscale.com>
Implements a KMS input for AWS parameter to support encrypting Tailscale
state
Fixes#14765
Change-Id: I39c0fae4bfd60a9aec17c5ea6a61d0b57143d4ba
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
This commit updates the logic of vipServicesFromPrefsLocked, so that it would return the vipServices list
even when service host is only advertising the service but not yet serving anything. This makes control
always get accurate state of service host in terms of serving a service.
Fixestailscale/corp#26843
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
Add description of the license reports in this directory and brief
instructions for reviewers. I recently needed to convert these to CSV,
so I also wanted to place to stash that regex so I didn't lose it.
Updates tailscale/corp#5780
Signed-off-by: Will Norris <will@tailscale.com>
fixestailscale/corp#26806
Fixes a regression where LocalTCPPortAndToken needs to error out early
if we're not running as sandboxed macos so that we attempt to connect
using the normal unix machinery.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
To avoid duplicate issuances/slowness while the state Secret
contains a mismatched cert and key.
Updates tailscale/tailscale#15134
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The json/v2 prototype is still in flux and the API can/will change.
Statically enforce that types implementing the v2 methods
satisfy the correct interface so that changes to the signature
can be statically detected by the compiler.
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Fix the order of the CSRF handlers (HTTP plaintext context setting,
_then_ enforcement) in the construction of the web UI server. This
resolves false-positive "invalid Origin" 403 exceptions when attempting
to update settings in the web UI.
Add unit test to exercise the CSRF protection failure and success cases
for our web UI configuration.
Updates #14822
Updates #14872
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
The upstream module has seen significant work making
the v1 emulation layer a high fidelity re-implementation
of v1 "encoding/json".
This addresses several upstream breaking changes:
* MarshalJSONV2 renamed as MarshalJSONTo
* UnmarshalJSONV2 renamed as UnmarshalJSONFrom
* Options argument removed from MarshalJSONV2
* Options argument removed from UnmarshalJSONV2
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Ensures default Linux umask 022 for the installer script to
make sure that files created by the installer can be accessed
by other tools, such as apt.
Updates tailscale/tailscale#15133
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
In order to improve latency tracking, we will use an exponentially
weighted moving average that will smooth change over time and suppress
large outlier values.
Updates tailscale/corp#26649
Signed-off-by: James Tucker <james@tailscale.com>
In this PR, we update the LocalBackend so that when the ReconnectAfter policy setting is configured
and a user disconnects Tailscale by setting WantRunning to false in the profile prefs, the LocalBackend
will now start a timer to set WantRunning back to true once the ReconnectAfter timer expires.
We also update the ADMX/ADML policy definitions to allow configuring this policy setting for Windows
via Group Policy and Intune.
Updates #14824
Signed-off-by: Nick Khyl <nickk@tailscale.com>
For tests (in another repo) that use cgo, we'd like to set CGO_ENABLED=1
explicitly when evaluating cross-compiled deps with "go list".
Updates tailscale/corp#26717
Updates tailscale/corp#26737
Change-Id: Ic21a54379ae91688d2456985068a47e73d04a645
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Diff:
7c08383913
This reverts our previous CGO_ENABLED change: c1d3e9e814
It was causing depaware problems and is no longer necessary it seems? Upstream cmd/go is static nowadays.
And pulls in:
[release-branch.go1.24] doc/godebug: mention GODEBUG=fips140
[release-branch.go1.24] cmd/compile: avoid infinite recursion when inlining closures
[release-branch.go1.24] syscall: don't truncate newly created files on Windows
[release-branch.go1.24] runtime: fix usleep on s390x/linux
[release-branch.go1.24] runtime: add some linknames back for `github.com/bytedance/sonic`
Of those, really the only the 2nd and 3rd might affect us.
Updates #15015
Updates tailscale/go#52
Change-Id: I0fa479f8b2d39f43f2dcdff6c28289dbe50b0773
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When LocalAPI returns an AccessDeniedError, display a message in the
menu and hide or disable most other menu items. This currently includes
a placeholder KB link which I'll update if we end up using something
different.
I debated whether to change the app icon to indicate an error, but opted
not to since there is actually nothing wrong with the client itself and
Tailscale will continue to function normally. It's just that the systray
app itself is in a read-only state.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
This method uses `path.Join` to build the URL. Turns out with 1.24 this
started stripping consecutive "/" characters, so "http://..." in baseURL
becomes "http:/...".
Also, `c.Tailnet` is a function that returns `c.tailnet`. Using it as a
path element would encode as a pointer instead of the tailnet name.
Finally, provide a way to prevent escaping of path elements e.g. for `?`
in `acl?details=1`.
Updates #15015
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
* go.toolchain.branch: update to Go 1.24
Updates #15015
Change-Id: I29c934ec17e60c3ac3264f30fbbe68fc21422f4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* cmd/testwrapper: fix for go1.24
Updates #15015
Signed-off-by: Paul Scott <paul@tailscale.com>
* go.mod,Dockerfile: bump to Go 1.24
Also bump golangci-lint to a version that was built with 1.24
Updates #15015
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
---------
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Paul Scott <paul@tailscale.com>
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Co-authored-by: Paul Scott <paul@tailscale.com>
Co-authored-by: Andrew Lytvynov <awly@tailscale.com>
There's nothing about it on
https://github.com/multipath-tcp/mptcp_net-next/issues/ but empirically
MPTCP doesn't support this option on awly's kernel 6.13.2 and in GitHub
actions.
Updates #15015
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
We already reset the always-on override flag when switching profiles and in a few other cases.
In this PR, we update (*LocalBackend).Start() to reset it as well. This is necessary to support
scenarios where Start() is called explicitly, such as when the GUI starts or when tailscale up is used
with additional flags and passes prefs via ipn.Options in a call to Start() rather than via EditPrefs.
Additionally, we update it to apply policy settings to the current prefs, which is necessary
for properly overriding prefs specified in ipn.Options.
Updates #14823
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This will help debug unexpected issues encountered by consumers of the gitops-pusher.
Updates tailscale/corp#26664
Signed-off-by: Percy Wegmann <percy@tailscale.com>
`routeAdvertiser` is the `iplocal.LocalBackend`. Calls to
`Advertise/UnadvertiseRoute` end up calling `EditPrefs` which in turn
calls `authReconfig` which finally calls `readvertiseAppConnectorRoutes`
which calls `AppConnector.DomainRoutes` and gets stuck on a mutex that
was already held when `routeAdvertiser` was called.
Make all calls to `routeAdvertiser` in `app.AppConnector` go through the
execqueue instead as a short-term fix.
Updates tailscale/corp#25965
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Co-authored-by: Irbe Krumina <irbe@tailscale.com>
It's not entirely clear whether this capability will be maintained, or in what form,
so this serves as a warning to that effect.
Updates tailscale/corp#22748
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This will allow Client to be extended with additional functions for internal use.
Updates tailscale/corp#22748
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This allows use of the officially supported control server API,
authenticated with the tsnet node's nodekey.
Updates tailscale/corp#22748
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Even after we remove the deprecated API, we will want to maintain a minimal
API for internal use, in order to avoid importing the external
tailscale.com/client/tailscale/v2 package. This shim exposes only the necessary
parts of the deprecated API for internal use, which gains us the following:
1. It removes deprecation warnings for internal use of the API.
2. It gives us an inventory of which parts we will want to keep for internal use.
Updates tailscale/corp#22748
Signed-off-by: Percy Wegmann <percy@tailscale.com>
testwrapper doesn't work with Go 1.24 and the coverage support is
making it harder to debug.
Updates #15015
Updates tailscale/corp#26659
Change-Id: I0125e881d08c92f1ecef88b57344f6bbb571b569
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Hostinfo.WireIngress is used as a hint that the node intends to use
funnel. We now send another field, IngressEnabled, in cases where
funnel is explicitly enabled, and the logic control-side has
been changed to look at IngressEnabled as well as WireIngress in all
cases where previously the hint was used - so we can now stop sending
WireIngress when IngressEnabled is true to save some bandwidth.
Updates tailscale/tailscale#11572
Updates tailscale/corp#25931
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
In this PR, we enable the registration of LocalBackend extensions to exclude code specific to certain
platforms or environments. We then introduce desktopSessionsExt, which is included only in Windows builds
and only if the ts_omit_desktop_sessions tag is disabled for the build. This extension tracks desktop sessions
and switches to (or remains on) the appropriate profile when a user signs in or out, locks their screen,
or disconnects a remote session.
As desktopSessionsExt requires an ipn/desktop.SessionManager, we register it with tsd.System
for the tailscaled subprocess on Windows.
We also fix a bug in the sessionWatcher implementation where it attempts to close a nil channel on stop.
Updates #14823
Updates tailscale/corp#26247
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This reverts most of 124dc10261 (#10401).
Removing in favour of adding this in CapMaps instead (#14829).
Updates tailscale/corp#16016
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
This will be used by clients to make better decisions on when to warn users
about impending key expiry.
Updates tailscale/corp#16016
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
Introduce new TaildropTargetStatus in PeerStatus
Refactor getTargetStableID to solely rely on Status() instead of calling FileTargets(). This removes a possible race condition between the two calls and provides more detailed failure information if a peer can't receive files.
Updates tailscale/tailscale#14393
Signed-off-by: kari-ts <kari@tailscale.com>
Some clients don't request 'none' authentication. Instead, they immediately supply
a password or public key. This change allows them to do so, but ignores the supplied
credentials and authenticates using Tailscale instead.
Updates #14922
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Bart has had some substantial improvements in internal representation,
update functions, and other optimizations to reduce memory usage and
improve runtime performance.
Updates tailscale/corp#26353
Signed-off-by: James Tucker <james@tailscale.com>
In this PR, we further refactor LocalBackend and Unattended Mode to extract the logic that determines
which profile should be used at the time of the check, such as when a LocalAPI client connects or disconnects.
We then update (*LocalBackend).switchProfileLockedOnEntry to to switch to the profile returned by
(*LocalBackend).resolveBestProfileLocked() rather than to the caller-specified specified profile, and rename it
to switchToBestProfileLockedOnEntry.
This is done in preparation for updating (*LocalBackend).getBackgroundProfileIDLocked to support Always-On
mode by determining which profile to use based on which users, if any, are currently logged in and have an active
foreground desktop session.
Updates #14823
Updates tailscale/corp#26247
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This reverts commit 413fb5b933.
See long story in #14992
Updates #14992
Updates tailscale/corp#26058
Change-Id: I3de7d080443efe47cbf281ea20887a3caf202488
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This adds a new policy definition for the AlwaysOn.Enabled policy setting
as well as the AlwaysOn.OverrideWithReason sub-option.
Updates #14823
Updates tailscale/corp#26247
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Currently, we disconnect Tailscale and reset LocalBackend on Windows when the last LocalAPI client
disconnects, unless Unattended Mode is enabled for the current profile. And the implementation
is somewhat racy since the current profile could theoretically change after
(*ipnserver.Server).addActiveHTTPRequest checks (*LocalBackend).InServerMode() and before it calls
(*LocalBackend).SetCurrentUser(nil) (or, previously, (*LocalBackend).ResetForClientDisconnect).
Additionally, we might want to keep Tailscale running and connected while a user is logged in
rather than tying it to whether a LocalAPI client is connected (i.e., while the GUI is running),
even when Unattended Mode is disabled for a profile. This includes scenarios where the new
AlwaysOn mode is enabled, as well as when Tailscale is used on headless Windows editions,
such as Windows Server Core, where the GUI is not supported. It may also be desirable to switch
to the "background" profile when a user logs off from their device or implement other similar
features.
To facilitate these improvements, we move the logic from ipnserver.Server to ipnlocal.LocalBackend,
where it determines whether to keep Tailscale running when the current user disconnects.
We also update the logic that determines whether a connection should be allowed to better reflect
the fact that, currently, LocalAPI connections are not allowed unless:
- the current UID is "", meaning that either we are not on a multi-user system or Tailscale is idle;
- the LocalAPI client belongs to the current user (their UIDs are the same);
- the LocalAPI client is Local System (special case; Local System is always allowed).
Whether Unattended Mode is enabled only affects the error message returned to the Local API client
when the connection is denied.
Updates #14823
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This PR adds a new package, ipn/desktop, which provides a platform-agnostic
interface for enumerating desktop sessions and registering session callbacks.
Currently, it is implemented only for Windows.
Updates #14823
Signed-off-by: Nick Khyl <nickk@tailscale.com>
WindowsActor is an ipnauth.Actor implementation that represents a logged-in
Windows user by wrapping their Windows user token.
Updates #14823
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The context carries additional information about the actor, such as the
request reason, and is canceled when the actor is done.
Additionally, we implement three new ipn.Actor types that wrap other actors
to modify their behavior:
- WithRequestReason, which adds a request reason to the actor;
- WithoutClose, which narrows the actor's interface to prevent it from being
closed;
- WithPolicyChecks, which adds policy checks to the actor's CheckProfileAccess
method.
Updates #14823
Signed-off-by: Nick Khyl <nickk@tailscale.com>
And add omitempty to the ProfilePicURL too while here. Plenty
of users (and tagged devices) don't have profile pics.
Updates #14988
Change-Id: I6534bc14edb58fe1034d2d35ae2395f09fd7dd0d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is deprecated anyway, and we don't need to be sending
`"Bits":null` on the wire for the majority of clients.
Updates tailscale/corp#20965
Updates tailscale/corp#26353
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I95a3e3d72619389ae34a6547ebf47043445374e1
The machine API docs were still often referring to the nacl boxes
which are no longer present in the client. Fix that up, fix the paths,
add the HTTP methods.
And then delete some unused code I found in the process.
Updates #cleanup
Change-Id: I1591274acbb00a08b7ca4879dfebd5e6b8a9fbcd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This fork golang.org/x/crypto/ssh (at upstream x/crypto git rev e47973b1c1)
into tailscale.com/tempfork/sshtest/ssh so we can hack up the client in weird
ways to simulate other SSH clients seen in the wild.
Two changes were made to the files when they were copied from x/crypto:
* internal/poly1305 imports were replaced by the non-internal version;
no code changes otherwise. It didn't need the internal one.
* all decode-with-passphrase funcs were deleted, to avoid
using the internal package x/crypto/ssh/internal/bcrypt_pbkdf
Then the tests passed.
Updates #14969
Change-Id: Ibf1abebfe608c75fef4da0255314f65e54ce5077
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There’s (*LocalBackend).ResetForClientDisconnect, and there’s also (*LocalBackend).resetForProfileChangeLockedOnEntry.
Both methods essentially did the same thing but in slightly different ways. For example, resetForProfileChangeLockedOnEntry didn’t reset the control client until (*LocalBackend).Start() was called at the very end and didn’t reset the keyExpired flag, while ResetForClientDisconnect didn’t reinitialize TKA.
Since SetCurrentUser can be called with a nil argument to reset the currently connected user and internally calls resetForProfileChangeLockedOnEntry, we can remove ResetForClientDisconnect and let SetCurrentUser and resetForProfileChangeLockedOnEntry handle it.
Updates #14823
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Currently, profileManager filters profiles based on their creator/owner and the "current user"'s UID.
This causes DefaultUserProfileID(uid) to work incorrectly when the UID doesn't match the current user.
While we plan to remove the concept of the "current user" completely, we're not there yet.
In this PR, we fix DefaultUserProfileID by updating profileManager to allow checking profile access
for a given UID and modifying helper methods to accept UID as a parameter when returning
matching profiles.
Updates #14823
Signed-off-by: Nick Khyl <nickk@tailscale.com>
When in tun mode on Linux, AllowedIPs are not automatically added to
netstack because the kernel is responsible for handling subnet routes.
This ensures that virtual IPs are always added to netstack.
When in tun mode, pings were also not being handled, so this adds
explicit support for ping as well.
Fixestailscale/corp#26387
Change-Id: I6af02848bf2572701288125f247d1eaa6f661107
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
These tunings reduced memory usage while the implementation was
struggling with earlier bugs, but will no longer be necessary after
those bugs are addressed.
Depends #14933
Depends #14934
Updates #9707
Updates #10408
Updates tailscale/corp#24483
Updates tailscale/corp#25169
Signed-off-by: James Tucker <james@tailscale.com>
Cubic performs better than Reno in higher BDP scenarios, and enables the
use of the hystart++ implementation contributed by Coder. This improves
throughput on higher BDP links with a much faster ramp.
gVisor is bumped as well for some fixes related to send queue processing
and RTT tracking.
Updates #9707
Updates #10408
Updates #12393
Updates tailscale/corp#24483
Updates tailscale/corp#25169
Signed-off-by: James Tucker <james@tailscale.com>
Originally identified by Coder and documented in their blog post, this
implementation differs slightly as our link endpoint was introduced for
a different purpose, but the behavior is the same: apply backpressure
rather than dropping packets. This reduces the negative impact of large
packet count bursts substantially. An alternative would be to swell the
size of the channel buffer substantially, however that's largely just
moving where buffering occurs and may lead to reduced signalling back to
lower layer or upstream congestion controls.
Updates #9707
Updates #10408
Updates #12393
Updates tailscale/corp#24483
Updates tailscale/corp#25169
Signed-off-by: James Tucker <james@tailscale.com>
The gVisor RACK implementation appears to perfom badly, particularly in
scenarios with higher BDP. This may have gone poorly noticed as a result
of it being gated on SACK, which is not enabled by default in upstream
gVisor, but itself has a higher positive impact on performance. Both the
RACK and DACK implementations (which are now one) have overlapping
non-completion of tasks in their work streams on the public tracker.
Updates #9707
Signed-off-by: James Tucker <james@tailscale.com>
Incorrect disabled support for not having a mesh key in
d5316a4fbb
Allow for no mesh key to be set.
Fixes#14928
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Since dynamic reload of setec is not supported
in derper at this time, close the server after
the secret is loaded.
Updates tailscale/corp#25756
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
updates tailscale/corp#25687
The darwin appstore and standalone clients now support XPC and the keychain for passing user credentials securely between the gui process and an NEVPNExtension hosted tailscaled. Clients that can communicate directly with the network extension, via XPC or the keychain, are now expected to call SetCredentials and supply credentials explicitly, fixing issues with the cli breaking if the current user cannot read the contents of /Library/Tailscale due to group membership restrictions. This matches how those clients source and supply credentials to the localAPI http client.
Non-platform-specific code that has traditionally been in the client is moved to safesocket.
/Libraray/Tailscaled/sameuserproof has its permissions changed to that it's readably only by users in the admin group. This restricts standalone CLI access for and direct use of localAPI to admins.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
It was moved in f57fa3cbc3.
Updates tailscale/corp#22748
Change-Id: I19f965e6bded1d4c919310aa5b864f2de0cd6220
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
localclient_aliases.go was missing some package level functions from client/local.
This adds them.
Updates tailscale/corp#22748
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Shells on OpenBSD don't support the -l option. This means that when
handling SSH in-process, we can't give the user a login shell, but this
change at least allows connecting at all.
Updates #13338
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Something I accidentally added in #14217.
It doesn't seem to impact Intune or the Administrative Templates MMC extension,
but it should still be fixed.
Updates #cleanup
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Shells on FreeBSD don't support the -l option. This means that when
handling SSH in-process, we can't give the user a login shell, but this
change at least allows connecting at all.
Updates #13338
Signed-off-by: Percy Wegmann <percy@tailscale.com>
A previous PR accidentally logged the key as part
of an error. Remove logging of the key.
Add log print for Setec store steup.
Updates tailscale/corp#25756
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Add setec secret support for derper.
Support dev mode via env var, and setec via secrets URL.
For backwards compatibility use setec load from file also.
Updates tailscale/corp#25756
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
When running via tsnet, c2n will be hooked up so requests to update can
reach the node. But it will then apply whatever OS-specific update
function, upgrading the local tailscaled instead.
We can't update tsnet automatically, so refuse it.
Fixes#14892
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
With #14843 merged, (*localapi.Handler).servePrefs() now requires a non-nil actor,
and other places may soon require it as well.
In this PR, we update localapi.NewHandler with a new required parameter for the actor.
We then update tsnet to use ipnauth.Self.
We also rearrange the code in (*ipnserver.Server).serveHTTP() to pass the actor via Handler's
constructor instead of the field.
Updates #14823
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we move the code that checks the AlwaysOn policy from ipnserver.actor to ipnauth.
It is intended to be used by ipnauth.Actor implementations, and we temporarily make it exported
while these implementations reside in ipnserver and in corp. We'll unexport it later.
We also update [ipnauth.Actor.CheckProfileAccess] to accept an auditLogger, which is called
to write details about the action to the audit log when required by the policy, and update
LocalBackend.EditPrefsAs to use an auditLogger that writes to the regular backend log.
Updates tailscale/corp#26146
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This change:
- reinstates the HA Ingress controller that was disabled for 1.80 release
- fixes the API calls to manage VIPServices as the API was changed
- triggers the HA Ingress reconciler on ProxyGroup changes
Updates tailscale/tailscale#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We once again have a report of a panic from ParseRIB. This panic guard
should probably remain permanent.
Updates #14201
This reverts commit de9d4b2f88.
Signed-off-by: James Tucker <james@tailscale.com>
Without adding this, the packet filter rejects traffic to VIP service
addresses before checking the filters sent in the netmap.
Fixestailscale/corp#26241
Change-Id: Idd54448048e9b786cf4873fd33b3b21e03d3ad4c
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
Many places that need to work with node/peer capabilities end up with a
something-View and need to either reimplement the helper code or make an
expensive copy. We have the machinery to easily handle this now.
Updates #cleanup
Change-Id: Ic3f55be329f0fc6c178de26b34359d0e8c6ca5fc
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
Observed on some airlines (British Airways, WestJet), Squid is
configured to cache and transform these results, which is disruptive.
The server and client should both actively request that this is not done
by setting Cache-Control headers.
Send a timestamp parameter to further work against caches that do not
respect the cache-control headers.
Updates #14856
Signed-off-by: James Tucker <james@tailscale.com>
In this PR, we update client/tailscale.LocalClient to allow sending requests with an optional X-Tailscale-Reason
header. We then update ipn/ipnserver.{actor,Server} to retrieve this reason, if specified, and use it to determine
whether ipnauth.Disconnect is allowed when the AlwaysOn.OverrideWithReason policy setting is enabled.
For now, we log the reason, along with the profile and OS username, to the backend log.
Finally, we update LocalBackend to remember when a disconnect was permitted and do not reconnect automatically
unless the policy changes.
Updates tailscale/corp#26146
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Dots are not allowed in metric names and cause panics. Since we use dots in names like
AlwaysOn.OverrideWithReason, let's replace them with underscores. We don’t want to use
setting.KeyPathSeparator here just yet to make it fully hierarchical, but we will decide as
we progress on the (experimental) AlwaysOn.* policy settings.
tailscale/corp#26146
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The AlwaysOn policy can be applied by (*LocalBackend).applySysPolicy, flipping WantRunning from false to true
before (*LocalBackend).Start() has been called for the first time and set a control client in b.cc. This results in a nil
pointer dereference and a panic when setPrefsLockedOnEntry applies the change and calls controlclient.Client.Login().
In this PR, we fix it by only doing a login if b.cc has been set.
Updates #14823
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The upstream crypto package now supports sending banners at any time during
authentication, so the Tailscale fork of crypto/ssh is no longer necessary.
github.com/tailscale/golang-x-crypto is still needed for some custom ACME
autocert functionality.
tempfork/gliderlabs is still necessary because of a few other customizations,
mostly related to TTY handling.
Originally implemented in 46fd4e58a2,
which was reverted in b60f6b849a to
keep the change out of v1.80.
Updates #8593
Signed-off-by: Percy Wegmann <percy@tailscale.com>
In this PR, we update LocalBackend to set WantRunning=true when applying policy settings
to the current profile's prefs, if the "always-on" mode is enabled.
We also implement a new (*LocalBackend).EditPrefsAs() method, which is like EditPrefs
but accepts an actor (e.g., a LocalAPI client's identity) that initiated the change.
If WantRunning is being set to false, the new EditPrefsAs method checks whether the actor
has ipnauth.Disconnect access to the profile and propagates an error if they do not.
Finally, we update (*ipnserver.actor).CheckProfileAccess to allow a disconnect
only if the "always-on" mode is not enabled by the AlwaysOn policy setting.
This is not a comprehensive solution to the "always-on" mode across platforms,
as instead of disconnecting a user could achieve the same effect by creating
a new empty profile, initiating a reauth, or by deleting the profile.
These are the things we should address in future PRs.
Updates #14823
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The implementations define it to verify whether the actor has the requested access to a login profile.
Updates #14823
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Conventionally, we use views (e.g., ipn.PrefsView, tailcfg.NodeView, etc.) when
dealing with structs that shouldn't be mutated. However, ipn.LoginProfile has been
an exception so far, with a mix of passing and returning LoginProfile by reference
(allowing accidental mutations) and by value (which is wasteful, given its
current size of 192 bytes).
In this PR, we generate an ipn.LoginProfileView and use it instead of passing/returning
LoginProfiles by mutable reference or copying them when passing/returning by value.
Now, LoginProfiles can only be mutated by (*profileManager).setProfilePrefs.
Updates #14823
Signed-off-by: Nick Khyl <nickk@tailscale.com>
SliceEqualAnyOrderFunc had an optimization missing from SliceEqualAnyOrder.
Now they share the same code and both have the optimization.
Updates #14593
Change-Id: I550726e0964fc4006e77bb44addc67be989c131c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
tailscaled's ipn package writes a collection of keys to state after
authenticating to control, but one at a time. If containerboot happens
to send a SIGTERM signal to tailscaled in the middle of writing those
keys, it may shut down with an inconsistent state Secret and never
recover. While we can't durably fix this with our current single-use
auth keys (no atomic operation to auth + write state), we can reduce
the window for this race condition by checking for partial state
before sending SIGTERM to tailscaled. Best effort only.
Updates #14080
Change-Id: I0532d51b6f0b7d391e538468bd6a0a80dbe1d9f7
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Some probes might need to run for longer than their scheduling interval,
so this change relaxes the 1-at-a-time restriction, allowing us to
configure probe concurrency and timeout separately. The default values
remain the same (concurrency of 1; timeout of 80% of interval).
Updates tailscale/corp#25479
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
The HA Ingress functionality is not actually doing anything
valuable yet, so don't run the controller in 1.80 release yet.
Updates tailscale/tailscale#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This change builds on top of #14436 to ensure minimum downtime during egress ProxyGroup update rollouts:
- adds a readiness gate for ProxyGroup replicas that prevents kubelet from marking
the replica Pod as ready before a corresponding readiness condition has been added
to the Pod
- adds a reconciler that reconciles egress ProxyGroup Pods and, for each that is not ready,
if cluster traffic for relevant egress endpoints is routed via this Pod- if so add the
readiness condition to allow kubelet to mark the Pod as ready.
During the sequenced StatefulSet update rollouts kubelet does not restart
a Pod before the previous replica has been updated and marked as ready, so
ensuring that a replica is not marked as ready allows to avoid a temporary
post-update situation where all replicas have been restarted, but none of the
new ones are yet set up as an endpoint for the egress service, so cluster traffic is dropped.
Updates tailscale/tailscale#14326
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
For 9dd6af1f6d
Update client/web and safeweb to correctly signal to the csrf middleware
whether the request is being served over TLS. This determines whether
Origin and Referer header checks are strictly enforced. The gorilla
library previously did not enforce these checks due to a logic bug based
on erroneous use of the net/http.Request API. The patch to fix this also
inverts the library behavior to presume that every request is being
served over TLS, necessitating these changes.
Updates tailscale/corp#25340
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Co-authored-by: Patrick O'Doherty <patrick@tailscale.com>
This reverts commit 46fd4e58a2.
We don't want to include this in 1.80 yet, but can add it back post 1.80.
Updates #8593
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Fixes the configfile reload logic- if the tailscale capver can not
yet be determined because the device info is not yet written to the
state Secret, don't assume that the proxy is pre-110.
Updates tailscale/tailscale#13032
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
cmd/{containerboot,k8s-operator},kube: add preshutdown hook for egress PG proxies
This change is part of work towards minimizing downtime during update
rollouts of egress ProxyGroup replicas.
This change:
- updates the containerboot health check logic to return Pod IP in headers,
if set
- always runs the health check for egress PG proxies
- updates ClusterIP Services created for PG egress endpoints to include
the health check endpoint
- implements preshutdown endpoint in proxies. The preshutdown endpoint
logic waits till, for all currently configured egress services, the ClusterIP
Service health check endpoint is no longer returned by the shutting-down Pod
(by looking at the new Pod IP header).
- ensures that kubelet is configured to call the preshutdown endpoint
This reduces the possibility that, as replicas are terminated during an update,
a replica gets terminated to which cluster traffic is still being routed via
the ClusterIP Service because kube proxy has not yet updated routig rules.
This is not a perfect check as in practice, it only checks that the kube
proxy on the node on which the proxy runs has updated rules. However, overall
this might be good enough.
The preshutdown logic is disabled if users have configured a custom health check
port via TS_LOCAL_ADDR_PORT env var. This change throws a warnign if so and in
future setting of that env var for operator proxies might be disallowed (as users
shouldn't need to configure this for a Pod directly).
This is backwards compatible with earlier proxy versions.
Updates tailscale/tailscale#14326
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This was flagged by @tkhattra on the merge commit; thanks!
Updates tailscale/corp#25479
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ia8045640f02bd4dcc0fe7433249fd72ac6b9cf52
The upstream crypto package now supports sending banners at any time during
authentication, so the Tailscale fork of crypto/ssh is no longer necessary.
github.com/tailscale/golang-x-crypto is still needed for some custom ACME
autocert functionality.
tempfork/gliderlabs is still necessary because of a few other customizations,
mostly related to TTY handling.
Updates #8593
Signed-off-by: Percy Wegmann <percy@tailscale.com>
It was a temporary migration over four years ago. It's no longer
relevant.
Updates #610
Change-Id: I1f00c9485fab13ede6f77603f7d4235222c2a481
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We've been maintaining temporary dev forks of golang.org/x/crypto/{acme,ssh}
in https://github.com/tailscale/golang-x-crypto instead of using
this repo's tempfork directory as we do with other packages. The reason we were
doing that was because x/crypto/ssh depended on x/crypto/ssh/internal/poly1305
and I hadn't noticed there are forwarding wrappers already available
in x/crypto/poly1305. It also depended internal/bcrypt_pbkdf but we don't use that
so it's easy to just delete that calling code in our tempfork/ssh.
Now that our SSH changes have been upstreamed, we can soon unfork from SSH.
That leaves ACME remaining.
This change copies our tailscale/golang-x-crypto/acme code to
tempfork/acme but adds a test that our vendored copied still matches
our tailscale/golang-x-crypto repo, where we can continue to do
development work and rebases with upstream. A comment on the new test
describes the expected workflow.
While we could continue to just import & use
tailscale/golang-x-crypto/acme, it seems a bit nicer to not have that
entire-fork-of-x-crypto visible at all in our transitive deps and the
questions that invites. Showing just a fork of an ACME client is much
less scary. It does add a step to the process of hacking on the ACME
client code, but we do that approximately never anyway, and the extra
step is very incremental compared to the existing tedious steps.
Updates #8593
Updates #10238
Change-Id: I8af4378c04c1f82e63d31bf4d16dba9f510f9199
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously we were depending on the GUI(s) to do it.
By doing it in tailscaled, GUIs can be simplified and be
guaranteed to render consistent results.
If warnable A depends on warnable B, if both A & B are unhealhy, only
B will be shown to the GUI as unhealthy. Once B clears up, only then
will A be presented as unhealthy.
Updates #14687
Change-Id: Id8566f2672d8d2d699740fa053d4e2a2c8009e83
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The c2n handling code was using the Go httptest package's
ResponseRecorder code but that's in a test package which brings in
Go's test certs, etc.
This forks the httptest recorder type into its own package that only
has the recorder and adds a test that we don't re-introduce a
dependency on httptest.
Updates #12614
Change-Id: I3546f49972981e21813ece9064cc2be0b74f4b16
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The hiding of internal packages has hidden things I wanted to see a
few times now. Stop hiding them. This makes depaware.txt output a bit
longer, but not too much. Plus we only really look at it with diffs &
greps anyway; it's not like anybody reads the whole thing.
Updates #12614
Change-Id: I868c89eeeddcaaab63e82371651003629bc9bda8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This protects against rearranging packages and not catching that a BadDeps
package got moved. That would then effectively remove a test.
Updates #12614
Change-Id: I257f1eeda9e3569c867b7628d5bfb252d3354ba6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The AsDebugJSON method (used only for a LocalAPI debug call) always
needed to be updated whenever a new controlknob was added. We had a
test for it, which was nice, but it was a tedious step we don't need
to do. Use reflect instead.
Updates #14788
Change-Id: If59cd776920f3ce7c748f86ed2eddd9323039a0b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We had the debug packet capture code + Lua dissector in the CLI + the
iOS app. Now we don't, with tests to lock it in.
As a bonus, tailscale.com/net/packet and tailscale.com/net/flowtrack
no longer appear in the CLI's binary either.
A new build tag ts_omit_capture disables the packet capture code and
was added to build_dist.sh's --extra-small mode.
Updates #12614
Change-Id: I79b0628c0d59911bd4d510c732284d97b0160f10
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Adds an envknob setting for changing the client's ACME directory URL.
This allows testing cert issuing against LE's staging environment, as
well as enabling local-only test environments, which is useful for
avoiding the production rate limits in test and development scenarios.
Fixes#14761
Change-Id: I191c840c0ca143a20e4fa54ea3b2f9b7cbfc889f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Some natc instances have been observed with excessive memory growth,
dominant in gvisor buffers. It is likely that the connection buffers are
sticking around for too long due to the default long segment time, and
uptuned buffer size applied by default in wgengine/netstack. Apply
configurations in natc specifically which are a better match for the
natc use case, most notably a 5s maximum segment lifetime.
Updates tailscale/corp#25169
Signed-off-by: James Tucker <james@tailscale.com>
Manually update the `web-client-prebuilt` package as the GitHub action
is failing for some reason.
Updates https://github.com/tailscale/tailscale/issues/14568
Signed-off-by: Mario Minardi <mario@tailscale.com>
Removing the advanced options collapsible from the web client login for
now ahead of our next client release.
Updates https://github.com/tailscale/tailscale/issues/14568
Signed-off-by: Mario Minardi <mario@tailscale.com>
The CN field is technically deprecated; set the requested name in a DNS SAN
extension in addition to maximise compatibility with RFC 8555.
Fixes#14762
Change-Id: If5d27f1e7abc519ec86489bf034ac98b2e613043
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
The timeout still defaults to 2 seconds, but can now be changed via command-line flag.
Updates tailscale/corp#26045
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This interface is used both by the DERP client as well as the server.
Defining the interface in derp.go makes it clear that it is shared.
Updates tailscale/corp#26045
Signed-off-by: Percy Wegmann <percy@tailscale.com>
3dabea0fc2 added some docs with inconsistent usage docs.
This fixes them, and adds a test.
It also adds some other tests and fixes other verb tense
inconsistencies.
Updates tailscale/corp#25278
Change-Id: I94c2a8940791bddd7c35c1c3d5fb791a317370c2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In v1.78, we started acquiring the GP lock when reading policy settings. This led to a deadlock during
Tailscale installation via Group Policy Software Installation because the GP engine holds the write lock
for the duration of policy processing, which in turn waits for the installation to complete, which in turn
waits for the service to enter the running state.
In this PR, we prevent the acquisition of GP locks (aka EnterCriticalPolicySection) during service startup
and update the Windows Registry-based util/syspolicy/source.PlatformPolicyStore to handle this failure
gracefully. The GP lock is somewhat optional; it’s safe to read policy settings without it, but acquiring
the lock is recommended when reading multiple values to prevent the Group Policy engine from modifying
settings mid-read and to avoid inconsistent results.
Fixes#14416
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This was a slow memory leak on busy tailnets with lots of tagged
ephemeral nodes.
Updates tailscale/corp#26058
Change-Id: I298e7d438e3ffbb3cde795640e344671d244c632
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Still behind the same ts_omit_tap build tag.
See #14738 for background on the pattern.
Updates #12614
Change-Id: I03fb3d2bf137111e727415bd8e713d8568156ecc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If we fail to parse the upstream DNS response in an app connector, we
might miss new IPs for the target domain. Log parsing errors to be able
to diagnose that.
Updates #14606
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Remove "unexpected" labelling of PeerGoneReasonNotHere.
A peer being no longer connected to a DERP server
is not an unexpected case and causes confusion in looking at logs.
Fixestailscale/corp#25609
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
The new ProxyGroup-based Ingress reconciler is causing a fatal log at
startup because it has the same name as the existing Ingress reconciler.
Explicitly name both to ensure they have unique names that are consistent
with other explicitly named reconcilers.
Updates #14583
Change-Id: Ie76e3eaf3a96b1cec3d3615ea254a847447372ea
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This pulls out the Wake-on-LAN (WoL) code out into its own package
(feature/wakeonlan) that registers itself with various new hooks
around tailscaled.
Then a new build tag (ts_omit_wakeonlan) causes the package to not
even be linked in the binary.
Ohter new packages include:
* feature: to just record which features are loaded. Future:
dependencies between features.
* feature/condregister: the package with all the build tags
that tailscaled, tsnet, and the Tailscale Xcode project
extension can empty (underscore) import to load features
as a function of the defined build tags.
Future commits will move of our "ts_omit_foo" build tags into this
style.
Updates #12614
Change-Id: I9c5378dafb1113b62b816aabef02714db3fc9c4a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* Reapply "ipn/ipnlocal: re-advertise appc routes on startup (#14609)"
This reverts commit 51adaec35a.
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
* ipn/ipnlocal: fix a deadlock in readvertiseAppConnectorRoutes
Don't hold LocalBackend.mu while calling the methods of
appc.AppConnector. Those methods could call back into LocalBackend and
try to acquire it's mutex.
Fixes https://github.com/tailscale/corp/issues/25965Fixes#14606
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
---------
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Updates tailscale/corp#25278
Adds definitions for new CLI commands getting added in v1.80. Refactors some pre-existing CLI commands within the `configure` tree to clean up code.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Rather than using a string everywhere and needing to clarify that the
string should have the svc: prefix, create a separate type for Service
names.
Updates tailscale/corp#24607
Change-Id: I720e022f61a7221644bb60955b72cacf42f59960
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
This commit intend to provide support for TCP and Web VIP services and also allow user to use Tun
for VIP services if they want to.
The commit includes:
1.Setting TCP intercept function for VIP Services.
2.Update netstack to send packet written from WG to netStack handler for VIP service.
3.Return correct TCP hander for VIP services when netstack acceptTCP.
This commit also includes unit tests for if the local backend setServeConfig would set correct TCP intercept
function and test if a hander gets returned when getting TCPHandlerForDst. The shouldProcessInbound
check is not unit tested since the test result just depends on mocked functions. There should be an integration
test to cover shouldProcessInbound and if the returned TCP handler actually does what the serveConfig says.
Updates tailscale/corp#24604
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
We previously baked in the LetsEncrypt x509 root CA for our tlsdial
package.
This moves that out into a new "bakedroots" package and is now also
shared by ipn/ipnlocal's cert validation code (validCertPEM) that
decides whether it's time to fetch a new cert.
Otherwise, a machine without LetsEncrypt roots locally in its system
roots is unable to use tailscale cert/serve and fetch certs.
Fixes#14690
Change-Id: Ic88b3bdaabe25d56b9ff07ada56a27e3f11d7159
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Both @agottardo and I tripped over this today.
Updates #cleanup
Change-Id: I64380a03bfc952b9887b1512dbcadf26499ff1cd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If unable to accept a connection from the bandwidth probe listener,
return from the goroutine immediately since the accepted connection
will be nil.
Updates tailscale/corp#25958
Signed-off-by: Percy Wegmann <percy@tailscale.com>
I saw this panic while writing a new test for #14715:
panic: send on closed channel
goroutine 826 [running]:
tailscale.com/tsnet.(*listener).handle(0x1400031a500, {0x1035fbb00, 0x14000b82300})
/Users/bradfitz/src/tailscale.com/tsnet/tsnet.go:1317 +0xac
tailscale.com/wgengine/netstack.(*Impl).acceptTCP(0x14000204700, 0x14000882100)
/Users/bradfitz/src/tailscale.com/wgengine/netstack/netstack.go:1320 +0x6dc
created by gvisor.dev/gvisor/pkg/tcpip/transport/tcp.(*Forwarder).HandlePacket in goroutine 807
/Users/bradfitz/go/pkg/mod/gvisor.dev/gvisor@v0.0.0-20240722211153-64c016c92987/pkg/tcpip/transport/tcp/forwarder.go:98 +0x32c
FAIL tailscale.com/tsnet 0.927s
Updates #14715
Change-Id: I9924e0a6c2b801d46ee44eb8eeea0da2f9ea17c4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
cmd/k8s-operator: add logic to parse L7 Ingresses in HA mode
- Wrap the Tailscale API client used by the Kubernetes Operator
into a client that knows how to manage VIPServices.
- Create/Delete VIPServices and update serve config for L7 Ingresses
for ProxyGroup.
- Ensure that ingress ProxyGroup proxies mount serve config from a shared ConfigMap.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Adds a new Hostinfo.IngressEnabled bool field that holds whether
funnel is currently enabled for the node. Triggers control update
when this value changes.
Bumps capver so that control can distinguish the new field being false
vs non-existant in previous clients.
This is part of a fix for an issue where nodes with any AllowFunnel
block set in their serve config are being displayed as if actively
routing funnel traffic in the admin panel.
Updates tailscale/tailscale#11572
Updates tailscale/corp#25931
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We throw error early with a warning if users attempt to enable background funnel
for a node that does not allow incoming connections
(shields up), but if it done in foreground mode, we just silently fail
(the funnel command succeeds, but the connections are not allowed).
This change makes sure that we also error early in foreground mode.
Updates tailscale/tailscale#11049
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Updates tailscale/corp#25936
This defines a new syspolicy 'Hostname' and allows an IT administrator to override the value we normally read from os.Hostname(). This is particularly useful on Android and iOS devices, where the hostname we get from the OS is really just the device model (a platform restriction to prevent fingerprinting).
If we don't implement this, all devices on the customer's side will look like `google-pixel-7a-1`, `google-pixel-7a-2`, `google-pixel-7a-3`, etc. and it is not feasible for the customer to use the API or worse the admin console to manually fix these names.
Apply code review comment by @nickkhyl
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Co-authored-by: Nick Khyl <1761190+nickkhyl@users.noreply.github.com>
Since 5297bd2cff, tstun.Wrapper has required its Start
method to be called for it to function. Failure to do so just
results in weird hangs and I've wasted too much time multiple
times now debugging. Hopefully this prevents more lost time.
Updates tailscale/corp#24454
Change-Id: I87f4539f7be7dc154627f8835a37a8db88c31be0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Metrics currently exist for dropped packets by reason, and total
received packets by kind (e.g., `disco` or `other`), but relating these
two together to gleam information about the drop rate for specific
reasons on a per-kind basis is not currently possible.
Change `derp_packets_dropped` to use a `metrics.MultiLabelMap` to
track both the `reason` and `kind` in the same metric to allow for this
desired level of granularity.
Drop metrics that this makes unnecessary (namely `packetsDroppedReason`
and `packetsDroppedType`).
Updates https://github.com/tailscale/corp/issues/25489
Signed-off-by: Mario Minardi <mario@tailscale.com>
Most users should not run into this because it's set in the helm chart
and the deploy manifest, but if namespace is not set we get confusing
authz errors because the kube client tries to fetch some namespaced resources
as though they're cluster-scoped and reports permission denied. Try to
detect namespace from the default projected volume, and otherwise fatal.
Fixes #cleanup
Change-Id: I64b34191e440b61204b9ad30bbfa117abbbe09c3
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
If the server was in use at the time of the initial check, but disconnected and was removed
from the activeReqs map by the time we registered a waiter, the ready channel will never
be closed, resulting in a deadlock. To avoid this, we check whether the server is still busy
after registering the wait.
Fixes#14655
Signed-off-by: Nick Khyl <nickk@tailscale.com>
I made a last-minute change in #14626 to split a single loop that created 1_000 concurrent
connections into an inner and outer loop that create 100 concurrent connections 10 times.
This introduced a race because the last user's connection may still be active (from the server's
perspective) when a new outer iteration begins. Since every new client gets a unique ClientID,
but we reuse usernames and UIDs, the server may let a user in (as the UID matches, which is fine),
but the test might then fail due to a ClientID mismatch:
server_test.go:232: CurrentUser(Initial): got &{S-1-5-21-1-0-0-1001 User-4 <nil> Client-2 false false};
want &{S-1-5-21-1-0-0-1001 User-4 <nil> Client-114 false false}
In this PR, we update (*testIPNServer).blockWhileInUse to check whether the server is currently busy
and wait until it frees up. We then call blockWhileInUse at the end of each outer iteration so that the server
is always in a known idle state at the beginning of the inner loop. We also check that the current user
is not set when the server is idle.
Updates tailscale/corp#25804
Updates #14655 (found when working on it)
Signed-off-by: Nick Khyl <nickk@tailscale.com>
As we look to add github.com/prometheus/client_golang/prometheus to
more parts of the codebase, lock in that we don't use it in tailscaled,
primarily for binary size reasons.
Updates #12614
Change-Id: I03c100d12a05019a22bdc23ce5c4df63d5a03ec6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This test verifies, among other things, that init functions cannot be deferred after (*DeferredFuncs).Do
has already been called and that all subsequent calls to (*DeferredFuncs).Defer return false.
However, the initial implementation of this check was racy: by the time (*DeferredFuncs).Do returned,
not all goroutines that successfully deferred an init function may have incremented the atomic variable
tracking the number of deferred functions. As a result, the variable's value could differ immediately
after (*DeferredFuncs).Do returned and after all goroutines had completed execution (i.e., after wg.Wait()).
In this PR, we replace the original racy check with a different one. Although this new check is also racy,
it can only produce false negatives. This means that if the test fails, it indicates an actual bug rather than
a flaky test.
Fixes#14039
Signed-off-by: Nick Khyl <nickk@tailscale.com>
There's at least one example of stored routes and advertised routes
getting out of sync. I don't know how they got there yet, but this would
backfill missing advertised routes on startup from stored routes.
Also add logging in LocalBackend.AdvertiseRoute to record when new
routes actually get put into prefs.
Updates #14606
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
I moved the actual rename into separate, GOOS-specific files. On
non-Windows, we do a simple os.Rename. On Windows, we first try
ReplaceFile with a fallback to os.Rename if the target file does
not exist.
ReplaceFile is the recommended way to rename the file in this use case,
as it preserves attributes and ACLs set on the target file.
Updates #14428
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
The --mesh-with flag now supports the specification of hostname tuples like
derp1a.tailscale.com/derp1a-vpc.tailscale.com, which instructs derp to mesh
with host 'derp1a.tailscale.com' but dial TCP connections to 'derp1a-vpc.tailscale.com'.
For backwards compatibility, --mesh-with still supports individual hostnames.
The logic which attempts to auto-discover '[host]-vpc.tailscale.com' dial hosts
has been removed.
Updates tailscale/corp#25653
Signed-off-by: Percy Wegmann <percy@tailscale.com>
We have observed some clients with extremely large lists of IPv6
endpoints, in some cases from subnets where the machine also has the
zero address for a whole /48 with then arbitrary addresses additionally
assigned within that /48. It is in general unnecessary for reachability
to report all of these addresses, typically only one will be necessary
for reachability. We report two, to cover some other common cases such
as some styles of IPv6 private address rotations.
Updates tailscale/corp#25850
Signed-off-by: James Tucker <james@tailscale.com>
In this commit, we add a failing test to verify that ipn/ipnserver.Server correctly
sets and unsets the current user when two different clients send requests concurrently
(A sends request, B sends request, A's request completes, B's request completes).
The expectation is that the user who wins the race becomes the current user
from the LocalBackend's perspective, remaining in this state until they disconnect,
after which a different user should be able to connect and use the LocalBackend.
We then fix the second of two bugs in (*Server).addActiveHTTPRequest, where a race
condition causes the LocalBackend's state to be reset after a new client connects,
instead of after the last active request of the previous client completes and the server
becomes idle.
Fixestailscale/corp#25804
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this commit, we add a failing test to verify that ipn/ipnserver.Server correctly
sets and unsets the current user when two different users connect sequentially
(A connects, A disconnects, B connects, B disconnects).
We then fix the test by updating (*ipn/ipnserver.Server).addActiveHTTPRequest
to avoid calling (*LocalBackend).ResetForClientDisconnect again after a new user
has connected and been set as the current user with (*LocalBackend).SetCurrentUser().
Since ipn/ipnserver.Server does not allow simultaneous connections from different
Windows users and relies on the LocalBackend's current user, and since we already
reset the LocalBackend's state by calling ResetForClientDisconnect when the last
active request completes (indicating the server is idle and can accept connections
from any Windows user), it is unnecessary to track the last connected user on the
ipnserver.Server side or call ResetForClientDisconnect again when the user changes.
Additionally, the second call to ResetForClientDisconnect occurs after the new user
has been set as the current user, resetting the correct state for the new user
instead of the old state of the now-disconnected user, causing issues.
Updates tailscale/corp#25804
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We update client/tailscale.LocalClient to allow specifying an optional Transport
(http.RoundTripper) for LocalAPI HTTP requests, and implement one that injects
an ipnauth.TestActor via request headers. We also add several functions and types
to make testing an ipn/ipnserver.Server possible (or at least easier).
We then use these updates to write basic tests for ipnserver.Server,
ensuring it works on non-Windows platforms and correctly sets and unsets
the LocalBackend's current user when a Windows user connects and disconnects.
We intentionally omit tests for switching between different OS users
and will add them in follow-up commits.
Updates tailscale/corp#25804
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In preparation for adding test coverage for ipn/ipnserver.Server, we update it
to use ipnauth.Actor instead of its concrete implementation where possible.
Updates tailscale/corp#25804
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We build up maps of both the existing MagicDNS configuration in hosts
and the desired MagicDNS configuration, compare the two, and only
write out a new one if there are changes. The comparison doesn't need
to be perfect, as the occasional false-positive is fine, but this
should greatly reduce rewrites of the hosts file.
I also changed the hosts updating code to remove the CRLF/LF conversion
stuff, and use Fprintf instead of Frintln to let us write those inline.
Updates #14428
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
This deprecates the old "DERP string" packing a DERP region ID into an
IP:port of 127.3.3.40:$REGION_ID and just uses an integer, like
PeerChange.DERPRegion does.
We still support servers sending the old form; they're converted to
the new form internally right when they're read off the network.
Updates #14636
Change-Id: I9427ec071f02a2c6d75ccb0fcbf0ecff9f19f26f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This doesn't seem to have any immediate impact, but not allowing access via the IPv6 masquerade
address when an IPv4 masquerade address is also set seems like a bug.
Updates #cleanup
Updates #14570 (found when working on it)
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This finishes the work started in #14616.
Updates #8632
Change-Id: I4dc07d45b1e00c3db32217c03b21b8b1ec19e782
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In this PR, we add a generic views.ValuePointer type that can be used as a view for pointers
to basic types and struct types that do not require deep cloning and do not have corresponding
view types. Its Get/GetOk methods return stack-allocated shallow copies of the underlying value.
We then update the cmd/viewer codegen to produce getters that return either concrete views
when available or ValuePointer views when not, for pointer fields in generated view types.
This allows us to avoid unnecessary allocations compared to returning pointers to newly
allocated shallow copies.
Updates #14570
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This amends commit b7e48058c8.
That commit broke all documented ways of starting Tailscale on gokrazy:
https://gokrazy.org/packages/tailscale/ — both Option A (tailscale up)
and Option B (tailscale up --auth-key) rely on the tailscale CLI working.
I verified that the tailscale CLI just prints it help when started
without arguments, i.e. it does not stay running and is not restarted.
I verified that the tailscale CLI successfully exits when started with
tailscale up --auth-key, regardless of whether the node has joined
the tailnet yet or not.
I verified that the tailscale CLI successfully waits and exits when
started with tailscale up, as expected.
fixes https://github.com/gokrazy/gokrazy/issues/286
Signed-off-by: Michael Stapelberg <michael@stapelberg.de>
This will enable Prometheus queries to look at the bandwidth over time windows,
for example 'increase(derp_bw_bytes_total)[1h] / increase(derp_bw_transfer_time_seconds_total)[1h]'.
Fixes commit a51672cafd.
Updates tailscale/corp#25503
Signed-off-by: Percy Wegmann <percy@tailscale.com>
We still use josharian/native (hi @josharian!) via
netlink, but I also sent https://github.com/mdlayher/netlink/pull/220
Updates #8632
Change-Id: I2eedcb7facb36ec894aee7f152c8a1f56d7fc8ba
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
sync.OnceValue and slices.Compact were both added in Go 1.21.
cmp.Or was added in Go 1.22.
Updates #8632
Updates #11058
Change-Id: I89ba4c404f40188e1f8a9566c8aaa049be377754
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Bump the versions to pick up some CVE patches. They don't affect us, but
customer scanners will complain.
Updates #cleanup
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This commit updates the return body of c2n endpoint /vip-services to keep hash generation logic on client side.
Updates tailscale/corp#24510
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
Most of these are effectively no-ops, but appease security scanners.
At least one (x/net for x/net/html) only affect builds from the open source repo,
since we already had it updated in our "corp" repo:
golang.org/x/net v0.33.1-0.20241230221519-e9d95ba163f7
... and that's where we do the official releases from. e.g.
tailscale.io % go install tailscale.com/cmd/tailscaled
tailscale.io % go version -m ~/go/bin/tailscaled | grep x/net
dep golang.org/x/net v0.33.1-0.20241230221519-e9d95ba163f7 h1:raAbYgZplPuXQ6s7jPklBFBmmLh6LjnFaJdp3xR2ljY=
tailscale.io % cd ../tailscale.com
tailscale.com % go install tailscale.com/cmd/tailscaled
tailscale.com % go version -m ~/go/bin/tailscaled | grep x/net
dep golang.org/x/net v0.33.0 h1:74SYHlV8BIgHIFC/LrYkOGIwL19eTYXQ5wc6TBuO36I=
Updates #8043
Updates #14599
Change-Id: I6e238cef62ca22444145a5313554aab8709b33c9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
cmd/containerboot: load containerboot serve config that does not contain HTTPS endpoint in tailnets with HTTPS disabled
Fixes an issue where, if a tailnet has HTTPS disabled, no serve config
set via TS_SERVE_CONFIG was loaded, even if it does not contain an HTTPS endpoint.
Now for tailnets with HTTPS disabled serve config provided to containerboot is considered invalid
(and therefore not loaded) only if there is an HTTPS endpoint defined in the config.
Fixestailscale/tailscale#14495
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
cmd/{k8s-operator,containerboot}: reload tailscaled configfile when its contents have changed
Instead of restarting the Kubernetes Operator proxies each time
tailscaled config has changed, this dynamically reloads the configfile
using the new reload endpoint.
Older annotation based mechanism will be supported till 1.84
to ensure that proxy versions prior to 1.80 keep working with
operator 1.80 and newer.
Updates tailscale/tailscale#13032
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
If the total number of differences is less than a small amount, just do
the dumb quadratic thing and compare every single object instead of
allocating a map.
Updates tailscale/corp#25479
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8931b4355a2da4ec0f19739927311cf88711a840
Extracted from some code written in the other repo.
Updates tailscale/corp#25479
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I6df062fdffa1705524caa44ac3b6f2788cf64595
This will enable Prometheus queries to look at the bandwidth over time windows,
for example 'increase(derp_bw_bytes_total)[1h] / increase(derp_bw_transfer_time_seconds_total)[1h]'.
Updates tailscale/corp#25503
Signed-off-by: Percy Wegmann <percy@tailscale.com>
* cmd/k8s-operator,k8s-operator: allow users to set custom labels for the optional ServiceMonitor
Updates tailscale/tailscale#14381
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
govulncheck flagged a couple fresh vulns in that package:
* https://pkg.go.dev/vuln/GO-2025-3367
* https://pkg.go.dev/vuln/GO-2025-3368
I don't believe these affect us, as we only do any git stuff from
release tooling which is all internal and with hardcoded repo URLs.
Updates #cleanup
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Change the type of the `IPv4` and `IPv6` members in the `nodeData`
struct to be `netip.Addr` instead of `string`.
We were previously calling `String()` on this struct, which returns
"invalid IP" when the `netip.Addr` is its zero value, and passing this
value into the aforementioned attributes.
This caused rendering issues on the frontend
as we were assuming that the value for `IPv4` and `IPv6` would be falsy
in this case.
The zero value for a `netip.Addr` marshalls to an empty string instead
which is the behaviour we want downstream.
Updates https://github.com/tailscale/tailscale/issues/14568
Signed-off-by: Mario Minardi <mario@tailscale.com>
Extracted from some code written in the other repo.
Updates tailscale/corp#25479
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I92c97a63a8f35cace6e89a730938ea587dcefd9b
Currently this does not yet do anything apart from creating
the ProxyGroup resources like StatefulSet.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This commit updates the VIPService c2n endpoint on client to response with actual VIPService configuration stored
in the serve config.
Fixestailscale/corp#24510
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
These erroneously blocked a recent PR, which I fixed by simply
re-running CI. But we might as well fix them anyway.
These are mostly `printf` to `print` and a couple of `!=` to `!Equal()`
Updates #cleanup
Signed-off-by: Will Norris <will@tailscale.com>
Remove the platform specificity, it is unnecessary complexity.
Deduplicate repeated code as a result of reduced complexity.
Split out error identification code.
Update call-sites and tests.
Updates #14551
Updates tailscale/corp#25648
Signed-off-by: James Tucker <james@tailscale.com>
Fixestailscale/tailscale#14563
When creating a NoiseClient, ensure that if any private IP address is provided, with both an `http` scheme and an explicit port number, we do not ever attempt to use HTTPS. We were only handling the case of `127.0.0.1` and `localhost`, but `192.168.x.y` is a private IP as well. This uses the `netip` package to check and adds some logging in case we ever need to troubleshoot this.
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
Observed in the wild some macOS machines gain broken sockets coming out
of sleep (we observe "time jumped", followed by EPIPE on sendto). The
cause of this in the platform is unclear, but the fix is clear: always
rebind if the socket is broken. This can also be created artificially on
Linux via `ss -K`, and other conditions or software on a system could
also lead to the same outcomes.
Updates tailscale/corp#25648
Signed-off-by: James Tucker <james@tailscale.com>
In the process, because I needed it for testing, make all
LocalBackend-managed goroutines be accounted for. And then in tests,
verify they're no longer running during LocalBackend.Shutdown.
Updates tailscale/corp#19681
Change-Id: Iad873d4df7d30103a4a7863dfacf9e078c77e6a3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates #14520
Updates #14517 (in that I pulled this out of there)
Change-Id: Ibc28162816e083fcadf550586c06805c76e378fc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It was failing about an unaccepted risk ("mac-app-connector") because
it was checking runtime.GOOS ("darwin") instead of the test's env.goos
string value ("linux", which doesn't have the warning).
Fixes#14544
Change-Id: I470d86a6ad4bb18e1dd99d334538e56556147835
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Updates #cleanup
Updates #1909 (noticed while working on that)
Change-Id: I505001e5294287ad2a937b4db61d9e67de70fa14
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Fixes#14492
-----
Developer Certificate of Origin
Version 1.1
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
Developer's Certificate of Origin 1.1
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
Change-Id: I6dc1068d34bbfa7477e7b7a56a4325b3868c92e1
Signed-off-by: Marc Paquette <marcphilippaquette@gmail.com>
These were the last two Range funcs in this repo.
Updates #12912
Change-Id: I6ba0a911933cb5fc4e43697a9aac58a8035f9622
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The remaining range funcs in the tree are RangeOverTCPs and
RangeOverWebs in ServeConfig; those will be cleaned up separately.
Updates #12912
Change-Id: Ieeae4864ab088877263c36b805f77aa8e6be938d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And misc cleanup along the way.
Updates #12912
Change-Id: I0cab148b49efc668c6f5cdf09c740b84a713e388
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
While working on #13390, I ran across this non-idiomatic
pointer-to-view and parallel-sorted-map accounting code that was all
just to avoid a sort later.
But the sort later when building a new netmap.NetworkMap is already a
drop in the bucket of CPU compared to how much work & allocs
mapSession.netmap and LocalBackend's spamming of the full netmap
(potentially tens of thousands of peers, MBs of JSON) out to IPNBus
clients for any tiny little change (node changing online status, etc).
Removing the parallel sorted slice let everything be simpler to reason
about, so this does that. The sort might take a bit more CPU time now
in theory, but in practice for any netmap size for which it'd matter,
the quadratic netmap IPN bus spam (which we need to fix soon) will
overshadow that little sort.
Updates #13390
Updates #1909
Change-Id: I3092d7c67dc10b2a0f141496fe0e7e98ccc07712
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Importing the ~deprecated golang.org/x/exp/maps as "xmaps" to not
shadow the std "maps" was getting ugly.
And using slices.Collect on an iterator is verbose & allocates more.
So copy (x)maps.Keys+Values into our slicesx package instead.
Updates #cleanup
Updates #12912
Updates #14514 (pulled out of that change)
Change-Id: I5e68d12729934de93cf4a9cd87c367645f86123a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Using context.CancelFunc as the type (instead of func()) answers
questions like whether it's okay to call it multiple times, whether
it blocks, etc. And that's the type it actually is in this case.
Updates #cleanup
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
On Linux, systray.SetTitle actually seems to set the tooltip on all
desktops I've tested on. But on macOS, it actually does set a title
that is always displayed in the systray area next to the icon. This
change should properly set the tooltip across platforms.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Move a number of global state vars into the Menu struct, keeping things
better encapsulated. The systray package still relies on its own global
state, so only a single Menu instance can run at a time.
Move a lot of the initialization logic out of onReady, in particular
fetching the latest tailscale state. Instead, populate the state before
calling systray.Run, which fixes a timing issue in GNOME (#14477).
This change also creates a separate bgContext for actions not tied menu
item clicks. Because we have to rebuild the entire menu regularly, we
cancel that context as needed, which can cancel subsequent updateState
calls.
Also exit cleanly on SIGINT and SIGTERM.
Updates #1708Fixes#14477
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Refactor code to set app icon and title as part of rebuild, rather than
separately in eventLoop. This fixes several cases where they weren't
getting updated properly. This change also makes use of the new exit
node icons.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
restructure tsLogo to allow setting a mask to be used when drawing the
logo dots, as well as add an overlay icon, such as the arrow when
connected to an exit node.
The icon is still renders as white on black, but this change also
prepare for doing a black on white version, as well a fully transparent
icon. I don't know if we can consistently determine which to use, so
this just keeps the single icon for now.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
metrics.LabelMap grows slightly more heavy, needing a lock to ensure
proper ordering for newly initialized ShardedInt values. An Add method
enables callers to use .Add for both expvar.Int and syncs.ShardedInt
values, but retains the original behavior of defaulting to initializing
expvar.Int values.
Updates tailscale/corp#25450
Co-Authored-By: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: James Tucker <james@tailscale.com>
- rebuild menu when prefs change outside of systray, such as setting an
exit node
- refactor onClick handler code
- compare lowercase country name, the same as macOS and Windows (now
sorts Ukraine before USA)
- fix "connected / disconnected" menu items on stopped status
- prevent nil pointer on "This Device" menu item
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Noted as useful during review of #14448.
Updates #14457
Change-Id: I0f16f08d5b05a8e9044b19ef6c02d3dab497f131
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Remove EOL Ubuntu versions.
Add new Ubuntu LTS.
Update Alpine to test latest version.
Also, make the test run when its workflow is updated and installer.sh isn't.
Updates #cleanup
Signed-off-by: Erisa A <erisa@tailscale.com>
This commit builds the exit node menu including the recommended exit
node, if available, as well as tailnet and mullvad exit nodes.
This does not yet update the menu based on changes in exit node outside
of the systray app, which will come later. This also does not include
the ability to run as an exit node.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
* tailcfg: rename and retype ServiceHost capability, add value type
Updates tailscale/corp#22743.
In #14046, this was accidentally made a PeerCapability when it
should have been NodeCapability. Also, renaming it to use the
nomenclature that we decided on after #14046 went up, and adding
the type of the value that will be passed down in the RawMessage
for this capability.
This shouldn't break anything, since no one was using this string or
variable yet.
Signed-off-by: Naman Sood <mail@nsood.in>
The new menu delay added to fix libdbusmenu systrays causes problems
with KDE. Given the state of wildly varying systray implementations, I
suspect we may need more desktop-specific hacks, so I'm setting this up
to accommodate that.
Updates #1708
Updates #14431
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Bring UI closer to macOS and windows:
- split login and tailnet name over separate lines
- render profile picture (with very simple caching)
- use checkbox to indicate active profile. I've not found any desktops
that can't render checkboxes, so I'd like to explore other options
if needed.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
ShardedInt provides an int type expvar.Var that supports more efficient
writes at high frequencies (one order of magnigude on an M1 Max, much
more on NUMA systems).
There are two implementations of ShardValue, one that abuses sync.Pool
that will work on current public Go versions, and one that takes a
dependency on a runtime.TailscaleP function exposed in Tailscale's Go
fork. The sync.Pool variant has about 10x the throughput of a single
atomic integer on an M1 Max, and the runtime.TailscaleP variant is about
10x faster than the sync.Pool variant.
Neither variant have perfect distribution, or perfectly always avoid
cross-CPU sharing, as there is no locking or affinity to ensure that the
time of yield is on the same core as the time of core biasing, but in
the average case the distributions are enough to provide substantially
better performance.
See golang/go#18802 for a related upstream proposal.
Updates tailscale/go#109
Updates tailscale/corp#25450
Signed-off-by: James Tucker <james@tailscale.com>
Some notification managers crop the application icon to a circle, so
ensure we have enough padding to account for that.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
This new type of probe sends DERP packets sized similarly to CallMeMaybe packets
at a rate of 10 packets per second. It records the round-trip times in a Prometheus
histogram. It also keeps track of how many packets are dropped. Packets that fail to
arrive within 5 seconds are considered dropped.
Updates tailscale/corp#24522
Signed-off-by: Percy Wegmann <percy@tailscale.com>
MutexValue is simply a value guarded by a mutex.
For any type that is not pointer-sized,
MutexValue will perform much better than AtomicValue
since it will not incur an allocation boxing the value
into an interface value (which is how Go's atomic.Value
is implemented under-the-hood).
Updates #cleanup
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
This is an experiment to see how useful we will find it to have some
text-based diagrams to document how various components of the operator
work. There are no plans to link to this from elsewhere yet, but
hopefully it will be a useful reference internally.
Updates #cleanup
Change-Id: If5911ed39b09378fec0492e87738ec0cc3d8731e
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
1ed9bd76d6 meant to make tunAddress be optional.
Updates tailscale/corp#24635
Change-Id: Idc4a8540b294e480df5bd291967024c04df751c0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For https://github.com/tailscale/go/pull/108 so we can depend on it in
other repos. (This repo can't yet use it; we permit building
tailscale/tailscale with the latest stock Go release) But that will be
in Go 1.24. We're just impatient elsewhere and would like it in the
control plane code earlier.
Updates tailscale/corp#25406
Change-Id: I53ff367318365c465cbd02cea387c8ff1eb49fab
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The omitzero tag option has been backported to v1 "encoding/json"
from the "encoding/json/v2" prototype and will land in Go1.24.
Until we fully upgrade to Go1.24, adjust the test to be agnostic
to which version of Go someone is using.
Updates tailscale/corp#25406
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
The go-httpstat package has a data race when used with connections that
are performing happy-eyeballs connection setups as we are in the DERP
client. There is a long-stale PR upstream to address this, however
revisiting the purpose of this code suggests we don't really need
httpstat here.
The code populates a latency table that may be used to compare to STUN
latency, which is a lightweight RTT check. Switching out the reported
timing here to simply the request HTTP request RTT avoids the
problematic package.
Fixestailscale/corp#25095
Signed-off-by: James Tucker <james@tailscale.com>
When we first made Tailscale SSH, we assumed people would want public
key support soon after. Turns out that hasn't been the case; people
love the Tailscale identity authentication and check mode.
In light of CVE-2024-45337, just remove all our public key code to not
distract people, and to make the code smaller. We can always get it
back from git if needed.
Updates tailscale/corp#25131
Updates golang/go#70779
Co-authored-by: Percy Wegmann <percy@tailscale.com>
Change-Id: I87a6e79c2215158766a81942227a18b247333c22
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The errors emitted by util/dnsname are all written at least moderately
friendly and none of them emit sensitive information. They should be
safe to display to end users.
Updates tailscale/corp#9025
Change-Id: Ic58705075bacf42f56378127532c5f28ff6bfc89
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
The IfElse function is equivalent to the ternary (c ? a : b) operator
in many other languages like C. Unfortunately, this function
cannot perform short-circuit evaluation like in many other languages,
but this is a restriction that's not much different
than the pre-existing cmp.Or function.
The argument against ternary operators in Go is that
nested ternary operators become unreadable
(e.g., (c1 ? (c2 ? a : b) : (c2 ? x : y))).
But a single layer of ternary expressions can sometimes
make code much more readable.
Having the bools.IfElse function gives code authors the
ability to decide whether use of this is more readable or not.
Obviously, code authors will need to be judicious about
their use of this helper function.
Readability is more of an art than a science.
Updates #cleanup
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Throughout our codebase we have types that only exist only
to implement an io.Reader or io.Writer, when it would have been
simpler, cleaner, and more readable to use an inlined function literal
that closes over the relevant types.
This is arguably more readable since it keeps the semantic logic
in place rather than have it be isolated elsewhere.
Note that a function literal that closes over some variables
is semantic equivalent to declaring a struct with fields and
having the Read or Write method mutate those fields.
Updates #cleanup
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
This is the start of an integration/e2e test suite for the tailscale operator.
It currently only tests two major features, ingress proxy and API server proxy,
but we intend to expand it to cover more features over time. It also only
supports manual runs for now. We intend to integrate it into CI checks in a
separate update when we have planned how to securely provide CI with the secrets
required for connecting to a test tailnet.
Updates #12622
Change-Id: I31e464bb49719348b62a563790f2bc2ba165a11b
Co-authored-by: Irbe Krumina <irbe@tailscale.com>
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
A method on kc was called unconditionally, even if was not initialized,
leading to a nil pointer dereference when TS_SERVE_CONFIG was set
outside Kubernetes.
Add a guard symmetric with other uses of the kubeClient.
Fixes#14354.
Signed-off-by: Bjorn Neergaard <bjorn@neersighted.com>
Make dev-mode DERP probes work without TLS. Properly dial port `3340`
when not using HTTPS when dialing nodes in `derphttp_client`. Skip
verifying TLS state in `newConn` if we are not running a prober.
Updates tailscale/corp#24635
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Co-authored-by: Percy Wegmann <percy@tailscale.com>
Use envknob to configure the per client send
queue depth for the derp server.
Fixestailscale/corp#24978
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Previously this unit test failed if it was run in a container. Update the assert
to focus on exactly the condition we are trying to assert: the package type
should only be 'container' if we use the build tag.
Updates #14317
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Make argparsing use flag for adding a new
parameter that requires parsing.
Enforce a read timeout deadline waiting for response
from the stun server provided in the args. Otherwise
the program will never exit.
Fixes#14267
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
OAuth clients that were used to generate an auth_key previously
specified the scope 'device'. 'device' is not an actual scope,
the real scope is 'devices'. The resulting OAuth token ended up
including all scopes from the specified OAuth client, so the code
was able to successfully create auth_keys.
It's better not to hardcode a scope here anyway, so that we have
the flexibility of changing which scope(s) are used in the future
without having to update old clients.
Since the qualifier never actually did anything, this commit simply
removes it.
Updates tailscale/corp#24934
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This package grew organically over time and
is an awful mix of explicitly declared options and
globally set parameters via environment variables and
other subtle effects.
Add a new Options and TransportOptions type to
allow for the creation of a Policy or http.RoundTripper
with some set of options.
The options struct avoids the need to add yet more
NewXXX functions for every possible combination of
ordered arguments.
The goal of this refactor is to allow specifying the http.Client
to use with the Policy.
Updates tailscale/corp#18177
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
If previousEtag is empty, then we assume control ACLs were not modified
manually and push the local ACLs. Instead, we defaulted to localEtag
which would be different if local ACLs were different from control.
AFAIK this was always buggy, but never reported?
Fixes#14295
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
We as members, contributors, and leaders pledge to make participation
We are committed to creating an open, welcoming, diverse, inclusive, healthy and respectful community.
in our community a harassment-free experience for everyone, regardless
Unacceptable, harmful and inappropriate behavior will not be tolerated.
of age, body size, visible or invisible disability, ethnicity, sex
characteristics, gender identity and expression, level of experience,
education, socio-economic status, nationality, personal appearance,
race, religion, or sexual identity and orientation.
We pledge to act and interact in ways that contribute to an open,
welcoming, diverse, inclusive, and healthy community.
## Our Standards
## Our Standards
Examples of behavior that contributes to a positive environment for
Examples of behavior that contributes to a positive environment for our community include:
our community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our
mistakes, and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
overall community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or
- Demonstrating empathy and kindness toward other people.
advances of any kind
- Being respectful of differing opinions, viewpoints, and experiences.
* Trolling, insulting or derogatory comments, and personal or
- Giving and gracefully accepting constructive feedback.
political attacks
- Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience.
* Public or private harassment
- Focusing on what is best not just for us as individuals, but for the overall community.
* Publishing others' private information, such as a physical or email
address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in
a professional setting
## Enforcement Responsibilities
Examples of unacceptable behavior include without limitation:
Community leaders are responsible for clarifying and enforcing our
- The use of language, imagery or emojis (collectively "content") that is racist, sexist, homophobic, transphobic, or otherwise harassing or discriminatory based on any protected characteristic.
standards of acceptable behavior and will take appropriate and fair
- The use of sexualized content and sexual attention or advances of any kind.
corrective action in response to any behavior that they deem
- The use of violent, intimidating or bullying content.
inappropriate, threatening, offensive, or harmful.
- Trolling, concern trolling, insulting or derogatory comments, and personal or political attacks.
- Public or private harassment.
- Publishing others' personal information, such as a photo, physical address, email address, online profile information, or other personal information, without their explicit permission or with the intent to bully or harass the other person.
- Posting deep fake or other AI generated content about or involving another person without the explicit permission.
- Spamming community channels and members, such as sending repeat messages, low-effort content, or automated messages.
- Phishing or any similar activity.
- Distributing or promoting malware.
- The use of any coded or suggestive content to hide or provoke otherwise unacceptable behavior.
- Other conduct which could reasonably be considered harmful, illegal, or inappropriate in a professional setting.
Community leaders have the right and responsibility to remove, edit,
Please also see the Tailscale Acceptable Use Policy, available at [tailscale.com/tailscale-aup](https://tailscale.com/tailscale-aup).
or reject comments, commits, code, wiki edits, issues, and other
contributions that are not aligned to this Code of Conduct, and will
communicate reasons for moderation decisions when appropriate.
## Scope
## Reporting Incidents
This Code of Conduct applies within all community spaces, and also
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to Tailscale directly via <info@tailscale.com>, or to the community leaders or moderators via DM or similar.
applies when an individual is officially representing the community in
All complaints will be reviewed and investigated promptly and fairly.
public spaces. Examples of representing our community include using an
We will respect the privacy and safety of the reporter of any issues.
official e-mail address, posting via an official social media account,
or acting as an appointed representative at an online or offline
event.
## Enforcement
Please note that this community is not moderated by staff 24/7, and we do not have, and do not undertake, any obligation to prescreen, monitor, edit, or remove any content or data, or to actively seek facts or circumstances indicating illegal activity.
While we strive to keep the community safe and welcoming, moderation may not be immediate at all hours.
If you encounter any issues, report them using the appropriate channels.
Instances of abusive, harassing, or otherwise unacceptable behavior
## Enforcement Guidelines
may be reported to the community leaders responsible for enforcement
at [info@tailscale.com](mailto:info@tailscale.com). All complaints
will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and
Community leaders and moderators are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
security of the reporter of any incident.
## Enforcement Guidelines
Community leaders and moderators have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Community Code of Conduct.
Tailscale retains full discretion to take action (or not) in response to a violation of these guidelines with or without notice or liability to you.
We will interpret our policies and resolve disputes in favor of protecting users, customers, the public, our community and our company, as a whole.
Community leaders will follow these Community Impact Guidelines in
Community leaders will follow these community enforcement guidelines in determining the consequences for any action they deem in violation of this Code of Conduct,
determining the consequences for any action they deem in violation of
and retain full discretion to apply the enforcement guidelines as necessary depending on the circumstances:
this Code of Conduct:
### 1. Correction
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior
Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
deemed unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders,
Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate.
providing clarity around the nature of the violation and an
A public apology may be requested.
explanation of why the behavior was inappropriate. A public apology
may be requested.
### 2. Warning
### 2. Warning
**Community Impact**: A violation through a single incident or series
Community Impact: A violation through a single incident or series of actions.
of actions.
**Consequence**: A warning with consequences for continued
Consequence: A warning with consequences for continued behavior.
behavior. No interaction with the people involved, including
No interaction with the people involved, including unsolicited interaction with those enforcing this Community Code of Conduct, for a specified period of time.
unsolicited interaction with those enforcing the Code of Conduct, for
This includes avoiding interactions in community spaces as well as external channels like social media.
a specified period of time. This includes avoiding interactions in
Violating these terms may lead to a temporary or permanent ban.
community spaces as well as external channels like social
media. Violating these terms may lead to a temporary or permanent ban.
### 3. Temporary Ban
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards,
Community Impact: A serious violation of community standards, including sustained inappropriate behavior.
including sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or
Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time.
public communication with the community for a specified period of
No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
time. No public or private interaction with the people involved,
including unsolicited interaction with those enforcing the Code of
Conduct, is allowed during this period. Violating these terms may lead
to a permanent ban.
### 4. Permanent Ban
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of
Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
community standards, including sustained inappropriate behavior,
harassment of an individual, or aggression toward or disparagement of
classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction
Consequence: A permanent ban from any sort of public interaction within the community.
within the community.
## Acceptable Use Policy
Violation of this Community Code of Conduct may also violate the Tailscale Acceptable Use Policy, which may result in suspension or termination of your Tailscale account.
For more information, please see the Tailscale Acceptable Use Policy, available at [tailscale.com/tailscale-aup](https://tailscale.com/tailscale-aup).
## Privacy
Please see the Tailscale [Privacy Policy](https://tailscale.com/privacy-policy) for more information about how Tailscale collects, uses, discloses and protects information.
## Attribution
## Attribution
This Code of Conduct is adapted from the [Contributor
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 2.0, available at <https://www.contributor-covenant.org/version/2/0/code_of_conduct.html>.
// GNOME expands submenus downward in the main menu, rather than flyouts to the side.
// Either as a result of that or another limitation, there seems to be a maximum depth of submenus.
// Mullvad countries that have a city submenu are not being rendered, and so can't be selected.
// Handle this by simply treating all mullvad countries as single-city and select the best peer.
hideMullvadCities=true
case"kde":
// KDE doesn't need a delay, and actually won't render submenus
// if we delay for more than about 400µs.
newMenuDelay=0
default:
// Add a slight delay to ensure the menu is created before adding items.
//
// Systray implementations that use libdbusmenu sometimes process messages out of order,
// resulting in errors such as:
// (waybar:153009): LIBDBUSMENU-GTK-WARNING **: 18:07:11.551: Children but no menu, someone's been naughty with their 'children-display' property: 'submenu'
//
// See also: https://github.com/fyne-io/systray/issues/12
newMenuDelay=10*time.Millisecond
}
}
// onReady is called by the systray package when the menu is ready to be built.
// cigocacher is an opinionated-to-Tailscale client for gocached. It connects
// at a URL like "https://ci-gocached-azure-1.corp.ts.net:31364", but that is
// stored in a GitHub actions variable so that its hostname can be updated for
// all branches at the same time in sync with the actual infrastructure.
//
// It authenticates using GitHub OIDC tokens, and all HTTP errors are ignored
// so that its failure mode is just that builds get slower and fall back to
// disk-only cache.
packagemain
import(
"bytes"
"context"
jsonv1"encoding/json"
"errors"
"flag"
"fmt"
"io"
"log"
"net"
"net/http"
"net/url"
"os"
"path/filepath"
"runtime/debug"
"strconv"
"strings"
"sync/atomic"
"time"
"github.com/bradfitz/go-tool-cache/cacheproc"
"github.com/bradfitz/go-tool-cache/cachers"
)
funcmain(){
var(
version=flag.Bool("version",false,"print version and exit")
auth=flag.Bool("auth",false,"auth with cigocached and exit, printing the access token as output")
stats=flag.Bool("stats",false,"fetch and print cigocached stats and exit")
token=flag.String("token","","the cigocached access token to use, as created using --auth")
srvURL=flag.String("cigocached-url","","optional cigocached URL (scheme, host, and port). Empty means to not use one.")
srvHostDial=flag.String("cigocached-host","","optional cigocached host to dial instead of the host in the provided --cigocached-url. Useful for public TLS certs on private addresses.")
dir=flag.String("cache-dir","","cache directory; empty means automatic")
log.Printf("backend addresses for egress target %q have changed old IPs %v, new IPs %v trigger egress config resync",nn.Name(),ips,nn.Addresses().AsSlice())
log.Printf("backend addresses for egress target %q have changed old IPs %v, new IPs %v trigger egress config resync",nn.Name(),ips,nn.Addresses().AsSlice())
http.Error(w,fmt.Sprintf("error determining the number of times health check endpoint should be pinged: %v",err),http.StatusInternalServerError)
return
}
ep.waitTillSafeToShutdown(r.Context(),cfgs,hp)
}
// waitTillSafeToShutdown looks up all egress targets configured to be proxied via this instance and, for each target
// whose configuration includes a healthcheck endpoint, pings the endpoint till none of the responses
// are returned by this instance or till the HTTP request times out. In practice, the endpoint will be a Kubernetes Service for whom one of the backends
// would normally be this Pod. When this Pod is being deleted, the operator should have removed it from the Service
// backends and eventually kube proxy routing rules should be updated to no longer route traffic for the Service to this
ifcfgs==nil||len(*cfgs)==0{// avoid sleeping if no services are configured
return
}
log.Printf("Ensuring that cluster traffic for egress targets is no longer routed via this Pod...")
varwgsync.WaitGroup
fors,cfg:=range*cfgs{
hep:=cfg.HealthCheckEndpoint
ifhep==""{
log.Printf("Tailnet target %q does not have a cluster healthcheck specified, unable to verify if cluster traffic for the target is still routed via this Pod",s)
continue
}
svc:=s
wg.Go(func(){
log.Printf("Ensuring that cluster traffic is no longer routed to %q via this Pod...",svc)