This is for a future test scheduler, so it can run potentially flaky
tests separately, doing all the non-flaky ones together in one batch.
Updates tailscale/corp#28679
Change-Id: Ic4a11f9bf394528ef75792fd622f17bc01a4ec8a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When TS_GO_NEXT=1 is set, update/use the
go.toolchain.next.{branch,rev} files instead.
This lets us do test deploys of Go release candidates on some
backends, without affecting all backends.
Updates tailscale/corp#36382
Change-Id: I00dbde87b219b720be5ea142325c4711f101a364
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The Tailscale CLI has some methods to watch the IPN bus for
messages, say, the current netmap (`tailscale debug netmap`).
The Tailscale daemon supports this using a streaming HTTP
response. Sometimes, the client can close its connection
abruptly -- due to an interruption, or in the case of `debug netmap`,
intentionally after consuming one message.
If the server daemon is writing a response as the client closes
its end of the socket, the daemon typically encounters a "broken pipe"
error. The "Watch IPN Bus" handler currently logs such errors after
they're propagated by a JSON encoding/writer helper.
Since the Tailscale CLI nominally closes its socket with the daemon
in this slightly ungraceful way (viz. `debug netmap`), stop logging
these broken pipe errors as far as possible. This will help avoid
confounding users when they scan backend logs.
Updates #18477
Signed-off-by: Amal Bansode <amal@tailscale.com>
This commit is based on part of #17925, reworked as a separate package.
Add a package that can store and load netmap.NetworkMap values in persistent
storage, using a basic columnar representation. This commit includes a default
storage interface based on plain files, but the interface can be implemented
with more structured storage if we want to later.
The tests are set up to require that all the fields of the NetworkMap are
handled, except those explicitly designated as not-cached, and check that a
fully-populated value can round-trip correctly through the cache. Adding or
removing fields, either in the NetworkMap or in the cached representation, will
trigger either build failures (e.g., for type mismatch) or test failures (e.g.,
for representation changes or missing fields). This isn't quite as nice as
automatically updating the representation, which I also prototyped, but is much
simpler to maintain and less code.
This commit does not yet hook up the cache to the backend, that will be a
subsequent change.
Updates #12639
Change-Id: Icb48639e1d61f2aec59904ecd172c73e05ba7bf9
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Someone asked me if we use DNS-over-HTTPS if the system's resolver is an
IP address that supports DoH and there's no global nameserver set (i.e.
no "Override DNS servers" set). I didn't know the answer offhand, and it
took a while for me to figure it out. The answer is yes, in cases where
we take over the system's DNS configuration and read the base config, we
do upgrade any DoH-capable resolver to use DoH. Here's a test that
verifies this behaviour (and hopefully helps as documentation the next
time someone has this question).
Updates #cleanup
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
If conn25 config is sent in the netmap: add split DNS entries to use
appropriately tagged peers' PeerAPI to resolve DNS requests for those
domains.
This will enable future work where we use the peers as connectors for
the configured domains.
Updates tailscale/corp#34252
Signed-off-by: Fran Bull <fran@tailscale.com>
Similarly to allowing link-local multicast in #13661, we should also allow broadcast traffic
on permitted interfaces when the killswitch is enabled due to exit node usage on Windows.
This always includes internal interfaces, such as Hyper-V/WSL2, and also the LAN when
"Allow local network access" is enabled in the client.
Updates #18504
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This file was never truly necessary and has never actually been used in
the history of Tailscale's open source releases.
A Brief History of AUTHORS files
---
The AUTHORS file was a pattern developed at Google, originally for
Chromium, then adopted by Go and a bunch of other projects. The problem
was that Chromium originally had a copyright line only recognizing
Google as the copyright holder. Because Google (and most open source
projects) do not require copyright assignemnt for contributions, each
contributor maintains their copyright. Some large corporate contributors
then tried to add their own name to the copyright line in the LICENSE
file or in file headers. This quickly becomes unwieldy, and puts a
tremendous burden on anyone building on top of Chromium, since the
license requires that they keep all copyright lines intact.
The compromise was to create an AUTHORS file that would list all of the
copyright holders. The LICENSE file and source file headers would then
include that list by reference, listing the copyright holder as "The
Chromium Authors".
This also become cumbersome to simply keep the file up to date with a
high rate of new contributors. Plus it's not always obvious who the
copyright holder is. Sometimes it is the individual making the
contribution, but many times it may be their employer. There is no way
for the proejct maintainer to know.
Eventually, Google changed their policy to no longer recommend trying to
keep the AUTHORS file up to date proactively, and instead to only add to
it when requested: https://opensource.google/docs/releasing/authors.
They are also clear that:
> Adding contributors to the AUTHORS file is entirely within the
> project's discretion and has no implications for copyright ownership.
It was primarily added to appease a small number of large contributors
that insisted that they be recognized as copyright holders (which was
entirely their right to do). But it's not truly necessary, and not even
the most accurate way of identifying contributors and/or copyright
holders.
In practice, we've never added anyone to our AUTHORS file. It only lists
Tailscale, so it's not really serving any purpose. It also causes
confusion because Tailscalars put the "Tailscale Inc & AUTHORS" header
in other open source repos which don't actually have an AUTHORS file, so
it's ambiguous what that means.
Instead, we just acknowledge that the contributors to Tailscale (whoever
they are) are copyright holders for their individual contributions. We
also have the benefit of using the DCO (developercertificate.org) which
provides some additional certification of their right to make the
contribution.
The source file changes were purely mechanical with:
git ls-files | xargs sed -i -e 's/\(Tailscale Inc &\) AUTHORS/\1 contributors/g'
Updates #cleanup
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
In order to better manage per-profile data resources on the client, add methods
to the LocalBackend to support creation of per-profile directory structures in
local storage. These methods build on the existing TailscaleVarRoot config, and
have the same limitation (i.e., if no local storage is available, it will
report an error when used).
The immediate motivation is to support netmap caching, but we can also use this
mechanism for other per-profile resources including pending taildrop files and
Tailnet Lock authority caches.
This commit only adds the directory-management plumbing; later commits will
handle migrating taildrop, TKA, etc. to this mechanism, as well as caching
network maps.
Updates #12639
Change-Id: Ia75741955c7bf885e49c1ad99f856f669a754169
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
`dnf config-manager addrepo` will fail if the Tailscale repo is already
installed. Without the --overwrite flag, the installer will error out
instead of succeeding like with dnf3.
Fixes#18491
Signed-off-by: Francois Marier <francois@fmarier.org>
tsnet users can now provide a tun.Device, including any custom
implementation that conforms to the interface.
netstack has a new option CheckLocalTransportEndpoints that when used
alongside a TUN enables netstack listens and dials to correctly capture
traffic associated with those sockets. tsnet with a TUN sets this
option, while all other builds leave this at false to preserve existing
performance.
Updates #18423
Signed-off-by: James Tucker <james@tailscale.com>
Every other listen method on tsnet.Server makes this clarification, so
should ListenService.
Fixestailscale/corp#36207
Signed-off-by: Harry Harpham <harry@tailscale.com>
When we have not yet communicated with a peer, send a
TSMPDiscoAdvertisement to let the peer know of our disco key. This is in
most cases redundant, but will allow us to set up direct connections
when the client cannot access control.
Some parts taken from: #18073
Updates #12639
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
New gauge reflects endpoints state via labels:
- open, when both peers are connected and ready to talk, and
- connecting. when at least one peer hasn't connected yet.
Corresponding client metrics are logged as
- udprelay_endpoints_connecting
- udprelay_endpoints_open
Updates tailscale/corp#30820
Change-Id: Idb1baa90a38c97847e14f9b2390093262ad0ea23
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
This commit contains the implementation of multi-tailnet support within the Kubernetes Operator
Each of our custom resources now expose the `spec.tailnet` field. This field is a string that must match the name of an existing `Tailnet` resource. A `Tailnet` resource looks like this:
```yaml
apiVersion: tailscale.com/v1alpha1
kind: Tailnet
metadata:
name: example # This is the name that must be referenced by other resources
spec:
credentials:
secretName: example-oauth
```
Each `Tailnet` references a `Secret` resource that contains a set of oauth credentials. This secret must be created in the same namespace as the operator:
```yaml
apiVersion: v1
kind: Secret
metadata:
name: example-oauth # This is the name that's referenced by the Tailnet resource.
namespace: tailscale
stringData:
client_id: "client-id"
client_secret: "client-secret"
```
When created, the operator performs a basic check that the oauth client has access to all required scopes. This is done using read actions on devices, keys & services. While this doesn't capture a missing "write" permission, it catches completely missing permissions. Once this check passes, the `Tailnet` moves into a ready state and can be referenced. Attempting to use a `Tailnet` in a non-ready state will stall the deployment of `Connector`s, `ProxyGroup`s and `Recorder`s until the `Tailnet` becomes ready.
The `spec.tailnet` field informs the operator that a `Connector`, `ProxyGroup`, or `Recorder` must be given an auth key generated using the specified oauth client. For backwards compatibility, the set of credentials the operator is configured with are considered the default. That is, where `spec.tailnet` is not set, the resource will be deployed in the same tailnet as the operator.
Updates https://github.com/tailscale/corp/issues/34561
fixestailscale/corp#27182
tailscale version --json now includes an osVariant field that will report
one of macsys, appstore or darwin. We can extend this to other
platforms where tailscaled can have multiple personalities.
This also adds the concept of a platform-specific callback for querying
an explicit application identifier. On Apple, we can use
CFBundleGetIdentifier(mainBundle) to get the bundle identifier via cgo.
This removes all the ambiguity and lets us remove other less direct
methods (like env vars, locations, etc).
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Polls IMDS (currently only AWS) for extra IPs to advertise as udprelay.
Updates #17796
Change-Id: Iaaa899ef4575dc23b09a5b713ce6693f6a6a6964
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
* k8s-operator,kube: removing enableSessionRecordings option. It seems
like it is going to create a confusing user experience and it's going to
be a very niche use case, so we have decided to defer this for now.
Updates tailscale/corp#35796
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
* k8s-operator: adding metric for env var deprecation
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
---------
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
net/portmapper: Stop replacing the internal port with the upnp external port
This causes the UPnP mapping to break in the next recreation of the
mapping.
Fixes#18348
Signed-off-by: Eduardo Sorribas <eduardo@sorribas.org>
This change allows tsnet nodes to act as Service hosts by adding a new
function, tsnet.Server.ListenService. Invoking this function will
advertise the node as a host for the Service and create a listener to
receive traffic for the Service.
Fixes#17697Fixestailscale/corp#27200
Signed-off-by: Harry Harpham <harry@tailscale.com>
This change adds API to ipn.LocalBackend to retrieve the ETag when
querying for the current serve config. This allows consumers of
ipn.LocalBackend.SetServeConfig to utilize the concurrency control
offered by ETags. Previous to this change, utilizing serve config ETags
required copying the local backend's internal ETag calcuation.
The local API server was previously copying the local backend's ETag
calculation as described above. With this change, the local API server
now uses the new ETag retrieval function instead. Serve config ETags are
therefore now opaque to clients, in line with best practices.
Fixestailscale/corp#35857
Signed-off-by: Harry Harpham <harry@tailscale.com>
fixestailscale/tailscale#18418
Both Serve and PeerAPI broke when we moved the TailscaleInterfaceName
into State, which is updated asynchronously and may not be
available when we configure the listeners.
This extracts the explicit interface name property from netmon.State
and adds as a static struct with getters that have proper error
handling.
The bug is only found in sandboxed Darwin clients, where we
need to know the Tailscale interface details in order to set up the
listeners correctly (they must bind to our interface explicitly to escape
the network sandboxing that is applied by NECP).
Currently set only sandboxed macOS and Plan9 set this but it will
also be useful on Windows to simplify interface filtering in netns.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Policy editors, such as gpedit.msc and gpme.msc, rely on both the presence and the value of the
registry value to determine whether a policy is enabled. Unless an enabledValue is specified
explicitly, it defaults to REG_DWORD 1.
Therefore, we cannot rely on the same registry value to track the policy configuration state when
it is already used by a policy option, such as a dropdown. Otherwise, while the policy setting
will be written and function correctly, it will appear as Not Configured in the policy editor
due to the value mismatch (for example, REG_SZ "always" vs REG_DWORD 1).
In this PR, we update the DNSRegistration policy setting to use the DNSRegistrationConfigured
registry value for tracking. This change has no effect on the client side and exists solely to
satisfy ADMX and policy editor requirements.
Updates #14917
Signed-off-by: Nick Khyl <nickk@tailscale.com>
gocross-wrapper.ps1 is written to use the version of tar that ships with
Windows; we want to avoid conflicts with any other tar on the PATH, such
ones installed by MSYS and/or Cygwin.
Updates https://github.com/tailscale/corp/issues/29940
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Recently, the golangci-lint workflow has been taking longer and longer
to complete, causing it to timeout after the default of 5 minutes.
Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: context deadline exceeded
Timeout exceeded: try increasing it by passing --timeout option
Although PR #18398 enabled the Go module cache, bootstrapping with a
cold cache still takes too long.
This PR doubles the default 5 minute timeout for golangci-lint to 10
minutes so that golangci-lint can finish downloading all of its
dependencies.
Note that this doesn’t affect the 5 minute timeout configured in
.golangci.yml, since running golangci-lint on your local instance
should still be plenty fast.
Fixes#18366
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Allow for optionally specifying an audience for containerboot. This is
passed to tailscale up to allow for containerboot to use automatic ID
token generation for authentication.
Updates https://github.com/tailscale/corp/issues/34430
Signed-off-by: Mario Minardi <mario@tailscale.com>
Allow for optionally specifiying an audience for tsnet. This is passed
to the underlying identity federation logic to allow for tsnet auth to
use automatic ID token generation for authentication.
Updates https://github.com/tailscale/corp/issues/33316
Signed-off-by: Mario Minardi <mario@tailscale.com>
If local tailscale/tailscale checkout is not available,
pulll cigocacher remotely.
Fall back to ./tool/go if no other Go installation
is present.
Updates tailscale/corp#32493
Signed-off-by: Irbe Krumina <irbekrm@gmail.com>
Adds the ability to detect what provider the client is running on and tries fetch the ID token to use with Workload Identity.
Updates https://github.com/tailscale/corp/issues/33316
Signed-off-by: Danni Popova <danni@tailscale.com>
Recently, the golangci-lint workflow has been taking longer and longer
to complete, causing it to timeout after the default of 5 minutes.
Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: context deadline exceeded
Timeout exceeded: try increasing it by passing --timeout option
This PR upgrades actions/setup-go to version 6, the latest, and
enables caching for Go modules and build outputs. This should speed up
linting because most packages won’t have to be downloaded over and
over again.
Fixes#18366
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Fixes a bug where, for kube HA proxies, TLS certs for the replica
responsible for cert issuance where loaded in memory on startup,
although the in-memory store was not updated after renewal (to
avoid failing re-issuance for re-created Ingresses).
Now the 'write' replica always reads certs from the kube Secret.
Updates tailscale/tailscale#18394
Signed-off-by: Irbe Krumina <irbekrm@gmail.com>
Previously the funnel listener would leave artifacts in the serve
config. This caused weird out-of-sync effects like the admin panel
showing that funnel was enabled for a node, but the node rejecting
packets because the listener was closed.
This change resolves these synchronization issues by ensuring that
funnel listeners clean up the serve config when closed.
See also:
e109cf9fdd
Updates #cleanup
Signed-off-by: Harry Harpham <harry@tailscale.com>
Prior to this change, we were resetting the tsnet's serve config every
time tsnet.Server.Up was run. This is important to do on startup, to
prevent messy interactions with stale configuration when the code has
changed.
However, Up is frequently run as a just-in-case step (for example, by
Server.ListenTLS/ListenFunnel and possibly by consumers of tsnet). When
the serve config is reset on each of these calls to Up, this creates
situations in which the serve config disappears unexpectedly. The
solution is to reset the serve config only on the first call to Up.
Fixes#8800
Updates tailscale/corp#27200
Signed-off-by: Harry Harpham <harry@tailscale.com>
Add support for authenticating the gitops-pusher using workload identity
federation.
Updates https://github.com/tailscale/corp/issues/34172
Signed-off-by: Mario Minardi <mario@tailscale.com>
QR codes are used by `tailscale up --qr` to provide an easy way to
open a web-page without transcribing a difficult URI. However, there’s
no need for this feature if the client will never be called
interactively. So this PR adds the `ts_omit_qrcodes` build tag.
Updates #18182
Signed-off-by: Simon Law <sfllaw@tailscale.com>
It's not worth adding the v2 client just for these e2e tests. Remove
that dependency for now to keep a clear separation, but we should revive
the v2 client version if we ever decide to take that dependency for the
tailscale/tailscale repo as a whole.
Updates tailscale/corp#32085
Change-Id: Ic51ce233d5f14ce2d25f31a6c4bb9cf545057dd0
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
* cmd/k8s-operator/e2e: run self-contained e2e tests with devcontrol
Adds orchestration for more of the e2e testing setup requirements to
make it easier to run them in CI, but also run them locally in a way
that's consistent with CI. Requires running devcontrol, but otherwise
supports creating all the scaffolding required to exercise the operator
and proxies.
Updates tailscale/corp#32085
Change-Id: Ia7bff38af3801fd141ad17452aa5a68b7e724ca6
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
* cmd/k8s-operator/e2e: being more specific on tmp dir cleanup
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
---------
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Co-authored-by: chaosinthecrd <tom@tmlabs.co.uk>
Raw Linux consoles support UTF-8, but we cannot assume that all UTF-8
characters are available. The default Fixed and Terminus fonts don’t
contain half-block characters (`▀` and `▄`), but do contain the
full-block character (`█`).
Sometimes, Linux doesn’t have a framebuffer, so it falls back to VGA.
When this happens, the full-block character could be anywhere in
extended ASCII block, because we don’t know which code page is active.
This PR introduces `--qr-format=auto` which tries to heuristically
detect when Tailscale is printing to a raw Linux console, whether
UTF-8 is enabled, and which block characters have been mapped in the
console font.
If Unicode characters are unavailable, the new `--qr-format=ascii`
formatter uses `#` characters instead of full-block characters.
Fixes#12935
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Moves magicksock.cloudInfo into util/cloudinfo with minimal changes.
Updates #17796
Change-Id: I83f32473b9180074d5cdbf00fa31e5b3f579f189
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
Bump peter-evans/create-pull-request to 8.0.0 to ensure compatibility
with actions/checkout 6.x.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
The funnel command is sort of an alias for the serve command. This means
that the subcommands added to serve to support Services appear as
subcommands for funnel as well, despite having no meaning for funnel.
This change removes all such Services-specific subcommands from funnel.
Fixestailscale/corp#34167
Signed-off-by: Harry Harpham <harry@tailscale.com>
Ensure that hardware attestation keys are not added to tailscaled
state stores that are Kubernetes Secrets or AWS SSM as those Tailscale
devices should be able to be recreated on different nodes, for example,
when moving Pods between nodes.
Updates tailscale/tailscale#18302
Signed-off-by: Irbe Krumina <irbekrm@gmail.com>
TPM-based features have been incredibly painful due to the heterogeneous
devices in the wild, and many situations in which the TPM "changes" (is
reset or replaced). All of this leads to a lot of customer issues.
We hoped to iron out all the kinks and get all users to benefit from
state encryption and hardware attestation without manually opting in,
but the long tail of kinks is just too long.
This change disables TPM-based features on Windows and Linux by default.
Node state should get auto-decrypted on update, and old attestation keys
will be removed.
There's also tailscaled-on-macOS, but it won't have a TPM or Keychain
bindings anyway.
Updates #18302
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Soft-fail on initial unmarshal and try again, ignoring the
AttestationKey. This helps in cases where something about the
attestation key storage (usually a TPM) is messed up. The old key will
be lost, but at least the node can start again.
Updates #18302
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Send LOGIN audit messages to the kernel audit subsystem on Linux
when users successfully authenticate to Tailscale SSH. This provides
administrators with audit trail integration via auditd or journald,
recording details about both the Tailscale user (whois) and the
mapped local user account.
The implementation uses raw netlink sockets to send AUDIT_USER_LOGIN
messages to the kernel audit subsystem. It requires CAP_AUDIT_WRITE
capability, which is checked at runtime. If the capability is not
present, audit logging is silently skipped.
Audit messages are sent to the kernel (pid 0) and consumed by either
auditd (written to /var/log/audit/audit.log) or journald (available
via journalctl _TRANSPORT=audit), depending on system configuration.
Note: This may result in duplicate messages on a system where
auditd/journald audit logs are enabled and the system has and supports
`login -h`. Sadly Linux login code paths are still an inconsistent wild
west so we accept the potential duplication rather than trying to avoid
it.
Fixes#18332
Signed-off-by: James Tucker <james@tailscale.com>
GCP Certificate Manager requires an email contact on ACME accounts.
Add --acme-email flag that is required for --certmode=gcp and
optional for --certmode=letsencrypt.
Fixes#18277
Signed-off-by: Raj Singh <raj@tailscale.com>
An error returned by net.Listener.Accept() causes the owning http.Server to shut down.
With the deprecation of net.Error.Temporary(), there's no way for the http.Server to test
whether the returned error is temporary / retryable or not (see golang/go#66252).
Because of that, errors returned by (*safesocket.winIOPipeListener).Accept() cause the LocalAPI
server (aka ipnserver.Server) to shut down, and tailscaled process to exit.
While this might be acceptable in the case of non-recoverable errors, such as programmer errors,
we shouldn't shut down the entire tailscaled process for client- or connection-specific errors,
such as when we couldn't obtain the client's access token because the client attempts to connect
at the Anonymous impersonation level. Instead, the LocalAPI server should gracefully handle
these errors by denying access and returning a 401 Unauthorized to the client.
In tailscale/tscert#15, we fixed a known bug where Caddy and other apps using tscert would attempt
to connect at the Anonymous impersonation level and fail. However, we should also fix this on the tailscaled
side to prevent a potential DoS, where a local app could deliberately open the Tailscale LocalAPI named pipe
at the Anonymous impersonation level and cause tailscaled to exit.
In this PR, we defer token retrieval until (*WindowsClientConn).Token() is called and propagate the returned token
or error via ipnauth.GetConnIdentity() to ipnserver, which handles it the same way as other ipnauth-related errors.
Fixes#18212Fixestailscale/tscert#13
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This gauge will be reworked to include endpoint state in future.
Updates tailscale/corp#30820
Change-Id: I66f349d89422b46eec4ecbaf1a99ad656c7301f9
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
In dynamically changing environments where ACME account keys and certs
are stored separately, it can happen that the account key would get
deleted (and recreated) between issuances. If that is the case,
we currently fail renewals and the only way to recover is for users
to delete certs.
This adds a config knob to allow opting out of the replaces extension
and utilizes it in the Kubernetes operator where there are known
user workflows that could end up with this edge case.
Updates #18251
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Adding both user and client metrics for peer relay forwarded bytes and
packets, and the total endpoints gauge.
User metrics:
tailscaled_peer_relay_forwarded_packets_total{transport_in, transport_out}
tailscaled_peer_relay_forwarded_bytes_total{transport_in, transport_out}
tailscaled_peer_relay_endpoints_total{}
Where the transport labels can be of "udp4" or "udp6".
Client metrics:
udprelay_forwarded_(packets|bytes)_udp(4|6)_udp(4|6)
udprelay_endpoints
RELNOTE: Expose tailscaled metrics for peer relay.
Updates tailscale/corp#30820
Change-Id: I1a905d15bdc5ee84e28017e0b93210e2d9660259
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
Adds support for targeting FQDNs that are a Tailscale Service. Uses the
same method of searching for Services as the tailscale configure
kubeconfig command. This fixes using the tailscale.com/tailnet-fqdn
annotation for Kubernetes Service when the specified FQDN is a Tailscale
Service.
Fixes#16534
Change-Id: I422795de76dc83ae30e7e757bc4fbd8eec21cc64
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Signed-off-by: Becky Pauley <becky@tailscale.com>
IsZero is required by the interface, so we should use that before trying
to serialize the key.
Updates #35412
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
When the TS_DEBUG_DNS_FORWARD_SEND envknob is turned on, also log the
source IP:port of the query that tailscaled is forwarding.
Updates tailscale/corp#35374
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
updates tailscale/corp#33891
Addresses several older the TODO's in netmon. This removes the
Major flag precomputes the ChangeDelta state, rather than making
consumers of ChangeDeltas sort that out themselves. We're also seeing
a lot of ChangeDelta's being flagged as "Major" when they are
not interesting, triggering rebinds in wgengine that are not needed. This
cleans that up and adds a host of additional tests.
The dependencies are cleaned, notably removing dependency on netmon
itself for calculating what is interesting, and what is not. This includes letting
individual platforms set a bespoke global "IsInterestingInterface"
function. This is only used on Darwin.
RebindRequired now roughly follows how "Major" was historically
calculated but includes some additional checks for various
uninteresting events such as changes in interface addresses that
shouldn't trigger a rebind. This significantly reduces thrashing (by
roughly half on Darwin clients which switching between nics). The individual
values that we roll into RebindRequired are also exposed so that
components consuming netmap.ChangeDelta can ask more
targeted questions.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
The existing client metric methods only support incrementing (or
decrementing) a delta value. This new method allows setting the metric
to a specific value.
Updates tailscale/corp#35327
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
This commit also introduces a sync.Mutex for guarding mutatable fields
on serverEndpoint, now that it is no longer guarded by the sync.Mutex
in Server.
These changes reduce lock contention and by effect increase aggregate
throughput under high flow count load. A benchmark on Linux with AWS
c8gn instances showed a ~30% increase in aggregate throughput (37Gb/s
vs 28Gb/s) for 12 tailscaled flows.
Updates tailscale/corp#35264
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Add flags:
* --cigocached-host to support alternative host resolution in other
environments, like the corp repo.
* --stats to reduce the amount of bash script we need.
* --version to support a caching tool/cigocacher script that will
download from GitHub releases.
Updates tailscale/corp#10808
Change-Id: Ib2447bc5f79058669a70f2c49cef6aedd7afc049
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
tcpHandlerForVIPService was missing ProxyProtocol support that
tcpHandlerForServe already had. Extract the shared logic into
forwardTCPWithProxyProtocol helper and use it in both handlers.
Fixes#18172
Signed-off-by: Raj Singh <raj@tailscale.com>
Add metrics about logtail uploading and underlying buffer.
Add metrics to the in-memory buffer implementation.
Updates tailscale/corp#21363
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
PR #18033 skipped tests for the versions of Linux 6.6 and 6.12 that
had a regression in /proc/net/tcp that causes seek operations to fail
with “illegal seek”.
This PR skips tests for Linux 6.14.0, which is the default Ubuntu
kernel, that also contains this regression.
Updates #16966
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The filch implementation is fairly broken:
* When Filch.cur exceeds MaxFileSize, it calls moveContents
to copy the entirety of cur into alt (while holding the write lock).
By nature, this is the movement of a lot of data in a hot path,
meaning that all log calls will be globally blocked!
It also means that log uploads will be blocked during the move.
* The implementation of moveContents is buggy in that
it copies data from cur into the start of alt,
but fails to truncate alt to the number of bytes copied.
Consequently, there are unrelated lines near the end,
leading to out-of-order lines when being read back.
* Data filched via stderr do not directly respect MaxFileSize,
which is only checked every 100 Filch.Write calls.
This means that it is possible that the file grows far beyond
the specified max file size before moveContents is called.
* If both log files have data when New is called,
it also copies the entirety of cur into alt.
This can block the startup of a process copying lots of data
before the process can do any useful work.
* TryReadLine is implemented using bufio.Scanner.
Unfortunately, it will choke on any lines longer than
bufio.MaxScanTokenSize, rather than gracefully skip over them.
The re-implementation avoids a lot of these problems
by fundamentally eliminating the need for moveContent.
We enforce MaxFileSize by simply rotating the log files
whenever the current file exceeds MaxFileSize/2.
This is a constant-time operation regardless of file size.
To more gracefully handle lines longer than bufio.MaxScanTokenSize,
we skip over these lines (without growing the read buffer)
and report an error. This allows subsequent lines to be read.
In order to improve debugging, we add a lot of metrics.
Note that the the mechanism of dup2 with stderr
is inherently racy with a the two file approach.
The order of operations during a rotation is carefully chosen
to reduce the race window to be as short as possible.
Thus, this is slightly less racy than before.
Updates tailscale/corp#21363
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
When receiving a TSMPDiscoAdvertisement from peer, update the discokey
for said peer.
Some parts taken from: https://github.com/tailscale/tailscale/pull/18073/
Updates #12639
Co-authored-by: James Tucker <james@tailscale.com>
Re-instate the linking of iptables installed in Tailscale container
to the legacy iptables version. In environments where the legacy
iptables is not needed, we should be able to run nftables instead,
but this will ensure that Tailscale keeps working in environments
that don't support nftables, such as some Synology NAS hosts.
Updates #17854
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Add --certmode=gcp for using Google Cloud Certificate Manager's
public CA instead of Let's Encrypt. GCP requires External Account
Binding (EAB) credentials for ACME registration, so this adds
--acme-eab-kid and --acme-eab-key flags.
The EAB key accepts both base64url and standard base64 encoding
to support both ACME spec format and gcloud output.
Fixestailscale/corp#34881
Signed-off-by: Raj Singh <raj@tailscale.com>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When using the resolve.conf file for setting DNS, it is possible that
some other services will trample the file and overwrite our set DNS
server. Experiments has shown this to be a racy error depending on how
quickly processes start.
Make an attempt to trample back the file a limited number of times if
the file is changed.
Updates #16635
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
When peers request an IP address mapping to be stored, the connector
stores it in memory.
Fixestailscale/corp#34251
Signed-off-by: Fran Bull <fran@tailscale.com>
To save rebuilding cigocacher on each CI job, build it on-demand, and
publish a release similar to how we publish releases for tool/go to
consume. Once the first release is done, we can add a new
tool/cigocacher script that pins to a specific release for each branch
to download.
Updates tailscale/corp#10808
Change-Id: I7694b2c2240020ba2335eb467522cdd029469b6c
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
It appears (*controlclient.Auto).Shutdown() can still deadlock when called with b.mu held, and therefore the changes in #18127 are unsafe.
This reverts #18127 until we figure out what causes it.
This reverts commit d199ecac80.
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This improves our test coverage of the Bootstrap() method, especially
around catching AUMs that shouldn't pass validation.
Updates #cleanup
Change-Id: Idc61fcbc6daaa98c36d20ec61e45ce48771b85de
Signed-off-by: Alex Chan <alexc@tailscale.com>
Previously, if users attempted to expose a headless Service to tailnet,
this just silently did not work.
This PR makes the operator throw a warning event + update Service's
status with an error message.
Updates #18139
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The event queue gets deleted events, which means that sometimes
the object that should be reconciled no longer exists.
Don't log user facing errors if that is the case.
Updates #18141
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The service was starting after systemd itself, and while this
surprisingly worked for some situations, it broke for others.
Change it to start after a GUI has been initialized.
Updates #17656
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Previously we only set this when it updated, which was fine for the first
call to Start(), but after that point future updates would be skipped if
nothing had changed. If Start() was called again, it would wipe the peer API
endpoints and they wouldn't get added back again, breaking exit nodes (and
anything else requiring peer API to be advertised).
Updates tailscale/corp#27173
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
Based on PR #16700 by @lox, adapted to current codebase.
Adds support for proxying HTTP requests to Unix domain sockets via
tailscale serve unix:/path/to/socket, enabling exposure of services
like Docker, containerd, PHP-FPM over Tailscale without TCP bridging.
The implementation includes reasonable protections against exposure of
tailscaled's own socket.
Adaptations from original PR:
- Use net.Dialer.DialContext instead of net.Dial for context propagation
- Use http.Transport with Protocols API (current h2c approach, not http2.Transport)
- Resolve conflicts with hasScheme variable in ExpandProxyTargetValue
Updates #9771
Signed-off-by: Peter A. <ink.splatters@pm.me>
Co-authored-by: Lachlan Donald <lachlan@ljd.cc>
If a packet arrives while WireGuard is being reconfigured with b.mu held, such as during a profile switch,
calling back into (*LocalBackend).GetPeerAPIPort from (*Wrapper).filterPacketInboundFromWireGuard
may deadlock when it tries to acquire b.mu.
This occurs because a peer cannot be removed while an inbound packet is being processed.
The reconfig and profile switch wait for (*Peer).RoutineSequentialReceiver to return, but it never finishes
because GetPeerAPIPort needs b.mu, which the waiting goroutine already holds.
In this PR, we make peerAPIPorts a new syncs.AtomicValue field that is written with b.mu held
but can be read by GetPeerAPIPort without holding the mutex, which fixes the deadlock.
There might be other long-term ways to address the issue, such as moving peer API listeners
from LocalBackend to nodeBackend so they can be accessed without holding b.mu,
but these changes are too large and risky at this stage in the v1.92 release cycle.
Updates #18124
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Previously, callers of (*LocalBackend).resetControlClientLocked were supposed
to call Shutdown on the returned controlclient.Client after releasing b.mu.
In #17804, we started calling Shutdown while holding b.mu, which caused
deadlocks during profile switches due to the (*ExecQueue).RunSync implementation.
We first patched this in #18053 by calling Shutdown in a new goroutine,
which avoided the deadlocks but made TestStateMachine flaky because
the shutdown order was no longer guaranteed.
In #18070, we updated (*ExecQueue).RunSync to allow shutting down
the queue without waiting for RunSync to return. With that change,
shutting down the control client while holding b.mu became safe.
Therefore, this PR updates (*LocalBackend).resetControlClientLocked
to shut down the old client synchronously during the reset, instead of
returning it and shifting that responsibility to the callers.
This fixes the flaky tests and simplifies the code.
Fixes#18052
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This commit uses SO_REUSEPORT (when supported) to bind multiple sockets
per address family. Increasing the number of sockets can increase
aggregate throughput when serving many peer relay client flows.
Benchmarks show 3x improvement in max aggregate bitrate in some
environments.
Updates tailscale/corp#34745
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Add support for pinning specific Tailscale versions during installation
via the TAILSCALE_VERSION environment variable.
Example usage:
curl -fsSL https://tailscale.com/install.sh | TAILSCALE_VERSION=1.88.4 sh
Fixes#17776
Signed-off-by: Raj Singh <raj@tailscale.com>
111 is 3 years old, and there have been a lot of speed improvements
since then. We run wasm-opt twice as part of the CI wasm job, and it
currently takes about 3 minutes each time. With 125, it takes ~40
seconds, a 4.5x speed-up.
Updates #cleanup
Change-Id: I671ae6cefa3997a23cdcab6871896b6b03e83a4f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Implements a new disk put function for cigocacher that does not cause
locking issues on Windows when there are multiple processes reading and
writing the same files concurrently. Integrates cigocacher into test.yml
for Windows where we are running on larger runners that support
connecting to private Azure vnet resources where cigocached is hosted.
Updates tailscale/corp#10808
Change-Id: I0d0e9b670e49e0f9abf01ff3d605cd660dd85ebb
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
The cache artifacts from a full run of test.yml are 14GB. Only save
artifacts from the main branch to ensure we don't thrash too much. Most
branches should get decent performance with a hit from recent main.
Fixestailscale/corp#34739
Change-Id: Ia83269d878e4781e3ddf33f1db2f21d06ea2130f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Thanks to seamless key renewal, you can now do a force-reauth without
losing your connection in all circumstances. We softened the interactive
warning (see #17262) so let's soften the help text as well.
Updates https://github.com/tailscale/corp/issues/32429
Signed-off-by: Alex Chan <alexc@tailscale.com>
* cmd/k8s-operator: add support for taiscale.com/http-redirect
The k8s-operator now supports a tailscale.com/http-redirect annotation
on Ingress resources. When enabled, this automatically creates port 80
handlers that automatically redirect to the equivalent HTTPS location.
Fixes#11252
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
* Fix for permanent redirect
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
* lint
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
* warn for redirect+endpoint
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
* tests
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
---------
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
Restrict running the golangci-lint workflow to when the workflow file
itself or a .go file, go.mod, or go.sum have actually been modified.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Skip the "request review" workflows for PRs that are in draft to reduce
noise / skip adding reviewers to PRs that are intentionally marked as
not ready to review.
Updates #cleanup
Signed-off-by: Mario Minardi <mario@tailscale.com>
Adds an observation point that may identify potentially abusive traffic
patterns at outlier values.
Updates tailscale/corp#24681
Signed-off-by: James Tucker <james@tailscale.com>
We don't hold q.mu while running normal ExecQueue.Add funcs, so we
shouldn't in RunSync either. Otherwise code it calls can't shut down
the queue, as seen in #18502.
Updates #18052
Co-authored-by: Nick Khyl <nickk@tailscale.com>
Change-Id: Ic5e53440411eca5e9fabac7f4a68a9f6ef026de1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This patch adds an integration test for Tailnet Lock, checking that a node can't
talk to peers in the tailnet until it becomes signed.
This patch also introduces a new package `tstest/tkatest`, which has some helpers
for constructing a mock control server that responds to TKA requests. This allows
us to reduce boilerplate in the IPN tests.
Updates tailscale/corp#33599
Signed-off-by: Alex Chan <alexc@tailscale.com>
In preparation for exposing its configuration via ipn.ConfigVAlpha,
change {Masked}Prefs.RelayServerPort from *int to *uint16. This takes a
defensive stance against invalid inputs at JSON decode time.
'tailscale set --relay-server-port' is currently the only input to this
pref, and has always sanitized input to fit within a uint16.
Updates tailscale/corp#34591
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Adds a new types of TSMP messages for advertising disco keys keys
to/from a peer, and implements the advertising triggered by a TSMP ping.
Needed as part of the effort to cache the netmap and still let clients
connect without control being reachable.
Updates #12639
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Co-authored-by: James Tucker <james@tailscale.com>
In suggestExitNodeLocked, if no exit node candidates have a home DERP or
valid location info, `bestCandidates` is an empty slice. This slice is
passed to `selectNode` (`randomNode` in prod):
```go func randomNode(nodes views.Slice[tailcfg.NodeView], …) tailcfg.NodeView {
…
return nodes.At(rand.IntN(nodes.Len()))
}
```
An empty slice becomes a call to `rand.IntN(0)`, which panics.
This patch changes the behaviour, so if we've filtered out all the
candidates before calling `selectNode`, reset the list and then pick
from any of the available candidates.
This patch also updates our tests to give us more coverage of `randomNode`,
so we can spot other potential issues.
Updates #17661
Change-Id: I63eb5e4494d45a1df5b1f4b1b5c6d5576322aa72
Signed-off-by: Alex Chan <alexc@tailscale.com>
And fix up the TestAutoUpdateDefaults integration tests as they
weren't testing reality: the DefaultAutoUpdate is supposed to only be
relevant on the first MapResponse in the stream, but the tests weren't
testing that. They were instead injecting a 2nd+ MapResponse.
This changes the test control server to add a hook to modify the first
map response, and then makes the test control when the node goes up
and down to make new map responses.
Also, the test now runs on macOS where the auto-update feature being
disabled would've previously t.Skipped the whole test.
Updates #11502
Change-Id: If2319bd1f71e108b57d79fe500b2acedbc76e1a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In PR tailscale/corp#34401, the `traffic-steering` feature flag does
not automatically enable traffic steering for all nodes. Instead, an
admin must add the `traffic-steering` node attribute to each client
node that they want opted-in.
For backwards compatibility with older clients, tailscale/corp#34401
strips out the `traffic-steering` node attribute if the feature flag
is not enabled, even if it is set in the policy file. This lets us
safely disable the feature flag.
This PR adds a missing test case for suggested exit nodes that have no
priority.
Updates tailscale/corp#34399
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This commit fixes a bug in our HA ingress reconciler where ingress resources would
be stuck in a deleting state should their associated VIP service be deleted within
control.
The reconciliation loop would check for the existence of the VIP service and if not
found perform no additional cleanup steps. The code has been modified to continue
onwards even if the VIP service is not found.
Fixes: https://github.com/tailscale/tailscale/issues/18049
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit replaces crypto/rand challenge generation with a blake2s-256
MAC. This enables the peer relay server to respond to multiple forward
disco.BindUDPRelayEndpoint messages per handshake generation without
sacrificing the proof of IP ownership properties of the handshake.
Responding to multiple forward disco.BindUDPRelayEndpoint messages per
handshake generation improves client address/path selection where
lowest client->server path/addr one-way delay does not necessarily
equate to lowest client<->server round trip delay.
It also improves situations where outbound traffic is filtered
independent of input, and the first reply
disco.BindUDPRelayEndpointChallenge message is dropped on the reply
path, but a later reply using a different source would make it through.
Reduction in serverEndpoint state saves 112 bytes per instance, trading
for slightly more expensive crypto ops: 277ns/op vs 321ns/op on an M1
Macbook Pro.
Updates tailscale/corp#34414
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Adds cmd/cigocacher as the client to cigocached for Go caching over
HTTP. The HTTP cache is best-effort only, and builds will fall back to
disk-only cache if it's not available, much like regular builds.
Not yet used in CI; that will follow in another PR once we have runners
available in this repo with the right network setup for reaching
cigocached.
Updates tailscale/corp#10808
Change-Id: I13ae1a12450eb2a05bd9843f358474243989e967
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
When the underlying transport returns a network error, the RoundTrip
method returns (nil, error). The defer was attempting to access resp
without checking if it was nil first, causing a panic. Fix this by
checking for nil in the defer.
Also changes driveTransport.tr from *http.Transport to http.RoundTripper
and adds a test.
Fixes#17306
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
Change-Id: Icf38a020b45aaa9cfbc1415d55fd8b70b978f54c
SetSubnetRoutes was not sending update notifications to nodes when their
approved routes changed, causing nodes to not fetch updated netmaps with
PrimaryRoutes populated. This resulted in TestUserMetricsRouteGauges
flaking because it waited for PrimaryRoutes to be set, which only happened
if the node happened to poll for other reasons.
Now send updateSelfChanged notification to affected nodes so they fetch
an updated netmap immediately.
Fixes#17962
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
Linux kernel versions 6.6.102-104 and 6.12.42-45 have a regression
in /proc/net/tcp that causes seek operations to fail with "illegal seek".
This breaks portlist tests on these kernels.
Add kernel version detection for Linux systems and a SkipOnKernelVersions
helper to tstest. Use it to skip affected portlist tests on the broken
kernel versions.
Thanks to philiptaron for the list of kernels with the issue and fix.
Updates #16966
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
Bounded DeliveredEvent queues reduce memory usage, but they can deadlock under load.
Two common scenarios trigger deadlocks when the number of events published in a short
period exceeds twice the queue capacity (there's a PublishedEvent queue of the same size):
- a subscriber tries to acquire the same mutex as held by a publisher, or
- a subscriber for A events publishes B events
Avoiding these scenarios is not practical and would limit eventbus usefulness and reduce its adoption,
pushing us back to callbacks and other legacy mechanisms. These deadlocks already occurred in customer
devices, dev machines, and tests. They also make it harder to identify and fix slow subscribers and similar
issues we have been seeing recently.
Choosing an arbitrary large fixed queue capacity would only mask the problem. A client running
on a sufficiently large and complex customer environment can exceed any meaningful constant limit,
since event volume depends on the number of peers and other factors. Behavior also changes
based on scheduling of publishers and subscribers by the Go runtime, OS, and hardware, as the issue
is essentially a race between publishers and subscribers. Additionally, on lower-end devices,
an unreasonably high constant capacity is practically the same as using unbounded queues.
Therefore, this PR changes the event queue implementation to be unbounded by default.
The PublishedEvent queue keeps its existing capacity of 16 items, while subscribers'
DeliveredEvent queues become unbounded.
This change fixes known deadlocks and makes the system stable under load,
at the cost of higher potential memory usage, including cases where a queue grows
during an event burst and does not shrink when load decreases.
Further improvements can be implemented in the future as needed.
Fixes#17973Fixes#18012
Signed-off-by: Nick Khyl <nickk@tailscale.com>
As of 2025-11-20, publishing more events than the eventbus's
internal queues can hold may deadlock if a subscriber tries
to publish events itself.
This commit adds a test that demonstrates this deadlock,
and skips it until the bug is fixed.
Updates #18012
Signed-off-by: Nick Khyl <nickk@tailscale.com>
As of 2025-11-20, publishing more events than the eventbus's
internal queues can hold may deadlock if a subscriber tries
to acquire a mutex that can also be held by a publisher.
This commit adds a test that demonstrates this deadlock,
and skips it until the bug is fixed.
Updates #17973
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This is causing confusing panics in tailscale/corp#34485. We'll keep
using the tka.ChonkMem constructor as much as we can, but don't panic
if you create a tka.Mem directly -- we know what the sensible thing is.
Updates #cleanup
Signed-off-by: Alex Chan <alexc@tailscale.com>
Change-Id: I49309f5f403fc26ce4f9a6cf0edc8eddf6a6f3a4
With the introduction of node sealing, store.New fails in some cases due
to the TPM device being reset or unavailable. Currently it results in
tailscaled crashing at startup, which is not obvious to the user until
they check the logs.
Instead of crashing tailscaled at startup, start with an in-memory store
with a health warning about state initialization and a link to (future)
docs on what to do. When this health message is set, also block any
login attempts to avoid masking the problem with an ephemeral node
registration.
Updates #15830
Updates #17654
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
These validations were previously performed in the CLI frontend. There
are two motivations for moving these to the local backend:
1. The backend controls synchronization around the relevant state, so
only the backend can guarantee many of these validations.
2. Doing these validations in the back-end avoids the need to repeat
them across every frontend (e.g. the CLI and tsnet).
Updates tailscale/corp#27200
Signed-off-by: Harry Harpham <harry@tailscale.com>
This commit adds the `spec.replicas` field to the `Recorder` custom
resource that allows for a highly available deployment of `tsrecorder`
within a kubernetes cluster.
Many changes were required here as the code hard-coded the assumption
of a single replica. This has required a few loops, similar to what we
do for the `Connector` resource to create auth and state secrets. It
was also required to add a check to remove dangling state and auth
secrets should the recorder be scaled down.
Updates: https://github.com/tailscale/tailscale/issues/17965
Signed-off-by: David Bond <davidsbond93@gmail.com>
fixestailscale/tailscale#17990
The logging for the netns caps is spammy. Log only on changes
to the values and don't log Darwin specific stuff on non Darwin
clients.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
This commit modifies the kubernetes operator to use the "stable" version
of `k8s-nameserver` by default.
Updates: https://github.com/tailscale/corp/issues/19028
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit enables user to set service backend to remote destinations, that can be a partial
URL or a full URL. The commit also prevents user to set remote destinations on linux system
when socket mark is not working. For user on any version of mac extension they can't serve a
service either. The socket mark usability is determined by a new local api.
Fixestailscale/corp#24783
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
Now that we support using an in-memory backend for TKA state (#17946),
this function always returns `nil` – we can always support Network Lock.
We don't need it any more.
Plus, clean up a couple of errant TODOs from that PR.
Updates tailscale/corp#33599
Change-Id: Ief93bb9adebb82b9ad1b3e406d1ae9d2fa234877
Signed-off-by: Alex Chan <alexc@tailscale.com>
Our style guide recommends avoiding Latin abbreviations in technical
documentation, which includes the CLI help text. This is causing linter
issues for the docs site, because this help text is copied into the docs.
See http://go/style-guide/kb/language-and-grammar/abbreviations#latin-abbreviations
Updates #cleanup
Change-Id: I980c28d996466f0503aaaa65127685f4af608039
Signed-off-by: Alex Chan <alexc@tailscale.com>
ArgoCD sends boolean values but the template expects strings, causing
"incompatible types for comparison" errors. Wrap values with toString
so both work.
Fixes#17158
Signed-off-by: Raj Singh <raj@tailscale.com>
Previously a TKA compaction would only run when a node starts, which means a long-running node could use unbounded storage as it accumulates ever-increasing amounts of TKA state. This patch changes TKA so it runs a compaction after every sync.
Updates https://github.com/tailscale/corp/issues/33537
Change-Id: I91df887ea0c5a5b00cb6caced85aeffa2a4b24ee
Signed-off-by: Alex Chan <alexc@tailscale.com>
This commit modifies the helm/static manifest configuration for the
k8s-operator to prefer the stable image tag. This avoids making those
using static manifests seeing unstable behaviour by default if they
do not manually make the change.
This is managed for us when using helm but not when generating the
static manifests.
Updates https://github.com/tailscale/tailscale/issues/10655
Signed-off-by: David Bond <davidsbond93@gmail.com>
(trying to get in smaller obvious chunks ahead of later PRs to make
them smaller)
Updates #17925
Change-Id: I184002001055790484e4792af8ffe2a9a2465b2e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We now embed node information into network flow logs.
By default, netlogfmt still prints out using Tailscale IP addresses.
Support a "--resolve-addrs=TYPE" flag that can be used to specify
resolving IP addresses as node IDs, hostnames, users, or tags.
Updates tailscale/corp#33352
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Adds the ability to rotate discovery keys on running clients, needed for
testing upcoming disco key distribution changes.
Introduces key.DiscoKey, an atomic container for a disco private key,
public key, and the public key's ShortString, replacing the prior
separate atomic fields.
magicsock.Conn has a new RotateDiscoKey method, and access to this is
provided via localapi and a CLI debug command.
Note that this implementation is primarily for testing as it stands, and
regular use should likely introduce an additional mechanism that allows
the old key to be used for some time, to provide a seamless key rotation
rather than one that invalidates all sessions.
Updates tailscale/corp#34037
Signed-off-by: James Tucker <james@tailscale.com>
As part of the conn25 work we will want to be able to keep track of a
pool of IP Addresses and know which have been used and which have not.
Fixestailscale/corp#34247
Signed-off-by: Fran Bull <fran@tailscale.com>
We use `tka.AUMHash` in `netmap.NetworkMap`, and we serialise it as JSON
in the `/debug/netmap` C2N endpoint. If the binary omits Tailnet Lock support,
the debug endpoint returns an error because it's unable to marshal the
AUMHash.
This patch adds a sentinel value so this marshalling works, and we can
use the debug endpoint.
Updates https://github.com/tailscale/tailscale/issues/17115
Signed-off-by: Alex Chan <alexc@tailscale.com>
Change-Id: I51ec1491a74e9b9f49d1766abd89681049e09ce4
Existing compaction logic seems to have had an assumption that
markActiveChain would cover a longer part of the chain than
markYoungAUMs. This prevented long, but fresh, chains, from being
compacted correctly.
Updates tailscale/corp#33537
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
6a73c0bdf5 added a feature tag but didn't re-run go generate on ./feature/buildfeatures.
Updates #9192
Change-Id: I7819450453e6b34c60cad29d2273e3e118291643
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I added a RemoveAll() method on tka.Chonk in #17946, but it's only used
in the node to purge local AUMs. We don't need it in the SQLite storage,
which currently implements tka.Chonk, so move it to CompactableChonk
instead.
Also add some automated tests, as a safety net.
Updates tailscale/corp#33599
Change-Id: I54de9ccf1d6a3d29b36a94eccb0ebd235acd4ebc
Signed-off-by: Alex Chan <alexc@tailscale.com>
The REST API does not return a node name
with a trailing dot, while the internal node name
reported in the netmap does have one.
In order to be consistent with the API,
strip the dot when recording node information.
Updates tailscale/corp#33352
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Perform a path check first before attempting exec of `true`.
Try /usr/bin/true first, as that is now and increasingly so, the more
common and more portable path.
Fixes tests on macOS arm64 where exec was returning a different kind of
path error than previously checked.
Updates #16569
Signed-off-by: James Tucker <james@tailscale.com>
DA protection is not super helpful because we don't set an authorization
password on the key. But if authorization fails for other reasons (like
TPM being reset), we will eventually cause DA lockout with tailscaled
trying to load the key. DA lockout then leads to (1) issues for other
processes using the TPM and (2) the underlying authorization error being
masked in logs.
Updates #17654
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
For manual (human) testing, this lets the user disable control plane
map polls with "tailscale set --sync=false" (which survives restarts)
and "tailscale set --sync" to restore.
A high severity health warning is shown while this is active.
Updates #12639
Updates #17945
Change-Id: I83668fa5de3b5e5e25444df0815ec2a859153a6d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Let's fix all the typos, which lets the code be more readable, lest we
confuse our readers.
Updates #cleanup
Change-Id: I4954601b0592b1fda40269009647bb517a4457be
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This requires making the internals of LocalBackend a bit more generic,
and implementing the `tka.CompactableChonk` interface for `tka.Mem`.
Signed-off-by: Alex Chan <alexc@tailscale.com>
Updates https://github.com/tailscale/corp/issues/33599
Pick up a fix for https://pkg.go.dev/vuln/GO-2025-4116 (even though
we're not affected).
Updates #cleanup
Change-Id: I9f2571b17c1f14db58ece8a5a34785805217d9dd
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Includes adding StartPaused, which will be used in a future change to
enable netmap caching testing.
Updates #12639
Change-Id: Iec39915d33b8d75e9b8315b281b1af2f5d13a44a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This patch changes the behaviour of `tailscale lock log --json` to make
it more useful for users. It also introduces versioning of our JSON output.
## Changes to `tailscale lock log --json`
Previously this command would print the hash and base64-encoded bytes of
each AUM, and users would need their own CBOR decoder to interpret it in
a useful way:
```json
[
{
"Hash": [
80,
136,
151,
…
],
"Change": "checkpoint",
"Raw": "pAEFAvYFpQH2AopYIAkPN+8V3cJpkoC5ZY2+RI2Bcg2q5G7tRAQQd67W3YpnWCDPOo4KGeQBd8hdGsjoEQpSXyiPdlm+NXAlJ5dS1qEbFlggylNJDQM5ZQ2ULNsXxg2ZBFkPl/D93I1M56/rowU+UIlYIPZ/SxT9EA2Idy9kaCbsFzjX/s3Ms7584wWGbWd/f/QAWCBHYZzYiAPpQ+NXN+1Wn2fopQYk4yl7kNQcMXUKNAdt1lggcfjcuVACOH0J9pRNvYZQFOkbiBmLOW1hPKJsbC1D1GdYIKrJ38XMgpVMuTuBxM4YwoLmrK/RgXQw1uVEL3cywl3QWCA0FilVVv8uys8BNhS62cfNvCew1Pw5wIgSe3Prv8d8pFggQrwIt6ldYtyFPQcC5V18qrCnt7VpThACaz5RYzpx7RNYIKskOA7UoNiVtMkOrV2QoXv6EvDpbO26a01lVeh8UCeEA4KjAQECAQNYIORIdNHqSOzz1trIygnP5w3JWK2DtlY5NDIBbD7SKcjWowEBAgEDWCD27LpxiZNiA19k0QZhOWmJRvBdK2mz+dHu7rf0iGTPFwQb69Gt42fKNn0FGwRUiav/k6dDF4GiAVgg5Eh00epI7PPW2sjKCc/nDclYrYO2Vjk0MgFsPtIpyNYCWEDzIAooc+m45ay5PB/OB4AA9Fdki4KJq9Ll+PF6IJHYlOVhpTbc3E0KF7ODu1WURd0f7PXnW72dr89CSfGxIHAF"
}
]
```
Now we print the AUM in an expanded form that can be easily read by scripts,
although we include the raw bytes for verification and auditing.
```json
{
"SchemaVersion": "1",
"Messages": [
{
"Hash": "KCEJPRKNSXJG2TPH3EHQRLJNLIIK2DV53FUNPADWA7BZJWBDRXZQ",
"AUM": {
"MessageKind": "checkpoint",
"PrevAUMHash": null,
"Key": null,
"KeyID": null,
"State": {
…
},
"Votes": null,
"Meta": null,
"Signatures": [
{
"KeyID": "tlpub:e44874d1ea48ecf3d6dac8ca09cfe70dc958ad83b656393432016c3ed229c8d6",
"Signature": "8yAKKHPpuOWsuTwfzgeAAPRXZIuCiavS5fjxeiCR2JTlYaU23NxNChezg7tVlEXdH+z151u9na/PQknxsSBwBQ=="
}
]
},
"Raw": "pAEFAvYFpQH2AopYIAkPN-8V3cJpkoC5ZY2-RI2Bcg2q5G7tRAQQd67W3YpnWCDPOo4KGeQBd8hdGsjoEQpSXyiPdlm-NXAlJ5dS1qEbFlggylNJDQM5ZQ2ULNsXxg2ZBFkPl_D93I1M56_rowU-UIlYIPZ_SxT9EA2Idy9kaCbsFzjX_s3Ms7584wWGbWd_f_QAWCBHYZzYiAPpQ-NXN-1Wn2fopQYk4yl7kNQcMXUKNAdt1lggcfjcuVACOH0J9pRNvYZQFOkbiBmLOW1hPKJsbC1D1GdYIKrJ38XMgpVMuTuBxM4YwoLmrK_RgXQw1uVEL3cywl3QWCA0FilVVv8uys8BNhS62cfNvCew1Pw5wIgSe3Prv8d8pFggQrwIt6ldYtyFPQcC5V18qrCnt7VpThACaz5RYzpx7RNYIKskOA7UoNiVtMkOrV2QoXv6EvDpbO26a01lVeh8UCeEA4KjAQECAQNYIORIdNHqSOzz1trIygnP5w3JWK2DtlY5NDIBbD7SKcjWowEBAgEDWCD27LpxiZNiA19k0QZhOWmJRvBdK2mz-dHu7rf0iGTPFwQb69Gt42fKNn0FGwRUiav_k6dDF4GiAVgg5Eh00epI7PPW2sjKCc_nDclYrYO2Vjk0MgFsPtIpyNYCWEDzIAooc-m45ay5PB_OB4AA9Fdki4KJq9Ll-PF6IJHYlOVhpTbc3E0KF7ODu1WURd0f7PXnW72dr89CSfGxIHAF"
}
]
}
```
This output was previously marked as unstable, and it wasn't very useful,
so changing it should be fine.
## Versioning our JSON output
This patch introduces a way to version our JSON output on the CLI, so we
can make backwards-incompatible changes in future without breaking existing
scripts or integrations.
You can run this command in two ways:
```
tailscale lock log --json
tailscale lock log --json=1
```
Passing an explicit version number allows you to pick a specific JSON schema.
If we ever want to change the schema, we increment the version number and
users must opt-in to the new output.
A bare `--json` flag will always return schema version 1, for compatibility
with existing scripts.
Updates https://github.com/tailscale/tailscale/issues/17613
Updates https://github.com/tailscale/corp/issues/23258
Signed-off-by: Alex Chan <alexc@tailscale.com>
Change-Id: I897f78521cc1a81651f5476228c0882d7b723606
This adds the --proxy-protocol flag to 'tailscale serve' and
'tailscale funnel', which tells the Tailscale client to prepend a PROXY
protocol[1] header when making connections to the proxied-to backend.
I've verified that this works with our existing funnel servers without
additional work, since they pass along source address information via
PeerAPI already.
Updates #7747
[1]: https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
Change-Id: I647c24d319375c1b33e995555a541b7615d2d203
Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
It's an unnecessary nuisance having it. We go out of our way to redact
it in so many places when we don't even need it there anyway.
Updates #12639
Change-Id: I5fc72e19e9cf36caeb42cf80ba430873f67167c3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Remove the State enum (StateNew, StateNotAuthenticated, etc.) from
controlclient and replace it with two explicit boolean fields:
- LoginFinished: indicates successful authentication
- Synced: indicates we've received at least one netmap
This makes the state more composable and easier to reason about, as
multiple conditions can be true independently rather than being
encoded in a single enum value.
The State enum was originally intended as the state machine for the
whole client, but that abstraction moved to ipn.Backend long ago.
This change continues moving away from the legacy state machine by
representing state as a combination of independent facts.
Also adds test helpers in ipnlocal that check independent, observable
facts (hasValidNetMap, needsLogin, etc.) rather than relying on
derived state enums, making tests more robust.
Updates #12639
Signed-off-by: James Tucker <james@tailscale.com>
The key.NewEmptyHardwareAttestationKey hook returns a non-nil empty
attestationKey, which means that the nil check in Clone doesn't trigger
and proceeds to try and clone an empty key. Check IsZero instead to
reduce log spam from Clone.
As a drive-by, make tpmAvailable check a sync.Once because the result
won't change.
Updates #17882
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Most /etc/os-release files set the VERSION_ID to a `MAJOR.MINOR`
string, but we were trying to compare this numerically against a major
version number. I can only assume that Linux Mint used switched from a
plain integer, since shells only do integer comparisons.
This patch extracts a VERSION_MAJOR from the VERSION_ID using
parameter expansion and unifies all the other ad-hoc comparisons to
use it.
Fixes#15841
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Co-authored-by: Xavier <xhienne@users.noreply.github.com>
LinkChangeLogLimiter keeps a subscription to track rate limits for log
messages. But when its context ended, it would exit the subscription loop,
leaving the subscriber still alive. Ensure the subscriber gets cleaned up
when the context ends, so we don't stall event processing.
Updates tailscale/corp#34311
Change-Id: I82749e482e9a00dfc47f04afbc69dd0237537cb2
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
On the corp tailnet (using Mullvad exit nodes + bunch of expired
devices + subnet routers), these were generating big ~35 KB blobs of
logging regularly.
This logging shouldn't even exist at this level, and should be rate
limited at a higher level, but for now as a bandaid, make it less
spammy.
Updates #cleanup
Change-Id: I0b5e9e6e859f13df5f982cd71cd5af85b73f0c0a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When TS_LOG_TARGET is set to an invalid URL, url.Parse returns an error
and nil pointer, which caused a panic when accessing u.Host.
Now we check the error from url.Parse and log a helpful message while
falling back to the default log host.
Fixes#17792
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
As a baby step towards eventbus-ifying controlclient, make the
Observer optional.
This also means callers that don't care (like this network lock test,
and some tests in other repos) can omit it, rather than passing in a
no-op one.
Updates #12639
Change-Id: Ibd776b45b4425c08db19405bc3172b238e87da4e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit replaces usage of local.Client in net/udprelay with DERPMap
plumbing over the eventbus. This has been a longstanding TODO. This work
was also accelerated by a memory leak in net/http when using
local.Client over long periods of time. So, this commit also addresses
said leak.
Updates #17801
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Instead of trying to call View() on something that's already a View
type (or trying to Clone the view unnecessarily), we can re-use the
existing View values in a map[T]ViewType.
Fixes#17866
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
They distracted me in some refactoring. They're set but never used.
Updates #17858
Change-Id: I6ec7d6841ab684a55bccca7b7cbf7da9c782694f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
updates tailscale/corp#31571
It appears that on the latest macOS, iOS and tVOS versions, the work
that netns is doing to bind outgoing connections to the default interface (and all
of the trimmings and workarounds in netmon et al that make that work) are
not needed. The kernel is extension-aware and doing nothing, is the right
thing. This is, however, not the case for tailscaled (which is not a
special process).
To allow us to test this assertion (and where it might break things), we add a
new node cap that turns this behaviour off only for network-extension equipped clients,
making it possible to turn this off tailnet-wide, without breaking any tailscaled
macos nodes.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
I noticed a deadlock in a test in a in-development PR where during a
shutdown storm of things (from a tsnet.Server.Close), LocalBackend was
trying to call magicsock.Conn.Synchronize but the magicsock and/or
eventbus was already shut down and no longer processing events.
Updates #16369
Change-Id: I58b1f86c8959303c3fb46e2e3b7f38f6385036f1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Unfortunately I closed the tab and lost it in my sea of CI failures
I'm currently fighting.
Updates #cleanup
Change-Id: I4e3a652d57d52b75238f25d104fc1987add64191
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So they're not all run N times on the sharded oss builders
and are only run one time each.
Updates tailscale/corp#28679
Change-Id: Ie21e84b06731fdc8ec3212eceb136c8fc26b0115
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When systemd notification support was omitted from the build, or on
non-Linux systems, we were unnecessarily emitting code and generating
garbage stringifying addresses upon transition to the Running state.
Updates #12614
Change-Id: If713f47351c7922bb70e9da85bf92725b25954b9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This removes one of the O(n=peers) allocs in getStatus, as
Engine.getStatus happens more often than Reconfig.
Updates #17814
Change-Id: I8a87fbebbecca3aedadba38e46cc418fd163c2b0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously if `chains` was empty, it would be passed to `computeActiveAncestor()`,
which would fail with the misleading error "multiple distinct chains".
Updates tailscale/corp#33846
Signed-off-by: Alex Chan <alexc@tailscale.com>
Change-Id: Ib93a755dbdf4127f81cbf69f3eece5a388db31c8
* lock released early just to call `b.send` when it can call
`b.sendToLocked` instead
* `UnlockEarly` called to release the lock before trivially fast
operations, we can wait for a defer there
Updates #11649
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
It was disabled in May 2024 in #12205 (9eb72bb51).
This removes the unused symbols.
Updates #188
Updates tailscale/corp#19106
Updates tailscale/corp#19116
Change-Id: I5208b7b750b18226ed703532ed58c4ea17195a8e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Use GetGlobalAddrs() to discover all STUN endpoints, handling bad NATs
that create multiple mappings. When MappingVariesByDestIP is true, also
add the first STUN IPv4 address with the relay's local port for static
port mapping scenarios.
Updates #17796
Signed-off-by: Raj Singh <raj@tailscale.com>
The feature is currently in private alpha, so requires a tailnet feature
flag. Initially focuses on supporting the operator's own auth, because the
operator is the only device we maintain that uses static long-lived
credentials. All other operator-created devices use single-use auth keys.
Testing steps:
* Create a cluster with an API server accessible over public internet
* kubectl get --raw /.well-known/openid-configuration | jq '.issuer'
* Create a federated OAuth client in the Tailscale admin console with:
* The issuer from the previous step
* Subject claim `system:serviceaccount:tailscale:operator`
* Write scopes services, devices:core, auth_keys
* Tag tag:k8s-operator
* Allow the Tailscale control plane to get the public portion of
the ServiceAccount token signing key without authentication:
* kubectl create clusterrolebinding oidc-discovery \
--clusterrole=system:service-account-issuer-discovery \
--group=system:unauthenticated
* helm install --set oauth.clientId=... --set oauth.audience=...
Updates #17457
Change-Id: Ib29c85ba97b093c70b002f4f41793ffc02e6c6e9
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Now that the feature is in beta, no one should encounter this error.
Updates #cleanup
Change-Id: I69ed3f460b7f28c44da43ce2f552042f980a0420
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This starts running the jsontags vet checker on the module.
All existing findings are adding to an allowlist.
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
The cmd/jsontags is non-idiomatic since it is not a main binary.
Move it to a vet directory, which will eventually contain a vettool binary.
Update tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Include the node's OS with network flow log information.
Refactor the JSON-length computation to be a bit more precise.
Updates tailscale/corp#33352Fixestailscale/corp#34030
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Prior to this change a SubscriberFunc treated the call to the subscriber's
function as the completion of delivery. But that means when we are closing the
subscriber, that callback could continue to execute for some time after the
close returns.
For channel-based subscribers that works OK because the close takes effect
before the subscriber ever sees the event. To make the two subscriber types
symmetric, we should also wait for the callback to finish before returning.
This ensures that a Close of the client means the same thing with both kinds of
subscriber.
Updates #17638
Change-Id: I82fd31bcaa4e92fab07981ac0e57e6e3a7d9d60b
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Add options to the eventbus.Bus to plumb in a logger.
Route that logger in to the subscriber machinery, and trigger a log message to
it when a subscriber fails to respond to its delivered events for 5s or more.
The log message includes the package, filename, and line number of the call
site that created the subscription.
Add tests that verify this works.
Updates #17680
Change-Id: I0546516476b1e13e6a9cf79f19db2fe55e56c698
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
In particular on Windows, the `transport.TPMCloser` we get is not safe
for concurrent use. This is especially noticeable because
`tpm.attestationKey.Clone` uses the same open handle as the original
key. So wrap the operations on ak.tpm with a mutex and make a deep copy
with a new connection in Clone.
Updates #15830
Updates #17662
Updates #17644
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Specify the app apability that failed the test, instead of the
entire comma-separated list.
Fixes #cleanup
Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
In #17639 we moved the subscription into NewLogger to ensure we would not race
subscribing with shutdown of the eventbus client. Doing so fixed that problem,
but exposed another: As we were only servicing events occasionally when waiting
for the network to come up, we could leave the eventbus to stall in cases where
a number of network deltas arrived later and weren't processed.
To address that, let's separate the concerns: As before, we'll Subscribe early
to avoid conflicts with shutdown; but instead of using the subscriber directly
to determine readiness, we'll keep track of the last-known network state in a
selectable condition that the subscriber updates for us. When we want to wait,
we'll wait on that condition (or until our context ends), ensuring all the
events get processed in a timely manner.
Updates #17638
Updates #15160
Change-Id: I28339a372be4ab24be46e2834a218874c33a0d2d
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Adds a new Redirect field to HTTPHandler for serving HTTP redirects
from the Tailscale serve config. The redirect URL supports template
variables ${HOST} and ${REQUEST_URI} that are resolved per request.
By default, it redirects using HTTP Status 302 (Found). For another
redirect status, like 301 - Moved Permanently, pass the HTTP status
code followed by ':' on Redirect, like: "301:https://tailscale.com"
Updates #11252
Updates #11330
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
In 3f5c560fd4 I changed to use std net/http's HTTP/2 support,
instead of pulling in x/net/http2.
But I forgot to update DialTLSContext to DialContext, which meant it
was falling back to using the std net.Dialer for its dials, instead
of the passed-in one.
The tests only passed because they were using localhost addresses, so
the std net.Dialer worked. But in prod, where a tsnet Dialer would be
needed, it didn't work, and would time out for 10 seconds before
resorting to the old protocol.
So this fixes the tests to use an isolated in-memory network to prevent
that class of problem in the future. With the test change, the old code
fails and the new code passes.
Thanks to @jasonodonnell for debugging!
Updates #17304
Updates 3f5c560fd4
Change-Id: I3602bafd07dc6548e2c62985af9ac0afb3a0e967
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Single letter 'l' variables can eventually become confusing when
they're rendered in some fonts that make them similar to 1 or I.
Updates #cleanup
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
A follow-up to #17411. Put AppConnector events into a task queue, as they may
take some time to process. Ensure that the queue is stopped at shutdown so that
cleanup will remain orderly.
Because events are delivered on a separate goroutine, slow processing of an
event does not cause an immediate problem; however, a subscriber that blocks
for a long time will push back on the bus as a whole. See
https://godoc.org/tailscale.com/util/eventbus#hdr-Expected_subscriber_behavior
for more discussion.
Updates #17192
Updates #15160
Change-Id: Ib313cc68aec273daf2b1ad79538266c81ef063e3
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This migrates an internal tool to open source
so that we can run it on the tailscale.com module as well.
This PR does not yet set up a CI to run this analyzer.
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
This rewrites the netlog package to support embedding node information in network flow logs.
Some bit of complexity comes in trying to pre-compute the expected size of the log message
after JSON serialization to ensure that we can respect maximum body limits in log uploading.
We also fix a bug in tstun, where we were recording the IP address after SNAT,
which was resulting in non-sensible connection flows being logged.
Updates tailscale/corp#33352
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
This migrates an internal tool to open source
so that we can run it on the tailscale.com module as well.
We add the "util/safediff" also as a dependency of the tool.
This PR does not yet set up a CI to run this analyzer.
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Found by staticcheck, the test was calling derphttp.NewClient but not checking
its error result before doing other things to it.
Updates #cleanup
Change-Id: I4ade35a7de7c473571f176e747866bc0ab5774db
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This reverts commit 4346615d77.
We averted the shutdown race, but will need to service the subscriber even when
we are not waiting for a change so that we do not delay the bus as a whole.
Updates #17638
Change-Id: I5488466ed83f5ad1141c95267f5ae54878a24657
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Drop usage of the branches filter with a single asterisk as this matches
against zero or more characters but not a forward slash, resulting in
PRs to branch names with forwards slashes in them not having these
workflow run against them as expected.
Updates https://github.com/tailscale/corp/issues/33523
Signed-off-by: Mario Minardi <mario@tailscale.com>
Also consolidates variable and header naming and amends the
CLI behavior
* multiple app-caps have to be specified as comma-separated
list
* simple regex-based validation of app capability names is
carried out during flag parsing
Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
Given that we filter based on the usercaps argument now, truncation
should not be necessary anymore.
Updates tailscale/corp/#28372
Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
Temporarily back out the TPM-based hw attestation code while we debug
Windows exceptions.
Updates tailscale/corp#31269
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
When the eventbus is enabled, set up the subscription for change deltas at the
beginning when the client is created, rather than waiting for the first
awaitInternetUp check.
Otherwise, it is possible for a check to race with the client close in
Shutdown, which triggers a panic.
Updates #17638
Change-Id: I461c07939eca46699072b14b1814ecf28eec750c
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This compares the warnings we actually care about and skips the unstable
warnings and the changes with no warnings.
Fixes#17635
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
If you run tailscaled without passing a `--statedir`, Tailnet Lock is
unavailable -- we don't have a folder to store the AUMs in.
This causes a lot of unnecessary requests to bootstrap TKA, because
every time the node receives a NetMap with some TKA state, it tries to
bootstrap, fetches the bootstrap TKA state from the control plane, then
fails with the error:
TKA sync error: bootstrap: network-lock is not supported in this
configuration, try setting --statedir
We can't prevent the error, but we can skip the control plane request
that immediately gets dropped on the floor.
In local testing, a new node joining a tailnet caused *three* control
plane requests which were unused.
Updates tailscale/corp#19441
Signed-off-by: Alex Chan <alexc@tailscale.com>
This fixes a regression from dd615c8fdd that moved the
newIPTablesRunner constructor from a any-Linux-GOARCH file to one that
was only amd64 and arm64, thus breaking iptables on other platforms
(notably 32-bit "arm", as seen on older Pis running Buster with
iptables)
Tested by hand on a Raspberry Pi 2 w/ Buster + iptables for now, for
lack of automated 32-bit arm tests at the moment. But filed #17629.
Fixes#17623
Updates #17629
Change-Id: Iac1a3d78f35d8428821b46f0fed3f3717891c1bd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
On some platforms e.g. ChromeOS the owner hierarchy might not always be
available to us. To avoid stale sealing exceptions later we probe to
confirm it's working rather than rely solely on family indicator status.
Updates #17622
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Check that the TPM we have opened is advertised as a 2.0 family device
before using it for state sealing / hardware attestation.
Updates #17622
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
This reformats the existing text to have line breaks at sentences. This
commit contains no textual changes to the code of conduct, but is done
to make any subsequent changes easier to review. (sembr.org)
Also apply prettier formatting for consistency.
Updates #cleanup
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
* When we do the TKA sync, log whether TKA is enabled and whether
we want it to be enabled. This would help us see if a node is
making bootstrap errors.
* When we fail to look up an AUM locally, log the ID of the AUM
rather than a generic "file does not exist" error.
These AUM IDs are cryptographic hashes of the TKA state, which
itself just contains public keys and signatures. These IDs aren't
sensitive and logging them is safe.
Signed-off-by: Alex Chan <alexc@tailscale.com>
Updates https://github.com/tailscale/corp/issues/33594
Service hosts must be tagged nodes, meaning it is only valid to
advertise a Service from a machine which has at least one ACL tag.
Fixestailscale/corp#33197
Signed-off-by: Harry Harpham <harry@tailscale.com>
If users start the application with sudo, DBUS is likely not available
or will not have the correct endpoints. We want to warn users when doing
this.
Closes#17593
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
This does not change which subscriptions are made, it only swaps them to use
the SubscribeFunc API instead of Subscribe.
Updates #15160
Updates #17487
Change-Id: Id56027836c96942206200567a118f8bcf9c07f64
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This patch creates a set of tests that should be true for all implementations of Chonk and CompactableChonk, which we can share with the SQLite implementation in corp.
It includes all the existing tests, plus a test for LastActiveAncestor which was in corp but not in oss.
Updates https://github.com/tailscale/corp/issues/33465
Signed-off-by: Alex Chan <alexc@tailscale.com>
Previously, running `tailscale lock log` in a tailnet without Tailnet
Lock enabled would return a potentially confusing error:
$ tailscale lock log
2025/10/20 11:07:09 failed to connect to local Tailscale service; is Tailscale running?
It would return this error even if Tailscale was running.
This patch fixes the error to be:
$ tailscale lock log
Tailnet Lock is not enabled
Fixes#17586
Signed-off-by: Alex Chan <alexc@tailscale.com>
Add new arguments to `tailscale up` so authkeys can be generated dynamically via identity federation.
Updates #9192
Signed-off-by: mcoulombe <max@tailscale.com>
* Remove a couple of single-letter `l` variables
* Use named struct parameters in the test cases for readability
* Delete `wantAfterInactivityForFn` parameter when it returns the
default zero
Updates #cleanup
Signed-off-by: Alex Chan <alexc@tailscale.com>
We soft-delete AUMs when they're purged, but when we call `ChildAUMs()`,
we look up soft-deleted AUMs to find the `Children` field.
This patch changes the behaviour of `ChildAUMs()` so it only looks at
not-deleted AUMs. This means we don't need to record child information
on AUMs any more, which is a minor space saving for any newly-recorded
AUMs.
Updates https://github.com/tailscale/tailscale/issues/17566
Updates https://github.com/tailscale/corp/issues/27166
Signed-off-by: Alex Chan <alexc@tailscale.com>
This method was added in cca25f6 in the initial in-memory implementation
of Chonk, but it's not part of the Chonk interface and isn't implemented
or used anywhere else. Let's get rid of it.
Updates https://github.com/tailscale/corp/issues/33465
Signed-off-by: Alex Chan <alexc@tailscale.com>
This commit modifies the k8s-operator's api proxy implementation to only
enable forwarding of api requests to tsrecorder when an environment
variable is set.
This new environment variable is named `TS_EXPERIMENTAL_KUBE_API_EVENTS`.
Updates https://github.com/tailscale/corp/issues/32448
Signed-off-by: David Bond <davidsbond93@gmail.com>
Merge the connstats package into the netlog package
and unexport all of its declarations.
Remove the buildfeatures.HasConnStats and use HasNetLog instead.
Updates tailscale/corp#33352
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
The connstats package was an unnecessary layer of indirection.
It was seperated out of wgengine/netlog so that net/tstun and
wgengine/magicsock wouldn't need a depenedency on the concrete
implementation of network flow logging.
Instead, we simply register a callback for counting connections.
This PR does the bare minimum work to prepare tstun and magicsock
to only care about that callback.
A future PR will delete connstats and merge it into netlog.
Updates tailscale/corp#33352
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Remove CBOR representation since it was never used.
We should support CBOR in the future, but for remove it
for now so that it is less work to add more fields.
Also, rely on just omitzero for JSON now that it is supported in Go 1.24.
Updates tailscale/corp#33352
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
I was debugging a customer issue and saw in their 1.88.3 logs:
TPM: error opening: stat /dev/tpm0: no such file or directory
That's unnecessary output. The lack of TPM will be reported by
them having a nil Hostinfo.TPM, which is plenty elsewhere in logs.
Let's only write out an "error opening" line if it's an interesting
error. (perhaps permissions, or EIO, etc)
Updates #cleanup
Change-Id: I3f987f6bf1d3ada03473ca3eef555e9cfafc7677
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Before synctest, timers was needed to allow the events to flow into the
test bus. There is still a timer, but this one is not derived from the
test deadline and it is mostly arbitrary as synctest will render it
practically non-existent.
With this approach, tests that do not need to test for the absence of
events do not rely on synctest.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
On Windows arm64 we are going to need to ship two different GUI builds;
one for Win10 (GOARCH=386) and one for Win11 (GOARCH=amd64, tags +=
winui). Due to quirks in MSI packaging, they cannot both share the
same filename. This requires some fixes in places where we have
hardcoded "tailscale-ipn" as the GUI filename.
We also do some cleanup in clientupdate to ensure that autoupdates
will continue to work correctly with the temporary "-winui" package
variant.
Fixes#17480
Updates https://github.com/tailscale/corp/issues/29940
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Extend Persist with AttestationKey to record a hardware-backed
attestation key for the node's identity.
Add a flag to tailscaled to allow users to control the use of
hardware-backed keys to bind node identity to individual machines.
Updates tailscale/corp#31269
Change-Id: Idcf40d730a448d85f07f1bebf387f086d4c58be3
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
The default representation of time.Duration has different
JSON representation between v1 and v2.
Apply an explicit format flag that uses the v1 representation
so that this behavior does not change if serialized with v2.
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
updates tailscale/tailscale#16836
Android's altNetInterfaces implementation now returns net.IPAddr
types which netmon wasn't handling.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
With a channel subscriber, the subscription processing always occurs on another
goroutine. The SubscriberFunc (prior to this commit) runs its callbacks on the
client's own goroutine. This changes the semantics, though: In addition to more
directly pushing back on the publisher, a publisher and subscriber can deadlock
in a SubscriberFunc but succeed on a Subscriber. They should behave
equivalently regardless which interface they use.
Arguably the caller should deal with this by creating its own goroutine if it
needs to. However, that loses much of the benefit of the SubscriberFunc API, as
it will need to manage the lifecycle of that goroutine. So, for practical
ergonomics, let's make the SubscriberFunc do this management on the user's
behalf. (We discussed doing this in #17432, but decided not to do it yet). We
can optimize this approach further, if we need to, without changing the API.
Updates #17487
Change-Id: I19ea9e8f246f7b406711f5a16518ef7ff21a1ac9
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This commit adds the subcommands `get-config` and `set-config` to Serve,
which can be used to read the current Tailscale Services configuration
in a standard syntax and provide a configuration to declaratively apply
with that same syntax.
Both commands must be provided with either `--service=svc:service` for
one service, or `--all` for all services. When writing a config,
`--set-config --all` will overwrite all existing Services configuration,
and `--set-config --service=svc:service` will overwrite all
configuration for that particular Service. Incremental changes are not
supported.
Fixestailscale/corp#30983.
cmd/tailscale/cli: hide serve "get-config"/"set-config" commands for now
tailscale/corp#33152 tracks unhiding them when docs exist.
Signed-off-by: Naman Sood <mail@nsood.in>
when tsrecorder receives events, it populates this field with
information about the node the request was sent to.
Updates #17141
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
The lazy init led to confusion and a belief that was something was
wrong. It's reasonable to expect the daemon to listen on the port at the
time it's configured.
Updates tailscale/corp#33094
Signed-off-by: Jordan Whited <jordan@tailscale.com>
I got sidetracked apparently and never finished writing this Clone
code in 316afe7d02 (#17448). (It really should use views instead.)
And then I missed one of the users of "routerChanged" that was broken up
into "routerChanged" vs "dnsChanged".
This broke integration tests elsewhere.
Fixes#17506
Change-Id: I533bf0fcf3da9ac6eb4a6cdef03b8df2c1fb4c8e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Update Nix flake to use go 1.25.2
Create the hash from the toolchain rev file automatically from
update-flake.sh
Updates tailscale/go#135
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
This patch fixes several issues related to printing login and device
approval URLs, especially when `tailscale up` is interrupted:
1. Only print a login URL that will cause `tailscale up` to complete.
Don't print expired URLs or URLs from previous login attempts.
2. Print the device approval URL if you run `tailscale up` after
previously completing a login, but before approving the device.
3. Use the correct control URL for device approval if you run a bare
`tailscale up` after previously completing a login, but before
approving the device.
4. Don't print the device approval URL more than once (or at least,
not consecutively).
Updates tailscale/corp#31476
Updates #17361
## How these fixes work
This patch went through a lot of trial and error, and there may still
be bugs! These notes capture the different scenarios and considerations
as we wrote it, which are also captured by integration tests.
1. We were getting stale login URLs from the initial IPN state
notification.
When the IPN watcher was moved to before Start() in c011369, we
mistakenly continued to request the initial state. This is only
necessary if you start watching after you call Start(), because
you may have missed some notifications.
By getting the initial state before calling Start(), we'd get
a stale login URL. If you clicked that URL, you could complete
the login in the control server (if it wasn't expired), but your
instance of `tailscale up` would hang, because it's listening for
login updates from a different login URL.
In this patch, we no longer request the initial state, and so we
don't print a stale URL.
2. Once you skip the initial state from IPN, the following sequence:
* Run `tailscale up`
* Log into a tailnet with device approval
* ^C after the device approval URL is printed, but without approving
* Run `tailscale up` again
means that nothing would ever be printed.
`tailscale up` would send tailscaled the pref `WantRunning: true`,
but that was already the case so nothing changes. You never get any
IPN notifications, and in particular you never get a state change to
`NeedsMachineAuth`. This means we'd never print the device approval URL.
In this patch, we add a hard-coded rule that if you're doing a simple up
(which won't trigger any other IPN notifications) and you start in the
`NeedsMachineAuth` state, we print the device approval message without
waiting for an IPN notification.
3. Consider the following sequence:
* Run `tailscale up --login-server=<custom server>`
* Log into a tailnet with device approval
* ^C after the device approval URL is printed, but without approving
* Run `tailscale up` again
We'd print the device approval URL for the default control server,
rather than the real control server, because we were using the `prefs`
from the CLI arguments (which are all the defaults) rather than the
`curPrefs` (which contain the custom login server).
In this patch, we use the `prefs` if the user has specified any settings
(and other code will ensure this is a complete set of settings) or
`curPrefs` if it's a simple `tailscale up`.
4. Consider the following sequence: you've logged in, but not completed
device approval, and you run `down` and `up` in quick succession.
* `up`: sees state=NeedsMachineAuth
* `up`: sends `{wantRunning: true}`, prints out the device approval URL
* `down`: changes state to Stopped
* `up`: changes state to Starting
* tailscaled: changes state to NeedsMachineAuth
* `up`: gets an IPN notification with the state change, and prints
a second device approval URL
Either URL works, but this is annoying for the user.
In this patch, we track whether the last printed URL was the device
approval URL, and if so, we skip printing it a second time.
Signed-off-by: Alex Chan <alexc@tailscale.com>
This patch extends the integration tests for `tailscale up` to include tailnets
where new devices need to be approved. It doesn't change the CLI, because it's
mostly working correctly already -- these tests are just to prevent future
regressions.
I've added support for `MachineAuthorized` to mock control, and I've refactored
`TestOneNodeUpAuth` to be more flexible. It now takes a sequence of steps to
run and asserts whether we got a login URL and/or machine approval URL after
each step.
Updates tailscale/corp#31476
Updates #17361
Signed-off-by: Alex Chan <alexc@tailscale.com>
This commit also shuffles the hasPeerRelayServers atomic load
to happen sooner, reducing the cost for clients with no peer relay
servers.
Updates tailscale/corp#33099
Signed-off-by: Jordan Whited <jordan@tailscale.com>
The hijacker on k8s-proxy's reverse proxy is used to stream recordings
to tsrecorder as they pass through the proxy to the kubernetes api
server. The connection to the recorder was using the client's
(e.g., kubectl) context, rather than a dedicated one. This was causing
the recording stream to get cut off in scenarios where the client
cancelled the context before streaming could be completed.
By using a dedicated context, we can continue streaming even if the
client cancels the context (for example if the client request
completes).
Fixes#17404
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Originally proposed by @bradfitz in #17413.
In practice, a lot of subscribers have only one event type of interest, or a
small number of mostly independent ones. In that case, the overhead of running
and maintaining a goroutine to select on multiple channels winds up being more
noisy than we'd like for the user of the API.
For this common case, add a new SubscriberFunc[T] type that delivers events to
a callback owned by the subscriber, directly on the goroutine belonging to the
client itself. This frees the consumer from the need to maintain their own
goroutine to pull events from the channel, and to watch for closure of the
subscriber.
Before:
s := eventbus.Subscribe[T](eventClient)
go func() {
for {
select {
case <-s.Done():
return
case e := <-s.Events():
doSomethingWith(e)
}
}
}()
// ...
s.Close()
After:
func doSomethingWithT(e T) { ... }
s := eventbus.SubscribeFunc(eventClient, doSomethingWithT)
// ...
s.Close()
Moreover, unless the caller wants to explicitly stop the subscriber separately
from its governing client, it need not capture the SubscriberFunc value at all.
One downside of this approach is that a slow or deadlocked callback could block
client's service routine and thus stall all other subscriptions on that client,
However, this can already happen more broadly if a subscriber fails to service
its delivery channel in a timely manner, it just feeds back more immediately.
Updates #17487
Change-Id: I64592d786005177aa9fd445c263178ed415784d5
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Since #17376, containerboot crashes on startup in k8s because state
encryption is enabled by default without first checking that it's
compatible with the selected state store. Make sure we only default
state encryption to enabled if it's not going to immediately clash with
other bits of tailscaled config.
Updates tailscale/corp#32909
Change-Id: I76c586772750d6da188cc97b647c6e0c1a8734f0
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Saves ~94 KB from the min build.
Updates #12614
Change-Id: I3b0b8a47f80b9fd3b1038c2834b60afa55bf02c2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Part of making all netlink monitoring code optional.
Updates #17311 (how I got started down this path)
Updates #12614
Change-Id: Ic80d8a7a44dc261c4b8678b3c2241c3b3778370d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Also pull out interface method only needed in Linux.
Instead of having userspace do the call into the router, just let the
router pick up the change itself.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Before we introduced seamless, the "blocked" state was used to track:
* Whether a login was required for connectivity, and therefore we should
keep the engine deconfigured until that happened
* Whether authentication was in progress
"blocked" would stop authReconfig from running. We want this when a login is
required: if your key has expired we want to deconfigure the engine and keep
it down, so that you don't keep using exit nodes (which won't work because
your key has expired).
Taking the engine down while auth was in progress was undesirable, so we
don't do that with seamless renewal. However, not entering the "blocked"
state meant that we needed to change the logic for when to send
LoginFinished on the IPN bus after seeing StateAuthenticated from the
controlclient. Initially we changed the "if blocked" check to "if blocked or
seamless is enabled" which was correct in other places.
In this place however, it introduced a bug: we are sending LoginFinished
every time we see StateAuthenticated, which happens even on a down & up, or
a profile switch. This in turn made it harder for UI clients to track when
authentication is complete.
Instead we should only send it out if we were blocked (i.e. seamless is
disabled, or our key expired) or an auth was in progress.
Updates tailscale/corp#31476
Updates tailscale/corp#32645
Fixes#17363
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
Saves 45 KB from the min build, no longer pulling in deephash or
util/hashx, both with unsafe code.
It can actually be more efficient to not use deephash, as you don't
have to walk all bytes of all fields recursively to answer that two
things are not equal. Instead, you can just return false at the first
difference you see. And then with views (as we use ~everywhere
nowadays), the cloning the old value isn't expensive, as it's just a
pointer under the hood.
Updates #12614
Change-Id: I7b08616b8a09b3ade454bb5e0ac5672086fe8aec
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Historically, and until recently, --extra-small produced a usable build.
When I recently made osrouter be modular in 39e35379d4 (which is
useful in, say, tsnet builds) after also making netstack modular, that
meant --min now lacked both netstack support for routing and system
support for routing, making no way to get packets into
wireguard. That's not a nice default to users. (we've documented
build_dist.sh in our KB)
Restore --extra-small to making a usable build, and add --min for
benchmarking purposes.
Updates #12614
Change-Id: I649e41e324a36a0ca94953229c9914046b5dc497
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Some of the test cases access fields of the backend that are supposed to be
locked while the test is running, which can trigger the race detector. I fixed
a few of these in #17411, but I missed these two cases.
Updates #15160
Updates #17192
Change-Id: I45664d5e34320ecdccd2844e0f8b228145aaf603
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Saves ~53 KB from the min build.
Updates #12614
Change-Id: I73f9544a9feea06027c6ebdd222d712ada851299
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add subscribers for AppConnector events
Make the RouteAdvertiser interface optional We cannot yet remove it because
the tests still depend on it to verify correctness. We will need to separately
update the test fixtures to remove that dependency.
Publish RouteInfo via the event bus, so we do not need a callback to do that.
Replace it with a flag that indicates whether to treat the route info the connector
has as "definitive" for filtering purposes.
Update the tests to simplify the construction of AppConnector values now that a
store callback is no longer required. Also fix a couple of pre-existing racy tests that
were hidden by not being concurrent in the same way production is.
Updates #15160
Updates #17192
Change-Id: Id39525c0f02184e88feaf0d8a3c05504850e47ee
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
If we received a wg engine status while processing an auth URL, there was a
race condition where the authURL could be reset to "" immediately after we
set it.
To fix this we need to check that we are moving from a non-Running state to
a Running state rather than always resetting the URL when we "move" into a
Running state even if that is the current state.
We also need to make sure that we do not return from stopEngineAndWait until
the engine is stopped: before, we would return as soon as we received any
engine status update, but that might have been an update already in-flight
before we asked the engine to stop. Now we wait until we see an update that
is indicative of a stopped engine, or we see that the engine is unblocked
again, which indicates that the engine stopped and then started again while
we were waiting before we checked the state.
Updates #17388
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
Co-authored-by: Nick Khyl <nickk@tailscale.com>
Saves ~102 KB from the min build.
Updates #12614
Change-Id: Ie1d4f439321267b9f98046593cb289ee3c4d6249
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Due to iOS memory limitations in 2020 (see
https://tailscale.com/blog/go-linker, etc) and wireguard-go using
multiple goroutines per peer, commit 16a9cfe2f4 introduced some
convoluted pathsways through Tailscale to look at packets before
they're delivered to wireguard-go and lazily reconfigure wireguard on
the fly before delivering a packet, only telling wireguard about peers
that are active.
We eventually want to remove that code and integrate wireguard-go's
configuration with Tailscale's existing netmap tracking.
To make it easier to find that code later, this makes it modular. It
saves 12 KB (of disk) to turn it off (at the expense of lots of RAM),
but that's not really the point. The point is rather making it obvious
(via the new constants) where this code even is.
Updates #12614
Change-Id: I113b040f3e35f7d861c457eaa710d35f47cee1cb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Explain that this file stays forked from coder/websocket until we can
depend on an upstream release for the helper.
Updates #cleanup
Signed-off-by: kscooo <kscowork@gmail.com>
Switching to a Geneve-encapsulated (peer relay) path in
endpoint.handlePongConnLocked is expected around port rebinds, which end
up clearing endpoint.bestAddr.
Fixestailscale/corp#33036
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Saves only 12 KB, but notably removes some deps on packages that future
changes can then eliminate entirely.
Updates #12614
Change-Id: Ibf830d3ee08f621d0a2011b1d4cd175427ef50df
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
c2n was already a conditional feature, but it didn't have a
feature/c2n directory before (rather, it was using consts + DCE). This
adds it, and moves some code, which removes the httprec dependency.
Also, remove some unnecessary code from our httprec fork.
Updates #12614
Change-Id: I2fbe538e09794c517038e35a694a363312c426a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
As found by @cmol in #17423.
Updates #17423
Change-Id: I1492501f74ca7b57a8c5278ea6cb87a56a4086b9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Saves 86 KB.
And stop depending on expvar and usermetrics when disabled,
in prep to removing all the expvar/metrics/tsweb stuff.
Updates #12614
Change-Id: I35d2479ddd1d39b615bab32b1fa940ae8cbf9b11
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This patch removes some code that didn’t get removed before merging
the changes in #16580.
Updates #cleanup
Updates #16551
Signed-off-by: Simon Law <sfllaw@tailscale.com>
kubestore init function has now been moved to a more explicit path of
ipn/store/kubestore meaning we can now avoid the generic import of
feature/condregister.
Updates #12614
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
When running integration tests on macOS, we get a panic from a nil
pointer dereference when calling `ci.creds.PID()`.
This panic occurs because the `ci.creds != nil` check is insufficient
after a recent refactoring (c45f881) that changed `ci.creds` from a
pointer to the `PeerCreds` interface. Now `ci.creds` always compares as
non-nil, so we enter this block even when the underlying value is nil.
The integration tests fail on macOS when `peercred.Get()` returns the
error `unix.GetsockoptInt: socket is not connected`. This error isn't
new, and the previous code was ignoring it correctly.
Since we trust that `peercred` returns either a usable value or an error,
checking for a nil error is a sufficient and correct gate to prevent the
method call and avoid the panic.
Fixes#17421
Signed-off-by: Alex Chan <alexc@tailscale.com>
In the earlier http2 package migration (1d93bdce20, #17394) I had
removed Direct.Close's tracking of the connPool, thinking it wasn't
necessary.
Some tests (in another repo) are strict and like it to tear down the
world and wait, to check for leaked goroutines. And they caught this
letting some goroutines idle past Close, even if they'd eventually
close down on their own.
This restores the connPool accounting and the aggressife close.
Updates #17305
Updates #17394
Change-Id: I5fed283a179ff7c3e2be104836bbe58b05130cc7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The control plane will sometimes determine that a node is not online,
while the node is still able to connect to its peers. This patch
doesn’t solve this problem, but it does mitigate it.
This PR introduces the `client-side-reachability` node attribute that
switches the node to completely ignore the online signal from control.
In the future, the client itself should collect reachability data from
active Wireguard flows and Tailscale pings.
Updates #17366
Updates tailscale/corp#30379
Updates tailscale/corp#32686
Signed-off-by: Simon Law <sfllaw@tailscale.com>
A recent change (009d702adf) introduced a deadlock where the
/machine/update-health network request to report the client's health
status update to the control plane was moved to being synchronous
within the eventbus's pump machinery.
I started to instead make the health reporting be async, but then we
realized in the three years since we added that, it's barely been used
and doesn't pay for itself, for how many HTTP requests it makes.
Instead, delete it all and replace it with a c2n handler, which
provides much more helpful information.
Fixestailscale/corp#32952
Change-Id: I9e8a5458269ebfdda1c752d7bbb8af2780d71b04
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Saves 262 KB so far. I'm sure I missed some places, but shotizam says
these were the low hanging fruit.
Updates #12614
Change-Id: Ia31c01b454f627e6d0470229aae4e19d615e45e3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Maybe it matters? At least globally across all nodes?
Fixes#17343
Change-Id: I3f61758ea37de527e16602ec1a6e453d913b3195
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add and wire up event publishers for these two event types in the AppConnector.
Nothing currently subscribes to them, so this is harmless. Subscribers for
these events will be added in a near-future commit.
As part of this, move the appc.RouteInfo type to the types/appctype package.
It does not contain any package-specific details from appc. Beside it, add
appctype.RouteUpdate to carry route update event state, likewise not specific
to appc. Update all usage of the appc.* types throughout to use appctype.*
instead, and update depaware files to reflect these changes.
Add a Close method to the AppConnector to make sure the client gets cleaned up
when the connector is dropped (we re-create connectors).
Update the unit tests in the appc package to also check the events published
alongside calls to the RouteAdvertiser.
For now the tests still rely on the RouteAdvertiser for correctness; this is OK
for now as the two methods are always performed together. In the near future,
we need to rework the tests so not require that, but that will require building
some more test fixtures that we can handle separately.
Updates #15160
Updates #17192
Change-Id: I184670ba2fb920e0d2cb2be7c6816259bca77afe
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Instead of using separate channels to manage the lifecycle of the eventbus
client, use the recently-added eventbus.Monitor, which handles signaling the
processing loop to stop and waiting for it to complete. This allows us to
simplify some of the setup and cleanup code in the relay server.
Updates #15160
Change-Id: Ia1a47ce2e5a31bc8f546dca4c56c3141a40d67af
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Saves 352 KB, removing one of our two HTTP/2 implementations linked
into the binary.
Fixes#17305
Updates #15015
Change-Id: I53a04b1f2687dca73c8541949465038b69aa6ade
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add a .gitignore for the chart version of the CRDs that we never commit,
because the static manifest CRD files are the canonical version. This
makes it easier to deploy the CRDs via the helm chart in a way that
reflects the production workflow without making the git checkout
"dirty".
Given that the chart CRDs are ignored, we can also now safely generate
them for the kube-generate-all Makefile target without being a nuisance
to the state of the git checkout. Added a slightly more robust repo root
detection to the generation logic to make sure the command works from
the context of both the Makefile and the image builder command we run
for releases in corp.
Updates tailscale/corp#32085
Change-Id: Id44a4707c183bfaf95a160911ec7a42ffb1a1287
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
mkctr already has support for including extra files in the built
container image. Wire up a new optional environment variable to thread
that through to mkctr. The operator e2e tests will use this to bake
additional trusted CAs into the test image without significantly
departing from the normal build or deployment process for our
containers.
Updates tailscale/corp#32085
Change-Id: Ica94ed270da13782c4f5524fdc949f9218f79477
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Using memnet and synctest removes flakiness caused by real networking
and subtle timing differences.
Additionally, remove the `t.Logf` call inside the server's shutdown
goroutine that was causing a false positive data race detection.
The race detector is flagging a double write during this `t.Logf` call.
This is a common pattern, noted in golang/go#40343 and elsehwere in
this file, where using `t.Logf` after a test has finished can interact
poorly with the test runner.
This is a long-standing issue which became more common after rewriting
this test to use memnet and synctest.
Fixed#17355
Signed-off-by: Alex Chan <alexc@tailscale.com>
Whenever running on a platform that has a TPM (and tailscaled can access
it), default to encrypting the state. The user can still explicitly set
this flag to disable encryption.
Updates https://github.com/tailscale/corp/issues/32909
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
A following change will split out the controlclient.NoiseClient type
out, away from the rest of the controlclient package which is
relatively dependency heavy.
A question was where to move it, and whether to make a new (a fifth!)
package in the ts2021 dependency chain.
@creachadair and I brainstormed and decided to merge
internal/noiseconn and controlclient.NoiseClient into one package,
with names ts2021.Conn and ts2021.Client.
For ease of reviewing the subsequent PR, this is the first step that
just renames the internal/noiseconn package to control/ts2021.
Updates #17305
Change-Id: Ib5ea162dc1d336c1d805bdd9548d1702dd6e1468
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
depaware was merging golang.org/x/foo and std's
vendor/golang.org/x/foo packages (which could both be in the binary!),
leading to confusing output, especially when I was working on
eliminating duplicate packages imported under different names.
This makes the depaware output longer and grosser, but doesn't hide
reality from us.
Updates #17305
Change-Id: I21cc3418014e127f6c1a81caf4e84213ce84ab57
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Require the presence of the bus, but do not use it yet. Check for required
fields and update tests and production use to plumb the necessary arguments.
Updates #15160
Updates #17192
Change-Id: I8cefd2fdb314ca9945317d3320bd5ea6a92e8dcb
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
The callback itself is not removed as it is used in other repos, making
it simpler for those to slowly transition to the eventbus.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Replace the positional arguments to NewAppConnector with a Config struct.
Update the existing uses. Other than the API change, there are no functional
changes in this commit.
Updates #15160
Updates #17192
Change-Id: Ibf37f021372155a4db8aaf738f4b4f2c746bf623
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
It never launched and I've lost hope of it launching and it's in my
way now, so I guess it's time to say goodbye.
Updates tailscale/corp#4383
Updates #17305
Change-Id: I2eb551d49f2fb062979cc307f284df4b3dfa5956
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This permits other programs (in other repos) to conditionally
import ipn/store/awsstore and/or ipn/store/kubestore and have them
register themselves, rather than feature/condregister doing it.
Updates tailscale/corp#32922
Change-Id: I2936229ce37fd2acf9be5bf5254d4a262d090ec1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The `put` callback runs on a different goroutine to the test, so calling
t.Fatalf in put had no effect. `drain` is always called when checking what
was put and is called from the test goroutine, so that's a good place to
fail the test if the channel was too full.
Updates #17363
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
https://github.com/tailscale/tailscale/pull/17346 moved the kube and aws
arn store initializations to feature/condregister, under the assumption
that anything using it would use kubestore.New. Unfortunately,
cmd/k8s-proxy makes use of store.New, which compares the `<prefix>:`
supplied in the provided `path string` argument against known stores. If
it doesn't find it, it fallsback to using a FileStore.
Since cmd/k8s-proxy uses store.New to try and initialize a kube store in
some cases (without importing feature/condregister), it silently creates
a FileStore and that leads to misleading errors further along in
execution.
This fixes this issue by importing condregister, and successfully
initializes a kube store.
Updates #12614
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Saves 442 KB. Lock it with a new min test.
Updates #12614
Change-Id: Ia7bf6f797b6cbf08ea65419ade2f359d390f8e91
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We will need this for unmarshaling node prefs: use the zero
HardwareAttestationKey implementation when parsing and later check
`IsZero` to see if anything was loaded.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
I'm trying to remove the "regexp" and "regexp/syntax" packages from
our minimal builds. But tsweb pulls in regexp (via net/http/pprof etc)
and util/eventbus was importing the tsweb for no reason.
Updates #12614
Change-Id: Ifa8c371ece348f1dbf80d6b251381f3ed39d5fbd
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Even with ts_omit_drive, the drive package is currently still imported
for some types. So it should be light. But it was depending on the
"regexp" packge, which I'd like to remove from our minimal builds.
Updates #12614
Change-Id: I5bf85d8eb15a739793723b1da11c370d3fcd2f32
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The Tailscale CLI is the primary configuration interface and as such it
is used in scripts, container setups, and many other places that do not
have a terminal available and should not be made to respond to prompts.
The default is set to false where the "risky" API is being used by the
CLI and true otherwise, this means that the `--yes` flags are only
required under interactive runs and scripts do not need to be concerned
with prompts or extra flags.
Updates #19445
Signed-off-by: James Tucker <james@tailscale.com>
Saves 139 KB.
Also Synology support, which I saw had its own large-ish proxy parsing
support on Linux, but support for proxies without Synology proxy
support is reasonable, so I pulled that out as its own thing.
Updates #12614
Change-Id: I22de285a3def7be77fdcf23e2bec7c83c9655593
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In Dec 2021 in d3d503d997 I had grand plans to make exit node DNS
cheaper by using HTTP/2 over PeerAPI, at least on some platforms. I
only did server-side support though and never made it to the client.
In the ~4 years since, some things have happened:
* Go 1.24 got support for http.Protocols (https://pkg.go.dev/net/http#Protocols)
and doing UnencryptedHTTP2 ("HTTP2 with prior knowledge")
* The old h2c upgrade mechanism was deprecated; see https://github.com/golang/go/issues/63565
and https://github.com/golang/go/issues/67816
* Go plans to deprecate x/net/http2 and move everything to the standard library.
So this drops our use of the x/net/http2/h2c package and instead
enables h2c (on all platforms now) using the standard library.
This does mean we lose the deprecated h2c Upgrade support, but that's
fine.
If/when we do the h2c client support for ExitDNS, we'll have to probe
the peer to see whether it supports it. Or have it reply with a header
saying that future requests can us h2c. (It's tempting to use capver,
but maybe people will disable that support anyway, so we should
discover it at runtime instead.)
Also do the same in the sessionrecording package.
Updates #17305
Change-Id: If323f5ef32486effb18ed836888aa05c0efb701e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Saves 328 KB (2.5%) off the minimal binary.
For IoT devices that don't need MagicDNS (e.g. they don't make
outbound connections), this provides a knob to disable all the DNS
functionality.
Rather than a massive refactor today, this uses constant false values
as a deadcode sledgehammer, guided by shotizam to find the largest DNS
functions which survived deadcode.
A future refactor could make it so that the net/dns/resolver and
publicdns packages don't even show up in the import graph (along with
their imports) but really it's already pretty good looking with just
these consts, so it's not at the top of my list to refactor it more
soon.
Also do the same in a few places with the ACME (cert) functionality,
as I saw those while searching for DNS stuff.
Updates #12614
Change-Id: I8e459f595c2fde68ca16503ff61c8ab339871f97
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
DNS configuration support to ProxyClass, allowing users to customize DNS resolution for Tailscale proxy pods.
Fixes#16886
Signed-off-by: Raj Singh <raj@tailscale.com>
When I added dependency support to featuretag, I broke the handling of
the non-omit build tags (as used by the "box" support for bundling the
CLI into tailscaled). That then affected depaware. The
depaware-minbox.txt this whole time recently has not included the CLI.
So fix that, and also add a new depaware variant that's only the
daemon, without the CLI.
Updates #12614
Updates #17139
Change-Id: I4a4591942aa8c66ad8e3242052e3d9baa42902ca
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Otherwise they're uselessly imported by tsnet applications, even
though they do nothing. tsnet applications wanting to use these
already had to explicitly import them and use kubestore.New or
awsstore.New and assign those to their tsnet.Server.Store fields.
Updates #12614
Change-Id: I358e3923686ddf43a85e6923c3828ba2198991d4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Listen address reuse is allowed as soon as the previous listener is
closed. There is no attempt made to emulate more complex address reuse
logic.
Updates tailscale/corp#28078
Change-Id: I56be1c4848e7b3f9fc97fd4ef13a2de9dcfab0f2
Signed-off-by: Brian Palmer <brianp@tailscale.com>
So wgengine/router is just the docs + entrypoint + types, and then
underscore importing wgengine/router/osrouter registers the constructors
with the wgengine/router package.
Then tsnet can not pull those in.
Updates #17313
Change-Id: If313226f6987d709ea9193c8f16a909326ceefe7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Allow the user to access information about routes an app connector has
learned, such as how many routes for each domain.
Fixestailscale/corp#32624
Signed-off-by: Fran Bull <fran@tailscale.com>
Removes 434 KB from the minimal Linux binary, or ~3%.
Primarily this comes from not linking in the zstd encoding code.
Fixes#17323
Change-Id: I0a90de307dfa1ad7422db7aa8b1b46c782bfaaf7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit modifies the `DNSConfig` custom resource to allow specifying
a replica count when deploying a nameserver. This allows deploying
nameservers in a HA configuration.
Updates https://github.com/tailscale/corp/issues/32589
Signed-off-by: David Bond <davidsbond93@gmail.com>
As of the earlier 85febda86d, our new preferred zstd API of choice
is zstdframe.
Updates #cleanup
Updates tailscale/corp#18514
Change-Id: I5a6164d3162bf2513c3673b6d1e34cfae84cb104
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It has nothing to do with logtail and is confusing named like that.
Updates #cleanup
Updates #17323
Change-Id: Idd34587ba186a2416725f72ffc4c5778b0b9db4a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Now cmd/derper doesn't depend on iptables, nftables, and netlink code :)
But this is really just a cleanup step I noticed on the way to making
tsnet applications able to not link all the OS router code which they
don't use.
Updates #17313
Change-Id: Ic7b4e04e3a9639fd198e9dbeb0f7bae22a4a47a9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This PR cleans up a bunch of things in ./tstest/integration/vms:
- Bumps version of Ubuntu that's actually run from CI 20.04 -> 24.04
- Removes Ubuntu 18.04 test
- Bumps NixOS 21.05 -> 25.05
Updates#cleanup
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
The dnstype package is used by tailcfg, which tries to be light and
leafy. But it brings in dnstype. So dnstype shouldn't bring in
x/net/dns/dnsmessage.
Updates #12614
Change-Id: I043637a7ce7fed097e648001f13ca1927a781def
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I noticed this while modularizing clientupdate. With this in first,
moving clientupdate to be modular removes a bunch more stuff from
the minimal build + tsnet.
Updates #17115
Change-Id: I44bd055fca65808633fd3a848b0bbc09b00ad4fa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
As part of making Tailscale's gvisor dependency optional for small builds,
this was one of the last places left that depended on gvisor. Just copy
the couple functions were were using.
Updates #17283
Change-Id: Id2bc07ba12039afe4c8a3f0b68f4d76d1863bbfe
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Baby steps. This permits building without much of gvisor, but not all of it.
Updates #17283
Change-Id: I8433146e259918cc901fe86b4ea29be22075b32c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This only saves ~32KB in the minimal linux/amd64 binary, but it's a
step towards permitting not depending on gvisor for small builds.
Updates #17283
Change-Id: Iae8da5e9465127de354dbcaf25e794a6832d891b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We can only register one key implementation per process. When running on
macOS or Android, trying to register a separate key implementation from
feature/tpm causes a panic.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
On platforms that are causing EPIPE at a high frequency this is
resulting in non-working connections, for example when Apple decides to
forcefully close UDP sockets due to an unsoliced packet rejection in the
firewall.
Too frequent rebinds cause a failure to solicit the endpoints triggering
the rebinds, that would normally happen via CallMeMaybe.
Updates #14551
Updates tailscale/corp#25648
Signed-off-by: James Tucker <james@tailscale.com>
This commit fixes a race condition where `tailscale up --force-reauth` would
exit prematurely on an already-logged in device.
Previously, the CLI would wait for IPN to report the "Running" state and then
exit. However, this could happen before the new auth URL was printed, leading
to two distinct issues:
* **Without seamless key renewal:** The CLI could exit immediately after
the `StartLoginInteractive` call, before IPN has time to switch into
the "Starting" state or send a new auth URL back to the CLI.
* **With seamless key renewal:** IPN stays in the "Running" state
throughout the process, so the CLI exits immediately without performing
any reauthentication.
The fix is to change the CLI's exit condition.
Instead of waiting for the "Running" state, if we're doing a `--force-reauth`
we now wait to see the node key change, which is a more reliable indicator
that a successful authentication has occurred.
Updates tailscale/corp#31476
Updates tailscale/tailscale#17108
Signed-off-by: Alex Chan <alexc@tailscale.com>
This partially reverts f3d2fd2.
When that patch was written, the goroutine that responds to IPN notifications
could call `StartLoginInteractive`, creating a race condition that led to
flaky integration tests. We no longer call `StartLoginInteractive` in that
goroutine, so the race is now impossible.
Moving the `WatchIPNBus` call earlier ensures the CLI gets all necessary
IPN notifications, preventing a reauth from hanging.
Updates tailscale/corp#31476
Signed-off-by: Alex Chan <alexc@tailscale.com>
A customer wants to allow their employees to restart tailscaled at will, when access rights and MDM policy allow it,
as a way to fully reset client state and re-create the tunnel in case of connectivity issues.
On Windows, the main tailscaled process runs as a child of a service process. The service restarts the child
when it exits (or crashes) until the service itself is stopped. Regular (non-admin) users can't stop the service,
and allowing them to do so isn't ideal, especially in managed or multi-user environments.
In this PR, we add a LocalAPI endpoint that instructs ipnserver.Server, and by extension the tailscaled process,
to shut down. The service then restarts the child tailscaled. Shutting down tailscaled requires LocalAPI write access
and an enabled policy setting.
Updates tailscale/corp#32674
Updates tailscale/corp#32675
Signed-off-by: Nick Khyl <nickk@tailscale.com>
And yay: tsnet (and thus k8s-operator etc) no longer depends on
portlist! And LocalBackend is smaller.
Removes 50 KB from the minimal binary.
Updates #12614
Change-Id: Iee04057053dc39305303e8bd1d9599db8368d926
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We made changes to ipnext callback registration/unregistration/invocation in #15780
that made resetting b.exthost to a nil, no-op host in (*LocalBackend).Shutdown() unnecessary.
But resetting it is also racy: b.exthost must be safe for concurrent use with or without b.mu held,
so it shouldn't be written after NewLocalBackend returns. This PR removes it.
Fixes#17279
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This change adds full IPv6 support to the Kubernetes operator's DNS functionality,
enabling dual-stack and IPv6-only cluster support.
Fixes#16633
Signed-off-by: Raj Singh <raj@tailscale.com>
Expand the integration tests to cover a wider range of scenarios, including:
* Before and after a successful initial login
* Auth URLs and auth keys
* With and without the `--force-reauth` flag
* With and without seamless key renewal
These tests expose a race condition when using `--force-reauth` on an
already-logged in device. The command completes too quickly, preventing
the auth URL from being displayed. This issue is identified and will be
fixed in a separate commit.
Updates #17108
Signed-off-by: Alex Chan <alexc@tailscale.com>
Ideally we would remove this warning entirely, as it is now possible to
reauthenticate without losing connectivty. However, it is still possible to
lose SSH connectivity if the user changes the ownership of the machine when
they do a force-reauth, and we have no way of knowing if they are going to
do that before they do it.
For now, let's just reduce the strength of the warning to warn them that
they "may" lose their connection, rather than they "will".
Updates tailscale/corp#32429
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
I think this was originally a brain-o in 9380e2dfc6. It's
disabling the port _poller_, listing what open ports (i.e. services)
are open, not PMP/PCP/UPnP port mapping.
While there, drop in some more testenv.AssertInTest() in a few places.
Updates #cleanup
Change-Id: Ia6f755ad3544f855883b8a7bdcfc066e8649547b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
PR #17258 extracted `derp.Server` into `derp/derpserver.Server`.
This followup patch adds the following cleanups:
1. Rename `derp_server*.go` files to `derpserver*.go` to match
the package name.
2. Rename the `derpserver.NewServer` constructor to `derpserver.New`
to reduce stuttering.
3. Remove the unnecessary `derpserver.Conn` type alias.
Updates #17257
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Sidestep cmd/viewer incompatibility hiccups with
HardwareAttestationPublic type due to its *ecdsa.PublicKey inner member
by serializing the key to a byte slice instead.
Updates tailscale/corp#31269
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
This exports a number of things from the derp (generic + client) package
to be used by the new derpserver package, as now used by cmd/derper.
And then enough other misc changes to lock in that cmd/tailscaled can
be configured to not bring in tailscale.com/client/local. (The webclient
in particular, even when disabled, was bringing it in, so that's now fixed)
Fixes#17257
Change-Id: I88b6c7958643fb54f386dd900bddf73d2d4d96d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Some systems need to tell whether the monitored goroutine has finished
alongside other channel operations (notably in this case the relay server, but
there seem likely to be others similarly situated).
Updates #15160
Change-Id: I5f0f3fae827b07f9b7102a3b08f60cda9737fe28
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Help out the linker's dead code elimination.
Updates #12614
Change-Id: I6c13cb44d3250bf1e3a01ad393c637da4613affb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This doesn't yet fully pull it out into a feature/captiveportal package.
This is the usual first step, moving the code to its own files within
the same packages.
Updates #17254
Change-Id: Idfaec839debf7c96f51ca6520ce36ccf2f8eec92
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
updates tailscale/corp#32600
A localAPI/cli call to reload-config can end up leaving magicsock's mutex
locked. We were missing an unlock for the early exit where there's no change in
the static endpoints when the disk-based config is loaded. This is not likely
the root cause of the linked issue - just noted during investigation.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
In MacOS GUI apps, users have to select folders to share via the GUI. This is both because
the GUI app keeps its own record of shares, and because the sandboxed version of the GUI
app needs to gain access to the shared folders by having the user pick them in a file
selector.
The new build tag `ts_mac_gui` allows the MacOS GUI app build to signal that this
is a MacOS GUI app, which causes the `drive` subcommand to be omitted so that people
do not mistakenly attempt to use it.
Updates tailscale/tailscale#17210
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Also update to use the new DisplayNameOrDefault.
Updates tailscale/corp#30456
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
This commit does not change the order or meaning of any eventbus activity, it
only updates the way the plumbing is set up.
Updates #15160
Change-Id: I61b863f9c05459d530a4c34063a8bad9046c0e27
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Add a last seen time on the cli's status command, similar to the web
portal.
Before:
```
100.xxx.xxx.xxx tailscale-operator tagged-devices linux offline
```
After:
```
100.xxx.xxx.xxx tailscale-operator tagged-devices linux offline, last seen 20d ago
```
Fixes#16584
Signed-off-by: Mahyar Mirrashed <mah.mirr@gmail.com>
This commit does not change the order or meaning of any eventbus activity, it
only updates the way the plumbing is set up.
Updates #15160
Change-Id: I06860ac4e43952a9bb4d85366138c9d9a17fd9cd
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
It is a programming error to Publish or Subscribe on a closed Client, but now
the way you discover that is by getting a panic from down in the machinery of
the bus after the client state has been cleaned up.
To provide a more helpful error, let's panic explicitly when that happens and
say what went wrong ("the client is closed"), by preventing subscriptions from
interleaving with closure of the client. With this change, either an attachment
fails outright (because the client is already closed) or completes and then
shuts down in good order in the normal course.
This does not change the semantics of the client, publishers, or subscribers,
it's just making the failure more eager so we can attach explanatory text.
Updates #15160
Change-Id: Ia492f4c1dea7535aec2cdcc2e5ea5410ed5218d2
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Only changes how the go routine consuming the events starts and stops,
not what it does.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
This commit modifies the k8s operator to wrap its logger using the logtail
logger provided via the tsnet server. This causes any logs written by
the operator to make their way to Tailscale in the same fashion as
wireguard logs to be used by support.
This functionality can also be opted-out of entirely using the
"TS_NO_LOGS_NO_SUPPORT" environment variable.
Updates https://github.com/tailscale/corp/issues/32037
Signed-off-by: David Bond <davidsbond93@gmail.com>
We never implemented the peercred package on OpenBSD (and I just tried
again and failed), but we've always documented that the creds pointer
can be nil for operating systems where we can't map the unix socket
back to its UID. On those platforms, we set the default unix socket
permissions such that only the admin can open it anyway and we don't
have a read-only vs read-write distinction. OpenBSD was always in that
camp, where any access to Tailscale's unix socket meant full access.
But during some refactoring, we broke OpenBSD in that we started
assuming during one logging path (during login) that Creds was non-nil
when looking up an ipnauth.Actor's username, which wasn't relevant (it
was called from a function "maybeUsernameOf" anyway, which threw away
errors).
Verified on an OpenBSD VM. We don't have any OpenBSD integration tests yet.
Fixes#17209
Updates #17221
Change-Id: I473c5903dfaa645694bcc75e7f5d484f3dd6044d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
controlhttp has the responsibility of dialing a set of candidate control
endpoints in a way that minimizes user facing latency. If one control
endpoint is unavailable we promptly dial another, racing across the
dimensions of: IPv6, IPv4, port 80, and port 443, over multiple server
endpoints.
In the case that the top priority endpoint was not available, the prior
implementation would hang waiting for other results, so as to try to
return the highest priority successful connection to the rest of the
client code. This hang would take too long with a large dialplan and
sufficient client to endpoint latency as to cause the server to timeout
the connection due to inactivity in the intermediate state.
Instead of trying to prioritize non-ideal candidate connections, the
first successful connection is now used unconditionally, improving user
facing latency and avoiding any delays that would encroach on the
server-side timeout.
The tests are converted to memnet and synctest, running on all
platforms.
Fixes#8442Fixestailscale/corp#32534
Co-authored-by: James Tucker <james@tailscale.com>
Change-Id: I4eb57f046d8b40403220e40eb67a31c41adb3a38
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
The controlhttp dialer with a ControlDialPlan IPv6 entry was hitting a
case where the dnscache Resolver was returning an netip.Addr zero
value, where it should've been returning the IPv6 address.
We then tried to dial "invalid IP:80", which would immediately fail,
at least locally.
Mostly this was causing spammy logs when debugging other stuff.
Updates tailscale/corp#32534
Change-Id: If8b9a20f10c1a6aa8a662c324151d987fe9bd2f8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
tsnet apps in particular never use the Linux DNS OSManagers, so they don't need
DBus, etc. I started to pull that all out into separate features so tsnet doesn't
need to bring in DBus, but hit this first.
Here you can see that tsnet (and the k8s-operator) no longer pulls in inotify.
Updates #17206
Change-Id: I7af0f391f60c5e7dbeed7a080346f83262346591
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit does not change the order or meaning of any eventbus activity, it
only updates the way the plumbing is set up.
Updates #15160
Change-Id: I0a175e67e867459daaedba0731bf68bd331e5ebc
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This commit does not change the order or meaning of any eventbus activity, it
only updates the way the plumbing is set up.
Updates #15160
Change-Id: I40c23b183c2a6a6ea3feec7767c8e5417019fc07
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
A common pattern in event bus usage is to run a goroutine to service a
collection of subscribers on a single bus client. To have an orderly shutdown,
however, we need a way to wait for such a goroutine to be finished.
This commit adds a Monitor type that makes this pattern easier to wire up:
rather than having to track all the subscribers and an extra channel, the
component need only track the client and the monitor. For example:
cli := bus.Client("example")
m := cli.Monitor(func(c *eventbus.Client) {
s1 := eventbus.Subscribe[T](cli)
s2 := eventbus.Subscribe[U](cli)
for {
select {
case <-c.Done():
return
case t := <-s1.Events():
processT(t)
case u := <-s2.Events():
processU(u)
}
}
})
To shut down the client and wait for the goroutine, the caller can write:
m.Close()
which closes cli and waits for the goroutine to finish. Or, separately:
cli.Close()
// do other stuff
m.Wait()
While the goroutine management is not explicitly tied to subscriptions, it is a
common enough pattern that this seems like a useful simplification in use.
Updates #15160
Change-Id: I657afda1cfaf03465a9dce1336e9fd518a968bca
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Pulls out the last callback logic and ensures timers are still running.
The eventbustest package is updated support the absence of events.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
And another case of the same typo in a comment elsewhere.
Updates #cleanup
Change-Id: Iaa9d865a1cf83318d4a30263c691451b5d708c9c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* tsnet,internal/client/tailscale: resolve OAuth into authkeys in tsnet
Updates #8403.
* internal/client/tailscale: omit OAuth library via build tag
Updates #12614.
Signed-off-by: Naman Sood <mail@nsood.in>
Expand TestRedactNetmapPrivateKeys to cover all sub-structs of
NetworkMap and confirm that a) all fields are annotated as private or
public, and b) all private fields are getting redacted.
Updates tailscale/corp#32095
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
For debugging purposes, add a new C2N endpoint returning the current
netmap. Optionally, coordination server can send a new "candidate" map
response, which the client will generate a separate netmap for.
Coordination server can later compare two netmaps, detecting unexpected
changes to the client state.
Updates tailscale/corp#32095
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Instead of a single hard-coded C2N handler, add support for calling
arbitrary C2N endpoints via a node roundtripper.
Updates tailscale/corp#32095
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
When tests run in parallel, events from multiple tests on the same bus can
intercede with each other. This is working as intended, but for the test cases
we want to control exactly what goes through the bus.
To fix that, allocate a fresh bus for each subtest.
Fixes#17197
Change-Id: I53f285ebed8da82e72a2ed136a61884667ef9a5e
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
When developing (and debugging) tests, it is useful to be able to see all the
traffic that transits the event bus during the execution of a test.
Updates #15160
Change-Id: I929aee62ccf13bdd4bd07d786924ce9a74acd17a
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Previously, seamless key renewal was an opt-in feature. Customers had
to set a `seamless-key-renewal` node attribute in their policy file.
This patch enables seamless key renewal by default for all clients.
It includes a `disable-seamless-key-renewal` node attribute we can set
in Control, so we can manage the rollout and disable the feature for
clients with known bugs. This new attribute makes the feature opt-out.
Updates tailscale/corp#31479
Signed-off-by: Alex Chan <alexc@tailscale.com>
This makes the `switch` command use the helper `matchProfile` function
that was introduced in the `remove` sub command.
Signed-off-by: Esteban-Bermudez <esteban@bermudezaguirre.com>
Fixes#12255
Add a new subcommand to `switch` for removing a profile from the local
client. This does not delete the profile from the Tailscale account, but
removes it from the local machine. This functionality is available on
the GUI's, but not yet on the CLI.
Signed-off-by: Esteban-Bermudez <esteban@bermudezaguirre.com>
It doesn't really pull its weight: it adds 577 KB to the binary and
is rarely useful.
Also, we now have static IPs and other connectivity paths coming
soon enough.
Updates #5853
Updates #1278
Updates tailscale/corp#32168
Change-Id: If336fed00a9c9ae9745419e6d81f7de6da6f7275
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For a common case of events being simple struct types with some exported
fields, add a helper to check (reflectively) for equal values using cmp.Diff so
that a failed comparison gives a useful diff in the test output.
More complex uses will still want to provide their own comparisons; this
(intentionally) does not export diff options or other hooks from the cmp
package.
Updates #15160
Change-Id: I86bee1771cad7debd9e3491aa6713afe6fd577a6
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This makes things work slightly better over the eventbus.
Also switches ipnlocal to use the event over the eventbus instead of the
direct callback.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Extend the Expect method of a Watcher to allow filter functions that report
only an error value, and which "pass" when the reported error is nil.
Updates #15160
Change-Id: I582d804554bd1066a9e499c1f3992d068c9e8148
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This fixes a flaky test which has been occasionally timing out in CI.
In particular, this test times out if `watchFile` receives multiple
notifications from inotify before we cancel the test context. We block
processing the second notification, because we've stopped listening to
the `callbackDone` channel.
This patch changes the test so we only send on the first notification.
Testing this locally with `stress` confirms that the test is no longer
flaky.
Fixes#17172
Updates #14699
Signed-off-by: Alex Chan <alexc@tailscale.com>
I'd started to do this in the earlier ts_omit_server PR but
decided to split it into this separate PR.
Updates #17128
Change-Id: Ief8823a78d1f7bbb79e64a5cab30a7d0a5d6ff4b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Instead of waiting for a designated subscription to close as a canary for the
bus being stopped, use the bus Client's own signal for closure added in #17118.
Updates #cleanup
Change-Id: I384ea39f3f1f6a030a6282356f7b5bdcdf8d7102
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
The Tracker was using direct callbacks to ipnlocal. This PR moves those
to be triggered via the eventbus.
Additionally, the eventbus is now closed on exit from tailscaled
explicitly, and health is now a SubSystem in tsd.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Subscribers already have a Done channel that the caller can use to detect when
the subscriber has been closed. Typically this happens when the governing
Client closes, which in turn is typically because the Bus closed.
But clients and subscribers can stop at other times too, and a caller has no
good way to tell the difference between "this subscriber closed but the rest
are OK" and "the client closed and all these subscribers are finished".
We've worked around this in practice by knowing the closure of one subscriber
implies the fate of the rest, but we can do better: Add a Done method to the
Client that allows us to tell when that has been closed explicitly, after all
the publishers and subscribers associated with that client have been closed.
This allows the caller to be sure that, by the time that occurs, no further
pending events are forthcoming on that client.
Updates #15160
Change-Id: Id601a79ba043365ecdb47dd035f1fdadd984f303
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Remove the need for the caller to hold on to and call an unregister
function. Both two callers (one real, one test) already have a context
they can use. Use context.AfterFunc instead. There are no observable
side effects from scheduling too late if the goroutine doesn't run sync.
Updates #17148
Change-Id: Ie697dae0e797494fa8ef27fbafa193bfe5ceb307
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This test ostensibly checks whether we record an error metric if a packet
is dropped because the network is down, but the network connectivity is
irrelevant -- the send error is actually because the arguments to Send()
are invalid:
RebindingUDPConn.WriteWireGuardBatchTo:
[unexpected] offset (0) != Geneve header length (8)
This patch changes the test so we try to send a valid packet, and we
verify this by sending it once before taking the network down. The new
error is:
magicsock: network down
which is what we're trying to test.
We then test sending an invalid payload as a separate test case.
Updates tailscale/corp#22075
Signed-off-by: Alex Chan <alexc@tailscale.com>
endpointState is used for tracking UDP direct connection candidate
addresses. If it contains a DERP addr, then direct connection path
discovery will always send a wasteful disco ping over it. Additionally,
CLI "tailscale ping" via peer relay will race over DERP, leading to a
misleading result if pong arrives via DERP first.
Disco pongs arriving via DERP never influence path selection. Disco
ping/pong via DERP only serves "tailscale ping" reporting.
Updates #17121
Signed-off-by: Jordan Whited <jordan@tailscale.com>
When you say --features=foo,bar, that was supposed to mean
to only show features "foo" and "bar" in the table.
But it was also being used as the set of all features that are
omittable, which was wrong, leading to misleading numbers
when --features was non-empty.
Updates #12614
Change-Id: Idad2fa67fb49c39454032e84a3dede967890fdf5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The Unix implementation of doExec propagates error codes by virtue of
the fact that it does an execve; the replacement binary will return the
exit code.
On non-Unix, we need to simulate these semantics by checking for an
ExitError and, when present, passing that value on to os.Exit.
We also add error handling to the doExec call for the benefit of
handling any errors where doExec fails before being able to execute
the desired binary.
Updates https://github.com/tailscale/corp/issues/29940
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
This renames the package+symbols in the earlier 17ffa80138 to be
in their own package ("buildfeatures") and start with the word "Has"
like "if buildfeatures.HasFoo {".
Updates #12614
Change-Id: I510e5f65993e5b76a0e163e3aa4543755213cbf6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Extend the client state management to generate a hardware attestation
key if none exists.
Extend MapRequest with HardwareAttestationKey{,Signature} fields that
optionally contain the public component of the hardware attestation key
and a signature of the node's node key using it. This will be used by
control to associate hardware attesation keys with node identities on a
TOFU basis.
Updates tailscale/corp#31269
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
So code (in upcoming PRs) can test for the build tags with consts and
get dead code elimination from the compiler+linker.
Updates #12614
Change-Id: If6160453ffd01b798f09894141e7631a93385941
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is a small introduction of the eventbus into controlclient that
communicates with mainly ipnlocal. While ipnlocal is a complicated part
of the codebase, the subscribers here are from the perspective of
ipnlocal already called async.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
This commit fixes an issue within the service reconciler where we end
up in a constant reconciliation loop. When reconciling, the loadbalancer
status is appended to but not reset between each reconciliation, leading
to an ever growing slice of duplicate statuses.
Fixes https://github.com/tailscale/tailscale/issues/17105
Fixes https://github.com/tailscale/tailscale/issues/17107
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit adds a new method to the tsnet.Server type named `Logger`
that returns the underlying logtail instance's Logf method.
This is intended to be used within the Kubernetes operator to wrap its
existing logger in a way such that operator specific logs can also be
sent to control for support & debugging purposes.
Updates https://github.com/tailscale/corp/issues/32037
Signed-off-by: David Bond <davidsbond93@gmail.com>
As of this commit (per the issue), the Taildrive code remains where it
was, but in new files that are protected by the new ts_omit_drive
build tag. Future commits will move it.
Updates #17058
Change-Id: Idf0a51db59e41ae8da6ea2b11d238aefc48b219e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Its doc said its signature matched a std signature, but it used
Tailscale-specific types.
Nowadays it's the caller (func control) that curries the logf/netmon
and returns the std-matching signature.
Updates #cleanup (while answering a question on Slack)
Change-Id: Ic99de41fc6a1c720575a7f33c564d0bcfd9a2c30
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To support integration testing of client features that rely on it, e.g.
peer relay.
Updates tailscale/corp#30903
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Removes ACL edits from e2e tests in favour of trying to simplify the
tests and separate the actual test logic from the environment setup
logic as much as possible. Also aims to fit in with the requirements
that will generally be filled anyway for most devs working on the
operator; in particular using tags that fit in with our documentation.
Updates tailscale/corp#32085
Change-Id: I7659246e39ec0b7bcc4ec0a00c6310f25fe6fac2
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This adds a file that's not compiled by default that exists just to
make it easier to do binary size checks, probing what a binary would
be like if it included reflect methods (as used by html/template, etc).
As an example, once tailscaled uses reflect.Type.MethodByName(non-const-string) anywhere,
the build jumps up by 14.5 MB:
$ GOOS=linux GOARCH=amd64 ./tool/go build -tags=ts_include_cli,ts_omit_webclient,ts_omit_systray,ts_omit_debugeventbus -o before ./cmd/tailscaled
$ GOOS=linux GOARCH=amd64 ./tool/go build -tags=ts_include_cli,ts_omit_webclient,ts_omit_systray,ts_omit_debugeventbus,ts_debug_forcereflect -o after ./cmd/tailscaled
$ ls -l before after
-rwxr-xr-x@ 1 bradfitz staff 41011861 Sep 9 07:28 before
-rwxr-xr-x@ 1 bradfitz staff 55610948 Sep 9 07:29 after
This is particularly pronounced with large deps like the AWS SDK. If you compare using ts_omit_aws:
-rwxr-xr-x@ 1 bradfitz staff 38284771 Sep 9 07:40 no-aws-no-reflect
-rwxr-xr-x@ 1 bradfitz staff 45546491 Sep 9 07:41 no-aws-with-reflect
That means adding AWS to a non-reflect binary adds 2.7 MB but adding
AWS to a reflect binary adds 10 MB.
Updates #17063
Updates #12614
Change-Id: I18e9b77c9cf33565ce5bba65ac5584fa9433f7fb
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* cmd/tailscale/cli: use client/local instead of deprecated client/tailscale
Updates tailscale/corp#22748
Signed-off-by: Alex Chan <alexc@tailscale.com>
* derp: use client/local instead of deprecated client/tailscale
Updates tailscale/corp#22748
Signed-off-by: Alex Chan <alexc@tailscale.com>
---------
Signed-off-by: Alex Chan <alexc@tailscale.com>
I probably could've deflaked this without synctest, but might as well use
it now that Go 1.25 has it.
Fixes#15348
Change-Id: I81c9253fcb7eada079f3e943ab5f1e29ba8e8e31
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* utils/expvarx: mark TestSafeFuncHappyPath as known flaky
Updates #15348
Signed-off-by: Alex Chan <alexc@tailscale.com>
* tstest/integration: mark TestCollectPanic as known flaky
Updates #15865
Signed-off-by: Alex Chan <alexc@tailscale.com>
---------
Signed-off-by: Alex Chan <alexc@tailscale.com>
It was a bit confusing that provided history did not include the
current probe results.
Updates tailscale/corp#20583
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
We should never use the real syspolicy implementation in tests by
default. (the machine's configuration shouldn't affect tests)
You either specify a test policy, or you get a no-op one.
Updates #16998
Change-Id: I3350d392aad11573a5ad7caab919bb3bbaecb225
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This commit modifies containerboot's state reset process to handle the
state secret not existing. During other parts of the boot process we
gracefully handle the state secret not being created yet, but missed
that check within `resetContainerbootState`
Fixes https://github.com/tailscale/tailscale/issues/16804
Signed-off-by: David Bond <davidsbond93@gmail.com>
Fix "file not found" errors when WebDAV clients access files/dirs inside
directories with spaces.
The issue occurred because StatCache was mixing URL-escaped and
unescaped paths, causing cache key mismatches.
Specifically, StatCache.set() parsed WebDAV responses containing
URL-escaped paths (ex. "Dir%20Space/file1.txt") and stored them
alongside unescaped cache keys (ex. "Dir Space/file1.txt").
This mismatch prevented StatCache.get() from correctly determining whether
a child file existed.
See https://github.com/tailscale/tailscale/issues/13632#issuecomment-3243522449
for the full explanation of the issue.
The decision to keep all paths references unescaped inside the StatCache
is consistent with net/http.Request.URL.Path and rewrite.go (sole consumer)
Update unit test to detect this directory space mishandling.
Fixes tailscale#13632
Signed-off-by: Craig Hesling <craig@hesling.com>
There's a TODO to delete all of handler.go, but part of it's
still used in another repo.
But this deletes some.
Updates #17022
Change-Id: Ic5a8a5a694ca258440307436731cd92b45ee2d21
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Before:
$ tailscale ip -4
1.2.3.4
$ tailscale set --exit-node=1.2.3.4
no node found in netmap with IP 1.2.3.4
After:
$ tailscale set --exit-node=1.2.3.4
cannot use 1.2.3.4 as an exit node as it is a local IP address to this machine; did you mean --advertise-exit-node?
The new error message already existed in the code, but would only be
triggered if the backend wasn't running -- which means, in practice,
it would almost never be triggered.
The old error message is technically true, but could be confusing if you
don't know the distinction between "netmap" and "tailnet" -- it could
sound like the exit node isn't part of your tailnet. A node is never in
its own netmap, but it is part of your tailnet.
This error confused me when I was doing some local dev work, and it's
confused customers before (e.g. #7513). Using the more specific error
message should reduce confusion.
Updates #7513
Updates https://github.com/tailscale/corp/issues/23596
Signed-off-by: Alex Chan <alexc@tailscale.com>
Now that we have policytest and the policyclient.Client interface, we
can de-global-ify many of the tests, letting them run concurrently
with each other, and just removing global variable complexity.
This does ~half of the LocalBackend ones.
Updates #16998
Change-Id: Iece754e1ef4e49744ccd967fa83629d0dca6f66a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is step 4 of making syspolicy a build-time feature.
This adds a policyclient.Get() accessor to return the correct
implementation to use: either the real one, or the no-op one. (A third
type, a static one for testing, also exists, so in general a
policyclient.Client should be plumbed around and not always fetched
via policyclient.Get whenever possible, especially if tests need to use
alternate syspolicy)
Updates #16998
Updates #12614
Change-Id: Iaf19670744a596d5918acfa744f5db4564272978
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Step 4 of N. See earlier commits in the series (via the issue) for the
plan.
This adds the missing methods to policyclient.Client and then uses it
everywhere in ipn/ipnlocal and locks it in with a new dep test.
Still plenty of users of the global syspolicy elsewhere in the tree,
but this is a lot of them.
Updates #16998
Updates #12614
Change-Id: I25b136539ae1eedbcba80124de842970db0ca314
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Step 3 in the series. See earlier cc532efc20 and d05e6dc09e.
This step moves some types into a new leaf "ptype" package out of the
big "settings" package. The policyclient.Client will later get new
methods to return those things (as well as Duration and Uint64, which
weren't done at the time of the earlier prototype).
Updates #16998
Updates #12614
Change-Id: I4d72d8079de3b5351ed602eaa72863372bd474a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously, when attempting a risky action, the CLI printed a 5 second countdown saying
"Continuing in 5 seconds...". When the countdown finished, the CLI aborted rather than
continuing.
To avoid confusion, but also avoid accidentally continuing if someone (or an automated
process) fails to manually abort within the countdown, we now explicitly prompt for a
y/n response on whether or not to continue.
Updates #15445
Co-authored-by: Kot C <kot@kot.pink>
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This commit adds a `replicas` field to the `Connector` custom resource that
allows users to specify the number of desired replicas deployed for their
connectors.
This allows users to deploy exit nodes, subnet routers and app connectors
in a highly available fashion.
Fixes#14020
Signed-off-by: David Bond <davidsbond93@gmail.com>
This is step 2 of ~4, breaking up #14720 into reviewable chunks, with
the aim to make syspolicy be a build-time configurable feature.
Step 1 was #16984.
In this second step, the util/syspolicy/policyclient package is added
with the policyclient.Client interface. This is the interface that's
always present (regardless of build tags), and is what code around the
tree uses to ask syspolicy/MDM questions.
There are two implementations of policyclient.Client for now:
1) NoPolicyClient, which only returns default values.
2) the unexported, temporary 'globalSyspolicy', which is implemented
in terms of the global functions we wish to later eliminate.
This then starts to plumb around the policyclient.Client to most callers.
Future changes will plumb it more. When the last of the global func
callers are gone, then we can unexport the global functions and make a
proper policyclient.Client type and constructor in the syspolicy
package, removing the globalSyspolicy impl out of tsd.
The final change will sprinkle build tags in a few more places and
lock it in with dependency tests to make sure the dependencies don't
later creep back in.
Updates #16998
Updates #12614
Change-Id: Ib2c93d15c15c1f2b981464099177cd492d50391c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is step 1 of ~3, breaking up #14720 into reviewable chunks, with
the aim to make syspolicy be a build-time configurable feature.
In this first (very noisy) step, all the syspolicy string key
constants move to a new constant-only (code-free) package. This will
make future steps more reviewable, without this movement noise.
There are no code or behavior changes here.
The future steps of this series can be seen in #14720: removing global
funcs from syspolicy resolution and using an interface that's plumbed
around instead. Then adding build tags.
Updates #12614
Change-Id: If73bf2c28b9c9b1a408fe868b0b6a25b03eeabd1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Apparently, #16989 introduced a bug in request-dataplane-review.yml:
> you may only define one of `paths` and `paths-ignore` for a single event
Related #16372
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
@tailscale/dataplane almost never needs to review depaware.txt, when
it is the only change to the DERP implementation.
Related #16372
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
If the DERP queue is full, drop the oldest item first, rather than the
youngest, on the assumption that older data is more likely to be
unanswerable.
Updates tailscale/corp#31762
Signed-off-by: James Tucker <james@tailscale.com>
Add a ternary flag that unless set explicitly to false keeps the
insecure behavior of TSIDP.
If the flag is false, add functionality on startup to migrate
oidc-funnel-clients.json to oauth-clients.json if it doesn’t exist.
If the flag is false, modify endpoints to behave similarly regardless
of funnel, tailnet, or localhost. They will all verify client ID & secret
when appropriate per RFC 6749. The authorize endpoint will no longer change
based on funnel status or nodeID.
Add extra tests verifying TSIDP endpoints behave as expected
with the new flag.
Safely create the redirect URL from what's passed into the
authorize endpoint.
Fixes #16880
Signed-off-by: Remy Guercio <remy@tailscale.com>
Doesn't look to affect us, but pacifies security scanners.
See 88ddf1d0d9
It's for decoding. We only use this package for encoding (via
github.com/google/rpmpack / github.com/goreleaser/nfpm/v2).
Updates #8043
Change-Id: I87631aa5048f9514bb83baf1424f6abb34329c46
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Our own WaitGroup wrapper type was a prototype implementation
for the Go method on the standard sync.WaitGroup type.
Now that there is first-class support for Go,
we should migrate over to using it and delete syncs.WaitGroup.
Updates #cleanup
Updates tailscale/tailscale#16330
Change-Id: Ib52b10f9847341ce29b4ca0da927dc9321691235
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
DERP writes go via TCP and the host OS will have plenty of buffer space.
We've observed in the wild with a backed up TCP socket kernel side
buffers of >2.4MB. The DERP internal queue being larger causes an
increase in the probability that the contents of the backbuffer are
"dead letters" - packets that were assumed to be lost.
A first step to improvement is to size this queue only large enough to
avoid some of the initial connect stall problem, but not large enough
that it is contributing in a substantial way to buffer bloat /
dead-letter retention.
Updates tailscale/corp#31762
Signed-off-by: James Tucker <james@tailscale.com>
I need a ringbuffer in the more traditional sense, one that has a notion
of item removal as well as tail loss on overrun. This implementation is
really a clearable log window, and is used as such where it is used.
Updates #cleanup
Updates tailscale/corp#31762
Signed-off-by: James Tucker <james@tailscale.com>
Bump Go 1.25 release to include a go/types patch and resolve govulncheck
CI exceptions.
Updates tailscale/corp#31755
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Extract field comments from AST and include them in generated view
methods. Comments are preserved from the original struct fields to
provide documentation for the view accessors.
Fixes#16958
Signed-off-by: Maisem Ali <3953239+maisem@users.noreply.github.com>
fixestailscale/corp#26369
The suggested exit node is currently only calculated during a localAPI request.
For older UIs, this wasn't a bad choice - we could just fetch it on-demand when a menu
presented itself. For newer incarnations however, this is an always-visible field
that needs to react to changes in the suggested exit node's value.
This change recalculates the suggested exit node ID on netmap updates and
broadcasts it on the IPN bus. The localAPI version of this remains intact for the
time being.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
updates tailscale/corp#29841
Adds a node cap macOS UIs can query to determine
whether then should enable the new windowed UI.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Pull the lock-bearing code into a closure, and use a clone rather than a
shallow copy of the hostinfo record.
Updates #11649
Change-Id: I4f1d42c42ce45e493b204baae0d50b1cbf82b102
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
The early unlock on this branch was required because the "send" method goes on
to acquire the mutex itself. Rather than release the lock just to acquire it
again, call the underlying locked helper directly.
Updates #11649
Change-Id: I50d81864a00150fc41460b7486a9c65655f282f5
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
In places where we are locking the LocakBackend and immediately deferring an
unlock, and where there is no shortcut path in the control flow below the
deferral, we do not need the unlockOnce helper. Replace all these with use of
the lock directly.
Updates #11649
Change-Id: I3e6a7110dfc9ec6c1d38d2585c5367a0d4e76514
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Instead of referring to groups, which is a term of art for a different entity,
update the doc comments to more accurately describe what tags are in reference
to the policy document.
Updates #cleanup
Change-Id: Iefff6f84981985f834bae7c6a6c34044f53f2ea2
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
There are several methods within the LocalBackend that used an unusual and
error-prone lock discipline whereby they require the caller to hold the backend
mutex on entry, but release it on the way out.
In #11650 we added some support code to make this pattern more visible.
Now it is time to eliminate the pattern (at least within this package).
This is intended to produce no semantic changes, though I am relying on
integration tests and careful inspection to achieve that.
To the extent possible I preserved the existing control flow. In a few places,
however, I replaced this with an unlock/lock closure. This means we will
sometimes reacquire a lock only to release it again one frame up the stack, but
these operations are not performance sensitive and the legibility gain seems
worthwhile.
We can probably also pull some of these out into separate methods, but I did
not do that here so as to avoid other variable scope changes that might be hard
to see. I would like to do some more cleanup separately.
As a follow-up, we could also remove the unlockOnce helper, but I did not do
that here either.
Updates #11649
Change-Id: I4c92d4536eca629cfcd6187528381c33f4d64e20
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
The serve code leaves it up to the system's DNS resolver and netstack to
figure out how to reach the proxy destination. Combined with k8s-proxy
running in userspace mode, this means we can't rely on MagicDNS being
available or tailnet IPs being routable. I'd like to implement that as a
feature for serve in userspace mode, but for now the safer fix to get
kube-apiserver ProxyGroups consistently working in all environments is to
switch to using localhost as the proxy target instead.
This has a small knock-on in the code that does WhoIs lookups, which now
needs to check the X-Forwarded-For header that serve populates to get
the correct tailnet IP to look up, because the request's remote address
will be loopback.
Fixes#16920
Change-Id: I869ddcaf93102da50e66071bb00114cc1acc1288
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This increases throughput over long fat networks, and in the presence
of crypto/syscall-induced delay.
Updates tailscale/corp#31164
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Update odic-funnel-clients.json to take a path, this
allows setting the location of the file and prevents
it from landing in the root directory or users home directory.
Move setting of rootPath until after tsnet has started.
Previously this was added for the lazy creation of the
oidc-key.json. It's now needed earlier in the flow.
Updates #16734Fixes#16844
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Add the ability for operators of natc in consensus mode to remove
servers from the raft cluster config, without losing other state.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
Currently consensus has a bootstrap routine where a tsnet node tries to
join each other node with the cluster tag, and if it is not able to join
any other node it starts its own cluster.
That algorithm is racy, and can result in split brain (more than one
leader/cluster) if all the nodes for a cluster are started at the same
time.
Add a FollowOnly argument to the bootstrap function. If provided this
tsnet node will never lead, it will try (and retry with exponential back
off) to follow any node it can contact.
Add a --follow-only flag to cmd/natc that uses this new tsconsensus
functionality.
Also slightly reorganize some arguments into opts structs.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
This significantly improves throughput of a peer relay server on Linux.
Server.packetReadLoop no longer passes sockets down the stack. Instead,
packet handling methods return a netip.AddrPort and []byte, which
packetReadLoop gathers together for eventual batched writes on the
appropriate socket(s).
Updates tailscale/corp#31164
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We have been unintentionally ignoring errors from calling bootstrap.
bootstrap sometimes calls raft.BootstrapCluster which sometimes returns
a safe to ignore error, handle that case appropriately.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
This has come up in a few situations recently and adding these helpers
is much better than copying the slice (calling AsSlice()) in order to
use slices.Max and friends.
Updates #cleanup
Change-Id: Ib289a07d23c3687220c72c4ce341b9695cd875bf
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
Update the runall handler to be more generic with an
exclude param to exclude multiple probes as the requesters
definition.
Updates tailscale/corp#27370
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Cleanup nix support, make flake easier to read with nix-systems.
This also harmonizes with golinks flake setup and reduces an input
dependency by 1.
Update deps test to ensure the vendor hash stays harmonized
with go.mod.
Update make tidy to ensure vendor hash stays current.
Overlay the current version of golang, tailscale runs
recent releases faster than nixpkgs can update them into
the unstable branch.
Updates #16637
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
The -Environment argument to Start-Process is essentially being treated
as a delta; removing a particular variable from the argument's hash
table does not indicate to delete. Instead we must set the value of each
unwanted variable to $null.
Updates https://github.com/tailscale/corp/issues/29940
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Some of the operations of the local API need an event bus to correctly
instantiate other components (notably including the portmapper).
This commit adds that, and as the parameter list is starting to get a bit long
and hard to read, I took the opportunity to move the arguments to a config
type. Only a few call sites needed to be updated and this API is not intended
for general use, so I did not bother to stage the change.
Updates #15160
Updates #16842
Change-Id: I7b057d71161bd859f5acb96e2f878a34c85be0ef
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
gocross-wrapper.ps1 is a PowerShell core script that is essentially a
straight port of gocross-wrapper.sh. It requires PowerShell 7.4, which
is the latest LTS release of PSCore.
Why use PowerShell Core instead of Windows PowerShell? Essentially
because the former is much better to script with and is the edition
that is currently maintained.
Because we're using PowerShell Core, but many people will be running
scripts from a machine that only has Windows PowerShell, go.cmd has
been updated to prompt the user for PowerShell core installation if
necessary.
gocross-wrapper.sh has also been updated to utilize the PSCore script
when running under cygwin or msys.
gocross itself required a couple of updates:
We update gocross to output the PowerShell Core wrapper alongside the
bash wrapper, which will propagate the revised scripts to other repos
as necessary.
We also fix a couple of things in gocross that didn't work on Windows:
we change the toolchain resolution code to use os.UserHomeDir instead
of directly referencing the HOME environment variable, and we fix a
bug in the way arguments were being passed into exec.Command on
non-Unix systems.
Updates https://github.com/tailscale/corp/issues/29940
Signed-off-by: Aaron Klotz <aaron@tailscale.com>
Add a Run all probes handler that executes all
probes except those that are continuous or the derpmap
probe.
This is leveraged by other tooling to confirm DERP
stability after a deploy.
Updates tailscale/corp#27370
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
fixestailscale/corp#31299
Fixes two issues:
getInterfaceIndex would occasionally race with netmon's state, returning
the cached default interface index after it had be changed by NWNetworkMonitor.
This had the potential to cause connections to bind to the prior default. The fix
here is to preferentially use the interface index provided by NWNetworkMonitor
preferentially.
When no interfaces are available, macOS will set the tunnel as the default
interface when an exit node is enabled, potentially causing getInterfaceIndex
to return utun's index. We now guard against this when taking the
defaultIdx path.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
This pulls in a change from github.com/tailscale/QDK to verify code signing
when using QNAP_SIGNING_SCRIPT.
It also upgrades to the latest Google Cloud PKCS#11 library, and reorders
the Dockerfile to allow for more efficient future upgrades to the included QDK.
Updates tailscale/corp#23528
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Define the HardwareAttestionKey interface describing a platform-specific
hardware backed node identity attestation key. Clients will register the
key type implementations for their platform.
Updates tailscale/corp#31269
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
dnstype.Resolver adds a boolean UseWithExitNode that controls
whether the resolver should be used in tailscale exit node contexts
(not wireguard exit nodes). If UseWithExitNode resolvers are found,
they are installed as the global resolvers. If no UseWithExitNode resolvers
are found, the exit node resolver continues to be installed as the global
resolver. Split DNS Routes referencing UseWithExitNode resolvers are also
installed.
Updates #8237Fixestailscale/corp#30906Fixestailscale/corp#30907
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
We already show a message in the menu itself, this just adds it to the
CLI output as well.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
This adds support for having every viewer type implement
jsonv2.MarshalerTo and jsonv2.UnmarshalerFrom.
This provides a significant boost in performance
as the json package no longer needs to validate
the entirety of the JSON value outputted by MarshalJSON,
nor does it need to identify the boundaries of a JSON value
in order to call UnmarshalJSON.
For deeply nested and recursive MarshalJSON or UnmarshalJSON calls,
this can improve runtime from O(N²) to O(N).
This still references "github.com/go-json-experiment/json"
instead of the experimental "encoding/json/v2" package
now available in Go 1.25 under goexperiment.jsonv2
so that code still builds without the experiment tag.
Of note, the "github.com/go-json-experiment/json" package
aliases the standard library under the right build conditions.
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
Adds a setter for proxyFunc to allow macOS to pull defined
system proxies. Disallows overriding if proxyFunc is set via config.
Updates tailscale/corp#30668
Signed-off-by: Will Hannah <willh@tailscale.com>
This affects the 1.87.33 unstable release.
Updates #16842
Updates #15160
Change-Id: Ie6d1b2c094d1a6059fbd1023760567900f06e0ad
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Expected when Peer Relay'ing via self. These disco messages never get
sealed, and never leave the process.
Updates tailscale/corp#30527
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Update some logging to help future failures.
Improve test shutdown concurrency issues.
Fixes#16722
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Peer Relay is dependent on crypto routing, therefore crypto routing is
now mandatory.
Updates tailscale/corp#20732
Updates tailscale/corp#31083
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This commit also extends the updateRelayServersSet unit tests to cover
onNodeViewsUpdate.
Fixestailscale/corp#31080
Signed-off-by: Jordan Whited <jordan@tailscale.com>
One of these tests highlighted a Geneve encap bug, which is also fixed
in this commit.
looksLikeInitMsg was passed a packet post Geneve header stripping with
slice offsets that had not been updated to account for the stripping.
Updates tailscale/corp#30903
Signed-off-by: Jordan Whited <jordan@tailscale.com>
* Update installer.sh add FreeBSD ver 15
this should fix the issue on https://github.com/tailscale/tailscale/issues/16740
Signed-off-by: TheBigBear <471105+TheBigBear@users.noreply.github.com>
* scripts/installer.sh: small indentation change
Signed-off-by: Erisa A <erisa@tailscale.com>
Fixes#16740
---------
Signed-off-by: TheBigBear <471105+TheBigBear@users.noreply.github.com>
Signed-off-by: Erisa A <erisa@tailscale.com>
Co-authored-by: Erisa A <erisa@tailscale.com>
Pass a local.Client to systray.Run, so we can use the existing global
localClient in the cmd/tailscale CLI. Add socket flag to cmd/systray.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Adds the eventbus to the router subsystem.
The event is currently only used on linux.
Also includes facilities to inject events into the bus.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
This will start including the sytray app in unstable builds for Linux,
unless the `ts_omit_systray` build flag is specified.
If we decide not to include it in the v1.88 release, we can pull it
back out or restrict it to unstable builds.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
In Android, we are prompting the user to select a Taildrop directory when they first receive a Taildrop: we block writes on Taildrop dir selection. This means that we cannot use Dir inside managerOptions, since the http request would not get the new Taildrop extension. This PR removes, in the Android case, the reliance on m.opts.Dir, and instead has FileOps hold the correct directory.
This expands FileOps to be the Taildrop interface for all file system operations.
Updates tailscale/corp#29211
Signed-off-by: kari-ts <kari@tailscale.com>
restore tstest
* cmd/k8s-operator,k8s-operator: allow setting a `priorityClassName`
Fixes#16682
Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
* Update k8s-operator/apis/v1alpha1/types_proxyclass.go
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
Signed-off-by: Lee Briggs <jaxxstorm@users.noreply.github.com>
* run make kube-generate-all
Change-Id: I5f8f16694fdc181b048217b9f05ec2ee2aa04def
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
---------
Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
Signed-off-by: Lee Briggs <jaxxstorm@users.noreply.github.com>
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
The tsidp oidc-key.json ended up in the root directory
or home dir of the user process running it.
Update this to store it in a known location respecting
the TS_STATE_DIR and flagDir options.
Fixes#16734
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Also adds a test to kube/kubeclient to defend against the error type
returned by the client changing in future.
Fixestailscale/corp#30855
Change-Id: Id11d4295003e66ad5c29a687f1239333c21226a4
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Some systems have `sudo`, some have `su`. This tries both, increasing
the chance that we can run the file server as an unprivileged user.
Updates #14629
Signed-off-by: Percy Wegmann <percy@tailscale.com>
If a conn.Close call raced conn.ReadFromUDPAddrPort before it could
"register" itself as an active read, the conn.ReadFromUDPAddrPort would
never return.
This commit replaces all the activeRead and breakActiveReads machinery
with a channel. These constructs were only depended upon by
SetReadDeadline, and SetReadDeadline was unused.
Updates #16707
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This commit update the message for recommanding clear command after running serve for service.
Instead of a flag, we pass the service name as a parameter.
Fixestailscale/corp#30846
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
In the components where an event bus is already plumbed through, remove the
exceptions that allow it to be omitted, and update all the tests that relied on
those workarounds execute properly.
This change applies only to the places where we're already using the bus; it
does not enforce the existence of a bus in other components (yet),
Updates #15160
Change-Id: Iebb92243caba82b5eb420c49fc3e089a77454f65
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
jsonv2 now returns an error when you marshal or unmarshal a time.Duration
without an explicit format flag. This is an intentional, temporary choice until
the default [time.Duration] representation is decided (see golang/go#71631).
setting.Snapshot can hold time.Duration values inside a map[string]any,
so the jsonv2 update breaks marshaling. In this PR, we start using
a custom marshaler until that decision is made or golang/go#71664
lets us specify the format explicitly.
This fixes `tailscale syspolicy list` failing when KeyExpirationNotice
or any other time.Duration policy setting is configured.
Fixes#16683
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Ideally when we attempt to create a new port mapping, we should not return
without error when no mapping is available. We already log these cases as
unexpected, so this change is just to avoiding panicking dispatch on the
invalid result in those cases. We still separately need to fix the underlying
control flow.
Updates #16662
Change-Id: I51e8a116b922b49eda45e31cd27f6b89dd51abc8
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This occasionally panics waiting on a nil ctx, but was missed in the
previous PR because it's quite a rare flake as it needs to progress to a
specific point in the parser.
Updates #16678
Change-Id: Ifd36dfc915b153aede36b8ee39eff83750031f95
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
When kubectl starts an interactive attach session, it sends 2 resize
messages in quick succession. It seems that particularly in HTTP mode,
we often receive both of these WebSocket frames from the underlying
connection in a single read. However, our parser currently assumes 0-1
frames per read, and leaves the second frame in the read buffer until
the next read from the underlying connection. It doesn't take long after
that before we end up failing to skip a control message as we normally
should, and then we parse a control message as though it will have a
stream ID (part of the Kubernetes protocol) and error out.
Instead, we should keep parsing frames from the read buffer for as long
as we're able to parse complete frames, so this commit refactors the
messages parsing logic into a loop based on the contents of the read
buffer being non-empty.
k/k staging/src/k8s.io/kubectl/pkg/cmd/attach/attach.go for full
details of the resize messages.
There are at least a couple more multiple-frame read edge cases we
should handle, but this commit is very conservatively fixing a single
observed issue to make it a low-risk candidate for cherry picking.
Updates #13358
Change-Id: Iafb91ad1cbeed9c5231a1525d4563164fc1f002f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This update introduces support for DNS records associated with ProxyGroup egress services, ensuring that the ClusterIP Service IP is used instead of Pod IPs.
Fixes#15945
Signed-off-by: Raj Singh <raj@tailscale.com>
Previously, we used a non-nil Location as an indicator that a peer is a Mullvad exit node.
However, this is not, or no longer, reliable, since regular exit nodes may also have a non-nil Location,
such as when traffic steering is enabled for a tailnet.
In this PR, we update the plaintext `tailscale status` output to omit only Mullvad exit nodes, rather than all
exit nodes with a non-nil Location. The JSON output remains unchanged and continues to include all peers.
Updates tailscale/corp#30614
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In #16625, I introduced a mechanism for sending the selected exit node
to Control via tailcfg.Hostinfo.ExitNodeID as part of the MapRequest.
@nickkhyl pointed out that LocalBackend.doSetHostinfoFilterServices
needs to be triggered in order to actually send this update. This
patch adds that command. It also prevents the client from sending
"auto:any" in that field, because that’s not a real exit node ID.
This patch also fills in some missing checks in TestConfigureExitNode.
Updates tailscale/corp#30536
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Update nixpkgs-unstable to include newer golang
to satisfy go.mod requirement of 1.24.4
Update vendor hash to current.
Updates #15015
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
This commit adds a advertise subcommand for tailscale serve, that would declare the node
as a service proxy for a service. This command only adds the service to node's list of
advertised service, but doesn't modify the list of services currently advertised.
Fixestailscale/corp#28016
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
When a client selects a particular exit node, Control may use that as
a signal for deciding other routes.
This patch causes the client to report whenever the current exit node
changes, through tailcfg.Hostinfo.ExitNodeID. It relies on a properly
set ipn.Prefs.ExitNodeID, which should already be resolved by
`tailscale set`.
Updates tailscale/corp#30536
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This commit reverts the key of Web field in ipn.ServiceConfig to use FQDN instead of service
name for the host part of HostPort. This change is because k8s operator already build base on
the assumption of the part being FQDN. We don't want to break the code with dependency.
Fixestailscale/corp#30695
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* Modifies the k8s-proxy to expose health check and metrics
endpoints on the Pod's IP.
* Moves cmd/containerboot/healthz.go and cmd/containerboot/metrics.go to
/kube to be shared with /k8s-proxy.
Updates #13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
Updates k8s-proxy's config so its auth mode config matches that we set
in kube-apiserver ProxyGroups for consistency.
Updates #13358
Change-Id: I95e29cec6ded2dc7c6d2d03f968a25c822bc0e01
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
The Kubernetes API server proxy is getting the ability to serve on a
Tailscale Service instead of individual node names. Update the configure
kubeconfig sub-command to accept arguments that look like a Tailscale
Service. Note, we can't know for sure whether a peer is advertising a
Tailscale Service, we can only guess based on the ExtraRecords in the
netmap and that IP showing up in a peer's AllowedIPs.
Also adds an --http flag to allow targeting individual proxies that can
be adverting on http for their node name, and makes the command a bit
more forgiving on the range of inputs it accepts and how eager it is to
print the help text when the input is obviously wrong.
Updates #13358
Change-Id: Ica0509c6b2c707252a43d7c18b530ec1acf7508f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This commit modifies the kubernetes operator's `DNSConfig` resource
with the addition of a new field at `nameserver.service.clusterIP`.
This field allows users to specify a static in-cluster IP address of
the nameserver when deployed.
Fixes#14305
Signed-off-by: David Bond <davidsbond93@gmail.com>
This function is behind a sync.Once so we should only see errors at
startup. In particular the error from `open` is useful to diagnose why
TPM might not be accessible.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Adds a new reconciler for ProxyGroups of type kube-apiserver that will
provision a Tailscale Service for each replica to advertise. Adds two
new condition types to the ProxyGroup, TailscaleServiceValid and
TailscaleServiceConfigured, to post updates on the state of that
reconciler in a way that's consistent with the service-pg reconciler.
The created Tailscale Service name is configurable via a new ProxyGroup
field spec.kubeAPISserver.ServiceName, which expects a string of the
form "svc:<dns-label>".
Lots of supporting changes were needed to implement this in a way that's
consistent with other operator workflows, including:
* Pulled containerboot's ensureServicesUnadvertised and certManager into
kube/ libraries to be shared with k8s-proxy. Use those in k8s-proxy to
aid Service cert sharing between replicas and graceful Service shutdown.
* For certManager, add an initial wait to the cert loop to wait until
the domain appears in the devices's netmap to avoid a guaranteed error
on the first issue attempt when it's quick to start.
* Made several methods in ingress-for-pg.go and svc-for-pg.go into
functions to share with the new reconciler
* Added a Resource struct to the owner refs stored in Tailscale Service
annotations to be able to distinguish between Ingress- and ProxyGroup-
based Services that need cleaning up in the Tailscale API.
* Added a ListVIPServices method to the internal tailscale client to aid
cleaning up orphaned Services
* Support for reading config from a kube Secret, and partial support for
config reloading, to prevent us having to force Pod restarts when
config changes.
* Fixed up the zap logger so it's possible to set debug log level.
Updates #13358
Change-Id: Ia9607441157dd91fb9b6ecbc318eecbef446e116
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
This commit removes the advertise command for service. The advertising is now embedded into
serve command and unadvertising is moved to drain subcommand
Fixestailscale/corp#22954
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: add clear subcommand for serve services
This commit adds a clear subcommand for serve command, to remove all config for a passed service.
This is a short cut for user to remove services after they drain a service. As an indipendent command
it would avoid accidently remove a service on typo.
Updates tailscale/corp#22954
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* update regarding comments
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* log when clearing a non-existing service but not error
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
---------
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
The tpmrm0 is a kernel-managed version of tpm0 that multiplexes multiple
concurrent connections. The basic tpm0 can only be accessed by one
application at a time, which can be pretty unreliable.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
* cmd/tailscale/cli: add drain subCommand for serve
This commit adds the drain subcommand for serving services. After we merge advertise and serve service as one step,
we now need a way to unadvertise service and this is it.
Updates tailscale/corp#22954
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* move runServeDrain and some update regarding pr comments
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* some code structure change
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
---------
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
Make it possible to dump the eventbus graph as JSON or DOT to both debug
and document what is communicated via the bus.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Package geo provides functionality to represent and process
geographical locations on a sphere. The main type, geo.Point,
represents a pair of latitude and longitude coordinates.
Updates tailscale/corp#29968
Signed-off-by: Simon Law <sfllaw@tailscale.com>
* cmd/tailscale/cli: Add service flag to serve command
This commit adds the service flag to serve command which allows serving a service and add the service
to the advertisedServices field in prefs (What advertise command does that will be removed later).
When adding proxies, TCP proxies and WEB proxies work the same way as normal serve, just under a
different DNSname. There is a services specific L3 serving mode called Tun, can be set via --tun flag.
Serving a service is always in --bg mode. If --bg is explicitly set t o false, an error message will
be sent out. The restriction on proxy target being localhost or 127.0.0.1 also applies to services.
When removing proxies, TCP proxies can be removed with type and port flag and off argument. Web proxies
can be removed with type, port, setPath flag and off argument. To align with normal serve, when setPath
is not set, all handler under the hostport will be removed. When flags are not set but off argument was
passed by user, it will be a noop. Removing all config for a service will be available later with a new
subcommand clear.
Updates tailscale/corp#22954
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: fix ai comments and fix a test
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: Add a test for addServiceToPrefs
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: fix comment
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* add dnsName in error message
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* change the cli input flag variable type
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* replace FindServiceConfig with map lookup
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* some code simplification and add asServiceName
This commit cotains code simplification for IsServingHTTPS, SetWebHandler, SetTCPForwarding
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* replace IsServiceName with tailcfg.AsServiceName
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* replace all assemble of host name for service with strings.Join
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: adjust parameter order and update output message
This commit updates the parameter order for IsTCPForwardingOnPort and SetWebHandler.
Also updated the message msgServiceIPNotAssigned to msgServiceWaitingApproval to adapt to
latest terminologies around services.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: flip bool condition
This commit fixes a previous bug added that throws error when serve funnel without service.
It should've been the opposite, which throws error when serve funnel with service.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: change parameter of IsTCPForwardingOnPort
This commit changes the dnsName string parameter for IsTCPForwardingOnPort to
svcName tailcfg.ServiceName. This change is made to reduce ambiguity when
a single service might have different dnsNames
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* ipn/ipnlocal: replace the key to webHandler for services
This commit changes the way we get the webhandler for vipServices. It used to use the host name
from request to find the webHandler, now everything targeting the vipService IP have the same
set of handlers. This commit also stores service:port instead of FQDN:port as the key in serviceConfig
for Web map.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: Updated use of service name.
This commit removes serviceName.IsEmpty and use direct comparison to instead. In legacy code, when an empty service
name needs to be passed, a new constant noService is passed. Removed redundant code for checking service name validity
and string method for serviceNameFlag.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: Update bgBoolFlag
This commit update field name, set and string method of bgBoolFlag to make code cleaner.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: remove isDefaultService output from srvTypeAndPortFromFlags
This commit removes the isDefaultService out put as it's no longer needed. Also deleted redundant code.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: remove unnessesary variable declare in messageForPort
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* replace bool output for AsServiceName with err
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: Replace DNSName with NoService if DNSname only used to identify service
This commit moves noService constant to tailcfg, updates AsServiceName to return tailcfg.NoService if the input
is not a valid service name. This commit also removes using the local DNSName as scvName parameter. When a function
is only using DNSName to identify if it's working with a service, the input in replaced with svcName and expect
caller to pass tailcfg.NoService if it's a local serve. This commit also replaces some use of Sprintf with
net.JoinHostPort for ipn.HostPort creation.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: Remove the returned error for AsServiceName
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* apply suggested code and comment
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* replace local dnsName in test with tailcfg.NoService
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* cmd/tailscale/cli: move noService back and use else where
The constant serves the purpose of provide readability for passing as a function parameter. It's
more meaningful comparing to a . It can just be an empty string in other places.
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
* ipn: Make WebHandlerExists and RemoveTCPForwarding accept svcName
This commit replaces two functions' string input with svcName input since they only use the dnsName to
identify service. Also did some minor cleanups
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
---------
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
With auto exit nodes enabled, the client picks exit nodes from the
ones advertised in the network map. Usually, it picks the one with the
highest priority score, but when the top spot is tied, it used to pick
randomly. Then, once it made a selection, it would strongly prefer to
stick with that exit node. It wouldn’t even consider another exit node
unless the client was shutdown or the exit node went offline. This is
to prevent flapping, where a client constantly chooses a different
random exit node.
The major problem with this algorithm is that new exit nodes don’t get
selected as often as they should. In fact, they wouldn’t even move
over if a higher scoring exit node appeared.
Let’s say that you have an exit node and it’s overloaded. So you spin
up a new exit node, right beside your existing one, in the hopes that
the traffic will be split across them. But since the client had this
strong affinity, they stick with the exit node they know and love.
Using rendezvous hashing, we can have different clients spread
their selections equally across their top scoring exit nodes. When an
exit node shuts down, its clients will spread themselves evenly to
their other equal options. When an exit node starts, a proportional
number of clients will migrate to their new best option.
Read more: https://en.wikipedia.org/wiki/Rendezvous_hashing
The trade-off is that starting up a new exit node may cause some
clients to move over, interrupting their existing network connections.
So this change is only enabled for tailnets with `traffic-steering`
enabled.
Updates tailscale/corp#29966
Fixes#16551
Signed-off-by: Simon Law <sfllaw@tailscale.com>
So that conn.PeerAwareEndpoint is always evaluated per-packet, rather
than at least once per packet batch.
Updates tailscale/corp#30042
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This is a follow-up to #15351, which fixed the test for Linux but not for
Darwin, which stores its "true" executable in /usr/bin instead of /bin.
Try both paths when not running on Windows.
In addition, disable CGo in the integration test build, which was causing the
linker to fail. These tests do not need CGo, and it appears we had some version
skew with the base image on the runners.
In addition, in error cases the recover step of the permissions check was
spuriously panicking and masking the "real" failure reason. Don't do that check
when a command was not produced.
Updates #15350
Change-Id: Icd91517f45c90f7554310ebf1c888cdfd109f43a
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Socket read errors currently close the server, so we need to understand
when and why they occur.
Updates tailscale/corp#27502
Updates tailscale/corp#30118
Signed-off-by: Jordan Whited <jordan@tailscale.com>
@nickkyl added an peer.Online check to suggestExitNodeUsingDERP, so it
should also check when running suggestExitNodeUsingTrafficSteering.
Updates tailscale/corp#29966
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Thanks to @nickkhyl for pointing out that NetMap.Peers doesn’t get
incremental updates since the last full NetMap update. Instead, he
recommends using ipn/ipnlocal.nodeBackend.AppendMatchingPeers.
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
A trusted peer relay path is always better than an untrusted direct or
peer relay path.
Updates tailscale/corp#30412
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This package promises more performance, but was never used.
The intent of the package is somewhat moot as "encoding/json"
in Go 1.25 (while under GOEXPERIMENT=jsonv2) has been
completely re-implemented using "encoding/json/v2"
such that unmarshal is dramatically faster.
Updates #cleanup
Updates tailscale/corp#791
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
udpRelayEndpointReady used to write into the peerMap, which required
holding Conn.mu, but this changed in f9e7131.
Updates #cleanup
Signed-off-by: Jordan Whited <jordan@tailscale.com>
To write the init script.
And fix the JetKVM detection to work during early boot while the filesystem
and modules are still being loaded; it wasn't being detected on early boot
and then tailscaled was failing to start because it didn't know it was on JetKVM
and didn't modprobe tun.
Updates #16524
Change-Id: I0524ca3abd7ace68a69af96aab4175d32c07e116
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When `tailscale exit-node suggest` contacts the LocalAPI for a
suggested exit node, the client consults its netmap for peers that
contain the `suggest-exit-node` peercap. It currently uses a series of
heuristics to determine the exit node to suggest.
When the `traffic-steering` feature flag is enabled on its tailnet,
the client will defer to Control’s priority scores for a particular
peer. These scores, in `tailcfg.Hostinfo.Location.Priority`, were
historically only used for Mullvad exit nodes, but they have now been
extended to score any peer that could host a redundant resource.
Client capability version 119 is the earliest client that understands
these traffic steering scores. Control tells the client to switch to
rely on these scores by adding `tailcfg.NodeAttrTrafficSteering` to
its `AllCaps`.
Updates tailscale/corp#29966
Signed-off-by: Simon Law <sfllaw@tailscale.com>
In this PR, we make ExitNode.AllowOverride configurable as part of the Exit Node ADMX policy setting,
similarly to Always On w/ "Disconnect with reason" option.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Updates the output for "tailscale ping" to indicate if a peer relay was traversed, just like the output for DERP or direct connections.
Fixestailscale/corp#30034
Signed-off-by: Dylan Bargatze <dylan@tailscale.com>
To signal when a tailnet has the `traffic-steering` feature flag,
Control will send a `traffic-steering` NodeCapability in netmap’s
AllCaps.
This patch adds `tailcfg.NodeAttrTrafficSteering` so that it can be
used in the control plane. Future patches will implement the actual
steering mechanisms.
Updates tailscale/corp#29966
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This commit modifies the k8s-operator and k8s-proxy to support passing down
the accept-routes configuration from the proxy class as a configuration value
read and used by the k8s-proxy when ran as a distinct container managed by
the operator.
Updates #13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit modifies the k8s proxy application configuration to include a
new field named `ServerURL` which, when set, modifies
the tailscale coordination server used by the proxy. This works in the same
way as the operator and the proxies it deploys.
If unset, the default coordination server is used.
Updates https://github.com/tailscale/tailscale/issues/13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
This commit modifies the operator to detect the usage of k8s-apiserver
type proxy groups that wish to use the letsencrypt staging directory and
apply the appropriate environment variable to the statefulset it
produces.
Updates #13358
Signed-off-by: David Bond <davidsbond93@gmail.com>
Errors were mashalled without the correct newlines. Also, they could
generally be mashalled with more data, so an intermediate was introduced
to make them slightly nicer to look at.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
If the NodeAttrDisableRelayClient node attribute is set, ensures that a node cannot allocate endpoints on a UDP relay server itself, and cannot use newly-discovered paths (via disco/CallMeMaybeVia) that traverse a UDP relay server.
Fixestailscale/corp#30180
Signed-off-by: Dylan Bargatze <dylan@tailscale.com>
If the GUI receives a new exit node ID before the new netmap, it may treat the node as offline or invalid
if the previous netmap didn't include the peer at all, or if the peer was offline or not advertised as an exit node.
This may result in briefly issuing and dismissing a warning, or a similar issue, which isn't ideal.
In this PR, we change the operation order to send the new netmap to clients first before selecting the new exit node
and notifying them of the Exit Node change.
Updates tailscale/corp#30252 (an old issue discovered during testing this)
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We already check this for cases where ipn.Prefs.AutoExitNode is configured via syspolicy.
Configuring it directly through EditPrefs should behave the same, so we add a test for that as well.
Additionally, we clarify the implementation and future extensibility in (*LocalBackend).resolveAutoExitNodeLocked,
where the AutoExitNode is actually enforced.
Updates tailscale/corp#29969
Updates #cleanup
Signed-off-by: Nick Khyl <nickk@tailscale.com>
If the specified exit node string starts with "auto:" (i.e., can be parsed as an ipn.ExitNodeExpression),
we update ipn.Prefs.AutoExitNode instead of ipn.Prefs.ExitNodeID.
Fixes#16459
Signed-off-by: Nick Khyl <nickk@tailscale.com>
So it can be used from the CLI without importing ipnlocal.
While there, also remove isAutoExitNodeID, a wrapper around parseAutoExitNodeID
that's no longer used.
Updates tailscale/corp#29969
Updates #cleanup
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The observed generation was set to always 0 in #16429, but this had the
knock-on effect of other controllers considering ProxyGroups never ready
because the observed generation is never up to date in
proxyGroupCondition. Make sure the ProxyGroupAvailable function does not
requires the observed generation to be up to date, and add testing
coverage to catch regressions.
Updates #16327
Change-Id: I42f50ad47dd81cc2d3c3ce2cd7b252160bb58e40
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Adds a new k8s-proxy command to convert operator's in-process proxy to
a separately deployable type of ProxyGroup: kube-apiserver. k8s-proxy
reads in a new config file written by the operator, modelled on tailscaled's
conffile but with some modifications to ensure multiple versions of the
config can co-exist within a file. This should make it much easier to
support reading that config file from a Kube Secret with a stable file name.
To avoid needing to give the operator ClusterRole{,Binding} permissions,
the helm chart now optionally deploys a new static ServiceAccount for
the API Server proxy to use if in auth mode.
Proxies deployed by kube-apiserver ProxyGroups currently work the same as
the operator's in-process proxy. They do not yet leverage Tailscale Services
for presenting a single HA DNS name.
Updates #13358
Change-Id: Ib6ead69b2173c5e1929f3c13fb48a9a5362195d8
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Based on feedback that it wasn't clear what the user is meant to do with
the output of the last command, clarify that it's an optional command to
explore what got created.
Updates #13427
Change-Id: Iff64ec6d02dc04bf4bbebf415d7ed1a44e7dd658
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
When running `tailscale exit-node list`, an empty city or country name
should be displayed as a hyphen "-". However, this only happened when
there was no location at all. If a node provides a Hostinfo.Location,
then the list would display exactly what was provided.
This patch changes the listing so that empty cities and countries will
either render the provided name or "-".
Fixes#16500
Signed-off-by: Simon Law <sfllaw@tailscale.com>
When the policy setting is enabled, it allows users to override the exit node enforced by the ExitNodeID
or ExitNodeIP policy. It's primarily intended for use when ExitNodeID is set to auto:any, but it can also
be used with specific exit nodes. It does not allow disabling exit node usage entirely.
Once the exit node policy is overridden, it will not be enforced again until the policy changes,
the user connects or disconnects Tailscale, switches profiles, or disables the override.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Now that applySysPolicy is only called by (*LocalBackend).reconcilePrefsLocked,
we can make it a method to avoid passing state via parameters and to support
future extensibility.
Also factor out exit node-specific logic into applyExitNodeSysPolicyLocked.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Now that resolveExitNodeInPrefsLocked is the only caller of setExitNodeID,
and setExitNodeID is the only caller of resolveExitNodeIP, we can restructure
the code with resolveExitNodeInPrefsLocked now calling both
resolveAutoExitNodeLocked and resolveExitNodeIPLocked directly.
This prepares for factoring out resolveAutoExitNodeLocked and related
auto-exit-node logic into an ipnext extension in a future commit.
While there, we also update exit node by IP lookup to use (*nodeBackend).NodeByAddr
and (*nodeBackend).NodeByID instead of iterating over all peers in the most recent netmap.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we start passing a LocalAPI actor to (*LocalBackend).Logout to make it subject
to the same access check as disconnects made via tailscale down or the GUI.
We then update the CLI to allow `tailscale logout` to accept a reason, similar to `tailscale down`.
Updates tailscale/corp#26249
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Since a [*lazyEndpoint] makes wireguard-go responsible for peer ID, but
wireguard-go may not yet be configured for said peer, we need a JIT hook
around initiation message reception to call what is usually called from
an [*endpoint].
Updates tailscale/corp#30042
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We can't relay a packet received over the IPv4 socket back out the same
socket if destined to an IPv6 address, and vice versa.
Updates tailscale/corp#30206
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We extract checkEditPrefsAccessLocked, adjustEditPrefsLocked, and onEditPrefsLocked from the EditPrefs
execution path, defining when each step is performed and what behavior is allowed at each stage.
Currently, this is primarily used to support Always On mode, to handle the Exit Node enablement toggle,
and to report prefs edit metrics.
We then use it to enforce Exit Node policy settings by preventing users from setting an exit node
and making EditPrefs return an error when an exit node is restricted by policy. This enforcement is also
extended to the Exit Node toggle.
These changes prepare for supporting Exit Node overrides when permitted by policy and preventing logout
while Always On mode is enabled.
In the future, implementation of these methods can be delegated to ipnext extensions via the feature hooks.
Updates tailscale/corp#29969
Updates tailscale/corp#26249
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We have several places where we call applySysPolicy, suggestExitNodeLocked, and setExitNodeID.
While there are cases where we want to resolve the exit node specifically, such as when network
conditions change or a new netmap is received, we typically need to perform all three steps.
For example, enforcing policy settings may enable auto exit nodes or set an ExitNodeIP,
which in turn requires picking a suggested exit node or resolving the IP to an ID, respectively.
In this PR, we introduce (*LocalBackend).resolveExitNodeInPrefsLocked and (*LocalBackend).reconcilePrefsLocked,
with the latter calling both applySysPolicy and resolveExitNodeInPrefsLocked.
Consolidating these steps into a single extensibility point would also make it easier to support
future hooks registered by ipnext extensions.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we update setExitNodeID to retain the existing exit node if auto exit node is enabled,
the current exit node is allowed by policy, and no suggested exit node is available yet.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Now that (*LocalBackend).suggestExitNodeLocked is never called with a non-current netmap
(the netMap parameter is always nil, indicating that the current netmap should be used),
we can remove the unused parameter.
Additionally, instead of suggestExitNodeLocked passing the most recent full netmap to suggestExitNode,
we now pass the current nodeBackend so it can access peers with delta updates applied.
Finally, with that fixed, we no longer need to skip TestUpdateNetmapDeltaAutoExitNode.
Updates tailscale/corp#29969
Fixes#16455
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we add (*LocalBackend).RefreshExitNode which determines which exit node
to use based on the current prefs and netmap and switches to it if needed. It supports
both scenarios when an exit node is specified by IP (rather than ID) and needs to be resolved
once the netmap is ready as well as auto exit nodes.
We then use it in (*LocalBackend).SetControlClientStatus when the netmap changes,
and wherever (*LocalBackend).pickNewAutoExitNode was previously used.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
TCP connections are two unidirectional data streams, and if one of these
streams closes, we should not assume the other half is closed as well.
For example, if an HTTP client closes its write half of the connection
early, it may still be expecting to receive data on its read half, so we
should keep the server -> client half of the connection open, while
terminating the client -> server half.
Fixestailscale/corp#29837.
Signed-off-by: Naman Sood <mail@nsood.in>
These were flipped. DstIP() and DstIPBytes() are used internally by
wireguard-go as part of a handshake DoS mitigation strategy.
Updates tailscale/corp#20732
Updates tailscale/corp#30042
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Just make [relayManager] always handle it, there's no benefit to
checking bestAddr's.
Also, remove passing of disco.Pong to [relayManager] in
endpoint.handlePongConnLocked(), which is redundant with the callsite in
Conn.handleDiscoMessage(). Conn.handleDiscoMessage() already passes to
[relayManager] if the txID us not known to any [*endpoint].
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
A lazyEndpoint may end up on this TX codepath when wireguard-go is
deemed "under load" and ends up transmitting a cookie reply using the
received conn.Endpoint.
Updates tailscale/corp#20732
Updates tailscale/corp#30042
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This commit modifies the k8s operator to allow for customisation of the ingress class name
via a new `OPERATOR_INGRESS_CLASS_NAME` environment variable. For backwards compatibility,
this defaults to `tailscale`.
When using helm, a new `ingress.name` value is provided that will set this environment variable
and modify the name of the deployed `IngressClass` resource.
Fixes https://github.com/tailscale/tailscale/issues/16248
Signed-off-by: David Bond <davidsbond93@gmail.com>
Refactors setting status into its own top-level function to make it
easier to ensure we _always_ set the status if it's changed on every
reconcile. Previously, it was possible to have stale status if some
earlier part of the provision logic failed.
Updates #16327
Change-Id: Idab0cfc15ae426cf6914a82f0d37a5cc7845236b
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Inverts the nodeAttrs related to UDP relay client/server enablement to disablement, and fixes up the corresponding logic that uses them. Also updates the doc comments on both nodeAttrs.
Fixestailscale/corp#30024
Signed-off-by: Dylan Bargatze <dylan@tailscale.com>
This commit modifies the operator helm chart values to bring the newly
added `loginServer` field to the top level. We felt as though it was a bit
confusing to be at the `operatorConfig` level as this value modifies the
behaviour or the operator, api server & all resources that the operator
manages.
Updates https://github.com/tailscale/corp/issues/29847
Signed-off-by: David Bond <davidsbond93@gmail.com>
With this change, policy enforcement and exit node resolution can happen in separate steps,
since enforcement no longer depends on resolving the suggested exit node. This keeps policy
enforcement synchronous (e.g., when switching profiles), while allowing exit node resolution
to be asynchronous on netmap updates, link changes, etc.
Additionally, the new preference will be used to let GUIs and CLIs switch back to "auto" mode
after a manual exit node override, which is necessary for tailscale/corp#29969.
Updates tailscale/corp#29969
Updates #16459
Signed-off-by: Nick Khyl <nickk@tailscale.com>
TestSetControlClientStatusAutoExitNode is broken similarly to TestUpdateNetmapDeltaAutoExitNode
as suggestExitNode didn't previously check the online status of exit nodes, and similarly to the other test
it succeeded because the test itself is also broken.
However, it is easier to fix as it sends out a full netmap update rather than a delta peer update,
so it doesn't depend on the same refactoring as TestSetControlClientStatusAutoExitNode.
Updates #16455
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
suggestExitNode never checks whether an exit node candidate is online.
It also accepts a full netmap, which doesn't include changes from delta updates.
The test can't work correctly until both issues are fixed.
Previously, it passed only because the test itself is flawed.
It doesn't succeed because the currently selected node goes offline and a new one is chosen.
Instead, it succeeds because lastSuggestedExitNode is incorrect, and suggestExitNode picks
the correct node the first time it runs, based on the DERP map and the netcheck report.
The node in exitNodeIDWant just happens to be the optimal choice.
Fixing SuggestExitNode requires refactoring its callers first, which in turn reveals the flawed test,
as suggestExitNode ends up being called slightly earlier.
In this PR, we update the test to correctly fail due to existing bugs in SuggestExitNode,
and temporarily skip it until those issues are addressed in a future commit.
Updates #16455
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
(*profileManager).CurrentPrefs() is always valid. Additionally, there's no value in cloning
and passing the full ipn.Prefs when editing preferences. Instead, ipn.MaskedPrefs should
only have ExitNodeID set.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Currently, (*LocalBackend).pickNewAutoExitNode() is just a wrapper around
setAutoExitNodeIDLockedOnEntry that sends a prefs-change notification at the end.
It doesn't need to do that, since setPrefsLockedOnEntry already sends the notification
(setAutoExitNodeIDLockedOnEntry calls it via editPrefsLockedOnEntry).
This PR removes the old pickNewAutoExitNode function and renames
setAutoExitNodeIDLockedOnEntry to pickNewAutoExitNode for clarity.
Updates tailscale/corp#29969
Signed-off-by: Nick Khyl <nickk@tailscale.com>
When dialed with just an URL and no node, the recent proxy fixes caused
a regression where there was no TLS server name being included.
Updates #16222
Updates #16223
Signed-off-by: James Tucker <james@tailscale.com>
Co-Authored-by: Jordan Whited <jwhited@tailscale.com>
This commit modifies the kubernetes operator to allow for customisation of the tailscale
login url. This provides some data locality for people that want to configure it.
This value is set in the `loginServer` helm value and is propagated down to all resources
managed by the operator. The only exception to this is recorder nodes, where additional
changes are required to support modifying the url.
Updates https://github.com/tailscale/corp/issues/29847
Signed-off-by: David Bond <davidsbond93@gmail.com>
Cryptokey Routing identification is now required to set an [epAddr] into
the peerMap for Geneve-encapsulated [epAddr]s.
Updates tailscale/corp#27502
Updates tailscale/corp#29422
Updates tailscale/corp#30042
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Report whether the client is configured with state encryption (which
varies by platform and can be optional on some). Wire it up to
`--encrypt-state` in tailscaled, which is set for Linux/Windows, and set
defaults for other platforms. Macsys will also report this if full
Keychain migration is done.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
We would like to start sending whether a node is a Tailnet owner in netmap responses so that clients can determine what information to display to a user who wants to request account deletion.
Updates tailscale/corp#30016
Signed-off-by: kari-ts <kari@tailscale.com>
Instead of calculating the PeerAPI URL at the time that we add the peer,
we now calculate it on every access to the peer. This way, if we
initially did not have a shared address family with the peer, but
later do, this allows us to access the peer at that point. This
follows the pattern from other places where we access the peer API,
which also calculate the URL on an as-needed basis.
Additionally, we now show peers as not Available when we can't get
a peer API URL.
Lastly, this moves some of the more frequent verbose Taildrive logging
from [v1] to [v2] level.
Updates #29702
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This allows logging the following Taildrive behavior from the client's perspective
when --verbose=1:
- Initialization of Taildrive remotes for every peer
- Peer availability checks
- All HTTP requests to peers (not just GET and PUT)
Updates tailscale/corp#29702
Signed-off-by: Percy Wegmann <percy@tailscale.com>
Changes to our src/address family can trigger blackholes.
This commit also adds a missing set of trustBestAddrUntil when setting
a UDP relay path as bestAddr.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
* cmd/k8s-operator: ProxyClass annotation for Services and Ingresses
Previously, the ProxyClass could only be configured for Services and
Ingresses via a Label. This adds the ability to set it via an
Annotation, but prioritizes the Label if both a Label and Annotation are
set.
Updates #14323
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
* Update cmd/k8s-operator/operator.go
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
* Update cmd/k8s-operator/operator.go
Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
* cmd/k8s-operator: ProxyClass annotation for Services and Ingresses
Previously, the ProxyClass could only be configured for Services and
Ingresses via a Label. This adds the ability to set it via an
Annotation, but prioritizes the Label if both a Label and Annotation are
set.
Updates #14323
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
---------
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Signed-off-by: Tom Meadows <tom@tmlabs.co.uk>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
Replace the existing systray_start counter metrics with a
systray_running gauge metrics.
This also adds an IncrementGauge method to local client to parallel
IncrementCounter. The LocalAPI handler supports both, we've just never
added a client method for gauges.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
The server-side code already does e.g. "nodeid:%d" instead of "%x"
and as a result we have to second guess a lot of identifiers that could
be hex or decimal.
This stops the bleeding and means in a year and change we'll stop
seeing the hex forms.
Updates tailscale/corp#29827
Change-Id: Ie5785a07fc32631f7c949348d3453538ab170e6d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Otherwise we can end up mirroring packets to them forever. We may
eventually want to relax this to direct paths as well, but start with
UDP relay paths, which have a higher chance of becoming untrusted and
never working again, to be conservative.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We dropped the idea of the Experimental release stage in
tailscale/tailscale-www#7697, in favour of Community Projects.
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This method is only needed to migrate between store.FileStore and
tpm.tpmStore. We can make a runtime type assertion instead of
implementing an unused method for every platform.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This was previously hooked around direct UDP path discovery /
CallMeMaybe transmission, and related conditions. Now it is subject to
relay-specific considerations.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Previously, the operator checked the ProxyGroup status fields for
information on how many of the proxies had successfully authed. Use
their state Secrets instead as a more reliable source of truth.
containerboot has written device_fqdn and device_ips keys to the
state Secret since inception, and pod_uid since 1.78.0, so there's
no need to use the API for that data. Read it from the state Secret
for consistency. However, to ensure we don't read data from a
previous run of containerboot, make sure we reset containerboot's
state keys on startup.
One other knock-on effect of that is ProxyGroups can briefly be
marked not Ready while a Pod is restarting. Introduce a new
ProxyGroupAvailable condition to more accurately reflect
when downstream controllers can implement flows that rely on a
ProxyGroup having at least 1 proxy Pod running.
Fixes#16327
Change-Id: I026c18e9d23e87109a471a87b8e4fb6271716a66
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Relay handshakes may now occur multiple times over the lifetime of a
relay server endpoint. Handshake messages now include a handshake
generation, which is client specified, as a means to trigger safe
challenge reset server-side.
Relay servers continue to enforce challenge values as single use. They
will only send a given value once, in reply to the first arriving bind
message for a handshake generation.
VNI has been added to the handshake messages, and we expect the outer
Geneve header value to match the sealed value upon reception.
Remote peer disco pub key is now also included in handshake messages,
and it must match the receiver's expectation for the remote,
participating party.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Add a new `--encrypt-state` flag to `cmd/tailscaled`. Based on that
flag, migrate the existing state file to/from encrypted format if
needed.
Updates #15830
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
GitHub used to recommend the tibdex/github-app-token GitHub Action
until they wrote their own actions/create-github-app-token.
This patch replaces the use of the third-party action with the
official one.
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
go(1) repsects GOROOT if set, but tool/go / gocross-wrapper.sh are explicitly intending to use our toolchain. We don't need to set GOROOT, just unset it, and then go(1) handles the rest.
Updates tailscale/corp#26717
Signed-off-by: James Tucker <james@tailscale.com>
For any changes that involve DERP, automatically add the
@tailscale/dataplane team as a reviewer.
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Premature cancellation was preventing the work from ever being cleaned
up in runLoop().
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
SSH was disabled in #10538
Exit node was disabled in #13726
This enables ssh and exit-node options in case of Home Assistant.
Fixes#15552
Signed-off-by: Laszlo Magyar <lmagyar1973@gmail.com>
This commit adds a NOTES.txt to the operator helm chart that will be written to the
terminal upon successful installation of the operator.
It includes a small list of knowledgebase articles with possible next steps for
the actor that installed the operator to the cluster. It also provides possible
commands to use for explaining the custom resources.
Fixes#13427
Signed-off-by: David Bond <davidsbond93@gmail.com>
Instead of every module having to come up with a set of test methods for
the event bus, this handful of test helpers hides a lot of the needed
setup for the testing of the event bus.
The tests in portmapper is also ported over to the new helpers.
Updates #15160
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
20.04 is no longer supported.
This pulls in changes to the QDK package that were required to make build succeed on 24.04.
Updates https://github.com/tailscale/corp/issues/29849
Signed-off-by: Percy Wegmann <percy@tailscale.com>
After the switch to 24.04, unsigned packages did not build correctly (came out as only a few KBs).
Fixestailscale/tailscale-qpkg#148
Signed-off-by: Percy Wegmann <percy@tailscale.com>
If we acted as the allocator we are responsible for signaling it to the
remote peer in a CallMeMaybeVia message over DERP.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
udprelay.Server is lazily initialized when the first request is received
over peerAPI. These early requests have a high chance of failure until
the first address discovery cycle has completed.
Return an ErrServerNotReady error until the first address discovery
cycle has completed, and plumb retry handling for this error all the
way back to the client in relayManager.
relayManager can now retry after a few seconds instead of waiting for
the next path discovery cycle, which could take another minute or
longer.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Any return underneath this select case must belong to a type switch case.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This tweaks the just-added ./tool/go.{cmd,ps1} port of ./tool/go for
Windows.
Otherwise in Windows Terminal in Powershell, running just ".\tool\go"
picks up go.ps1 before go.cmd, which means execution gets denied
without the cmd script's -ExecutionPolicy Bypass part letting it work.
This makes it work in both cmd.exe and in Powershell.
Updates tailscale/corp#28679
Change-Id: Iaf628a9fd6cb95670633b2dbdb635dfb8afaa006
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
go.cmd lets you run just "./tool/go" on Windows the same as Linux/Darwin.
The batch script (go.md) then just invokes PowerShell which is more
powerful than batch.
I wanted this while debugging Windows CI performance by reproducing slow
tests on my local Windows laptop.
Updates tailscale/corp#28679
Change-Id: I6e520968da3cef3032091c1c4f4237f663cefcab
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It's one of the slower ones, so split it up into chunks.
Updates tailscale/corp#28679
Change-Id: I16a5ba667678bf238c84417a51dda61baefbecf7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Proxies know how to reload configfile on changes since 1.80, which
is going to be the earliest supported proxy version with 1.84 operator,
so remove the mechanism that was updating configfile hash to force
proxy Pod restarts on config changes.
Updates #13032
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
I earlier thought this saved a second of CPU even on a fast machine,
but I think when I was previously measuring, I still had a 4096 bit
RSA key being generated in the code I was measuring.
Measuring again for this, it's plenty fast.
Prep for using this package more, for derp, etc.
Updates #16315
Change-Id: I4c9008efa9aa88a3d65409d6ffd7b3807f4d75e9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This reverts commit 6a93b17c8c.
The reverted commit added more complexity than it was worth at the
current stage. Handling delta CapVer changes requires extensive changes
to relayManager datastructures in order to also support delta updates of
relay servers.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
If you had HTTPS_PROXY=https://some-valid-cert.example.com running a
CONNECT proxy, we should've been able to do a TLS CONNECT request to
e.g. controlplane.tailscale.com:443 through that, and I'm pretty sure
it used to work, but refactorings and lack of integration tests made
it regress.
It probably regressed when we added the baked-in LetsEncrypt root cert
validation fallback code, which was testing against the wrong hostname
(the ultimate one, not the one which we were being asked to validate)
Fixes#16222
Change-Id: If014e395f830e2f87f056f588edacad5c15e91bc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Resolver.PreferGo didn't used to work on Windows.
It was fixed in 2022, though. (https://github.com/golang/go/issues/33097)
Updates #5161
Change-Id: I4e1aeff220ebd6adc8a14f781664fa6a2068b48c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This patch contains the following cleanups:
1. Simplify `ffcli.Command` definitions;
2. Word-wrap help text, consistent with other commands;
3. `tailscale dns --help` usage makes subcommand usage more obvious;
4. `tailscale dns query --help` describes DNS record types.
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
We aim to make the tsgo directories be read-only mounts on builders.
But gocross was previously writing within the ~/.cache/tsgo/$HASH/
directories to make the synthetic GOROOT directories.
This moves them to ~/.cache/tsgoroot/$HASH/ instead.
Updates tailscale/corp#28679
Updates tailscale/corp#26717
Change-Id: I0d17730bbdce3d6374e79d49486826575d4690af
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The caller of client.RunWatchConnectionLoop may need to be
aware of errors that occur within loop. Add a channel
that notifies of errors to the caller to allow for
decisions to be make as to the state of the client.
Updates tailscale/corp#25756
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Make the OS-specific staticcheck jobs only test stuff that's specialized
for that OS. Do that using a new ./tool/listpkgs program that's a fancy
'go list' with more filtering flags.
Updates tailscale/corp#28679
Change-Id: I790be2e3a0b42b105bd39f68c4b20e217a26de60
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Tests that go mod version matches ./tool/go version.
Mismatched versions result in incosistent Go versions being used i.e.
in CI jobs as the version in go.mod is used to determine what Go version
Github actions pull in.
Updates #16283
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
gocross is not needed like it used to be, now that Go does
version stamping itself.
We keep it for the xcode and Windows builds for now.
This simplifies things in the build, especially with upcoming build
system updates.
Updates tailscale/corp#28679
Updates tailscale/corp#26717
Change-Id: Ib4bebe6f50f3b9c3d6cd27323fca603e3dfb43cc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If natc is running on a host with tailscale using `--accept-dns=true`
then a DNS loop can occur. Provide a flag for some specific DNS
upstreams for natc to use instead, to overcome such situations.
Updates #14667
Signed-off-by: James Tucker <james@tailscale.com>
eventbus.Publish() calls newPublisher(), which in turn invokes (*Client).addPublisher().
That method adds the new publisher to c.pub, so we don’t need to add it again in eventbus.Publish.
Updates #cleanup
Signed-off-by: Nick Khyl <nickk@tailscale.com>
nodeBackend now publishes filter and node changes to eventbus topics
that are consumed by magicsock.Conn
Updates tailscale/corp#27502
Updates tailscale/corp#29543
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Ensure that if the ProxyGroup for HA Ingress changes, the TLS Secret
and Role and RoleBinding that allow proxies to read/write to it are
updated.
Fixes#16259
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
We update LocalBackend to shut down the current nodeBackend
when switching to a different node, and to mark the new node's
nodeBackend as ready when the switch completes.
Updates tailscale/corp#28014
Updates tailscale/corp#29543
Updates #12614
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This means the caller does not have to remember to close the reader, and avoids
having to duplicate the logic to decode JSON into events.
Updates #15160
Change-Id: I20186fabb02f72522f61d5908c4cc80b86b8936b
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Record dropped packets as soon as they time out, rather than after tx
record queues spill over, this will more accurately capture small
amounts of packet loss in a timely fashion.
Updates tailscale/corp#24522
Signed-off-by: James Tucker <james@tailscale.com>
The first packet fragment guard had an additional guard clause that was
incorrectly comparing a length in bytes to a length in octets, and was
also comparing what should have been an entire IPv4 through transport
header length to a subprotocol payload length. The subprotocol header
size guards were otherwise protecting against short transport headers,
as is the conservative non-first fragment minimum offset size. Add an
explicit disallowing of fragmentation for TSMP for the avoidance of
doubt.
Updates #cleanup
Updates #5727
Signed-off-by: James Tucker <james@tailscale.com>
During a short period of packet loss, a TCP connection to the home DERP
may be maintained. If no other regions emerge as winners, such as when
all regions but one are avoided/disallowed as candidates, ensure that
the current home region, if still active, is not dropped as the
preferred region until it has failed two keepalives.
Relatedly apply avoid and no measure no home to ICMP and HTTP checks as
intended.
Updates tailscale/corp#12894
Updates tailscale/corp#29491
Signed-off-by: James Tucker <james@tailscale.com>
The relay server now fetches IPs from local interfaces and external
perspective IP:port's via netcheck (STUN).
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Which can make operating the service more convenient.
It makes sense to put the cluster state with this if specified, so
rearrange the logic to handle that.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
This enables us to mark nodes as relay capable or not. We don't actually
do that yet, as we haven't established a relay CapVer.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Add mesh key support to derpprobe for
probing derpers with verify set to true.
Move MeshKey checking to central point for code reuse.
Fix a bad error fmt msg.
Fixestailscale/corp#27294Fixestailscale/corp#25756
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
We already present a health warning about this, but it is easy to miss
on a server when blackholing traffic makes it unreachable.
In addition to a health warning, present a risk message when exit node
is enabled.
Example:
```
$ tailscale up --exit-node=lizard
The following issues on your machine will likely make usage of exit nodes impossible:
- interface "ens4" has strict reverse-path filtering enabled
- interface "tailscale0" has strict reverse-path filtering enabled
Please set rp_filter=2 instead of rp_filter=1; see https://github.com/tailscale/tailscale/issues/3310
To skip this warning, use --accept-risk=linux-strict-rp-filter
$
```
Updates #3310
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
It might complete, interrupting it reduces the chances of establishing a
relay path.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This is simply for consistency with relayManagerInputEvent(), which
should be the sole launcher of runLoop().
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
relayManager can now hand endpoint a relay epAddr for it to consider
as bestAddr.
endpoint and Conn disco ping/pong handling are now VNI-aware.
Updates tailscale/corp#27502
Updates tailscale/corp#29422
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This commit fixes the bug that c2n requests are skiped when updating vipServices in serveConfig. This then resulted
netmap update being skipped which caused inaccuracy of Capmap info on client side. After this fix, client always
inform control about it's vipServices config changes.
Fixestailscale/corp#29219
Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
This commit adds a new type to magicsock, epAddr, which largely ends up
replacing netip.AddrPort in packet I/O paths throughout, enabling
Geneve encapsulation over UDP awareness.
The conn.ReceiveFunc for UDP has been revamped to fix and more clearly
distinguish the different classes of packets we expect to receive: naked
STUN binding messages, naked disco, naked WireGuard, Geneve-encapsulated
disco, and Geneve-encapsulated WireGuard.
Prior to this commit, STUN matching logic in the RX path could swallow
a naked WireGuard packet if the keypair index, which is randomly
generated, happened to overlap with a subset of the STUN magic cookie.
Updates tailscale/corp#27502
Updates tailscale/corp#29326
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Fragmented datagrams would be processed instead of being dumped right
away. In reality, thse datagrams would be dropped anyway later so there
should functionally not be any change. Additionally, the feature is off
by default.
Closes#16203
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Also add a trailing newline to error banners so that SSH client messages don't print on the same line.
Updates tailscale/corp#29138
Signed-off-by: Percy Wegmann <percy@tailscale.com>
- Add tsidp target to build_docker.sh for standard Tailscale image builds
- Add publishdevtsidp Makefile target for development image publishing
- Remove Dockerfile, using standard build process
- Include tsidp in depaware dependency tracking
- Update README with comprehensive Docker usage examples
This enables tsidp to be built and published like other Tailscale components
(tailscale/tailscale, tailscale/k8s-operator, tailscale/k8s-nameserver).
Fixes#16077
Signed-off-by: Raj Singh <raj@tailscale.com>
Our conn.Bind implementation is updated to make Send() offset-aware for
future VXLAN/Geneve encapsulation support.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Fix CompareAndSwap in the edge-case where
the underlying sync.AtomicValue is uninitialized
(i.e., Store was never called) and
the oldV is the zero value,
then perform CompareAndSwap with any(nil).
Also, document that T must be comparable.
This is a pre-existing restriction.
Fixes#16135
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
In 1.84 we made 'tailscale set'/'tailscale up' error out if duplicate
command line flags are passed.
This broke some container configurations as we have two env vars that
can be used to set --accept-dns flag:
- TS_ACCEPT_DNS- specifically for --accept-dns
- TS_EXTRA_ARGS- accepts any arbitrary 'tailscale up'/'tailscale set'
flag.
We default TS_ACCEPT_DNS to false (to make the container behaviour more
declarative), which with the new restrictive CLI behaviour resulted in
failure for users who had set --accept-dns via TS_EXTRA_ARGS as the flag would be
provided twice.
This PR re-instates the previous behaviour by checking if TS_EXTRA_ARGS
contains --accept-dns flag and if so using its value to override TS_ACCEPT_DNS.
Updates tailscale/tailscale#16108
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
This adds SmallSet.SoleElement, which I need in another repo for
efficiency. I added tests, but those tests failed because Add(1) +
Add(1) was promoting the first Add's sole element to a map of one
item. So fix that, and add more tests.
Updates tailscale/corp#29093
Change-Id: Iadd5ad08afe39721ee5449343095e389214d8389
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Using WINHTTP_AUTOPROXY_ALLOW_AUTOCONFIG on Windows versions older than Windows 10 1703 (build 15063)
is not supported and causes WinHttpGetProxyForUrl to fail with ERROR_INVALID_PARAMETER. This results in failures
reaching the control on environments where a proxy is required.
We use wingoes version detection to conditionally set the WINHTTP_AUTOPROXY_ALLOW_AUTOCONFIG flag
on Windows builds greater than 15063.
While there, we also update proxy detection to use WINHTTP_AUTO_DETECT_TYPE_DNS_A, as DNS-based proxy discovery
might be required with Active Directory and in certain other environments.
Updates tailscale/corp#29168
Fixes#879
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The field must only be accessed while holding LocalBackend's mutex,
but there are two places where it's accessed without the mutex:
- (LocalBackend).MaybeClearAppConnector()
- handleC2NAppConnectorDomainRoutesGet()
Fixes#16123
Signed-off-by: Nick Khyl <nickk@tailscale.com>
As note in the comment, it now being more than six months since this was
deprecated and there being no (further) uses of the old pattern in our internal
services, let's drop the migrator.
Updates #cleanup
Change-Id: Ie4fb9518b2ca04a9b361e09c51cbbacf1e2633a8
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
fixestailscale/corp#25612
We now keep track of any dns configurations which we could not
compile. This gives RecompileDNSConfig a configuration to
attempt to recompile and apply when the OS pokes us to indicate
that the interface dns servers have changed/updated. The manager config
will remain unset until we have the required information to compile
it correctly which should eliminate the problematic SERVFAIL
responses (especially on macOS 15).
This also removes the missingUpstreamRecovery func in the forwarder
which is no longer required now that we have proper error handling
and recovery manager and the client.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
relayManager is responsible for disco ping/pong probing of relay
endpoints once a handshake is complete.
Future work will enable relayManager to set a relay endpoint as the best
UDP path on an endpoint if appropriate.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Fix the wireshark lua dissector to support 0 bit position
and not throw modulo div by 0 errors.
Add new disco frame types to the decoder.
Updates tailscale/corp#29036
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Add comprehensive web interface at ui for managing OIDC clients, similar to tsrecorder's design. Features include list view, create/edit forms with validation, client secret management, delete functionality with confirmation dialogs, responsive design, and restricted tailnet access only.
Fixes#16067
Signed-off-by: Raj Singh <raj@tailscale.com>
In accordance with the OIDC/OAuth 2.0 protocol, do not send an empty
refresh_token and instead omit the field when empty.
Fixes https://github.com/tailscale/tailscale/issues/16073
Signed-off-by: Tim Klocke <taaem@mailbox.org>
Previously, a missing or invalid `dns` parameter on GET `/dns-query`
returned only “missing ‘dns’ parameter”. Now the error message guides
users to use `?dns=` or `?q=`.
Updates: #16055
Signed-off-by: Zach Buchheit <zachb@tailscale.com>
Validate that any tags that users have specified via tailscale.com/tags
annotation are valid Tailscale ACL tags.
Validate that no more than one HA Tailscale Kubernetes Services in a single cluster refer
to the same Tailscale Service.
Updates tailscale/tailscale#16054
Updates tailscale/tailscale#16035
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
As noted in #16048, the ./ssh/tailssh package failed to build on
Android, because GOOS=android also matches the "linux" build
tag. Exclude Android like iOS is excluded from macOS (darwin).
This now works:
$ GOOS=android go install ./ipn/ipnlocal ./ssh/tailssh
The original PR at #16048 is also fine, but this stops the problem
earlier.
Updates #16048
Change-Id: Ie4a6f6966a012e510c9cb11dd0d1fa88c48fac37
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
RELNOTE=Fix CSRF errors in the client Web UI
Replace gorilla/csrf with a Sec-Fetch-Site based CSRF protection
middleware that falls back to comparing the Host & Origin headers if no
SFS value is passed by the client.
Add an -origin override to the web CLI that allows callers to specify
the origin at which the web UI will be available if it is hosted behind
a reverse proxy or within another application via CGI.
Updates #14872
Updates #15065
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
To authenticate mesh keys, the DERP servers used a simple == comparison,
which is susceptible to a side channel timing attack.
By extracting the mesh key for a DERP server, an attacker could DoS it
by forcing disconnects using derp.Client.ClosePeer. They could also
enumerate the public Wireguard keys, IP addresses and ports for nodes
connected to that DERP server.
DERP servers configured without mesh keys deny all such requests.
This patch also extracts the mesh key logic into key.DERPMesh, to
prevent this from happening again.
Security bulletin: https://tailscale.com/security-bulletins#ts-2025-003Fixestailscale/corp#28720
Signed-off-by: Simon Law <sfllaw@tailscale.com>
* control/controlclient,health,tailcfg: refactor control health messages
Updates tailscale/corp#27759
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
Signed-off-by: Paul Scott <408401+icio@users.noreply.github.com>
Co-authored-by: Paul Scott <408401+icio@users.noreply.github.com>
Registering a new store is cheap, it just adds a map entry. No need to
lazy-init it with sync.Once and an intermediate slice holding init
functions.
Updates #cleanup
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Create FileOps for calling platform-specific file operations such as SAF APIs in Taildrop
Update taildrop.PutFile to support both traditional and SAF modes
Updates tailscale/tailscale#15263
Signed-off-by: kari-ts <kari@tailscale.com>
Use of the httptest client doesn't render header ordering
as expected.
Use http.DefaultClient for the test to ensure that
the header ordering test is valid.
Updates tailscale/corp#27370
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
This type improves code clarity and reduces the chance of heap alloc as
we pass it as a non-pointer. VNI being a 3-byte value enables us to
track set vs unset via the reserved/unused byte.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Taildrop wasn't working on iOS since #15971 because GetExt didn't work
until after init, but that PR moved Init until after Start.
This makes GetExt work before LocalBackend.Start (ExtensionHost.Init).
Updates #15812
Change-Id: I6e87257cd97a20f86083a746d39df223e5b6791b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The same message was used for "up" and "down" permission failures, but
"set" works better for both. Suggesting "up --operator" for a "down"
permission failure was confusing.
It's not like the latter command works in one shot anyway.
Fixes#16008
Change-Id: I6e4225ef06ce2d8e19c40bece8104e254c2aa525
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This fixes the implementation and test from #15208 which apparently
never worked.
Ignore the metacert when counting the number of expected certs
presented.
And fix the test, pulling out the TLSConfig setup code into something
shared between the real cmd/derper and the test.
Fixes#15579
Change-Id: I90526e38e59f89b480629b415f00587b107de10a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This reconciler allows users to make applications highly available at L3 by
leveraging Tailscale Virtual Services. Many Kubernetes Service's
(irrespective of the cluster they reside in) can be mapped to a
Tailscale Virtual Service, allowing access to these Services at L3.
Updates #15895
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Adds Recorder fields to configure the name and annotations of the ServiceAccount
created for and used by its associated StatefulSet. This allows the created Pod
to authenticate with AWS without requiring a Secret with static credentials,
using AWS' IAM Roles for Service Accounts feature, documented here:
https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.htmlFixes#15875
Change-Id: Ib0e15c0dbc357efa4be260e9ae5077bacdcb264f
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
cmd/containerboot,kube/ingressservices: proxy VIPService TCP/UDP traffic to cluster Services
This PR is part of the work to implement HA for Kubernetes Operator's
network layer proxy.
Adds logic to containerboot to monitor mounted ingress firewall configuration rules
and update iptables/nftables rules as the config changes.
Also adds new shared types for the ingress configuration.
The implementation is intentionally similar to that for HA for egress proxy.
Updates tailscale/tailscale#15895
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
CallMeMaybeVia reception and endpoint allocation have been collapsed to
a single event channel. discoInfo caching for active relay handshakes
is now implemented.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Content-type was responding as test/plain for probes
accepting application/json. Set content type header
before setting the response code to correct this.
Updates tailscale/corp#27370
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Update proxy-to-grafana to strip any X-Webauth prefixed headers passed
by the client in *every* request, not just those to /login.
/api/ routes will also accept these headers to authenticate users,
necessitating their removal to prevent forgery.
Updates tailscale/corp#28687
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Currently, LocalBackend/ExtensionHost doesn't invoke the profile change callback for the initial profile.
Since the initial profile may vary depending on loaded extensions and applied policy settings,
it can't be reliably determined until all extensions are initialized. Additionally, some extensions
may asynchronously trigger a switch to the "best" profile (based on system state and policy settings) during
initialization.
We intended to address these issues as part of the ongoing profileManager/LocalBackend refactoring,
but the changes didn't land in time for the v1.84 release and the Taildrop refactoring.
In this PR, we update the Taildrop extension to retrieve the current profile at initialization time
and handle it as a profile change.
We also defer extension initialization until LocalBackend has started, since the Taildrop extension
already relies on this behavior (e.g., it requires clients to call SetDirectFileRoot before Init).
Fixes#15970
Updates #15812
Updates tailscale/corp#28449
Signed-off-by: Nick Khyl <nickk@tailscale.com>
`cmd/derpprobe --once` didn’t respect the convention of non-zero exit
status for a failed run. It would always exit zero (i.e. success),
even. This patch fixes that, but only for `--once` mode.
Fixes: #15925
Signed-off-by: Simon Law <sfllaw@tailscale.com>
In this PR, we make DNS registration behavior configurable via the EnableDNSRegistration policy setting.
We keep the default behavior unchanged, but allow admins to either enforce DNS registration and dynamic
DNS updates for the Tailscale interface, or prevent Tailscale from modifying the settings configured in
the network adapter's properties or by other means.
Updates #14917
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Add new rules to update DNAT rules for Kubernetes operator's
HA ingress where it's expected that rules will be added/removed
frequently (so we don't want to keep old rules around or rewrite
existing rules unnecessarily):
- allow deleting DNAT rules using metadata lookup
- allow inserting DNAT rules if they don't already
exist (using metadata lookup)
Updates tailscale/tailscale#15895
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: chaosinthecrd <tom@tmlabs.co.uk>
Commit 0841477 moved ServerEndpoint to an independent package. Move its
tests over as well.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
OCSP has been removed from the LE certs.
Use CRL verification instead.
If a cert provides a CRL, check its revocation
status, if no CRL is provided and otherwise
is valid, pass the check.
Fixes#15912
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Co-authored-by: Simon Law <sfllaw@tailscale.com>
In tailssh.go:1284, (*sshSession).startNewRecording starts a fire-and-forget goroutine that can
outlive the test that triggered its creation. Among other things, it uses ss.logf, and may call it
after the test has already returned. Since we typically use (*testing.T).Logf as the logger,
this results in a data race and causes flaky tests.
Ideally, we should fix the root cause and/or use a goroutines.Tracker to wait for the goroutine
to complete. But with the release approaching, it's too risky to make such changes now.
As a workaround, we update the tests to use tstest.WhileTestRunningLogger, which logs to t.Logf
while the test is running and disables logging once the test finishes, avoiding the race.
While there, we also fix TestSSHAuthFlow not to use log.Printf.
Updates #15568
Updates #7707 (probably related)
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We previously kept these methods in local.go when we started moving node-specific state
from LocalBackend to nodeBackend, to make those changes easier to review. But it's time
to move them to node_backend.go.
Updates #cleanup
Updates #12614
Signed-off-by: Nick Khyl <nickk@tailscale.com>
In this PR, we extract the in-process LocalAPI client/server implementation from ipn/ipnserver/server_test.go
into a new ipntest package to be used in high‑level black‑box tests, such as those for the tailscale CLI.
Updates #15575
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The event loop removes the need for growing locking complexities and
synchronization. Now we simply use channels. The event loop only runs
while there is active work to do.
relayManager remains no-op inside magicsock for the time being.
endpoints are never 'relayCapable' and therefore endpoint & Conn will
not feed CallMeMaybeVia or allocation events into it.
A number of relayManager events remain unimplemented, e.g.
CallMeMaybeVia reception and relay handshaking.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
In this PR, we make the "user-dial-routes" behavior default on all platforms except for iOS and Android.
It can be disabled by setting the TS_DNS_FORWARD_USE_ROUTES envknob to 0 or false.
Updates #12027
Updates #13837
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Set Cross-Origin-Opener-Policy: same-origin for all browser requests to
prevent window.location manipulation by malicious origins.
Updates tailscale/corp#28480
Thank you to Triet H.M. Pham for the report.
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Commit 4b525fdda (ssh/tailssh: only chdir incubator process to user's
homedir when necessary and possible, 2024-08-16) defers changing the
working directory until the incubator process drops its privileges.
However, it didn't account for the case where there is no incubator
process, because no tailscaled was found on the PATH. In that case, it
only intended to run `tailscaled be-child` in the root directory but
accidentally ran everything there.
Fixes: #15350
Signed-off-by: Simon Law <sfllaw@sfllaw.ca>
ServerEndpoint will be used within magicsock and potentially elsewhere,
which should be possible without needing to import the server
implementation itself.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
I'd moved the osshare calls to feature/taildrop hooks, but forgot to
remove them from ipnlocal, or lost them during a rebase.
But then I noticed cmd/tailscaled also had some, so turn those into a
hook.
Updates #12614
Change-Id: I024fb1d27fbcc49c013158882ee5982c2737037d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is an integration test that covers all the code in Direct, Auto, and
LocalBackend that processes NetMaps and creates a Filter. The test uses
tsnet as a convenient proxy for setting up all the client pieces correctly,
but is not actually a test specific to tsnet.
Updates tailscale/corp#20514
Signed-off-by: James Sanderson <jsanderson@tailscale.com>
Android is Linux, but doesn't use Linux DNS managers (or D-Bus).
Updates #12614
Change-Id: I487802ac74a259cd5d2480ac26f7faa17ca8d1c3
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This handle can be used in tests and debugging to identify the specific
client connection.
Updates tailscale/corp#28368
Change-Id: I48cc573fc0bcf018c66a18e67ad6c4f248fb760c
Signed-off-by: Brian Palmer <brianp@tailscale.com>
Android is Linux, but that not much Linux.
Updates #12614
Change-Id: Ice80bd3e3d173511c30d05a43d25a31e18928db7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
None of them are applicable to the common tsnet use cases.
If somebody wants one of them, they can empty import it.
Updates #12614
Change-Id: I3d7f74b555eed22e05a09ad667e4572a5bc452d8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For consistency with other flags, per Slack chat.
Updates #5902
Change-Id: I7ae1e4c97b37185573926f5fafda82cf8b46f071
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The defaultEnv and defaultBool functions are copied over temporarily
to minimise diff. This lays the ground work for having both the operator
and the new k8s-proxy binary implement the API proxy
Updates #13358
Change-Id: Ieacc79af64df2f13b27a18135517bb31c80a5a02
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Until we turn on AAAA by default (which might make some people rely on
Happy Eyeballs for targets without IPv6), this lets people turn it on
explicitly if they want.
We still should add a peer cap as well in the future to let a peer
explicitly say that it's cool with IPv6.
Related: #9574
Updates #1813
Updates #1152
Change-Id: Iec6ec9b4b5db7a4dc700ecdf4a11146cc5303989
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This is a hack, but should suffice and be fast enough.
I really want to figure out what's keeping that writable fd open.
Fixes#15868
Change-Id: I285d836029355b11b7467841d31432cc5890a67e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Cleanup after #15866. It was using a mix of "b" and "c" before. But "b"
is ambiguous with LocalBackend's usual "b".
Updates #12614
Change-Id: I8c2e84597555ec3db0d783a00ac1c12549ce6706
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
As just discussed on Slack with @nickkhyl.
Updates #12614
Change-Id: I138dd7eaffb274494297567375d969b4122f3f50
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
relayManager will eventually be responsible for handling the allocation
and handshaking of UDP relay server endpoints.
relay servers are endpoint-independent, and Conn must already maintain
handshake state for all endpoints. This justifies a new data structure
to fill these roles.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Previously all tests shared their tailscale+tailscaled binaries in
system /tmp directories, which often leaked, and required TestMain to
clean up (which feature/taildrop didn't use).
This makes it use testing.T.TempDir for the binaries, but still only
builds them once and efficiently as possible depending on the OS
copies them around between each test's temp dir.
Updates #15812
Change-Id: I0e2585613f272c3d798a423b8ad1737f8916f527
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Conn.handleDiscoMessage() now makes a distinction between relay
handshake disco messages and peer disco messages.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Taildrop has never had an end-to-end test since it was introduced.
This adds a basic one.
It caught two recent refactoring bugs & one from 2022 (0f7da5c7dc).
This is prep for moving the rest of Taildrop out of LocalBackend, so
we can do more refactorings with some confidence.
Updates #15812
Change-Id: I6182e49c5641238af0bfdd9fea1ef0420c112738
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This fixes a refactoring bug introduced in 8b72dd7873
Tests (that failed on this) are coming in a separate change.
Updates #15812
Change-Id: Ibbf461b4eaefe22ad3005fc243d0a918e8af8981
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
* util/linuxfw: fix delete snat rule
This pr is fixing the bug that in nftables mode setting snat-subnet-routes=false doesn't
delete the masq rule in nat table.
Updates #15661
Signed-off-by: Kevin Liang <kevinliang@tailscale.com>
* change index arithmetic in test to chunk
Signed-off-by: Kevin Liang <kevinliang@tailscale.com>
* reuse rule creation function in rule delete
Signed-off-by: Kevin Liang <kevinliang@tailscale.com>
* add test for deleting the masq rule
Signed-off-by: Kevin Liang <kevinliang@tailscale.com>
---------
Signed-off-by: Kevin Liang <kevinliang@tailscale.com>
This fixes the Taildrop deadlock from 8b72dd7873.
Fixes#15824
Change-Id: I5ca583de20dd0d0b513ce546439dc632408ca1f1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Conn.sendDiscoMessage() now verifies if the destination disco key is
associated with any known peer(s) in a thread-safe manner.
Updates #15844
Signed-off-by: Jordan Whited <jordan@tailscale.com>
And also validate opts for unknown types, before other side effects.
Fixes#15833
Change-Id: I4cabe16c49c5b7566dcafbec59f2cd1e0c8b4b3c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Instead of using the version package (which depends on
tailcfg.CurrentCapabilityVersion) to get the git commit hash, do it
directly using debug.BuildInfo. This way, when changing struct fields in
tailcfg, we can successfully `go generate` it without compiler errors.
Updates #9634
Updates https://github.com/tailscale/corp/issues/26717
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
TS_CONTROL_IS_PLAINTEXT_HTTP no longer does anything as of
8fd471ce57
Updates #13597
Change-Id: I32ae7f8c5f2a2632e80323b1302a36295ee00736
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In prep for Taildrop integration tests using them from another package.
Updates #15812
Change-Id: I6a995de4e7400658229d99c90349ad5bd1f503ae
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
So it can be exported & used by other packages in future changes.
Updates #15812
Change-Id: I319000989ebc294e29c92be7f44a0e11ae6f7761
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We were missing this metric, but it can be important for some workloads.
Varz memstats output allocation cost reduced from 30 allocs per
invocation to 1 alloc per invocation.
Updates tailscale/corp#28033
Signed-off-by: James Tucker <james@tailscale.com>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We update profileManager to allow registering a single state (profile+prefs) change hook.
This is to invert the dependency between the profileManager and the LocalBackend, so that
instead of LocalBackend asking profileManager for the state, we can have profileManager
call LocalBackend when the state changes.
We also update feature.Hook with a new (*feature.Hook).GetOk method to avoid calling both
IsSet and Get.
Updates tailscale/corp#28014
Updates #12614
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This further minimizes the number of places where the profile manager updates the current profile and prefs.
We also document a scenario where an implicit profile switch can occur.
We should be able to address it after (partially?) inverting the dependency between
LocalBackend and profileManager, so that profileManager notifies LocalBackend
of profile changes instead of the other way around.
Updates tailscale/corp#28014
Updates #12614
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The previous strategy assumed clients maintained adequate state to
understand the relationship between endpoint allocation and the server
it was allocated on.
magicsock will not have awareness of the server's disco key
pre-allocation, it only understands peerAPI address at this point. The
second client to allocate on the same server could trigger
re-allocation, breaking a functional relay server endpoint.
If magicsock needs to force reallocation we can add opt-in behaviors
for this later.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
We had an ordered set type (set.Slice) already but we occasionally want
to do the same thing with a map, preserving the order things were added,
so add that too, as mapsx.OrderedMap[K, V], and then use in ipnext.
Updates #12614
Change-Id: I85e6f5e11035571a28316441075e952aef9a0863
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Now that 25c4dc5fd7 removed unregistering hooks and made them into
slices, just expose the slices and remove the setter funcs.
This removes boilerplate ceremony around adding new hooks.
This does export the hooks and make them mutable at runtime in theory,
but that'd be a data race. If we really wanted to lock it down in the
future we could make the feature.Hooks slice type be an opaque struct
with an All() iterator and a "frozen" bool and we could freeze all the
hooks after init. But that doesn't seem worth it.
This means that hook registration is also now all in one place, rather
than being mixed into ProfilesService vs ipnext.Host vs FooService vs
BarService. I view that as a feature. When we have a ton of hooks and
the list is long, then we can rearrange the fields in the Hooks struct
as needed, or make sub-structs, or big comments.
Updates #12614
Change-Id: I05ce5baa45a61e79c04591c2043c05f3288d8587
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The typical way to implement union types in Go
is to use an interface where the set of types is limited.
However, there historically has been poor support
in v1 "encoding/json" with interface types where
you can marshal such values, but fail to unmarshal them
since type information about the concrete type is lost.
The MakeInterfaceCoders function constructs custom
marshal/unmarshal functions such that the type name
is encoded in the JSON representation.
The set of valid concrete types for an interface
must be statically specified for this to function.
Updates tailscale/corp#22024
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
These were likely added after everything else was updated to use tsd.NewSystem,
in a feature branch, and before it was merged back into main.
Updates #15160
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Both are populated from the current netmap's MagicDNSSuffix.
But building a full ipnstate.Status (with peers!) is expensive and unnecessary.
Updates #cleanup
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Replace all instances of interface{} with any to resolve the
golangci-lint errors that appeared in the previous tsidp PR.
Updates #cleanup
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
tstime.GoDuration JSON serializes with time.Duration.String(), which is
more human-friendly than nanoseconds.
ServerEndpoint is currently experimental, therefore breaking changes
are tolerable.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
The encoding/json/v2 effort may end up changing
the default represention of time.Duration in JSON.
See https://go.dev/issue/71631
The GoDuration type allows us to explicitly use
the time.Duration.String representation regardless of
whether we serialize with v1 or v2 of encoding/json.
Updates tailscale/corp#27502
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
We added this helper in 1e2e319e7d. Remove this copy.
Updates #cleanup
Change-Id: I5b0681acc23692beed35951c9902ac9ceca0a8b9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The relay server is still permanently disabled until node attribute
changes are wired up in a future commit.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
QNAP now requires builds to be signed with an HSM.
This removes support for signing with a local keypair.
This adds support for signing with a Google Cloud hosted key.
The key should be an RSA key with protection level `HSM` and that uses PSS padding and a SHA256 digest.
The GCloud project, keyring and key name are passed in as command-line arguments.
The GCloud credentials and the PEM signing certificate are passed in as Base64-encoded command-line arguments.
Updates tailscale/corp#23528
Signed-off-by: Percy Wegmann <percy@tailscale.com>
This adds a feature/taildrop package, a ts_omit_taildrop build tag,
and starts moving code to feature/taildrop. In some cases, code
remains where it was but is now behind a build tag. Future changes
will move code to an extension and out of LocalBackend, etc.
Updates #12614
Change-Id: Idf96c61144d1a5f707039ceb2ff59c99f5c1642f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
When an event bus is plumbed in, use it to subscribe and react to port mapping
updates instead of using the client's callback mechanism. For now, the callback
remains available as a fallback when an event bus is not provided.
Updates #15160
Change-Id: I026adca44bf6187692ee87ae8ec02641c12f7774
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
When an event bus is configured publish an event each time a new port mapping
is updated. Publication is unconditional and occurs prior to calling any
callback that is registered. For now, the callback is still fired in a separate
goroutine as before -- later, those callbacks should become subscriptions to
the published event.
For now, the event type is defined as a new type here in the package. We will
want to move it to a more central package when there are subscribers. The event
wrapper is effectively a subset of the data exported by the internal mapping
interface, but on a concrete struct so the bus plumbing can inspect it.
Updates #15160
Change-Id: I951f212429ac791223af8d75b6eb39a0d2a0053a
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
In preparation for adding more parameters (and later, moving some away), rework
the portmapper constructor to accept its arguments on a Config struct rather
than positionally.
This is a breaking change to the function signature, but one that is very easy
to update, and a search of GitHub reveals only six instances of usage outside
clones and forks of Tailscale itself, that are not direct copies of the code
fixed up here.
While we could stub in another constructor, I think it is safe to let those
folks do the update in-place, since their usage is already affected by other
changes we can't test for anyway.
Updates #15160
Change-Id: I9f8a5e12b38885074c98894b7376039261b43f43
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Although, at the moment, we do not yet require an event bus to be present, as
we start to add more pieces we will want to ensure it is always available. Add
a new constructor and replace existing uses of new(tsd.System) throughout.
Update generated files for import changes.
Updates #15160
Change-Id: Ie5460985571ade87b8eac8b416948c7f49f0f64b
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This feature is "registered" as an ipnlocal.Extension, and
conditionally linked depending on GOOS and ts_omit_relayserver build
tag.
The feature is not linked on iOS in attempt to limit the impact to
binary size and resulting effect of pushing up against NetworkExtension
limits. Eventually we will want to support the relay server on iOS,
specifically on the Apple TV. Apple TVs are well-fitted to act as
underlay relay servers as they are effectively always-on servers.
This skeleton begins to tie a PeerAPI endpoint to a net/udprelay.Server.
The PeerAPI endpoint is currently no-op as
extension.shouldRunRelayServer() always returns false. Follow-up commits
will implement extension.shouldRunRelayServer().
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Bump to latest 22.x LTS release for node as the 18.x line is going EOL this month.
Updates https://github.com/tailscale/corp/issues/27737
Signed-off-by: Mario Minardi <mario@tailscale.com>
https://github.com/tailscale/tailscale/pull/15395 changed the logic to
skip `EditPrefs` when the platform doesn't support auto-updates. But the
old logic would only fail `EditPrefs` if the auto-update value was
`true`. If it was `false`, `EditPrefs` would succeed and store `false`
in prefs. The new logic will keep the value `unset` even if the tailnet
default is `false`.
Fixes#15691
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
In this PR, we enable extensions to track changes in the current prefs. These changes can result from a profile switch
or from the user or system modifying the current profile’s prefs. Since some extensions may want to distinguish between
the two events, while others may treat them similarly, we rename the existing profile-change callback to become
a profile-state-change callback and invoke it whenever the current profile or its preferences change. Extensions can still
use the sameNode parameter to distinguish between situations where the profile information, including its preferences,
has been updated but still represents the same tailnet node, and situations where a switch to a different profile has been made.
Having dedicated prefs-change callbacks is being considered, but currently seems redundant. A single profile-state-change callback
is easier to maintain. We’ll revisit the idea of adding a separate callback as we progress on extracting existing features from LocalBackend,
but the conversion to a profile-state-change callback is intended to be permanent.
Finally, we let extensions retrieve the current prefs or profile state (profile info + prefs) at any time using the new
CurrentProfileState and CurrentPrefs methods. We also simplify the NewControlClientCallback signature to exclude
profile prefs. It’s optional, and extensions can retrieve the current prefs themselves if needed.
Updates #12614
Updates tailscale/corp#27645
Updates tailscale/corp#26435
Updates tailscale/corp#27502
Signed-off-by: Nick Khyl <nickk@tailscale.com>
This change introduces an Age column in the output for all custom
resources to enhance visibility into their lifecycle status.
Fixes#15499
Signed-off-by: satyampsoni <satyampsoni@gmail.com>
[G,S]etWindowLongPtrW are not available on 32-bit Windows, where [G,S]etWindowLongW should be used instead.
The initial revision of #14945 imported the win package for calling and other Win32 API functions, which exported
the correct API depending on the platform. However, the same logic wasn't implemented when we removed
the win package dependency in a later revision, resulting in panics on Windows 10 x86 (there's no 32-bit Windows 11).
In this PR, we update the ipn/desktop package to use either [G,S]etWindowLongPtrW or [G,S]etWindowLongW
depending on the platform.
Fixes#15684
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Query for the const quad-100 reverse DNS name, for which a forward
record will also be served. This test was previously dependent on
search domain behavior, and now it is not.
Updates #15607
Signed-off-by: Jordan Whited <jordan@tailscale.com>
updates tailscale/tailscale#13476
On darwin, os.Hostname is no longer reliable when called
from a sandboxed process. To fix this, we will allow clients
to set an optional callback to query the hostname via an
alternative native API.
We will leave the default implementation as os.Hostname since
this works perfectly well for almost everything besides sandboxed
darwin clients.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Allow builds to be outputted to a specific directory.
By default, or if unset, artifacts are written to PWD/dist.
Updates tailscale/corp#27638
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
In this PR, we add two methods to facilitate extension lookup by both extensions,
and non-extensions (e.g., PeerAPI or LocalAPI handlers):
- FindExtensionByName returns an extension with the specified name.
It can then be type asserted to a given type.
- FindMatchingExtension is like errors.As, but for extensions.
It returns the first extension that matches the target type (either a specific extension
or an interface).
Updates tailscale/corp#27645
Updates tailscale/corp#27502
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Because we derive v6 addresses from v4 addresses we only need to store
the v4 address, not both.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
In this PR, we refactor the LocalBackend extension system, moving from direct callbacks to a more organized extension host model.
Specifically, we:
- Extract interface and callback types used by packages extending LocalBackend functionality into a new ipn/ipnext package.
- Define ipnext.Host as a new interface that bridges extensions with LocalBackend.
It enables extensions to register callbacks and interact with LocalBackend in a concurrency-safe, well-defined, and controlled way.
- Move existing callback registration and invocation code from ipnlocal.LocalBackend into a new type called ipnlocal.ExtensionHost,
implementing ipnext.Host.
- Improve docs for existing types and methods while adding docs for the new interfaces.
- Add test coverage for both the extracted and the new code.
- Remove ipn/desktop.SessionManager from tsd.System since ipn/desktop is now self-contained.
- Update existing extensions (e.g., ipn/auditlog and ipn/desktop) to use the new interfaces where appropriate.
We're not introducing new callback and hook types (e.g., for ipn.Prefs changes) just yet, nor are we enhancing current callbacks,
such as by improving conflict resolution when more than one extension tries to influence profile selection via a background profile resolver.
These further improvements will be submitted separately.
Updates #12614
Updates tailscale/corp#27645
Updates tailscale/corp#26435
Updates tailscale/corp#18342
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Adds a new diagram for ProxyGroups running in Ingress mode.
Documentation is currently not publicly available, but a link needs
adding once it is.
Updates tailscale/corp#24795
Change-Id: I0d5dd6bf6f0e1b8b0becae848dc97d8b4bfb9ccb
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Ultimately, we'd like to get rid of the concept of the "current user". It is only used on Windows,
but even then it doesn't work well in multi-user and enterprise/managed Windows environments.
In this PR, we update LocalBackend and profileManager to decouple them a bit more from this obsolete concept.
This is done in a preparation for extracting ipnlocal.Extension-related interfaces and types, and using them
to implement optional features like tailscale/corp#27645, instead of continuing growing the core ipnlocal logic.
Notably, we rename (*profileManager).SetCurrentUserAndProfile() to SwitchToProfile() and change its signature
to accept an ipn.LoginProfileView instead of an ipn.ProfileID and ipn.WindowsUserID. Since we're not removing
the "current user" completely just yet, the method sets the current user to the owner of the target profile.
We also update the profileResolver callback type, which is typically implemented by LocalBackend extensions,
to return an ipn.LoginProfileView instead of ipn.ProfileID and ipn.WindowsUserID.
Updates tailscale/corp#27645
Updates tailscale/corp#18342
Signed-off-by: Nick Khyl <nickk@tailscale.com>
ResourceCheck was previously using cmp.Diff on multiline goroutine stacks
The produced output was difficult to read for a number of reasons:
- the goroutines were sorted by count, and a changing count caused them to
jump around
- diffs would be in the middle of stacks
Instead, we now parse the pprof/goroutines?debug=1 format goroutines and
only diff whole stacks.
Updates #1253
Signed-off-by: Paul Scott <paul@tailscale.com>
Default tags to `$TAGS` if set, so that people can choose arbitrary
subsets of features.
Updates #12614
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Fix the index out of bound panic when a request is made to the local
fileserver mux with a valid secret-token, but missing share name.
Example error:
http: panic serving 127.0.0.1:40974: runtime error: slice bounds out of range [2:1]
Additionally, we document the edge case behavior of utilities that
this fileserver mux depends on.
Signed-off-by: Craig Hesling <craig@hesling.com>
This makes sure that the log target override is respected even if a
custom HTTP client is passed to logpolicy.
Updates tailscale/maple#29
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
The http.StatusMethodNotAllowed status code was being erroneously
set instead of http.StatusBadRequest in multiple places.
Updates #cleanup
Signed-off-by: Jordan Whited <jordan@tailscale.com>
In this PR, we update the Windows client updater to:
- Run msiexec with logging enabled and preserve the log file in %ProgramData%\Tailscale\Logs;
- Preserve the updater's own log file in the same location;
- Properly handle ERROR_SUCCESS_REBOOT_REQUIRED, ERROR_SUCCESS_REBOOT_INITIATED, and ERROR_INSTALL_ALREADY_RUNNING exit codes. The first two values indicate that installation
completed successfully and no further retries are needed. The last one means the Windows Installer
service is busy. Retrying immediately is likely to fail and may be risky; it could uninstall the current version
without successfully installing the new one, potentially leaving the user without Tailscale.
Updates tailscale/corp#27496
Updates tailscale#15554
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Installer tests only run when changes are made to pkgserve. This PR schedules
these tests to be run daily and report any failures to Slack.
Fixestailscale/corp#19103
Signed-off-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com>
This flag is currently no-op and hidden. The flag does round trip
through the related pref. Subsequent commits will tie them to
net/udprelay.Server. There is no corresponding "tailscale up" flag,
enabling/disabling of the relay server will only be supported via
"tailscale set".
This is a string flag in order to support disablement via empty string
as a port value of 0 means "enable the server and listen on a random
unused port". Disablement via empty string also follows existing flag
convention, e.g. advertise-routes.
Early internal discussions settled on "tailscale set --relay="<port>",
but the author felt this was too ambiguous around client vs server, and
may cause confusion in the future if we add related flags.
Updates tailscale/corp#27502
Signed-off-by: Jordan Whited <jordan@tailscale.com>
`tailscale set` was created to set preferences, which used to be
overloaded into `tailscale up`. To move people over to the new
command, `up` was supposed to be frozen and no new preference flags
would be added. But people forgot, there was no test to warn them, and
so new flags were added anyway.
TestUpFlagSetIsFrozen complains when new flags are added to
`tailscale up`. It doesn’t try all combinations of GOOS, but since
the CI builds in every OS, the pull-request tests should cover this.
Updates #15460
Signed-off-by: Simon Law <sfllaw@sfllaw.ca>
Ensure no services are advertised as part of shutting down tailscaled.
Prefs are only edited if services are currently advertised, and they're
edited we wait for control's ~15s (+ buffer) delay to failover.
Note that editing prefs will trigger a synchronous write to the state
Secret, so it may fail to persist state if the ProxyGroup is getting
scaled down and therefore has its RBAC deleted at the same time, but that
failure doesn't stop prefs being updated within the local backend,
doesn't affect connectivity to control, and the state Secret is
about to get deleted anyway, so the only negative side effect is a harmless
error log during shutdown. Control still learns that the node is no
longer advertising the service and triggers the failover.
Note that the first version of this used a PreStop lifecycle hook, but
that only supports GET methods and we need the shutdown to trigger side
effects (updating prefs) so it didn't seem appropriate to expose that
functionality on a GET endpoint that's accessible on the k8s network.
Updates tailscale/corp#24795
Change-Id: I0a9a4fe7a5395ca76135ceead05cbc3ee32b3d3c
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
As IPv4 and IPv6 end up with different MSS and different congestion
control strategies, proxying between them can really amplify TCP
meltdown style conditions in many real world network conditions, such as
with higher latency, some loss, etc.
Attempt to match up the protocols, otherwise pick a destination address
arbitrarily. Also shuffle the target address to spread load across
upstream load balancers.
Updates #15367
Signed-off-by: James Tucker <james@tailscale.com>
When we have an old cert that is being rotated, include it in the order.
If we're in the ARI-recommended rotation window, LE should exclude us
from rate limits. If we're not within that window, the order still
succeeds, so there's no risk in including the old cert.
Fixes#15542
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
The test suite had grown to about 20s on my machine, but it doesn't
do much taxing work so was a good candidate to parallelise. Now runs
in under 2s on my machine.
Updates #cleanup
Change-Id: I2fcc6be9ca226c74c0cb6c906778846e959492e4
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
The earlier #15534 prevent some dup string flags. This does it for all
flag types.
Updates #6813
Change-Id: Iec2871448394ea9a5b604310bdbf7b499434bf01
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
tsconsensus enables tsnet.Server instances to form a consensus.
tsconsensus wraps hashicorp/raft with
* the ability to do discovery via tailscale tags
* inter node communication over tailscale
* routing of commands to the leader
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
So we can link open source contributors to it.
Updates #cleanup
Change-Id: I02f612b38db9594f19b3be5d982f58c136120e9a
Co-authored-by: James Sanderson <jsanderson@tailscale.com>
Co-authored-by: Will Norris <will@tailscale.com>
Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Some CLI flags support multiple values separated by commas. These flags
are intended to be declared only once and will silently ignore subsequent
instances. This will now throw an error if multiple instances of advertise-tags
and advertise-routes are detected.
Fixes#6813
Signed-off-by: Jason O'Donnell <2160810+jasonodonnell@users.noreply.github.com>
Ensure that the upstream is always queried, so that if upstream is going
to NXDOMAIN natc will also return NXDOMAIN rather than returning address
allocations.
At this time both IPv4 and IPv6 are still returned if upstream has a
result, regardless of upstream support - this is ~ok as we're proxying.
Rewrite the tests to be once again slightly closer to integration tests,
but they're still very rough and in need of a refactor.
Further refactors are probably needed implementation side too, as this
removed rather than added units.
Updates #15367
Signed-off-by: James Tucker <james@tailscale.com>
This adds netx.DialFunc, unifying a type we have a bazillion other
places, giving it now a nice short name that's clickable in
editors, etc.
That highlighted that my earlier move (03b47a55c7) of stuff from
nettest into netx moved too much: it also dragged along the memnet
impl, meaning all users of netx.DialFunc who just wanted netx for the
type definition were instead also pulling in all of memnet.
So move the memnet implementation netx.Network into memnet, a package
we already had.
Then use netx.DialFunc in a bunch of places. I'm sure I missed some.
And plenty remain in other repos, to be updated later.
Updates tailscale/corp#27636
Change-Id: I7296cd4591218e8624e214f8c70dab05fb884e95
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
I added yet another one in 6d117d64a2 but that new one is at the
best place int he dependency graph and has the best name, so let's use
that one for everything possible.
types/lazy can't use it for circular dependency reasons, so unexport
that copy at least.
Updates #cleanup
Change-Id: I25db6b6a0d81dbb8e89a0a9080c7f15cbf7aa770
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
We want to be able to use the netx.Network (and RealNetwork
implemementation) outside of tests, without linking "testing".
So split out the non-test stuff of nettest into its own package.
We tend to use "foox" as the convention for things we wish were in the
standard library's foo package, so "netx" seems consistent.
Updates tailscale/corp#27636
Change-Id: I1911d361f4fbdf189837bf629a20f2ebfa863c44
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This fixes a bug in the local client where the DELETE request was
not being sent correctly. The route was missing a slash before the url
and this now matches the switch profile function.
Signed-off-by: Esteban-Bermudez <esteban@bermudezaguirre.com>
To avoid ephemeral port / TIME_WAIT exhaustion with high --count
values, and to eventually detect leaked connections in tests. (Later
the memory network will register a Cleanup on the TB to verify that
everything's been shut down)
Updates tailscale/corp#27636
Change-Id: Id06f1ae750d8719c5a75d871654574a8226d2733
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
For future in-memory network changes (#15558) to be able to be
stricter and do automatic leak detection when it's safe to do so, in
non-parallel tests.
Updates tailscale/corp#27636
Change-Id: I50f03b16a3f92ce61a7ed88264b49d8c6628f638
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Make the perPeerState objects able to function independently without a
shared reference to the connector.
We don't currently change the values from connector that perPeerState
uses at runtime. Explicitly copying them at perPeerState creation allows
us to, for example, put the perPeerState into a consensus algorithm in
the future.
Updates #14667
Signed-off-by: Fran Bull <fran@tailscale.com>
This shouldn't be necessary, but while we're continuing to figure out
the root cause, this is better than having to restart the app or switch
profiles on the command line.
Updates #15528
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Android >=14 forbids the use of netlink sockets, and in some configurations
can kill apps that try.
Fixes#9836
Signed-off-by: David Anderson <dave@tailscale.com>
The regular android app constructs its own wgengine with
additional FFI shims, so this default codepath only affects
other handcrafted buids like tsnet, which do not let the
caller customize the innards of wgengine.
Android >=14 forbids the use of netlink sockets, which makes
the standard linux router fail to initialize.
Fixes#9836
Signed-off-by: David Anderson <dave@tailscale.com>
So we can link tailscale and tailscaled together into one.
Updates #5794
Change-Id: I9a8b793c64033827e4188931546cbd64db55982e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
To ease local debugging and have fewer moving pieces while bringing up
Plan 9 support.
Updates #5794
Change-Id: I2dc98e73bbb0d4d4730dc47203efc0550a0ac0a0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Otherwise this was repeated closing control/derp connections all the time
on netmon changes. Arguably we should do this on all platforms?
Updates #5794
Change-Id: If6bbeff554235f188bab2a40ab75e08dd14746b2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This wasn't right; it was spinning up new goroutines non-stop.
Revert to a boring localhost TCP implementation for now.
Updates #5794
Change-Id: If93caa20a12ee4e741c0c72b0d91cc0cc5870152
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Not currently used in the OSS tree, a View for tailcfg.VIPService will
make implementing some server side changes easier.
Updates tailscale/corp#26272
Change-Id: If1ed0bea4eff8c4425d3845b433a1c562d99eb9e
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
Avoid the unbounded runtime during random allocation, if random
allocation fails after a first pass at random through the provided
ranges, pick the next free address by walking through the allocated set.
The new ipx utilities provide a bitset based allocation pool, good for
small to moderate ranges of IPv4 addresses as used in natc.
Updates #15367
Signed-off-by: James Tucker <james@tailscale.com>
fixestailscale/corp#27506
The source address link selection on sandboxed macOS doesn't deal
with loopback addresses correctly. This adds an explicit check to ensure
we return the loopback interface for loopback addresses instead of the
default empty interface.
Specifically, this allows the dns resolver to route queries to a loopback
IP which is a common tactic for local DNS proxies.
Tested on both macos, macsys and tailscaled. Forwarded requests to
127/8 all bound to lo0.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
This commit implements an experimental UDP relay server. The UDP relay
server leverages the Disco protocol for a 3-way handshake between
client and server, along with 3 new Disco message types for said
handshake. These new Disco message types are also considered
experimental, and are not yet tied to a capver.
The server expects, and imposes, a Geneve (Generic Network
Virtualization Encapsulation) header immediately following the underlay
UDP header. Geneve protocol field values have been defined for Disco
and WireGuard. The Geneve control bit must be set for the handshake
between client and server, and unset for messages relayed between
clients through the server.
Updates tailscale/corp#27101
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Add the golang-image-ico package, which is an incredibly small package
to handle the ICO container format with PNG inside. Some profile photos
look quite pixelated when displayed at this size, but it's better than
nothing, and any Windows support is just a bonus anyway.
Updates #1708
Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
Otherwise you can get stuck finding minor ones nonstop.
Fixes#15484
Change-Id: I7f98ac338c0b32ec1b9fdc47d053207b5fc1bf23
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
It only affected js/wasm and tamago.
Updates tailscale/corp#24697
Change-Id: I8fd29323ed9b663fe3fd8d4a86f26ff584a3e134
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
initPeerAPIListener may be returning early unexpectedly. Add debug logging to
see what causes it to return early when it does.
Updates #14393
Signed-off-by: Percy Wegmann <percy@tailscale.com>
If we previously knew of macaddresses of a node, and they
suddenly goes to zero, ignore them and return the previous
hardware addresses.
Updates tailscale/corp#25168
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
For hooking up websocket VM clients to natlab.
Updates #13038
Change-Id: Iaf728b9146042f3d0c2d3a5e25f178646dd10951
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Re-enable HA Ingress again that was disabled for 1.82 release.
This reverts commit fea74a60d5.
Updates tailscale/corp#24795
Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Not all platforms have hardlinks, or not easily.
This lets a "tailscale" wrapper script set an environment variable
before calling tailscaled.
Updates #2233
Change-Id: I9eccc18651e56c106f336fcbbd0fd97a661d312e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In this PR, we update ipnlocal.LocalBackend to allow registering callbacks for control client creation
and profile changes. We also allow to register ipnauth.AuditLogFunc to be called when an auditable
action is attempted.
We then use all this to invert the dependency between the auditlog and ipnlocal packages and make
the auditlog functionality optional, where it only registers its callbacks via ipnlocal-provided hooks
when the auditlog package is imported.
We then underscore-import it when building tailscaled for Windows, and we'll explicitly
import it when building xcode/ipn-go-bridge for macOS. Since there's no default log-store
location for macOS, we'll also need to call auditlog.SetStoreFilePath to specify where
pending audit logs should be persisted.
Fixes#15394
Updates tailscale/corp#26435
Updates tailscale/corp#27012
Signed-off-by: Nick Khyl <nickk@tailscale.com>
LocalBackend transitions to ipn.NoState when switching to a different (or new) profile.
When this happens, we should unconfigure wgengine to clear routes, DNS configuration,
firewall rules that block all traffic except to the exit node, etc.
In this PR, we update (*LocalBackend).enterStateLockedOnEntry to do just that.
Fixes#15316
Updates tailscale/corp#23967
Signed-off-by: Nick Khyl <nickk@tailscale.com>
The default values for `tailscale up` and `tailscale set` are supposed
to agree for all common flags. But they don’t for `--accept-routes`
on Windows and from the Mac OS App Store, because `tailscale up`
computes this value based on the operating system:
user@host:~$ tailscale up --help 2>&1 | grep -A1 accept-routes
--accept-dns, --accept-dns=false
accept DNS configuration from the admin panel (default true)
user@host:~$ tailscale set --help 2>&1 | grep -A1 accept-routes
--accept-dns, --accept-dns=false
accept DNS configuration from the admin panel
Luckily, `tailscale set` uses `ipn.MaskedPrefs`, so the default values
don’t logically matter. But someone will get the wrong idea if they
trust the `tailscale set --help` documentation.
In addition, `ipn.Prefs.RouteAll` defaults to true so it disagrees
with both of the flags above.
This patch makes `--accept-routes` use the same logic for in both
commands by hoisting the logic that was buried in `cmd/tailscale/cli`
to `ipn.Prefs.DefaultRouteAll`. Then, all three of defaults can agree.
Fixes: #15319
Signed-off-by: Simon Law <sfllaw@sfllaw.ca>
The default values for `tailscale up` and `tailscale set` are supposed
to agree on all common flags. But they don’t for `--accept-dns`:
user@host:~$ tailscale up --help 2>&1 | grep -A1 accept-dns
--accept-dns, --accept-dns=false
accept DNS configuration from the admin panel (default true)
user@host:~$ tailscale set --help 2>&1 | grep -A1 accept-dns
--accept-dns, --accept-dns=false
accept DNS configuration from the admin panel
Luckily, `tailscale set` uses `ipn.MaskedPrefs`, so the default values
don’t logically matter. But someone will get the wrong idea if they
trust the `tailscale set --help` documentation.
This patch makes `--accept-dns` default to true in both commands and
also introduces `TestSetDefaultsMatchUpDefaults` to prevent any future
drift.
Fixes: #15319
Signed-off-by: Simon Law <sfllaw@sfllaw.ca>
We as members, contributors, and leaders pledge to make participation
in our community a harassment-free experience for everyone, regardless
of age, body size, visible or invisible disability, ethnicity, sex
characteristics, gender identity and expression, level of experience,
education, socio-economic status, nationality, personal appearance,
race, religion, or sexual identity and orientation.
We pledge to act and interact in ways that contribute to an open,
welcoming, diverse, inclusive, and healthy community.
We are committed to creating an open, welcoming, diverse, inclusive, healthy and respectful community.
Unacceptable, harmful and inappropriate behavior will not be tolerated.
## Our Standards
Examples of behavior that contributes to a positive environment for
our community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our
mistakes, and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
overall community
Examples of unacceptable behavior include:
Examples of behavior that contributes to a positive environment for our community include:
* The use of sexualized language or imagery, and sexual attention or
advances of any kind
* Trolling, insulting or derogatory comments, and personal or
political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email
address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in
a professional setting
- Demonstrating empathy and kindness toward other people.
- Being respectful of differing opinions, viewpoints, and experiences.
- Giving and gracefully accepting constructive feedback.
- Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience.
- Focusing on what is best not just for us as individuals, but for the overall community.
## Enforcement Responsibilities
Examples of unacceptable behavior include without limitation:
Community leaders are responsible for clarifying and enforcing our
standards of acceptable behavior and will take appropriate and fair
corrective action in response to any behavior that they deem
inappropriate, threatening, offensive, or harmful.
- The use of language, imagery or emojis (collectively "content") that is racist, sexist, homophobic, transphobic, or otherwise harassing or discriminatory based on any protected characteristic.
- The use of sexualized content and sexual attention or advances of any kind.
- The use of violent, intimidating or bullying content.
- Trolling, concern trolling, insulting or derogatory comments, and personal or political attacks.
- Public or private harassment.
- Publishing others' personal information, such as a photo, physical address, email address, online profile information, or other personal information, without their explicit permission or with the intent to bully or harass the other person.
- Posting deep fake or other AI generated content about or involving another person without the explicit permission.
- Spamming community channels and members, such as sending repeat messages, low-effort content, or automated messages.
- Phishing or any similar activity.
- Distributing or promoting malware.
- The use of any coded or suggestive content to hide or provoke otherwise unacceptable behavior.
- Other conduct which could reasonably be considered harmful, illegal, or inappropriate in a professional setting.
Community leaders have the right and responsibility to remove, edit,
or reject comments, commits, code, wiki edits, issues, and other
contributions that are not aligned to this Code of Conduct, and will
communicate reasons for moderation decisions when appropriate.
Please also see the Tailscale Acceptable Use Policy, available at [tailscale.com/tailscale-aup](https://tailscale.com/tailscale-aup).
## Scope
## Reporting Incidents
This Code of Conduct applies within all community spaces, and also
applies when an individual is officially representing the community in
public spaces. Examples of representing our community include using an
official e-mail address, posting via an official social media account,
or acting as an appointed representative at an online or offline
event.
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to Tailscale directly via <info@tailscale.com>, or to the community leaders or moderators via DM or similar.
All complaints will be reviewed and investigated promptly and fairly.
We will respect the privacy and safety of the reporter of any issues.
## Enforcement
Please note that this community is not moderated by staff 24/7, and we do not have, and do not undertake, any obligation to prescreen, monitor, edit, or remove any content or data, or to actively seek facts or circumstances indicating illegal activity.
While we strive to keep the community safe and welcoming, moderation may not be immediate at all hours.
If you encounter any issues, report them using the appropriate channels.
Instances of abusive, harassing, or otherwise unacceptable behavior
may be reported to the community leaders responsible for enforcement
at [info@tailscale.com](mailto:info@tailscale.com). All complaints
will be reviewed and investigated promptly and fairly.
## Enforcement Guidelines
All community leaders are obligated to respect the privacy and
security of the reporter of any incident.
Community leaders and moderators are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
## Enforcement Guidelines
Community leaders and moderators have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Community Code of Conduct.
Tailscale retains full discretion to take action (or not) in response to a violation of these guidelines with or without notice or liability to you.
We will interpret our policies and resolve disputes in favor of protecting users, customers, the public, our community and our company, as a whole.
Community leaders will follow these Community Impact Guidelines in
determining the consequences for any action they deem in violation of
this Code of Conduct:
Community leaders will follow these community enforcement guidelines in determining the consequences for any action they deem in violation of this Code of Conduct,
and retain full discretion to apply the enforcement guidelines as necessary depending on the circumstances:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior
deemed unprofessional or unwelcome in the community.
Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders,
providing clarity around the nature of the violation and an
explanation of why the behavior was inappropriate. A public apology
may be requested.
Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate.
A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series
of actions.
Community Impact: A violation through a single incident or series of actions.
**Consequence**: A warning with consequences for continued
behavior. No interaction with the people involved, including
unsolicited interaction with those enforcing the Code of Conduct, for
a specified period of time. This includes avoiding interactions in
community spaces as well as external channels like social
media. Violating these terms may lead to a temporary or permanent ban.
Consequence: A warning with consequences for continued behavior.
No interaction with the people involved, including unsolicited interaction with those enforcing this Community Code of Conduct, for a specified period of time.
This includes avoiding interactions in community spaces as well as external channels like social media.
Violating these terms may lead to a temporary or permanent ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards,
including sustained inappropriate behavior.
Community Impact: A serious violation of community standards, including sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or
public communication with the community for a specified period of
time. No public or private interaction with the people involved,
including unsolicited interaction with those enforcing the Code of
Conduct, is allowed during this period. Violating these terms may lead
to a permanent ban.
Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time.
No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of
community standards, including sustained inappropriate behavior,
harassment of an individual, or aggression toward or disparagement of
classes of individuals.
Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction
within the community.
Consequence: A permanent ban from any sort of public interaction within the community.
## Acceptable Use Policy
Violation of this Community Code of Conduct may also violate the Tailscale Acceptable Use Policy, which may result in suspension or termination of your Tailscale account.
For more information, please see the Tailscale Acceptable Use Policy, available at [tailscale.com/tailscale-aup](https://tailscale.com/tailscale-aup).
## Privacy
Please see the Tailscale [Privacy Policy](https://tailscale.com/privacy-policy) for more information about how Tailscale collects, uses, discloses and protects information.
## Attribution
This Code of Conduct is adapted from the [Contributor
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 2.0, available at <https://www.contributor-covenant.org/version/2/0/code_of_conduct.html>.
Community Impact Guidelines were inspired by [Mozilla's code of