Commit Graph

2001 Commits (2c8859c2e725af2de59203c0b2d39b96f135cb60)

Author SHA1 Message Date
Jordan Whited 218110963d
cmd/stunstamp: implement HTTPS & TCP latency measurements (#13082)
HTTPS mirrors current netcheck behavior and TCP uses tcp_info->rtt.

Updates tailscale/corp#22114

Signed-off-by: Jordan Whited <jordan@tailscale.com>
3 months ago
Brad Fitzpatrick 2e32abc3e2 cmd/tailscaled: allow setting env via linux cmdline for integration tests
Updates #13038

Change-Id: I51e016d0eb7c14647159706c08f017fdedd68e2a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 months ago
Maisem Ali d0e8375b53 cmd/{tta,vnet}: proxy to gokrazy UI
Updates #13038

Change-Id: I1cacb1b0f8c3d0e4c36b7890155f7b1ad0d23575
Signed-off-by: Maisem Ali <maisem@tailscale.com>
4 months ago
Brad Fitzpatrick f47a5fe52b vnet: reduce some log spam
Updates #13038

Change-Id: I76038a90dfde10a82063988a5b54190074d4b5c5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 months ago
Maisem Ali f8d23b3582 tstest/integration/nat: stream daemon logs directly
Updates #13038

Signed-off-by: Maisem Ali <maisem@tailscale.com>
Change-Id: I5da5706149c082c27d74c8b894bf53dd9b259e84
4 months ago
Brad Fitzpatrick 6798f8ea88 tstest/natlab/vnet: add port mapping
Updates #13038

Change-Id: Iaf274d250398973790873534b236d5cbb34fbe0e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 months ago
Maisem Ali 12764e9db4 natlab: add NodeAgentClient
This adds a new NodeAgentClient type that can be used to
invoke the LocalAPI using the LocalClient instead of
handcrafted URLs. However, there are certain cases where
it does make sense for the node agent to provide more
functionality than whats possible with just the LocalClient,
as such it also exposes a http.Client to make requests directly.

Signed-off-by: Maisem Ali <maisem@tailscale.com>
4 months ago
Brad Fitzpatrick 1016aa045f hostinfo: add hostinfo.IsNATLabGuestVM
And don't make guests under vnet/natlab upload to logcatcher,
as there won't be a valid cert anyway.

Updates #13038

Change-Id: Ie1ce0139788036b8ecc1804549a9b5d326c5fef5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 months ago
Brad Fitzpatrick 8594292aa4 vnet: add control/derps to test, stateful firewall
Updates #13038

Change-Id: Icd65b34c5f03498b5a7109785bb44692bce8911a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 months ago
Jordan Whited 20691894f5
cmd/stunstamp: refactor to support multiple protocols (#13063)
'stun' has been removed from metric names and replaced with a protocol
label. This refactor is preparation work for HTTPS & ICMP support.

Updates tailscale/corp#22114

Signed-off-by: Jordan Whited <jordan@tailscale.com>
4 months ago
Andrew Lytvynov c0c4791ce7
cmd/gitops-pusher: ignore previous etag if local acls match control (#13068)
In a situation when manual edits are made on the admin panel, around the
GitOps process, the pusher will be stuck if `--fail-on-manual-edits` is
set, as expected.

To recover from this, there are 2 options:
1. revert the admin panel changes to get back in sync with the code
2. check in the manual edits to code

The former will work well, since previous and local ETags will match
control ETag again. The latter will still fail, since local and control
ETags match, but previous does not.

For this situation, check the local ETag against control first and
ignore previous when things are already in sync.

Updates https://github.com/tailscale/corp/issues/22177

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
4 months ago
Andrew Lytvynov ad038f4046
cmd/gitops-pusher: add --fail-on-manual-edits flag (#13066)
For cases where users want to be extra careful about not overwriting
manual changes, add a flag to hard-fail. This is only useful if the etag
cache is persistent or otherwise reliable. This flag should not be used
in ephemeral CI workers that won't persist the cache.

Updates https://github.com/tailscale/corp/issues/22177

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
4 months ago
Naman Sood f79183dac7
cmd/tsidp: add funnel support (#12591)
* cmd/tsidp: add funnel support

Updates #10263.

Signed-off-by: Naman Sood <mail@nsood.in>

* look past funnel-ingress-node to see who we're authenticating

Signed-off-by: Naman Sood <mail@nsood.in>

* fix comment typo

Signed-off-by: Naman Sood <mail@nsood.in>

* address review feedback, support Basic auth for /token

Turns out you need to support Basic auth if you do client ID/secret
according to OAuth.

Signed-off-by: Naman Sood <mail@nsood.in>

* fix typos

Signed-off-by: Naman Sood <mail@nsood.in>

* review fixes

Signed-off-by: Naman Sood <mail@nsood.in>

* remove debugging log

Signed-off-by: Naman Sood <mail@nsood.in>

* add comments, fix header

Signed-off-by: Naman Sood <mail@nsood.in>

---------

Signed-off-by: Naman Sood <mail@nsood.in>
4 months ago
Brad Fitzpatrick 1ed958fe23 tstest/natlab/vnet: add start of virtual network-based NAT Lab
Updates #13038

Change-Id: I3c74120d73149c1329288621f6474bbbcaa7e1a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 months ago
Brad Fitzpatrick 6ca078c46e cmd/derper: move 204 handler from package main to derphttp
Updates #13038

Change-Id: I28a8284dbe49371cae0e9098205c7c5f17225b40
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
4 months ago
Anton Tolchanov b3fc345aba cmd/derpprobe: use a status page from the prober library
Updates tailscale/corp#20583

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
4 months ago
Anton Tolchanov 0fd73746dd cmd/tailscale/cli: fix `revoke-keys` command name in CLI output
During review of #8644 the `recover-compromised-key` command was renamed
to `revoke-key`, but the old name remained in some messages printed by
the command.

Fixes tailscale/corp#19446

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
4 months ago
Jordan Whited f0230ce0b5
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GRO for Linux (#12921)
This commit implements TCP GRO for packets being written to gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported IP checksum functions.
gVisor is updated in order to make use of newly exported
stack.PacketBuffer GRO logic.

TCP throughput towards gVisor, i.e. TUN write direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement, sometimes as high as 2x. High bandwidth-delay product
paths remain receive window limited, bottlenecked by gVisor's default
TCP receive socket buffer size. This will be addressed in a  follow-on
commit.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.

The first result is from commit 57856fc without TCP GRO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  4.77 GBytes  4.10 Gbits/sec   20 sender
[  5]   0.00-10.00  sec  4.77 GBytes  4.10 Gbits/sec      receiver

The second result is from this commit with TCP GRO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  10.6 GBytes  9.14 Gbits/sec   20 sender
[  5]   0.00-10.00  sec  10.6 GBytes  9.14 Gbits/sec      receiver

Updates #6816

Signed-off-by: Jordan Whited <jordan@tailscale.com>
4 months ago
Jordan Whited 7bc2ddaedc
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869)
This commit implements TCP GSO for packets being read from gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported GSO logic from its tun
package.

A new gVisor stack.LinkEndpoint implementation has been established
(linkEndpoint) that is loosely modeled after its predecessor
(channel.Endpoint). This new implementation supports GSO of monster TCP
segments up to 64K in size, whereas channel.Endpoint only supports up to
32K. linkEndpoint will also be required for GRO, which will be
implemented in a follow-on commit.

TCP throughput from gVisor, i.e. TUN read direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement through a wide range of RTT and loss conditions, sometimes
as high as 5x.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.

The first result is from commit 57856fc without TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.51 GBytes  2.15 Gbits/sec  154 sender
[  5]   0.00-10.00  sec  2.49 GBytes  2.14 Gbits/sec      receiver

The second result is from this commit with TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  12.6 GBytes  10.8 Gbits/sec    6 sender
[  5]   0.00-10.00  sec  12.6 GBytes  10.8 Gbits/sec      receiver

Updates #6816

Signed-off-by: Jordan Whited <jordan@tailscale.com>
4 months ago
Jonathan Nobels 8a8ecac6a7
net/dns, cmd/tailscaled: plumb system health tracker into dns cleanup (#12969)
fixes tailscale#12968

The dns manager cleanup func was getting passed a nil
health tracker, which will panic.  Fixed to pass it
the system health tracker.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
4 months ago
Andrew Dunham 35a8fca379 cmd/tailscale/cli: release portmap after netcheck
Updates #12954

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Ic14f037b48a79b1263b140c6699579b466d89310
4 months ago
Irbe Krumina a21bf100f3
cmd/k8s-operator,k8s-operator/sessionrecording,sessionrecording,ssh/tailssh: refactor session recording functionality (#12945)
cmd/k8s-operator,k8s-operator/sessionrecording,sessionrecording,ssh/tailssh: refactor session recording functionality

Refactor SSH session recording functionality (mostly the bits related to
Kubernetes API server proxy 'kubectl exec' session recording):

- move the session recording bits used by both Tailscale SSH
and the Kubernetes API server proxy into a shared sessionrecording package,
to avoid having the operator to import ssh/tailssh

- move the Kubernetes API server proxy session recording functionality
into a k8s-operator/sessionrecording package, add some abstractions
in preparation for adding support for a second streaming protocol (WebSockets)

Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
4 months ago
Irbe Krumina c5623e0471
go.{mod,sum},tstest/tools,k8s-operator,cmd/k8s-operator: autogenerate CRD API docs (#12884)
Re-instates the functionality that generates CRD API docs, but using
a different library as the one we were using earlier seemed to have
some issues with its Git history.
Also regenerates the docs (make kube-generate-all).

Updates tailscale/tailscale#12859

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
4 months ago
Andrea Gottardo 90be06bd5b
health: introduce captive-portal-detected Warnable (#12707)
Updates tailscale/tailscale#1634

This PR introduces a new `captive-portal-detected` Warnable which is set to an unhealthy state whenever a captive portal is detected on the local network, preventing Tailscale from connecting.



ipn/ipnlocal: fix captive portal loop shutdown


Change-Id: I7cafdbce68463a16260091bcec1741501a070c95

net/captivedetection: fix mutex misuse

ipn/ipnlocal: ensure that we don't fail to start the timer


Change-Id: I3e43fb19264d793e8707c5031c0898e48e3e7465

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
4 months ago
Nick Khyl 20562a4fb9 cmd/viewer, types/views, util/codegen: add viewer support for custom container types
This adds support for container-like types such as Container[T] that
don't explicitly specify a view type for T. Instead, a package implementing
a container type should also implement and export a ContainerView[T, V] type
and a ContainerViewOf(*Container[T]) ContainerView[T, V] function, which
returns a view for the specified container, inferring the element view type V
from the element type T.

Updates #12736

Signed-off-by: Nick Khyl <nickk@tailscale.com>
4 months ago
Andrew Lytvynov e7bf6e716b
cmd/tailscale: add --min-validity flag to the cert command (#12822)
Some users run "tailscale cert" in a cron job to renew their
certificates on disk. The time until the next cron job run may be long
enough for the old cert to expire with our default heristics.

Add a `--min-validity` flag which ensures that the returned cert is
valid for at least the provided duration (unless it's longer than the
cert lifetime set by Let's Encrypt).

Updates #8725

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
4 months ago
Lee Briggs 32ce18716b
Add extra environment variables in deployment template (#12858)
Fixes #12857

Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
4 months ago
Irbe Krumina 0f57b9340b
cmd/k8s-operator,tstest,go.{mod,sum}: remove fybrik.io/crdoc dependency (#12862)
Remove fybrik.io/crdoc dependency as it is causing issues for folks attempting
to vendor tailscale using GOPROXY=direct.
This means that the CRD API docs in ./k8s-operator/api.md will no longer
be generated- I am going to look at replacing it with another tool
in a follow-up.

Updates tailscale/tailscale#12859

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
4 months ago
Irbe Krumina 2742153f84
cmd/k8s-operator: add a metric to track the amount of ProxyClass resources (#12833)
Updates tailscale/tailscale#10709

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
4 months ago
Adrian Dewhurst 0834712c91 ipn: allow FQDN in exit node selection
To match the format of exit node suggestions and ensure that the result
is not ambiguous, relax exit node CLI selection to permit using a FQDN
including the trailing dot.

Updates #12618

Change-Id: I04b9b36d2743154aa42f2789149b2733f8555d3f
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
4 months ago
Nick Khyl fd0acc4faf cmd/cloner, cmd/viewer: add _test prefix for files generated with the test build tag
Updates #12736

Signed-off-by: Nick Khyl <nickk@tailscale.com>
5 months ago
Linus Brogan 9609b26541 cmd/tailscale: resolve taildrive share paths
Fixes #12258.

Signed-off-by: Linus Brogan <git@linusbrogan.com>
5 months ago
Nick Khyl fc28c8e7f3 cmd/cloner, cmd/viewer, util/codegen: add support for generic types and interfaces
This adds support for generic types and interfaces to our cloner and viewer codegens.
It updates these packages to determine whether to make shallow or deep copies based
on the type parameter constraints. Additionally, if a template parameter or an interface
type has View() and Clone() methods, we'll use them for getters and the cloner of the
owning structure.

Updates #12736

Signed-off-by: Nick Khyl <nickk@tailscale.com>
5 months ago
Brad Fitzpatrick c6af5bbfe8 all: add test for package comments, fix, add comments as needed
Updates #cleanup

Change-Id: Ic4304e909d2131a95a38b26911f49e7b1729aaef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
Irbe Krumina 986d60a094
cmd/k8s-operator: add metrics for attempted/uploaded session recordings (#12765)
Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
5 months ago
Irbe Krumina 6a982faa7d
cmd/k8s-operator: send container name to session recorder (#12763)
Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
5 months ago
Nick Khyl 726d5d507d cmd/k8s-operator: update depaware.txt
This fixes an issue caused by the merge order of 2b638f550d and 8bd442ba8c.

Updates #Cleanup

Signed-off-by: Nick Khyl <nickk@tailscale.com>
5 months ago
Maisem Ali 2238ca8a05 go.mod: bump bart
Updates #bart

Signed-off-by: Maisem Ali <maisem@tailscale.com>
5 months ago
Nick Khyl 8bd442ba8c util/winutil/gp, net/dns: add package for Group Policy API
This adds a package with GP-related functions and types to be used in the future PRs.
It also updates nrptRuleDatabase to use the new package instead of its own gpNotificationWatcher implementation.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
5 months ago
Andrew Lytvynov b8af91403d
clientupdate: return true for CanAutoUpdate for macsys (#12746)
While `clientupdate.Updater` won't be able to apply updates on macsys,
we use `clientupdate.CanAutoUpdate` to gate the EditPrefs endpoint in
localAPI. We should allow the GUI client to set AutoUpdate.Apply on
macsys for it to properly get reported to the control plane. This also
allows the tailnet-wide default for auto-updates to propagate to macsys
clients.

Updates https://github.com/tailscale/corp/issues/21339

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
5 months ago
Nick Khyl e21d8768f9 types/opt: add generic Value[T any] for optional values of any types
Updates #12736

Signed-off-by: Nick Khyl <nickk@tailscale.com>
5 months ago
Irbe Krumina ba517ab388
cmd/k8s-operator,ssh/tailssh,tsnet: optionally record 'kubectl exec' sessions via Kubernetes operator's API server proxy (#12274)
cmd/k8s-operator,ssh/tailssh,tsnet: optionally record kubectl exec sessions

The Kubernetes operator's API server proxy, when it receives a request
for 'kubectl exec' session now reads 'RecorderAddrs', 'EnforceRecorder'
fields from tailcfg.KubernetesCapRule.
If 'RecorderAddrs' is set to one or more addresses (of a tsrecorder instance(s)),
it attempts to connect to those and sends the session contents
to the recorder before forwarding the request to the kube API
server. If connection cannot be established or fails midway,
it is only allowed if 'EnforceRecorder' is not true (fail open).

Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
5 months ago
Maisem Ali 2b638f550d cmd/k8s-operator: add depaware.txt
Updates #12742

Signed-off-by: Maisem Ali <maisem@tailscale.com>
5 months ago
Tom Proctor 01a7726cf7
cmd/containerboot,cmd/k8s-operator: enable IPv6 for fqdn egress proxies (#12577)
cmd/containerboot,cmd/k8s-operator: enable IPv6 for fqdn egress proxies

Don't skip installing egress forwarding rules for IPv6 (as long as the host
supports IPv6), and set headless services `ipFamilyPolicy` to
`PreferDualStack` to optionally enable both IP families when possible. Note
that even with `PreferDualStack` set, testing a dual-stack GKE cluster with
the default DNS setup of kube-dns did not correctly set both A and
AAAA records for the headless service, and instead only did so when
switching the cluster DNS to Cloud DNS. For both IPv4 and IPv6 to work
simultaneously in a dual-stack cluster, we require headless services to
return both A and AAAA records.

If the host doesn't support IPv6 but the FQDN specified only has IPv6
addresses available, containerboot will exit with error code 1 and an
error message because there is no viable egress route.

Fixes #12215

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
5 months ago
Charlotte Brandhorst-Satzkorn 42f01afe26
cmd/tailscale/cli: exit node filter should display all exit node options (#12699)
This change expands the `exit-node list -filter` command to display all
location based exit nodes for the filtered country. This allows users
to switch to alternative servers when our recommended exit node is not
working as intended.

This change also makes the country filter matching case insensitive,
e.g. both USA and usa will work.

Updates #12698

Signed-off-by: Charlotte Brandhorst-Satzkorn <charlotte@tailscale.com>
5 months ago
Jordan Whited ddf94a7b39
cmd/stunstamp: fix handling of invalid DERP map resp (#12679)
Updates tailscale/corp#20344

Signed-off-by: Jordan Whited <jordan@tailscale.com>
5 months ago
James Tucker b565a9faa7 cmd/xdpderper: add autodetection for default interface name
This makes deployment easier in hetrogenous environments.

Updates ENG-4274

Signed-off-by: James Tucker <james@tailscale.com>
5 months ago
Josh McKinney 18939df0a7 fix: broken tests for localhost
Signed-off-by: Josh McKinney <joshka@users.noreply.github.com>
5 months ago
Brad Fitzpatrick 210264f942 cmd/derper: clarify that derper and tailscaled need to be in sync
Fixes #12617

Change-Id: Ifc87b7d9cf699635087afb57febd01fb9a6d11b7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
Brad Fitzpatrick 6b801a8e9e cmd/derper: link to various derper docs in more places
In hopes it'll be found more.

Updates tailscale/corp#20844

Change-Id: Ic92ee9908f45b88f8770de285f838333f9467465
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
James Tucker 46fda6bf4c cmd/derper: add some DERP diagnostics pointers
A few other minor language updates.

Updates tailscale/corp#20844

Change-Id: Idba85941baa0e2714688cc8a4ec3e242e7d1a362
Signed-off-by: James Tucker <james@tailscale.com>
5 months ago
Adrian Dewhurst a6b13e6972 cmd/tailscale/cli: correct command emitted by exit node suggestion
The exit node suggestion CLI command was written with the assumption
that it's possible to provide a stableid on the command line, but this
is incorrect. Instead, it will now emit the name of the exit node.

Fixes #12618

Change-Id: Id7277f395b5fca090a99b0d13bfee7b215bc9802
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
5 months ago
Jordan Whited 94415e8029
cmd/stunstamp: remove sqlite DB and API (#12604)
stunstamp now sends data to Prometheus via remote write, and Prometheus
can serve the same data. Retaining and cleaning up old data in sqlite
leads to long probing pauses, and it's not worth investing more effort
to optimize the schema and/or concurrency model.

Updates tailscale/corp#20344

Signed-off-by: Jordan Whited <jordan@tailscale.com>
5 months ago
Brad Fitzpatrick 3485e4bf5a derp: make RunConnectionLoop funcs take Messages, support PeerPresentFlags
PeerPresentFlags was added in 5ffb2668ef but wasn't plumbed through to
the RunConnectionLoop. Rather than add yet another parameter (as
IP:port was added earlier), pass in the raw PeerPresentMessage and
PeerGoneMessage struct values, which are the same things, plus two
fields: PeerGoneReasonType for gone and the PeerPresentFlags from
5ffb2668ef.

Updates tailscale/corp#17816

Change-Id: Ib19d9f95353651ada90656071fc3656cf58b7987
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
Andrew Dunham 200d92121f types/lazy: add Peek method to SyncValue
This adds the ability to "peek" at the value of a SyncValue, so that
it's possible to observe a value without computing this.

Updates tailscale/corp#17122

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I06f88c22a1f7ffcbc7ff82946335356bb0ef4622
5 months ago
Aaron Klotz 7dd76c3411 net/netns: add Windows support for bind-to-interface-by-route
This is implemented via GetBestInterfaceEx. Should we encounter errors
or fail to resolve a valid, non-Tailscale interface, we fall back to
returning the index for the default interface instead.

Fixes #12551

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
5 months ago
Brad Fitzpatrick 91786ff958 cmd/derper: add debug endpoint to adjust mutex profiling rate
Updates #3560

Change-Id: I474421ce75c79fb66e1c306ed47daebc5a0e069e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
Jordan Whited 0d6e71df70
cmd/stunstamp: add explicit metric to track timeout events (#12564)
Timeouts could already be identified as NaN values on
stunstamp_derp_stun_rtt_ns, but we can't use NaN effectively with
promql to visualize them. So, this commit adds a timeouts metric that
we can use with rate/delta/etc promql functions.

Updates tailscale/corp#20689

Signed-off-by: Jordan Whited <jordan@tailscale.com>
5 months ago
Kristoffer Dalby dcb0f189cc cmd/proxy-to-grafana: add flag for alternative control server
Fixes #12571

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
5 months ago
Andrew Dunham 24976b5bfd cmd/tailscale/cli: actually perform Noise request in 'debug ts2021'
This actually performs a Noise request in the 'debug ts2021' command,
instead of just exiting once we've dialed a connection. This can help
debug certain forms of captive portals and deep packet inspection that
will allow a connection, but will RST the connection when trying to send
data on the post-upgraded TCP connection.

Updates #1634

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1e46ca9c9a0751c55f16373a6a76cdc24fec1f18
5 months ago
Andrew Dunham 732605f961 control/controlclient: move noiseConn to internal package
So that it can be later used in the 'tailscale debug ts2021' function in
the CLI, to aid in debugging captive portals/WAFs/etc.

Updates #1634

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: Iec9423f5e7570f2c2c8218d27fc0902137e73909
5 months ago
Andrea Gottardo 8eb15d3d2d
cli/netcheck: fail with output if we time out fetching a derpmap (#12528)
Updates tailscale/corp#20969

Right now, when netcheck starts, it asks tailscaled for a copy of the DERPMap. If it doesn't have one, it makes a HTTPS request to controlplane.tailscale.com to fetch one.

This will always fail if you're on a network with a captive portal actively blocking HTTPS traffic. The code appears to hang entirely because the http.Client doesn't have a Timeout set. It just sits there waiting until the request succeeds or fails.

This adds a timeout of 10 seconds, and logs more details about the status of the HTTPS request.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
5 months ago
Jordan Whited a93173b56a
cmd/xdpderper,derp/xdp: implement mode that drops STUN packets (#12527)
This is useful during maintenance as a method for shedding home client
load.

Updates tailscale/corp#20689

Signed-off-by: Jordan Whited <jordan@tailscale.com>
5 months ago
Tom Proctor 3099323976
cmd/k8s-operator,k8s-operator,go.{mod,sum}: publish proxy status condition for annotated services (#12463)
Adds a new TailscaleProxyReady condition type for use in corev1.Service
conditions.

Also switch our CRDs to use metav1.Condition instead of
ConnectorCondition. The Go structs are seralized identically, but it
updates some descriptions and validation rules. Update k8s
controller-tools and controller-runtime deps to fix the documentation
generation for metav1.Condition so that it excludes comments and
TODOs.

Stop expecting the fake client to populate TypeMeta in tests. See
kubernetes-sigs/controller-runtime#2633 for details of the change.

Finally, make some minor improvements to validation for service hostnames.

Fixes #12216

Co-authored-by: Irbe Krumina <irbe@tailscale.com>
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
5 months ago
Andrew Dunham 45d2f4301f proxymap, various: distinguish between different protocols
Previously, we were registering TCP and UDP connections in the same map,
which could result in erroneously removing a mapping if one of the two
connections completes while the other one is still active.

Add a "proto string" argument to these functions to avoid this.
Additionally, take the "proto" argument in LocalAPI, and plumb that
through from the CLI and add a new LocalClient method.

Updates tailscale/corp#20600

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I35d5efaefdfbf4721e315b8ca123f0c8af9125fb
5 months ago
Irbe Krumina 8cc2738609
cmd/{containerboot,k8s-operator}: store proxy device ID early to help with cleanup for broken proxies (#12425)
* cmd/containerboot: store device ID before setting up proxy routes.

For containerboot instances whose state needs to be stored
in a Kubernetes Secret, we additonally store the device's
ID, FQDN and IPs.
This is used, between other, by the Kubernetes operator,
who uses the ID to delete the device when resources need
cleaning up and writes the FQDN and IPs on various kube
resource statuses for visibility.

This change shifts storing device ID earlier in the proxy setup flow,
to ensure that if proxy routing setup fails,
the device can still be deleted.

Updates tailscale/tailscale#12146

Signed-off-by: Irbe Krumina <irbe@tailscale.com>

* code review feedback

Signed-off-by: Irbe Krumina <irbe@tailscale.com>

---------

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
5 months ago
Andrew Lytvynov 674c998e93
cmd/tailscale/cli: do not allow update --version on macOS (#12508)
We do not support specific version updates or track switching on macOS.
Do not populate the flag to avoid confusion.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
5 months ago
Brad Fitzpatrick 86e0f9b912 net/ipset, wgengine/filter/filtertype: add split-out packages
This moves NewContainsIPFunc from tsaddr to new ipset package.

And wgengine/filter types gets split into wgengine/filter/filtertype,
so netmap (and thus the CLI, etc) doesn't need to bring in ipset,
bart, etc.

Then add a test making sure the CLI deps don't regress.

Updates #1278

Change-Id: Ia246d6d9502bbefbdeacc4aef1bed9c8b24f54d5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
Brad Fitzpatrick 64ac64fb66 net/tsaddr: use bart in NewContainsIPFunc, add tests, benchmarks
NewContainsIPFunc was previously documented as performing poorly if
there were many netip.Prefixes to search over. As such, we never it used it
in such cases.

This updates it to use bart at a certain threshold (over 6 prefixes,
currently), at which point the bart lookup overhead pays off.

This is currently kinda useless because we're not using it. But now we
can and get wins elsewhere. And we can remove the caveat in the docs.

    goos: darwin
    goarch: arm64
    pkg: tailscale.com/net/tsaddr
                                     │    before    │                after                 │
                                     │    sec/op    │    sec/op     vs base                │
    NewContainsIPFunc/empty-8          2.215n ± 11%   2.239n ±  1%   +1.08% (p=0.022 n=10)
    NewContainsIPFunc/cidr-list-1-8    17.44n ±  0%   17.59n ±  6%   +0.89% (p=0.000 n=10)
    NewContainsIPFunc/cidr-list-2-8    27.85n ±  0%   28.13n ±  1%   +1.01% (p=0.000 n=10)
    NewContainsIPFunc/cidr-list-3-8    36.05n ±  0%   36.56n ± 13%   +1.41% (p=0.000 n=10)
    NewContainsIPFunc/cidr-list-4-8    43.73n ±  0%   44.38n ±  1%   +1.50% (p=0.000 n=10)
    NewContainsIPFunc/cidr-list-5-8    51.61n ±  2%   51.75n ±  0%        ~ (p=0.101 n=10)
    NewContainsIPFunc/cidr-list-10-8   95.65n ±  0%   68.92n ±  0%  -27.94% (p=0.000 n=10)
    NewContainsIPFunc/one-ip-8         4.466n ±  0%   4.469n ±  1%        ~ (p=0.491 n=10)
    NewContainsIPFunc/two-ip-8         8.002n ±  1%   7.997n ±  4%        ~ (p=0.697 n=10)
    NewContainsIPFunc/three-ip-8       27.98n ±  1%   27.75n ±  0%   -0.82% (p=0.012 n=10)
    geomean                            19.60n         19.07n         -2.71%

Updates #12486

Change-Id: I2e2320cc4384f875f41721374da536bab995c1ce
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
Maisem Ali 491483d599 cmd/viewer,type/views: add MapSlice for maps of slices
This abstraction provides a nicer way to work with
maps of slices without having to write out three long type
params.

This also allows it to provide an AsMap implementation which
copies the map and the slices at least.

Updates tailscale/corp#20910

Signed-off-by: Maisem Ali <maisem@tailscale.com>
5 months ago
Nick Khyl c32efd9118 various: create a catch-all NRPT rule when "Override local DNS" is enabled on Windows
Without this rule, Windows 8.1 and newer devices issue parallel DNS requests to DNS servers
associated with all network adapters, even when "Override local DNS" is enabled and/or
a Mullvad exit node is being used, resulting in DNS leaks.

This also adds "disable-local-dns-override-via-nrpt" nodeAttr that can be used to disable
the new behavior if needed.

Fixes tailscale/corp#20718

Signed-off-by: Nick Khyl <nickk@tailscale.com>
5 months ago
Aaron Klotz bd2a6d5386 util/winutil: add UserProfile type for (un)loading user profiles
S4U logons do not automatically load the associated user profile. In this
PR we add UserProfile to handle that part. Windows docs indicate that
we should try to resolve a remote profile path when present, so we attempt
to do so when the local computer is joined to a domain.

Updates #12383

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
5 months ago
Jordan Whited 9189fe007b
cmd/stunc: support user-specified port (#12469)
Updates tailscale/corp#20689

Signed-off-by: Jordan Whited <jordan@tailscale.com>
5 months ago
Jordan Whited 65888d95c9
derp/xdp,cmd/xdpderper: initial skeleton (#12390)
This commit introduces a userspace program for managing an experimental
eBPF XDP STUN server program. derp/xdp contains the eBPF pseudo-C along
with a Go pkg for loading it and exporting its metrics.
cmd/xdpderper is a package main user of derp/xdp.

Updates tailscale/corp#20689

Signed-off-by: Jordan Whited <jordan@tailscale.com>
5 months ago
Brad Fitzpatrick ccdd2e6650 cmd/derper: add a README
Updates tailscale/corp#20844

Change-Id: Ie3ca5dd7f582f4f298339dd3cd2039243c204ef8
Co-authored-by: James Tucker <james@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
Co-authored-by: Andrew Dunham <andrew@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
JunYanBJSS 4c01ce9f43 tsnet: fix error formatting bug
Fixes #12411

Signed-off-by: JunYanBJSS <johnnycocoyan@hotmail.com>
6 months ago
Aaron Klotz 3511d1f8a2 cmd/tailscaled, net/dns, wgengine/router: start Windows child processes with DETACHED_PROCESS when I/O is being piped
When we're starting child processes on Windows that are CLI programs that
don't need to output to a console, we should pass in DETACHED_PROCESS as a
CreationFlag on SysProcAttr. This prevents the OS from even creating a console
for the child (and paying the associated time/space penalty for new conhost
processes). This is more efficient than letting the OS create the console
window and then subsequently trying to hide it, which we were doing at a few
callsites.

Fixes #12270

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
6 months ago
Irbe Krumina bc53ebd4a0
ipn/{ipnlocal,localapi},net/netkernelconf,client/tailscale,cmd/containerboot: optionally enable UDP GRO forwarding for containers (#12410)
Add a new TS_EXPERIMENTAL_ENABLE_FORWARDING_OPTIMIZATIONS env var
that can be set for tailscale/tailscale container running as
a subnet router or exit node to enable UDP GRO forwarding
for improved performance.
See https://tailscale.com/kb/1320/performance-best-practices#linux-optimizations-for-subnet-routers-and-exit-nodes
This is currently considered an experimental approach;
the configuration support is partially to allow further experimentation
with containerized environments to evaluate the performance
improvements.

Updates tailscale/tailscale#12295

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
6 months ago
Irbe Krumina 6f2bae019f
cmd/k8s-nameserver: fix AAAA record query response (#12412)
Return empty response and NOERROR for AAAA record queries
for DNS names for which we have an A record.
This is to allow for callers that might be first sending an AAAA query and then,
if that does not return a response, follow with an A record query.
Previously we were returning NOTIMPL that caused some callers
to potentially not follow with an A record query or misbehave in different ways.

Also return NXDOMAIN for AAAA record queries for names
that we DO NOT have an A record for to ensure that the callers
do not follow up with an A record query.

Returning an empty response and NOERROR is the behaviour
that RFC 4074 recommends:
https://datatracker.ietf.org/doc/html/rfc4074

Updates tailscale/tailscale#12321

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
6 months ago
Aaron Klotz df86576989 util/winutil: add AllocateContiguousBuffer and SetNTString helper funcs
AllocateContiguousBuffer is for allocating structs with trailing buffers
containing additional data. It is to be used for various Windows structures
containing pointers to data located immediately after the struct.

SetNTString performs in-place setting of windows.NTString and
windows.NTUnicodeString.

Updates #12383

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
6 months ago
Irbe Krumina c3e2b7347b
tailcfg,cmd/k8s-operator,kube: move Kubernetes cap to a location that can be shared with control (#12236)
This PR is in prep of adding logic to control to be able to parse
tailscale.com/cap/kubernetes grants in control:
- moves the type definition of PeerCapabilityKubernetes cap to a location
shared with control.
- update the Kubernetes cap rule definition with fields for granting
kubectl exec session recording capabilities.
- adds a convenience function to produce tailcfg.RawMessage from an
arbitrary cap rule and a test for it.

An example grant defined via ACLs:
"grants": [{
      "src": ["tag:eng"],
      "dst": ["tag:k8s-operator"],
      "app": {
        "tailscale.com/cap/kubernetes": [{
            "recorder": ["tag:my-recorder"]
	    “enforceRecorder”: true
        }],
      },
    }
]
This grant enforces `kubectl exec` sessions from tailnet clients,
matching `tag:eng` via API server proxy matching `tag:k8s-operator`
to be recorded and recording to be sent to a tsrecorder instance,
matching `tag:my-recorder`.

The type needs to be shared with control because we want
control to parse this cap and resolve tags to peer IPs.

Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
6 months ago
Irbe Krumina 807934f00c
cmd/k8s-operator,k8s-operator: allow proxies accept advertized routes. (#12388)
Add a new .spec.tailscale.acceptRoutes field to ProxyClass,
that can be optionally set to true for the proxies to
accept routes advertized by other nodes on tailnet (equivalent of
setting --accept-routes to true).

Updates tailscale/tailscale#12322,tailscale/tailscale#10684

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
6 months ago
Irbe Krumina 53d9cac196
k8s-operator/apis/v1alpha1,cmd/k8s-operator/deploy/examples: update DNSConfig description (#11971)
Also removes hardcoded image repo/tag from example DNSConfig resource
as the operator now knows how to default those.

Updates tailscale/tailscale#11019

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
6 months ago
Tom Proctor 23e26e589f
cmd/k8s-operator,k8s-opeerator: include Connector's MagicDNS name and tailnet IPs in status (#12359)
Add new fields TailnetIPs and Hostname to Connector Status. These
contain the addresses of the Tailscale node that the operator created
for the Connector to aid debugging.

Fixes #12214

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
6 months ago
Irbe Krumina 3a6d3f1a5b
cmd/k8s-operator,k8s-operator,go.{mod,sum}: make individual proxy images/image pull policies configurable (#11928)
cmd/k8s-operator,k8s-operator,go.{mod,sum}: make individual proxy images/image pull policies configurable

Allow to configure images and image pull policies for individual proxies
via ProxyClass.Spec.StatefulSet.Pod.{TailscaleContainer,TailscaleInitContainer}.Image,
and ProxyClass.Spec.StatefulSet.Pod.{TailscaleContainer,TailscaleInitContainer}.ImagePullPolicy
fields.
Document that we have images in ghcr.io on the relevant Helm chart fields.

Updates tailscale/tailscale#11675

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
6 months ago
Andrew Dunham e88a5dbc92 various: fix lint warnings
Some lint warnings caught by running 'make lint' locally.

Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1534ed6f2f5e1eb029658906f9d62607dad98ca3
6 months ago
Brad Fitzpatrick 8a11a43c28 cmd/derpprobe: support 'local' derpmap to get derp map via LocalAPI
To make it easier for people to monitor their custom DERP fleet.

Updates tailscale/corp#20654

Change-Id: Id8af22936a6d893cc7b6186d298ab794a2672524
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Jordan Whited 6e106712f6
cmd/stunstamp: support probing multiple ports (#12356)
Updates tailscale/corp#20344

Signed-off-by: Jordan Whited <jordan@tailscale.com>
6 months ago
Maisem Ali 36e8e8cd64 wgengine/magicsock: use math/rands/v2
Updates #11058

Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: Maisem Ali <maisem@tailscale.com>
6 months ago
Fran Bull 573c8bd8c7 cmd/natc: add --wg-port flag
Updates tailscale/corp#20503

Signed-off-by: Fran Bull <fran@tailscale.com>
6 months ago
Maisem Ali 4a8cb1d9f3 all: use math/rand/v2 more
Updates #11058

Signed-off-by: Maisem Ali <maisem@tailscale.com>
6 months ago
Fran Bull d2d459d442 cmd/natc: add --ignore-destinations flag
Updates tailscale/corp#20503

Signed-off-by: Fran Bull <fran@tailscale.com>
6 months ago
Jordan Whited cf1e6c6e55
cmd/stunstamp: fix remote write retry (#12348)
Evaluation of remote write errors was using errors.Is() where it should
have been using errors.As().

Updates tailscale/corp#20344

Signed-off-by: Jordan Whited <jordan@tailscale.com>
6 months ago
Irbe Krumina 82576190a7
tailcfg,cmd/k8s-operator: moves tailscale.com/cap/kubernetes peer cap to tailcfg (#12235)
This is done in preparation for adding kubectl
session recording rules to this capability grant that will need to
be unmarshalled by control, so will also need to be
in a shared location.

Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
6 months ago
signed-long 1dc3136a24
cmd/k8s-operator: Support image 'repo' or 'repository' keys in helm values file (#12285)
cmd/k8s-operator/deploy/chart: Support image 'repo' or 'repository' keys in helm values

Fixes #12100

Signed-off-by: Michael Long <michaelongdev@gmail.com>
6 months ago
Jordan Whited ba0dd493c8
cmd/stunstamp: validate STUN tx ID in responses (#12339)
Extremely late arriving responses may leak across probing intervals.

Updates tailscale/corp#20344

Signed-off-by: Jordan Whited <jordan@tailscale.com>
6 months ago
Maisem Ali 2f2f588c80 cmd/natc: use ListenPacket
Now that tsnet supports it, use it.

Updates tailscale/corp#20503

Signed-off-by: Maisem Ali <maisem@tailscale.com>
6 months ago
Maisem Ali 0b1a8586eb cmd/natc: initial implementation of a NAT based connector
This adds a new prototype `cmd/natc` which can be used
to expose a services/domains to the tailnet.

It requires the user to specify a set of IPv4 prefixes
from the CGNAT range. It advertises these as normal subnet
routes. It listens for DNS on the first IP of the first range
provided to it.

When it gets a DNS query it allocates an IP for that domain
from the v4 range. Subsequent connections to the assigned IP
are then tcp proxied to the domain.

It is marked as a WIP prototype and requires the use of the
`TAILSCALE_USE_WIP_CODE` env var.

Updates tailscale/corp#20503

Signed-off-by: Maisem Ali <maisem@tailscale.com>
6 months ago
Jordan Whited d21c00205d
cmd/stunstamp: implement service to measure DERP STUN RTT (#12241)
stunstamp timestamping includes userspace and SO_TIMESTAMPING kernel
timestamping where available. Measurements are written locally to a
sqlite DB, exposed over an HTTP API, and written to prometheus
via remote-write protocol.

Updates tailscale/corp#20344

Signed-off-by: Jordan Whited <jordan@tailscale.com>
6 months ago
Maisem Ali 42cfbf427c tsnet,wgengine/netstack: add ListenPacket and tests
This adds a new ListenPacket function on tsnet.Server
which acts mostly like `net.ListenPacket`.

Unlike `Server.Listen`, this requires listening on a
specific IP and does not automatically listen on both
V4 and V6 addresses of the Server when the IP is unspecified.

To test this, it also adds UDP support to tsdial.Dialer.UserDial
and plumbs it through the localapi. Then an associated test
to make sure the UDP functionality works from both sides.

Updates #12182

Signed-off-by: Maisem Ali <maisem@tailscale.com>
6 months ago
Irbe Krumina c2a4719e9e
cmd/tailscale/cli: allow 'tailscale up' to succeed if --stateful-filtering is not explicitly set on linux (#12312)
This fixes an issue where, on containerized environments an upgrade
1.66.3 -> 1.66.4 failed with default containerboot configuration.
This was because containerboot by default runs 'tailscale up'
that requires all previously set flags to be explicitly provided
on subsequent runs and we explicitly set --stateful-filtering
to true on 1.66.3, removed that settingon 1.66.4.

Updates tailscale/tailscale#12307

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Andrew Lytvynov <awly@tailscale.com>
6 months ago
ChandonPierre 0a5bd63d32
ipn/store/kubestore, cmd/containerboot: allow overriding client api server URL via ENV (#12115)
Updates tailscale/tailscale#11397

Signed-off-by: Chandon Pierre <cpierre@coreweave.com>
6 months ago
Anton Tolchanov 32120932a5 cmd/tailscale/cli: print node signature in `tailscale lock status`
- Add current node signature to `ipnstate.NetworkLockStatus`;
- Print current node signature in a human-friendly format as part
  of `tailscale lock status`.

Examples:

```
$ tailscale lock status
Tailnet lock is ENABLED.

This node is accessible under tailnet lock. Node signature:
SigKind: direct
Pubkey: [OTB3a]
KeyID: tlpub:44a0e23cd53a4b8acc02f6732813d8f5ba8b35d02d48bf94c9f1724ebe31c943
WrappingPubkey: tlpub:44a0e23cd53a4b8acc02f6732813d8f5ba8b35d02d48bf94c9f1724ebe31c943

This node's tailnet-lock key: tlpub:44a0e23cd53a4b8acc02f6732813d8f5ba8b35d02d48bf94c9f1724ebe31c943

Trusted signing keys:
	tlpub:44a0e23cd53a4b8acc02f6732813d8f5ba8b35d02d48bf94c9f1724ebe31c943	1	(self)
	tlpub:6fa21d242a202b290de85926ba3893a6861888679a73bc3a43f49539d67c9764	1	(pre-auth key kq3NzejWoS11KTM59)
```

For a node created via a signed auth key:

```
This node is accessible under tailnet lock. Node signature:
SigKind: rotation
Pubkey: [e3nAO]
Nested:
  SigKind: credential
  KeyID: tlpub:6fa21d242a202b290de85926ba3893a6861888679a73bc3a43f49539d67c9764
  WrappingPubkey: tlpub:3623b0412cab0029cb1918806435709b5947ae03554050f20caf66629f21220a
```

For a node that rotated its key a few times:

```
This node is accessible under tailnet lock. Node signature:
SigKind: rotation
Pubkey: [DOzL4]
Nested:
  SigKind: rotation
  Pubkey: [S/9yU]
  Nested:
    SigKind: rotation
    Pubkey: [9E9v4]
    Nested:
      SigKind: direct
      Pubkey: [3QHTJ]
      KeyID: tlpub:44a0e23cd53a4b8acc02f6732813d8f5ba8b35d02d48bf94c9f1724ebe31c943
      WrappingPubkey: tlpub:2faa280025d3aba0884615f710d8c50590b052c01a004c2b4c2c9434702ae9d0
```

Updates tailscale/corp#19764

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
6 months ago
Brad Fitzpatrick 1ea100e2e5 cmd/tailscaled, ipn/conffile: support ec2 user-data config file
Updates #1412
Updates #1866

Change-Id: I4d08fb233b80c2078b3b28ffc18559baabb4a081
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
signed-long 5ad0dad15e
go generate directives reorder for 'make kube-generate-all' (#12210)
Fixes #11980

Signed-off-by: Michael Long <michaelongdev@gmail.com>
6 months ago
Irbe Krumina d0d33f257f
cmd/k8s-operator: add a note pointing at ProxyClass (#12246)
Updates tailscale/tailscale#12242

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
6 months ago
Maisem Ali 9a64c06a20 all: do not depend on the testing package
Discovered while looking for something else.

Updates tailscale/corp#18935

Signed-off-by: Maisem Ali <maisem@tailscale.com>
6 months ago
Brad Fitzpatrick 3c9be07214 cmd/derper: support TXT-mediated unpublished bootstrap DNS rollouts
Updates tailscale/coral#127

Change-Id: I2712c50630d0d1272c30305fa5a1899a19ffacef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Irbe Krumina 72f0f53ed0
cmd/k8s-operator: fix typo (#12217)
Fixes#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
6 months ago
James Tucker 9351eec3e1 net/netcheck: remove hairpin probes
Palo Alto reported interpreting hairpin probes as LAND attacks, and the
firewalls may be responding to this by shutting down otherwise in use NAT sessions
prematurely. We don't currently make use of the outcome of the hairpin
probes, and they contribute to other user confusion with e.g. the
AirPort Extreme hairpin session workaround. We decided in response to
remove the whole probe feature as a result.

Updates #188
Updates tailscale/corp#19106
Updates tailscale/corp#19116

Signed-off-by: James Tucker <james@tailscale.com>
6 months ago
Andrew Lytvynov c9179bc261
various: disable stateful filtering by default (#12197)
After some analysis, stateful filtering is only necessary in tailnets
that use `autogroup:danger-all` in `src` in ACLs. And in those cases
users explicitly specify that hosts outside of the tailnet should be
able to reach their nodes. To fix local DNS breakage in containers, we
disable stateful filtering by default.

Updates #12108

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
6 months ago
Brad Fitzpatrick 964282d34f ipn,wgengine: remove vestigial Prefs.AllowSingleHosts
It was requested by the first customer 4-5 years ago and only used
for a brief moment of time. We later added netmap visibility trimming
which removes the need for this.

It's been hidden by the CLI for quite some time and never documented
anywhere else.

This keeps the CLI flag, though, out of caution. It just returns an
error if it's set to anything but true (its default).

Fixes #12058

Change-Id: I7514ba572e7b82519b04ed603ff9f3bdbaecfda7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Jordan Whited adb7a86559
cmd/stunc: support ipv6 address targets (#12166)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
6 months ago
James Tucker 8d1249550a net/netcheck,wgengine/magicsock: add potential workaround for Palo Alto DIPP misbehavior
Palo Alto firewalls have a typically hard NAT, but also have a mode
called Persistent DIPP that is supposed to provide consistent port
mapping suitable for STUN resolution of public ports. Persistent DIPP
works initially on most Palo Alto firewalls, but some models/software
versions have a bug which this works around.

The bug symptom presents as follows:

- STUN sessions resolve a consistent public IP:port to start with
- Much later netchecks report the same IP:Port for a subset of
  sessions, most often the users active DERP, and/or the port related
  to sustained traffic.
- The broader set of DERPs in a full netcheck will now consistently
  observe a new IP:Port.
- After this point of observation, new inbound connections will only
  succeed to the new IP:Port observed, and existing/old sessions will
  only work to the old binding.

In this patch we now advertise the lowest latency global endpoint
discovered as we always have, but in addition any global endpoints that
are observed more than once in a single netcheck report. This should
provide viable endpoints for potential connection establishment across
a NAT with this behavior.

Updates tailscale/corp#19106

Signed-off-by: James Tucker <james@tailscale.com>
6 months ago
Andrea Gottardo e5f67f90a2
xcode: allow ICMP ping relay on macOS + iOS platforms (#12048)
Fixes tailscale/tailscale#10393
Fixes tailscale/corp#15412
Fixes tailscale/corp#19808

On Apple platforms, exit nodes and subnet routers have been unable to relay pings from Tailscale devices to non-Tailscale devices due to sandbox restrictions imposed on our network extensions by Apple. The sandbox prevented the code in netstack.go from spawning the `ping` process which we were using.

Replace that exec call with logic to send an ICMP echo request directly, which appears to work in userspace, and not trigger a sandbox violation in the syslog.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
6 months ago
Irbe Krumina 76c30e014d
cmd/containerboot: warn when an ingress proxy with an IPv4 tailnet address is being created for an IPv6 backend(s) (#12159)
Updates tailscale/tailscale#12156

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
6 months ago
Maisem Ali 486a423716 tsnet: split user facing and backend logging
This adds a new `UserLogf` field to the `Server` struct.
When set this any logs generated by Server are logged using
`UserLogf` and all spammy backend logs are logged to `Logf`.

If it `UserLogf` is unset, we default to `log.Printf` and
if `Logf` is unset we discard all the spammy logs.

Fixes #12094

Signed-off-by: Maisem Ali <maisem@tailscale.com>
7 months ago
Irbe Krumina d86d1e7601
cmd/k8s-operator,cmd/containerboot,ipn,k8s-operator: turn off stateful filter for egress proxies. (#12075)
Turn off stateful filtering for egress proxies to allow cluster
traffic to be forwarded to tailnet.

Allow configuring stateful filter via tailscaled config file.

Deprecate EXPERIMENTAL_TS_CONFIGFILE_PATH env var and introduce a new
TS_EXPERIMENTAL_VERSIONED_CONFIG env var that can be used to provide
containerboot a directory that should contain one or more
tailscaled config files named cap-<tailscaled-cap-version>.hujson.
Containerboot will pick the one with the newest capability version
that is not newer than its current capability version.

Proxies with this change will not work with older Tailscale
Kubernetes operator versions - users must ensure that
the deployed operator is at the same version or newer (up to
4 version skew) than the proxies.

Updates tailscale/tailscale#12061

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
7 months ago
Maisem Ali 21abb7f402 cmd/tailscale: add missing set flags for linux
We were missing `snat-subnet-routes`, `stateful-filtering`
and `netfilter-mode`. Add those to set too.

Fixes #12061

Signed-off-by: Maisem Ali <maisem@tailscale.com>
7 months ago
Irbe Krumina b5dbf155b1
cmd/k8s-operator: default nameserver image to tailscale/k8s-nameserver:unstable (#11991)
We are now publishing nameserver images to tailscale/k8s-nameserver,
so we can start defaulting the images if users haven't set
them explicitly, same as we already do with proxy images.

The nameserver images are currently only published for unstable
track, so we have to use the static 'unstable' tag.
Once we start publishing to stable, we can make the operator
default to its own tag (because then we'll know that for each
operator tag X there is also a nameserver tag X as we always
cut all images for a given tag.

Updates tailscale/tailscale#10499

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
7 months ago
Brad Fitzpatrick e968b0ecd7 cmd/tailscale,controlclient,ipnlocal: fix 'up', deflake tests more
The CLI's "up" is kinda chaotic and LocalBackend.Start is kinda
chaotic and they both need to be redone/deleted (respectively), but
this fixes some buggy behavior meanwhile. We were previously calling
StartLoginInteractive (to start the controlclient's RegisterRequest)
redundantly in some cases, causing test flakes depending on timing and
up's weird state machine.

We only need to call StartLoginInteractive in the client if Start itself
doesn't. But Start doesn't tell us that. So cheat a bit and a put the
information about whether there's a current NodeKey in the ipn.Status.
It used to be accessible over LocalAPI via GetPrefs as a private key but
we removed that for security. But a bool is fine.

So then only call StartLoginInteractive if that bool is false and don't
do it in the WatchIPNBus loop.

Fixes #12028
Updates #12042

Change-Id: I0923c3f704a9d6afd825a858eb9a63ca7c1df294
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick 727c0d6cfd ipn/ipnserver: close a small race in ipnserver, ~simplify code
There was a small window in ipnserver after we assigned a LocalBackend
to the ipnserver's atomic but before we Start'ed it where our
initalization Start could conflict with API calls from the LocalAPI.

Simplify that a bit and lay out the rules in the docs.

Updates #12028

Change-Id: Ic5f5e4861e26340599184e20e308e709edec68b1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Andrew Lytvynov 471731771c
ipn/ipnlocal: set default NoStatefulFiltering in ipn.NewPrefs (#12031)
This way the default gets populated on first start, when no existing
state exists to migrate. Also fix `ipn.PrefsFromBytes` to preserve empty
fields, rather than layering `NewPrefs` values on top.

Updates https://github.com/tailscale/corp/issues/19623

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
7 months ago
Paul Scott 78fa698fe6 cmd/tailscale/cli/ffcomplete: remove fullstop from ShortHelp
Updates #cleanup

Signed-off-by: Paul Scott <paul@tailscale.com>
7 months ago
Andrew Lytvynov c28f5767bf
various: implement stateful firewalling on Linux (#12025)
Updates https://github.com/tailscale/corp/issues/19623


Change-Id: I7980e1fb736e234e66fa000d488066466c96ec85

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Co-authored-by: Andrew Dunham <andrew@du.nham.ca>
7 months ago
Brad Fitzpatrick f3d2fd22ef cmd/tailscale/cli: don't start WatchIPNBus until after up's initial Start
The CLI "up" command is a historical mess, both on the CLI side and
the LocalBackend side. We're getting closer to cleaning it up, but in
the meantime it was again implicated in flaky tests.

In this case, the background goroutine running WatchIPNBus was very
occasionally running enough to get to its StartLoginInteractive call
before the original goroutine did its Start call. That meant
integration tests were very rarely but sometimes logging in with the
default control plane URL out on the internet
(controlplane.tailscale.com) instead of the localhost control server
for tests.

This also might've affected new Headscale etc users on initial "up".

Fixes #11960
Fixes #11962

Change-Id: I36f8817b69267a99271b5ee78cb7dbf0fcc0bd34
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Nick Khyl caa3d7594f ipn/ipnlocal, net/tsdial: plumb routes into tsdial and use them in UserDial
We'd like to use tsdial.Dialer.UserDial instead of SystemDial for DNS over TCP.
This is primarily necessary to properly dial internal DNS servers accessible
over Tailscale and subnet routes. However, to avoid issues when switching
between Wi-Fi and cellular, we need to ensure that we don't retain connections
to any external addresses on the old interface. Therefore, we need to determine
which dialer to use internally based on the configured routes.

This plumbs routes and localRoutes from router.Config to tsdial.Dialer,
and updates UserDial to use either the peer dialer or the system dialer,
depending on the network address and the configured routes.

Updates tailscale/corp#18725
Fixes #4529

Signed-off-by: Nick Khyl <nickk@tailscale.com>
7 months ago
Brad Fitzpatrick c3c18027c6 all: make more tests pass/skip in airplane mode
Updates tailscale/corp#19786

Change-Id: Iedc6730fe91c627b556bff5325bdbaf7bf79d8e6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Irbe Krumina 406293682c
cmd/k8s-operator: cleanup runReconciler signature (#11993)
Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
7 months ago
Brad Fitzpatrick 15fc6cd966 derp/derphttp: fix netcheck HTTPS probes
The netcheck client, when no UDP is available, probes distance using
HTTPS.

Several problems:

* It probes using /derp/latency-check.
* But cmd/derper serves the handler at /derp/probe
* Despite the difference, it work by accident until c8f4dfc8c0
  which made netcheck's probe require a 2xx status code.
* in tests, we only use derphttp.Handler, so the cmd/derper-installed
  mux routes aren't preesnt, so there's no probe. That breaks
  tests in airplane mode. netcheck.Client then reports "unexpected
  HTTP status 426" (Upgrade Required)

This makes derp handle both /derp/probe and /derp/latency-check
equivalently, and in both cmd/derper and derphttp.Handler standalone
modes.

I notice this when wgengine/magicsock TestActiveDiscovery was failing
in airplane mode (no wifi). It still doesn't pass, but it gets
further.

Fixes #11989

Change-Id: I45213d4bd137e0f29aac8bd4a9ac92091065113f
7 months ago
Brad Fitzpatrick 1fe0983f2d cmd/derper,tstest/nettest: skip network-needing test in airplane mode
Not buying wifi on a short flight is a good way to find tests
that require network. Whoops.

Updates #cleanup

Change-Id: Ibe678e9c755d27269ad7206413ffe9971f07d298
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Irbe Krumina cd633a7252
cmd/k8s-operator/deploy,k8s-operator: document that metrics are unstable (#11979)
Updates#11292

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
7 months ago
Irbe Krumina 19b31ac9a6
cmd/{k8s-operator,k8s-nameserver},k8s-operator: update nameserver config with records for ingress/egress proxies (#11019)
cmd/k8s-operator: optionally update dnsrecords Configmap with DNS records for proxies.

This commit adds functionality to automatically populate
DNS records for the in-cluster ts.net nameserver
to allow cluster workloads to resolve MagicDNS names
associated with operator's proxies.

The records are created as follows:
* For tailscale Ingress proxies there will be
a record mapping the MagicDNS name of the Ingress
device and each proxy Pod's IP address.
* For cluster egress proxies, configured via
tailscale.com/tailnet-fqdn annotation, there will be
a record for each proxy Pod, mapping
the MagicDNS name of the exposed
tailnet workload to the proxy Pod's IP.

No records will be created for any other proxy types.
Records will only be created if users have configured
the operator to deploy an in-cluster ts.net nameserver
by applying tailscale.com/v1alpha1.DNSConfig.

It is user's responsibility to add the ts.net nameserver
as a stub nameserver for ts.net DNS names.
https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/#configuration-of-stub-domain-and-upstream-nameserver-using-coredns
https://cloud.google.com/kubernetes-engine/docs/how-to/kube-dns#upstream_nameservers

See also https://github.com/tailscale/tailscale/pull/11017

Updates tailscale/tailscale#10499

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
7 months ago
Paul Scott 4c08410011 cmd/tailscale/cli: set localClient.UseSocketOnly during flag parsing
This configures localClient correctly during flag parsing, so that the --socket
option is effective when generating tab-completion results. For example, the
following would not connect to the system Tailscale for tab-completion results:

    tailscale --socket=/tmp/tailscaled.socket switch <TAB>

Updates #3793

Signed-off-by: Paul Scott <paul@tailscale.com>
7 months ago
Paul Scott ba34943133 cmd/tailscale/cli/ffcomplete: omit and clean completion results
Updates #3793

Signed-off-by: Paul Scott <paul@tailscale.com>
7 months ago
Gabe Gorelick de85610be0
cmd/k8s-operator/deploy/chart: allow users to configure additional labels for the operator's Pod via Helm chart values.
cmd/k8s-operator/deploy/chart: allow users to configure additional labels for the operator's Pod via Helm chart values.

Fixes #11947

Signed-off-by: Gabe Gorelick <gabe@hightouch.io>
7 months ago
Irbe Krumina 44aa809cb0
cmd/{k8s-nameserver,k8s-operator},k8s-operator: add a kube nameserver, make operator deploy it (#11919)
* cmd/k8s-nameserver,k8s-operator: add a nameserver that can resolve ts.net DNS names in cluster.

Adds a simple nameserver that can respond to A record queries for ts.net DNS names.
It can respond to queries from in-memory records, populated from a ConfigMap
mounted at /config. It dynamically updates its records as the ConfigMap
contents changes.
It will respond with NXDOMAIN to queries for any other record types
(AAAA to be implemented in the future).
It can respond to queries over UDP or TCP. It runs a miekg/dns
DNS server with a single registered handler for ts.net domain names.
Queries for other domain names will be refused.

The intended use of this is:
1) to allow non-tailnet cluster workloads to talk to HTTPS tailnet
services exposed via Tailscale operator egress over HTTPS
2) to allow non-tailnet cluster workloads to talk to workloads in
the same cluster that have been exposed to tailnet over their
MagicDNS names but on their cluster IPs.

DNSConfig CRD can be used to configure
the operator to deploy kube nameserver (./cmd/k8s-nameserver) to cluster.

Updates tailscale/tailscale#10499

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
7 months ago
Irbe Krumina 7d9c3f9897
cmd/k8s-operator/deploy/manifests: check if IPv6 module is loaded before using it (#11867)
Before attempting to enable IPv6 forwarding in the proxy init container
check if the relevant module is found, else the container crashes
on hosts that don't have it.

Updates#11860

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
7 months ago
Andrew Lytvynov ce5c80d0fe
clientupdate: exec systemctl instead of using dbus to restart (#11923)
Shell out to "systemctl", which lets us drop an extra dependency.

Updates https://github.com/tailscale/corp/issues/18935

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
7 months ago
Irbe Krumina 1452faf510
cmd/containerboot,kube,ipn/store/kubestore: allow interactive login on kube, check Secret create perms, allow empty state Secret (#11326)
cmd/containerboot,kube,ipn/store/kubestore: allow interactive login and empty state Secrets, check perms

* Allow users to pre-create empty state Secrets

* Add a fake internal kube client, test functionality that has dependencies on kube client operations.

* Fix an issue where interactive login was not allowed in an edge case where state Secret does not exist

* Make the CheckSecretPermissions method report whether we have permissions to create/patch a Secret if it's determined that these operations will be needed

Updates tailscale/tailscale#11170

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
7 months ago
Brad Fitzpatrick b9adbe2002 net/{interfaces,netmon}, all: merge net/interfaces package into net/netmon
In prep for most of the package funcs in net/interfaces to become
methods in a long-lived netmon.Monitor that can cache things.  (Many
of the funcs are very heavy to call regularly, whereas the long-lived
netmon.Monitor can subscribe to things from the OS and remember
answers to questions it's asked regularly later)

Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299

Change-Id: Ie4e8dedb70136af2d611b990b865a822cd1797e5
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick 6b95219e3a net/netmon, add: add netmon.State type alias of interfaces.State
... in prep for merging the net/interfaces package into net/netmon.

This is a no-op change that updates a bunch of the API signatures ahead of
a future change to actually move things (and remove the type alias)

Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299

Change-Id: I477613388f09389214db0d77ccf24a65bff2199c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Irbe Krumina 45f0721530
cmd/containerboot: wait on tailscaled process only (#11897)
Modifies containerboot to wait on tailscaled process
only, not on any child process of containerboot.
Waiting on any subprocess was racing with Go's
exec.Cmd.Run, used to run iptables commands and
that starts its own subprocesses and waits on them.

Containerboot itself does not run anything else
except for tailscaled, so there shouldn't be a need
to wait on anything else.

Updates tailscale/tailscale#11593

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
7 months ago
Brad Fitzpatrick 3672f29a4e net/netns, net/dns/resolver, etc: make netmon required in most places
The goal is to move more network state accessors to netmon.Monitor
where they can be cheaper/cached. But first (this change and others)
we need to make sure the one netmon.Monitor is plumbed everywhere.

Some notable bits:

* tsdial.NewDialer is added, taking a now-required netmon

* because a tsdial.Dialer always has a netmon, anything taking both
  a Dialer and a NetMon is now redundant; take only the Dialer and
  get the NetMon from that if/when needed.

* netmon.NewStatic is added, primarily for tests

Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299

Change-Id: I877f9cb87618c4eb037cee098241d18da9c01691
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick 7a62dddeac net/netcheck, wgengine/magicsock: make netmon.Monitor required
This has been a TODO for ages. Time to do it.

The goal is to move more network state accessors to netmon.Monitor
where they can be cheaper/cached.

Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299

Change-Id: I60fc6508cd2d8d079260bda371fc08b6318bcaf1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick 745931415c health, all: remove health.Global, finish plumbing health.Tracker
Updates #11874
Updates #4136

Change-Id: I414470f71d90be9889d44c3afd53956d9f26cd61
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick 6d69fc137f ipn/{ipnlocal,localapi},wgengine{,/magicsock}: plumb health.Tracker
Down to 25 health.Global users. After this remains controlclient &
net/dns & wgengine/router.

Updates #11874
Updates #4136

Change-Id: I6dd1856e3d9bf523bdd44b60fb3b8f7501d5dc0d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Irbe Krumina df8f40905b
cmd/k8s-operator,k8s-operator: optionally serve tailscaled metrics on Pod IP (#11699)
Adds a new .spec.metrics field to ProxyClass to allow users to optionally serve
client metrics (tailscaled --debug) on <Pod-IP>:9001.
Metrics cannot currently be enabled for proxies that egress traffic to tailnet
and for Ingress proxies with tailscale.com/experimental-forward-cluster-traffic-via-ingress annotation
(because they currently forward all cluster traffic to their respective backends).

The assumption is that users will want to have these metrics enabled
continuously to be able to monitor proxy behaviour (as opposed to enabling
them temporarily for debugging). Hence we expose them on Pod IP to make it
easier to consume them i.e via Prometheus PodMonitor.

Updates tailscale/tailscale#11292

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
7 months ago
Brad Fitzpatrick 723c775dbb tsd, ipnlocal, etc: add tsd.System.HealthTracker, start some plumbing
This adds a health.Tracker to tsd.System, accessible via
a new tsd.System.HealthTracker method.

In the future, that new method will return a tsd.System-specific
HealthTracker, so multiple tsnet.Servers in the same process are
isolated. For now, though, it just always returns the temporary
health.Global value. That permits incremental plumbing over a number
of changes. When the second to last health.Global reference is gone,
then the tsd.System.HealthTracker implementation can return a private
Tracker.

The primary plumbing this does is adding it to LocalBackend and its
dozen and change health calls. A few misc other callers are also
plumbed. Subsequent changes will flesh out other parts of the tree
(magicsock, controlclient, etc).

Updates #11874
Updates #4136

Change-Id: Id51e73cfc8a39110425b6dc19d18b3975eac75ce
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Lee Briggs 14ac41febc
cmd/k8s-operator,k8s-operator: proxyclass affinity (#11862)
add ability to set affinity rules to proxyclass

Updates#11861

Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
7 months ago