Commit Graph

388 Commits (6ac80b7334eb978390c75134a82462d43c78f029)

Author SHA1 Message Date
Percy Wegmann 2e95313b8b ssh,tempfork/gliderlabs/ssh: replace github.com/tailscale/golang-x-crypto/ssh with golang.org/x/crypto/ssh
The upstream crypto package now supports sending banners at any time during
authentication, so the Tailscale fork of crypto/ssh is no longer necessary.

github.com/tailscale/golang-x-crypto is still needed for some custom ACME
autocert functionality.

tempfork/gliderlabs is still necessary because of a few other customizations,
mostly related to TTY handling.

Originally implemented in 46fd4e58a2,
which was reverted in b60f6b849a to
keep the change out of v1.80.

Updates #8593

Signed-off-by: Percy Wegmann <percy@tailscale.com>
10 months ago
Irbe Krumina a49af98b31
cmd/k8s-operator: temporarily disable HA Ingress controller (#14833)
The HA Ingress functionality is not actually doing anything
valuable yet, so don't run the controller in 1.80 release yet.

Updates tailscale/tailscale#24795

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
10 months ago
Irbe Krumina 3f39211f98
cmd/k8s-operator: check that cluster traffic is routed to egress ProxyGroup Pod before marking it as ready (#14792)
This change builds on top of #14436 to ensure minimum downtime during egress ProxyGroup update rollouts:

- adds a readiness gate for ProxyGroup replicas that prevents kubelet from marking
the replica Pod as ready before a corresponding readiness condition has been added
to the Pod

- adds a reconciler that reconciles egress ProxyGroup Pods and, for each that is not ready,
if cluster traffic for relevant egress endpoints is routed via this Pod- if so add the
readiness condition to allow kubelet to mark the Pod as ready.

During the sequenced StatefulSet update rollouts kubelet does not restart
a Pod before the previous replica has been updated and marked as ready, so
ensuring that a replica is not marked as ready allows to avoid a temporary
post-update situation where all replicas have been restarted, but none of the
new ones are yet set up as an endpoint for the egress service, so cluster traffic is dropped.

Updates tailscale/tailscale#14326

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
10 months ago
Percy Wegmann b60f6b849a Revert "ssh,tempfork/gliderlabs/ssh: replace github.com/tailscale/golang-x-crypto/ssh with golang.org/x/crypto/ssh"
This reverts commit 46fd4e58a2.

We don't want to include this in 1.80 yet, but can add it back post 1.80.

Updates #8593

Signed-off-by: Percy Wegmann <percy@tailscale.com>
10 months ago
Irbe Krumina 52f88f782a
cmd/k8s-operator: don't set deprecated configfile hash on new proxies (#14817)
Fixes the configfile reload logic- if the tailscale capver can not
yet be determined because the device info is not yet written to the
state Secret, don't assume that the proxy is pre-110.

Updates tailscale/tailscale#13032

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
10 months ago
Irbe Krumina b406f209c3
cmd/{k8s-operator,containerboot},kube: ensure egress ProxyGroup proxies don't terminate while cluster traffic is still routed to them (#14436)
cmd/{containerboot,k8s-operator},kube: add preshutdown hook for egress PG proxies

This change is part of work towards minimizing downtime during update
rollouts of egress ProxyGroup replicas.
This change:
- updates the containerboot health check logic to return Pod IP in headers,
if set
- always runs the health check for egress PG proxies
- updates ClusterIP Services created for PG egress endpoints to include
the health check endpoint
- implements preshutdown endpoint in proxies. The preshutdown endpoint
logic waits till, for all currently configured egress services, the ClusterIP
Service health check endpoint is no longer returned by the shutting-down Pod
(by looking at the new Pod IP header).
- ensures that kubelet is configured to call the preshutdown endpoint

This reduces the possibility that, as replicas are terminated during an update,
a replica gets terminated to which cluster traffic is still being routed via
the ClusterIP Service because kube proxy has not yet updated routig rules.
This is not a perfect check as in practice, it only checks that the kube
proxy on the node on which the proxy runs has updated rules. However, overall
this might be good enough.

The preshutdown logic is disabled if users have configured a custom health check
port via TS_LOCAL_ADDR_PORT env var. This change throws a warnign if so and in
future setting of that env var for operator proxies might be disallowed (as users
shouldn't need to configure this for a Pod directly).
This is backwards compatible with earlier proxy versions.

Updates tailscale/tailscale#14326


Signed-off-by: Irbe Krumina <irbe@tailscale.com>
10 months ago
Percy Wegmann 46fd4e58a2 ssh,tempfork/gliderlabs/ssh: replace github.com/tailscale/golang-x-crypto/ssh with golang.org/x/crypto/ssh
The upstream crypto package now supports sending banners at any time during
authentication, so the Tailscale fork of crypto/ssh is no longer necessary.

github.com/tailscale/golang-x-crypto is still needed for some custom ACME
autocert functionality.

tempfork/gliderlabs is still necessary because of a few other customizations,
mostly related to TTY handling.

Updates #8593

Signed-off-by: Percy Wegmann <percy@tailscale.com>
10 months ago
Brad Fitzpatrick 2691b9f6be tempfork/acme: add new package for x/crypto package acme fork, move
We've been maintaining temporary dev forks of golang.org/x/crypto/{acme,ssh}
in https://github.com/tailscale/golang-x-crypto instead of using
this repo's tempfork directory as we do with other packages. The reason we were
doing that was because x/crypto/ssh depended on x/crypto/ssh/internal/poly1305
and I hadn't noticed there are forwarding wrappers already available
in x/crypto/poly1305. It also depended internal/bcrypt_pbkdf but we don't use that
so it's easy to just delete that calling code in our tempfork/ssh.

Now that our SSH changes have been upstreamed, we can soon unfork from SSH.

That leaves ACME remaining.

This change copies our tailscale/golang-x-crypto/acme code to
tempfork/acme but adds a test that our vendored copied still matches
our tailscale/golang-x-crypto repo, where we can continue to do
development work and rebases with upstream. A comment on the new test
describes the expected workflow.

While we could continue to just import & use
tailscale/golang-x-crypto/acme, it seems a bit nicer to not have that
entire-fork-of-x-crypto visible at all in our transitive deps and the
questions that invites. Showing just a fork of an ACME client is much
less scary. It does add a step to the process of hacking on the ACME
client code, but we do that approximately never anyway, and the extra
step is very incremental compared to the existing tedious steps.

Updates #8593
Updates #10238

Change-Id: I8af4378c04c1f82e63d31bf4d16dba9f510f9199
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
10 months ago
Brad Fitzpatrick bce05ec6c3 control/controlclient,tempfork/httprec: don't link httptest, test certs for c2n
The c2n handling code was using the Go httptest package's
ResponseRecorder code but that's in a test package which brings in
Go's test certs, etc.

This forks the httptest recorder type into its own package that only
has the recorder and adds a test that we don't re-introduce a
dependency on httptest.

Updates #12614

Change-Id: I3546f49972981e21813ece9064cc2be0b74f4b16
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
10 months ago
Brad Fitzpatrick 8c925899e1 go.mod: bump depaware, add --internal flag to stop hiding internal packages
The hiding of internal packages has hidden things I wanted to see a
few times now. Stop hiding them. This makes depaware.txt output a bit
longer, but not too much. Plus we only really look at it with diffs &
greps anyway; it's not like anybody reads the whole thing.

Updates #12614

Change-Id: I868c89eeeddcaaab63e82371651003629bc9bda8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
10 months ago
Brad Fitzpatrick 68a66ee81b feature/capture: move packet capture to feature/*, out of iOS + CLI
We had the debug packet capture code + Lua dissector in the CLI + the
iOS app. Now we don't, with tests to lock it in.

As a bonus, tailscale.com/net/packet and tailscale.com/net/flowtrack
no longer appear in the CLI's binary either.

A new build tag ts_omit_capture disables the packet capture code and
was added to build_dist.sh's --extra-small mode.

Updates #12614

Change-Id: I79b0628c0d59911bd4d510c732284d97b0160f10
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
10 months ago
Brad Fitzpatrick d6abbc2e61 net/tstun: move TAP support out to separate package feature/tap
Still behind the same ts_omit_tap build tag.

See #14738 for background on the pattern.

Updates #12614

Change-Id: I03fb3d2bf137111e727415bd8e713d8568156ecc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
10 months ago
Tom Proctor 3033a96b02
cmd/k8s-operator: fix reconciler name clash (#14712)
The new ProxyGroup-based Ingress reconciler is causing a fatal log at
startup because it has the same name as the existing Ingress reconciler.
Explicitly name both to ensure they have unique names that are consistent
with other explicitly named reconcilers.

Updates #14583

Change-Id: Ie76e3eaf3a96b1cec3d3615ea254a847447372ea
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
10 months ago
Brad Fitzpatrick 1562a6f2f2 feature/*: make Wake-on-LAN conditional, start supporting modular features
This pulls out the Wake-on-LAN (WoL) code out into its own package
(feature/wakeonlan) that registers itself with various new hooks
around tailscaled.

Then a new build tag (ts_omit_wakeonlan) causes the package to not
even be linked in the binary.

Ohter new packages include:

   * feature: to just record which features are loaded. Future:
     dependencies between features.
   * feature/condregister: the package with all the build tags
     that tailscaled, tsnet, and the Tailscale Xcode project
     extension can empty (underscore) import to load features
     as a function of the defined build tags.

Future commits will move of our "ts_omit_foo" build tags into this
style.

Updates #12614

Change-Id: I9c5378dafb1113b62b816aabef02714db3fc9c4a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
11 months ago
Adrian Dewhurst 0fa7b4a236 tailcfg: add ServiceName
Rather than using a string everywhere and needing to clarify that the
string should have the svc: prefix, create a separate type for Service
names.

Updates tailscale/corp#24607

Change-Id: I720e022f61a7221644bb60955b72cacf42f59960
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
11 months ago
Brad Fitzpatrick 150cd30b1d ipn/ipnlocal: also use LetsEncrypt-baked-in roots for cert validation
We previously baked in the LetsEncrypt x509 root CA for our tlsdial
package.

This moves that out into a new "bakedroots" package and is now also
shared by ipn/ipnlocal's cert validation code (validCertPEM) that
decides whether it's time to fetch a new cert.

Otherwise, a machine without LetsEncrypt roots locally in its system
roots is unable to use tailscale cert/serve and fetch certs.

Fixes #14690

Change-Id: Ic88b3bdaabe25d56b9ff07ada56a27e3f11d7159
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
11 months ago
Irbe Krumina 817ba1c300
cmd/{k8s-operator,containerboot},kube/kubetypes: parse Ingresses for ingress ProxyGroup (#14583)
cmd/k8s-operator: add logic to parse L7 Ingresses in HA mode

- Wrap the Tailscale API client used by the Kubernetes Operator
into a client that knows how to manage VIPServices.
- Create/Delete VIPServices and update serve config for L7 Ingresses
for ProxyGroup.
- Ensure that ingress ProxyGroup proxies mount serve config from a shared ConfigMap.

Updates tailscale/corp#24795


Signed-off-by: Irbe Krumina <irbe@tailscale.com>
11 months ago
Irbe Krumina 97a44d6453
go.{mod,sum},cmd/{k8s-operator,derper,stund}/depaware.txt: bump kube deps (#14601)
Updates kube deps and mkctr, regenerates kube yamls with the updated tooling.

Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
11 months ago
Tom Proctor 2d1f6f18cc
cmd/k8s-operator: require namespace config (#14648)
Most users should not run into this because it's set in the helm chart
and the deploy manifest, but if namespace is not set we get confusing
authz errors because the kube client tries to fetch some namespaced resources
as though they're cluster-scoped and reports permission denied. Try to
detect namespace from the default projected volume, and otherwise fatal.

Fixes #cleanup

Change-Id: I64b34191e440b61204b9ad30bbfa117abbbe09c3

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
11 months ago
Aaron Klotz fcf90260ce atomicfile: use ReplaceFile on Windows so that attributes and ACLs are preserved
I moved the actual rename into separate, GOOS-specific files. On
non-Windows, we do a simple os.Rename. On Windows, we first try
ReplaceFile with a fallback to os.Rename if the target file does
not exist.

ReplaceFile is the recommended way to rename the file in this use case,
as it preserves attributes and ACLs set on the target file.

Updates #14428

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
11 months ago
Brad Fitzpatrick 414a01126a go.mod: bump mdlayher/netlink and u-root/uio to use Go 1.21 NativeEndian
This finishes the work started in #14616.

Updates #8632

Change-Id: I4dc07d45b1e00c3db32217c03b21b8b1ec19e782
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
11 months ago
Brad Fitzpatrick 69b90742fe util/uniq,types/lazy,*: delete code that's now in Go std
sync.OnceValue and slices.Compact were both added in Go 1.21.

cmp.Or was added in Go 1.22.

Updates #8632
Updates #11058

Change-Id: I89ba4c404f40188e1f8a9566c8aaa049be377754
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
11 months ago
Irbe Krumina 48a95c422a
cmd/containerboot,cmd/k8s-operator: reload tailscaled config (#14342)
cmd/{k8s-operator,containerboot}: reload tailscaled configfile when its contents have changed

Instead of restarting the Kubernetes Operator proxies each time
tailscaled config has changed, this dynamically reloads the configfile
using the new reload endpoint.
Older annotation based mechanism will be supported till 1.84
to ensure that proxy versions prior to 1.80 keep working with
operator 1.80 and newer.

Updates tailscale/tailscale#13032
Updates tailscale/corp#24795

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
11 months ago
Irbe Krumina 68997e0dfa
cmd/k8s-operator,k8s-operator: allow users to set custom labels for the optional ServiceMonitor (#14475)
* cmd/k8s-operator,k8s-operator: allow users to set custom labels for the optional ServiceMonitor

Updates tailscale/tailscale#14381

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
11 months ago
Irbe Krumina 8d4ca13cf8
cmd/k8s-operator,k8s-operator: support ingress ProxyGroup type (#14548)
Currently this does not yet do anything apart from creating
the ProxyGroup resources like StatefulSet.

Updates tailscale/corp#24795

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
11 months ago
Will Norris 60daa2adb8 all: fix golangci-lint errors
These erroneously blocked a recent PR, which I fixed by simply
re-running CI. But we might as well fix them anyway.
These are mostly `printf` to `print` and a couple of `!=` to `!Equal()`

Updates #cleanup

Signed-off-by: Will Norris <will@tailscale.com>
11 months ago
Tom Proctor 3adad364f1
cmd/k8s-operator,k8s-operator: include top-level CRD descriptions (#14435)
When reading https://doc.crds.dev/github.com/tailscale/tailscale/tailscale.com/ProxyGroup/v1alpha1@v1.78.3
I noticed there is no top-level description for ProxyGroup and Recorder. Add
one to give some high-level direction.

Updates #cleanup

Change-Id: I3666c5445be272ea5a1d4d02b6d5ad4c23afb09f

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
12 months ago
Irbe Krumina cc168d9f6b
cmd/k8s-operator: fix ProxyGroup hostname (#14336)
Updates tailscale/tailscale#14325

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
12 months ago
James Tucker aa04f61d5e net/netcheck: adjust HTTPS latency check to connection time and avoid data race
The go-httpstat package has a data race when used with connections that
are performing happy-eyeballs connection setups as we are in the DERP
client. There is a long-stale PR upstream to address this, however
revisiting the purpose of this code suggests we don't really need
httpstat here.

The code populates a latency table that may be used to compare to STUN
latency, which is a lightweight RTT check. Switching out the reported
timing here to simply the request HTTP request RTT avoids the
problematic package.

Fixes tailscale/corp#25095

Signed-off-by: James Tucker <james@tailscale.com>
12 months ago
Tom Proctor f1ccdcc713
cmd/k8s-operator,k8s-operator: operator integration tests (#12792)
This is the start of an integration/e2e test suite for the tailscale operator.
It currently only tests two major features, ingress proxy and API server proxy,
but we intend to expand it to cover more features over time. It also only
supports manual runs for now. We intend to integrate it into CI checks in a
separate update when we have planned how to securely provide CI with the secrets
required for connecting to a test tailnet.

Updates #12622

Change-Id: I31e464bb49719348b62a563790f2bc2ba165a11b
Co-authored-by: Irbe Krumina <irbe@tailscale.com>
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
12 months ago
Tom Proctor df94a14870
cmd/k8s-operator: don't error for transient failures (#14073)
Every so often, the ProxyGroup and other controllers lose an optimistic locking race
with other controllers that update the objects they create. Stop treating
this as an error event, and instead just log an info level log line for it.

Fixes #14072

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Irbe Krumina 2aac916888
cmd/{containerboot,k8s-operator},kube/kubetypes: kube Ingress L7 proxies only advertise HTTPS endpoint when ready (#14171)
cmd/containerboot,kube/kubetypes,cmd/k8s-operator: detect if Ingress is created in a tailnet that has no HTTPS

This attempts to make Kubernetes Operator L7 Ingress setup failures more explicit:
- the Ingress resource now only advertises HTTPS endpoint via status.ingress.loadBalancer.hostname when/if the proxy has succesfully loaded serve config
- the proxy attempts to catch cases where HTTPS is disabled for the tailnet and logs a warning

Updates tailscale/tailscale#12079
Updates tailscale/tailscale#10407

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina aa43388363
cmd/k8s-operator: fix a bunch of status equality checks (#14270)
Updates tailscale/tailscale#14269

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Oliver Rahner cbf1a4efe9
cmd/k8s-operator/deploy/chart: allow reading OAuth creds from a CSI driver's volume and annotating operator's Service account (#14264)
cmd/k8s-operator/deploy/chart: allow reading OAuth creds from a CSI driver's volume and annotating operator's Service account

Updates #14264

Signed-off-by: Oliver Rahner <o.rahner@dke-data.com>
1 year ago
Tom Proctor efdfd54797
cmd/k8s-operator: avoid port collision with metrics endpoint (#14185)
When the operator enables metrics on a proxy, it uses the port 9001,
and in the near future it will start using 9002 for the debug endpoint
as well. Make sure we don't choose ports from a range that includes
9001 so that we never clash. Setting TS_SOCKS5_SERVER, TS_HEALTHCHECK_ADDR_PORT,
TS_OUTBOUND_HTTP_PROXY_LISTEN, and PORT could also open arbitrary ports,
so we will need to document that users should not choose ports from the
10000-11000 range for those settings.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Irbe Krumina 9f9063e624
cmd/k8s-operator,k8s-operator,go.mod: optionally create ServiceMonitor (#14248)
* cmd/k8s-operator,k8s-operator,go.mod: optionally create ServiceMonitor

Adds a new spec.metrics.serviceMonitor field to ProxyClass.
If that's set to true (and metrics are enabled), the operator
will create a Prometheus ServiceMonitor for each proxy to which
the ProxyClass applies.
Additionally, create a metrics Service for each proxy that has
metrics enabled.

Updates tailscale/tailscale#11292

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina eabb424275
cmd/k8s-operator,docs/k8s: run tun mode proxies in privileged containers (#14262)
We were previously relying on unintended behaviour by runc where
all containers where by default given read/write/mknod permissions
for tun devices.
This behaviour was removed in https://github.com/opencontainers/runc/pull/3468
and released in runc 1.2.
Containerd container runtime, used by Docker and majority of Kubernetes distributions
bumped runc to 1.2 in 1.7.24 https://github.com/containerd/containerd/releases/tag/v1.7.24
thus breaking our reference tun mode Tailscale Kubernetes manifests and Kubernetes
operator proxies.

This PR changes the all Kubernetes container configs that run Tailscale in tun mode
to privileged. This should not be a breaking change because all these containers would
run in a Pod that already has a privileged init container.

Updates tailscale/tailscale#14256
Updates tailscale/tailscale#10814

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Tom Proctor 24095e4897
cmd/containerboot: serve health on local endpoint (#14246)
* cmd/containerboot: serve health on local endpoint

We introduced stable (user) metrics in #14035, and `TS_LOCAL_ADDR_PORT`
with it. Rather than requiring users to specify a new addr/port
combination for each new local endpoint they want the container to
serve, this combines the health check endpoint onto the local addr/port
used by metrics if `TS_ENABLE_HEALTH_CHECK` is used instead of
`TS_HEALTHCHECK_ADDR_PORT`.

`TS_LOCAL_ADDR_PORT` now defaults to binding to all interfaces on 9002
so that it works more seamlessly and with less configuration in
environments other than Kubernetes, where the operator always overrides
the default anyway. In particular, listening on localhost would not be
accessible from outside the container, and many scripted container
environments do not know the IP address of the container before it's
started. Listening on all interfaces allows users to just set one env
var (`TS_ENABLE_METRICS` or `TS_ENABLE_HEALTH_CHECK`) to get a fully
functioning local endpoint they can query from outside the container.

Updates #14035, #12898

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Irbe Krumina 13faa64c14
cmd/k8s-operator: always set stateful filtering to false (#14216)
Updates tailscale/tailscale#12108

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina f8587e321e
cmd/k8s-operator: fix port name change bug for egress ProxyGroup proxies (#14247)
Ensure that the ExternalName Service port names are always synced to the
ClusterIP Service, to fix a bug where if users created a Service with
a single unnamed port and later changed to 1+ named ports, the operator
attempted to apply an invalid multi-port Service with an unnamed port.
Also, fixes a small internal issue where not-yet Service status conditons
were lost on a spec update.

Updates tailscale/tailscale#10102

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Tom Proctor 74d4652144
cmd/{containerboot,k8s-operator},k8s-operator: new options to expose user metrics (#14035)
containerboot:

Adds 3 new environment variables for containerboot, `TS_LOCAL_ADDR_PORT` (default
`"${POD_IP}:9002"`), `TS_METRICS_ENABLED` (default `false`), and `TS_DEBUG_ADDR_PORT`
(default `""`), to configure metrics and debug endpoints. In a follow-up PR, the
health check endpoint will be updated to use the `TS_LOCAL_ADDR_PORT` if
`TS_HEALTHCHECK_ADDR_PORT` hasn't been set.

Users previously only had access to internal debug metrics (which are unstable
and not recommended) via passing the `--debug` flag to tailscaled, but can now
set `TS_METRICS_ENABLED=true` to expose the stable metrics documented at
https://tailscale.com/kb/1482/client-metrics at `/metrics` on the addr/port
specified by `TS_LOCAL_ADDR_PORT`.

Users can also now configure a debug endpoint more directly via the
`TS_DEBUG_ADDR_PORT` environment variable. This is not recommended for production
use, but exposes an internal set of debug metrics and pprof endpoints.

operator:

The `ProxyClass` CRD's `.spec.metrics.enable` field now enables serving the
stable user metrics documented at https://tailscale.com/kb/1482/client-metrics
at `/metrics` on the same "metrics" container port that debug metrics were
previously served on. To smooth the transition for anyone relying on the way the
operator previously consumed this field, we also _temporarily_ serve tailscaled's
internal debug metrics on the same `/debug/metrics` path as before, until 1.82.0
when debug metrics will be turned off by default even if `.spec.metrics.enable`
is set. At that point, anyone who wishes to continue using the internal debug
metrics (not recommended) will need to set the new `ProxyClass` field
`.spec.statefulSet.pod.tailscaleContainer.debug.enable`.

Users who wish to opt out of the transitional behaviour, where enabling
`.spec.metrics.enable` also enables debug metrics, can set
`.spec.statefulSet.pod.tailscaleContainer.debug.enable` to false (recommended).

Separately but related, the operator will no longer specify a host port for the
"metrics" container port definition. This caused scheduling conflicts when k8s
needs to schedule more than one proxy per node, and was not necessary for allowing
the pod's port to be exposed to prometheus scrapers.

Updates #11292

---------

Co-authored-by: Kristoffer Dalby <kristoffer@tailscale.com>
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Irbe Krumina c59ab6baac
cmd/k8s-operator/deploy: ensure that operator can write kube state Events (#14177)
A small follow-up to #14112- ensures that the operator itself can emit
Events for its kube state store changes.

Updates tailscale/tailscale#14080

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina ebeb5da202
cmd/k8s-operator,kube/kubeclient,docs/k8s: update rbac to emit events + small fixes (#14164)
This is a follow-up to #14112 where our internal kube client was updated
to allow it to emit Events - this updates our sample kube manifests
and tsrecorder manifest templates so they can benefit from this functionality.

Updates tailscale/tailscale#14080

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
James Stocker 303a4a1dfb
Make the deployment of an IngressClass optional, default to true (#14153)
Fixes tailscale/tailscale#14152
Signed-off-by: James Stocker jamesrstocker@gmail.com

Co-authored-by: James Stocker <james.stocker@intenthq.co.uk>
1 year ago
Irbe Krumina 00517c8189
kube/{kubeapi,kubeclient},ipn/store/kubestore,cmd/{containerboot,k8s-operator}: emit kube store Events (#14112)
Adds functionality to kube client to emit Events.
Updates kube store to emit Events when tailscaled state has been loaded, updated or if any errors where
encountered during those operations.
This should help in cases where an error related to state loading/updating caused the Pod to crash in a loop-
unlike logs of the originally failed container instance, Events associated with the Pod will still be
accessible even after N restarts.

Updates tailscale/tailscale#14080

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina cf41cec5a8
cmd/{k8s-operator,containerboot},k8s-operator: remove support for proxies below capver 95. (#13986)
Updates tailscale/tailscale#13984

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Tom Proctor d8a3683fdf
cmd/k8s-operator: restart ProxyGroup pods less (#14045)
We currently annotate pods with a hash of the tailscaled config so that
we can trigger pod restarts whenever it changes. However, the hash
updates more frequently than is necessary causing more restarts than is
necessary. This commit removes two causes; scaling up/down and removing
the auth key after pods have initially authed to control. However, note
that pods will still restart on scale-up/down because of the updated set
of volumes mounted into each pod. Hopefully we can fix that in a planned
follow-up PR.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Irbe Krumina b9ecc50ce3
cmd/k8s-operator,k8s-operator,kube/kubetypes: add an option to configure app connector via Connector spec (#13950)
* cmd/k8s-operator,k8s-operator,kube/kubetypes: add an option to configure app connector via Connector spec

Updates tailscale/tailscale#11113

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Brad Fitzpatrick 020cacbe70 derp/derphttp: don't link websockets other than on GOOS=js
Or unless the new "ts_debug_websockets" build tag is set.

Updates #1278

Change-Id: Ic4c4f81c1924250efd025b055585faec37a5491d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick c3306bfd15 control/controlhttp/controlhttpserver: split out Accept to its own package
Otherwise all the clients only using control/controlhttp for the
ts2021 HTTP client were also pulling in WebSocket libraries, as the
server side always needs to speak websockets, but only GOOS=js clients
speak it.

This doesn't yet totally remove the websocket dependency on Linux because
Linux has a envknob opt-in to act like GOOS=js for manual testing and force
the use of WebSockets for DERP only (not control). We can put that behind
a build tag in a future change to eliminate the dep on all GOOSes.

Updates #1278

Change-Id: I4f60508f4cad52bf8c8943c8851ecee506b7ebc9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Irbe Krumina 8ba9b558d2
envknob,kube/kubetypes,cmd/k8s-operator: add app type for ProxyGroup (#14029)
Sets a custom hostinfo app type for ProxyGroup replicas, similarly
to how we do it for all other Kubernetes Operator managed components.

Updates tailscale/tailscale#13406,tailscale/corp#22920

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Brad Fitzpatrick 01185e436f types/result, util/lineiter: add package for a result type, use it
This adds a new generic result type (motivated by golang/go#70084) to
try it out, and uses it in the new lineutil package (replacing the old
lineread package), changing that package to return iterators:
sometimes over []byte (when the input is all in memory), but sometimes
iterators over results of []byte, if errors might happen at runtime.

Updates #12912
Updates golang/go#70084

Change-Id: Iacdc1070e661b5fb163907b1e8b07ac7d51d3f83
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Irbe Krumina 809a6eba80
cmd/k8s-operator: allow to optionally configure tailscaled port (#14005)
Updates tailscale/tailscale#13981

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Nick Khyl 3f626c0d77 cmd/tailscale/cli, client/tailscale, ipn/localapi: add tailscale syspolicy {list,reload} commands
In this PR, we add the tailscale syspolicy command with two subcommands: list, which displays
policy settings, and reload, which forces a reload of those settings. We also update the LocalAPI
and LocalClient to facilitate these additions.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Irbe Krumina 1103044598
cmd/k8s-operator,k8s-operator: add topology spread constraints to ProxyClass (#13959)
Now when we have HA for egress proxies, it makes sense to support topology
spread constraints that would allow users to define more complex
topologies of how proxy Pods need to be deployed in relation with other
Pods/across regions etc.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Nick Kirby 6ab39b7bcd
cmd/k8s-operator: validate that tailscale.com/tailnet-ip annotation value is a valid IP
Fixes #13836
Signed-off-by: Nick Kirby <nrkirb@gmail.com>
1 year ago
Nick Khyl e815ae0ec4 util/syspolicy, ipn/ipnlocal: update syspolicy package to utilize syspolicy/rsop
In this PR, we update the syspolicy package to utilize syspolicy/rsop under the hood,
and remove syspolicy.CachingHandler, syspolicy.windowsHandler and related code
which is no longer used.

We mark the syspolicy.Handler interface and RegisterHandler/SetHandlerForTest functions
as deprecated, but keep them temporarily until they are no longer used in other repos.

We also update the package to register setting definitions for all existing policy settings
and to register the Registry-based, Windows-specific policy stores when running on Windows.

Finally, we update existing internal and external tests to use the new API and add a few more
tests and benchmarks.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Maisem Ali d4d21a0bbf net/tstun: restore tap mode functionality
It had bit-rotted likely during the transition to vector io in
76389d8baf. Tested on Ubuntu 24.04
by creating a netns and doing the DHCP dance to get an IP.

Updates #2589

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Andrea Gottardo fd77965f23
net/tlsdial: call out firewalls blocking Tailscale in health warnings (#13840)
Updates tailscale/tailscale#13839

Adds a new blockblame package which can detect common MITM SSL certificates used by network appliances. We use this in `tlsdial` to display a dedicated health warning when we cannot connect to control, and a network appliance MITM attack is detected.

Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Mario Minardi d32d742af0
ipn/ipnlocal: error when trying to use exit node on unsupported platform (#13726)
Adds logic to `checkExitNodePrefsLocked` to return an error when
attempting to use exit nodes on a platform where this is not supported.
This mirrors logic that was added to error out when trying to use `ssh`
on an unsupported platform, and has very similar semantics.

Fixes https://github.com/tailscale/tailscale/issues/13724

Signed-off-by: Mario Minardi <mario@tailscale.com>
1 year ago
Percy Wegmann f9949cde8b client/tailscale,cmd/{cli,get-authkey,k8s-operator}: set distinct User-Agents
This helps better distinguish what is generating activity to the
Tailscale public API.

Updates tailscale/corp#23838

Signed-off-by: Percy Wegmann <percy@tailscale.com>
1 year ago
Brad Fitzpatrick 1938685d39 clientupdate: don't link distsign on platforms that don't download
Updates tailscale/corp#20099

Change-Id: Ie3b782379b19d5f7890a8d3a378096b4f3e8a612
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Irbe Krumina 89ee6bbdae
cmd/k8s-operator,k8s-operator/apis: set a readiness condition on egress Services for ProxyGroup (#13746)
cmd/k8s-operator,k8s-operator/apis: set a readiness condition on egress Services

Set a readiness condition on ExternalName Services that define a tailnet target
to route cluster traffic to via a ProxyGroup's proxies. The condition
is set to true if at least one proxy is currently set up to route.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina f6d4d03355
cmd/k8s-operator: don't error out if ProxyClass for ProxyGroup not found. (#13736)
We don't need to error out and continuously reconcile if ProxyClass
has not (yet) been created, once it gets created the ProxyGroup
reconciler will get triggered.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina 60011e73b8
cmd/k8s-operator: fix Pod IP selection (#13743)
Ensure that .status.podIPs is used to select Pod's IP
in all reconcilers.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Nick Khyl da40609abd util/syspolicy, ipn: add "tailscale debug component-logs" support
Fixes #13313
Fixes #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Tom Proctor 07c157ee9f
cmd/k8s-operator: base ProxyGroup StatefulSet on common proxy.yaml definition (#13714)
As discussed in #13684, base the ProxyGroup's proxy definitions on the same
scaffolding as the existing proxies, as defined in proxy.yaml

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Irbe Krumina 861dc3631c
cmd/{k8s-operator,containerboot},kube/egressservices: fix Pod IP check for dual stack clusters (#13721)
Currently egress Services for ProxyGroup only work for Pods and Services
with IPv4 addresses. Ensure that it works on dual stack clusters by reading
proxy Pod's IP from the .status.podIPs list that always contains both
IPv4 and IPv6 address (if the Pod has them) rather than .status.podIP that
could contain IPv6 only for a dual stack cluster.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Tom Proctor 36cb2e4e5f
cmd/k8s-operator,k8s-operator: use default ProxyClass if set for ProxyGroup (#13720)
The default ProxyClass can be set via helm chart or env var, and applies
to all proxies that do not otherwise have an explicit ProxyClass set.
This ensures proxies created by the new ProxyGroup CRD are consistent
with the behaviour of existing proxies

Nearby but unrelated changes:

* Fix up double error logs (controller runtime logs returned errors)
* Fix a couple of variable names

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Irbe Krumina 7f016baa87
cmd/k8s-operator,k8s-operator: create ConfigMap for egress services + small fixes for egress services (#13715)
cmd/k8s-operator, k8s-operator: create ConfigMap for egress services + small reconciler fixes

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Tom Proctor e48cddfbb3
cmd/{containerboot,k8s-operator},k8s-operator,kube: add ProxyGroup controller (#13684)
Implements the controller for the new ProxyGroup CRD, designed for
running proxies in a high availability configuration. Each proxy gets
its own config and state Secret, and its own tailscale node ID.

We are currently mounting all of the config secrets into the container,
but will stop mounting them and instead read them directly from the kube
API once #13578 is implemented.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Irbe Krumina e8bb5d1be5
cmd/{k8s-operator,containerboot},k8s-operator,kube: reconcile ExternalName Services for ProxyGroup (#13635)
Adds a new reconciler that reconciles ExternalName Services that define a
tailnet target that should be exposed to cluster workloads on a ProxyGroup's
proxies.
The reconciler ensures that for each such service, the config mounted to
the proxies is updated with the tailnet target definition and that
and EndpointSlice and ClusterIP Service are created for the service.

Adds a new reconciler that ensures that as proxy Pods become ready to route
traffic to a tailnet target, the EndpointSlice for the target is updated
with the Pods' endpoints.

Updates tailscale/tailscale#13406

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina c62b0732d2
cmd/k8s-operator: remove auth key once proxy has logged in (#13612)
The operator creates a non-reusable auth key for each of
the cluster proxies that it creates and puts in the tailscaled
configfile mounted to the proxies.
The proxies are always tagged, and their state is persisted
in a Kubernetes Secret, so their node keys are expected to never
be regenerated, so that they don't need to re-auth.

Some tailnet configurations however have seen issues where the auth
keys being left in the tailscaled configfile cause the proxies
to end up in unauthorized state after a restart at a later point
in time.
Currently, we have not found a way to reproduce this issue,
however this commit removes the auth key from the config once
the proxy can be assumed to have logged in.

If an existing, logged-in proxy is upgraded to this version,
its redundant auth key will be removed from the conffile.

If an existing, logged-in proxy is downgraded from this version
to a previous version, it will work as before without re-issuing key
as the previous code did not enforce that a key must be present.

Updates tailscale/tailscale#13451

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Tom Proctor cab2e6ea67
cmd/k8s-operator,k8s-operator: add ProxyGroup CRD (#13591)
The ProxyGroup CRD specifies a set of N pods which will each be a
tailnet device, and will have M different ingress or egress services
mapped onto them. It is the mechanism for specifying how highly
available proxies need to be. This commit only adds the definition, no
controller loop, and so it is not currently functional.

This commit also splits out TailnetDevice and RecorderTailnetDevice
into separate structs because the URL field is specific to recorders,
but we want a more generic struct for use in the ProxyGroup status field.

Updates #13406

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Cameron Stokes 65c26357b1
cmd/k8s-operator, k8s-operator: fix outdated kb links (#13585)
updates #13583

Signed-off-by: Cameron Stokes <cameron@tailscale.com>
1 year ago
Tom Proctor 98f4dd9857
cmd/k8s-operator,k8s-operator,kube: Add TSRecorder CRD + controller (#13299)
cmd/k8s-operator,k8s-operator,kube: Add TSRecorder CRD + controller

Deploys tsrecorder images to the operator's cluster. S3 storage is
configured via environment variables from a k8s Secret. Currently
only supports a single tsrecorder replica, but I've tried to take early
steps towards supporting multiple replicas by e.g. having a separate
secret for auth and state storage.

Example CR:

```yaml
apiVersion: tailscale.com/v1alpha1
kind: Recorder
metadata:
  name: rec
spec:
  enableUI: true
```

Updates #13298

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Irbe Krumina 209567e7a0
kube,cmd/{k8s-operator,containerboot},envknob,ipn/store/kubestore,*/depaware.txt: rename packages (#13418)
Rename kube/{types,client,api} -> kube/{kubetypes,kubeclient,kubeapi}
so that we don't need to rename the package on each import to
convey that it's kubernetes specific.

Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina d6dfb7f242
kube,cmd/{k8s-operator,containerboot},envknob,ipn/store/kubestore,*/depaware.txt: split out kube types (#13417)
Further split kube package into kube/{client,api,types}. This is so that
consumers who only need constants/static types don't have to import
the client and api bits.

Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina ecd64f6ed9
cmd/k8s-operator,kube: set app name for Kubernetes Operator proxies (#13410)
Updates tailscale/corp#22920

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina 8e1c00f841
cmd/k8s-operator,k8s-operator/sessionrecording: ensure recording header contains terminal size for terminal sessions (#12965)
* cmd/k8s-operator,k8s-operator/sessonrecording: ensure CastHeader contains terminal size

For tsrecorder to be able to play session recordings, the recording's
CastHeader must have '.Width' and '.Height' fields set to non-zero.
Kubectl (or whoever is the client that initiates the 'kubectl exec'
session recording) sends the terminal dimensions in a resize message that
the API server proxy can intercept, however that races with the first server
message that we need to record.
This PR ensures we wait for the terminal dimensions to be processed from
the first resize message before any other data is sent, so that for all
sessions with terminal attached, the header of the session recording
contains the terminal dimensions and the recording can be played by tsrecorder.

Updates tailscale/tailscale#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Andrew Dunham 1c972bc7cb wgengine/magicsock: actually use AF_PACKET socket for raw disco
Previously, despite what the commit said, we were using a raw IP socket
that was *not* an AF_PACKET socket, and thus was subject to the host
firewall rules. Switch to using a real AF_PACKET socket to actually get
the functionality we want.

Updates #13140

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: If657daeeda9ab8d967e75a4f049c66e2bca54b78
1 year ago
Nick Khyl 961ee321e8 ipn/{ipnauth,ipnlocal,ipnserver,localapi}: start baby step toward moving access checks from the localapi.Handler to the LocalBackend
Currently, we use PermitRead/PermitWrite/PermitCert permission flags to determine which operations are allowed for a LocalAPI client.
These checks are performed when localapi.Handler handles a request. Additionally, certain operations (e.g., changing the serve config)
requires the connected user to be a local admin. This approach is inherently racey and is subject to TOCTOU issues.
We consider it to be more critical on Windows environments, which are inherently multi-user, and therefore we prevent more than one
OS user from connecting and utilizing the LocalBackend at the same time. However, the same type of issues is also applicable to other
platforms when switching between profiles that have different OperatorUser values in ipn.Prefs.

We'd like to allow more than one Windows user to connect, but limit what they can see and do based on their access rights on the device
(e.g., an local admin or not) and to the currently active LoginProfile (e.g., owner/operator or not), while preventing TOCTOU issues on Windows
and other platforms. Therefore, we'd like to pass an actor from the LocalAPI to the LocalBackend to represent the user performing the operation.
The LocalBackend, or the profileManager down the line, will then check the actor's access rights to perform a given operation on the device
and against the current (and/or the target) profile.

This PR does not change the current permission model in any way, but it introduces the concept of an actor and includes some preparatory
work to pass it around. Temporarily, the ipnauth.Actor interface has methods like IsLocalSystem and IsLocalAdmin, which are only relevant
to the current permission model. It also lacks methods that will actually be used in the new model. We'll be adding these gradually in the next
PRs and removing the deprecated methods and the Permit* flags at the end of the transition.

Updates tailscale/corp#18342

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Kristoffer Dalby a2c42d3cd4 usermetric: add initial user-facing metrics
This commit adds a new usermetric package and wires
up metrics across the tailscale client.

Updates tailscale/corp#22075

Co-authored-by: Anton Tolchanov <anton@tailscale.com>
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
1 year ago
Percy Wegmann d00d6d6dc2 go.mod: update to github.com/tailscale/netlink library that doesn't require vishvananda/netlink
After the upstream PR is merged, we can point directly at github.com/vishvananda/netlink
and retire github.com/tailscale/netlink.

See https://github.com/vishvananda/netlink/pull/1006

Updates #12298

Signed-off-by: Percy Wegmann <percy@tailscale.com>
1 year ago
Ilarion Kovalchuk 0cb7eb9b75 net/dns: updated gonotify dependency to v2 that supports closable context
Signed-off-by: Ilarion Kovalchuk <illarion.kovalchuk@gmail.com>
1 year ago
Brad Fitzpatrick 696711cc17 all: switch to and require Go 1.23
Updates #12912

Change-Id: Ib4ae26eb5fb68ad2216cab4913811b94f7eed5b6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Brad Fitzpatrick 0ff474ff37 all: fix new lint warnings from bumping staticcheck
In prep for updating to new staticcheck required for Go 1.23.

Updates #12912

Change-Id: If77892a023b79c6fa798f936fc80428fd4ce0673
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Jordan Whited df6014f1d7
net/tstun,wgengine{/netstack/gro}: refactor and re-enable gVisor GRO for Linux (#13172)
In 2f27319baf we disabled GRO due to a
data race around concurrent calls to tstun.Wrapper.Write(). This commit
refactors GRO to be thread-safe, and re-enables it on Linux.

This refactor now carries a GRO type across tstun and netstack APIs
with a lifetime that is scoped to a single tstun.Wrapper.Write() call.

In 25f0a3fc8f we used build tags to
prevent importation of gVisor's GRO package on iOS as at the time we
believed it was contributing to additional memory usage on that
platform. It wasn't, so this commit simplifies and removes those
build tags.

Updates tailscale/corp#22353
Updates tailscale/corp#22125
Updates #6816

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
ChandonPierre 93dc2ded6e
cmd/k8s-operator: support default proxy class in k8s-operator (#12711)
Signed-off-by: ChandonPierre <cpierre@coreweave.com>

Closes #12421
1 year ago
pierig-n3xtio 2105773874
cmd/k8s-operator/deploy: replace wildcards in Kubernetes Operator RBAC role definitions with verbs
cmd/k8s-operator/deploy: replace wildcards in Kubernetes Operator RBAC role definitions with verbs

fixes: #13168

Signed-off-by: Pierig Le Saux <pierig@n3xt.io>
1 year ago
tomholford 16bb541adb wgengine/magicsock: replace deprecated poly1305 (#13184)
Signed-off-by: tomholford <tomholford@users.noreply.github.com>
1 year ago
Kyle Carberry 6c852fa817 go.{mod,sum}: migrate from nhooyr.io/websocket to github.com/coder/websocket
Coder has just adopted nhooyr/websocket which unfortunately changes the import path.

`github.com/coder/coder` imports `tailscale.com/net/wsconn` which was still pointing
to `nhooyr.io/websocket`, but this change updates it.

See https://coder.com/blog/websocket

Updates #13154

Change-Id: I3dec6512472b14eae337ae22c5bcc1e3758888d5
Signed-off-by: Kyle Carberry <kyle@carberry.com>
1 year ago
Irbe Krumina a15ff1bade
cmd/k8s-operator,k8s-operator/sessionrecording: support recording kubectl exec sessions over WebSockets (#12947)
cmd/k8s-operator,k8s-operator/sessionrecording: support recording WebSocket sessions

Kubernetes currently supports two streaming protocols, SPDY and WebSockets.
WebSockets are replacing SPDY, see
https://github.com/kubernetes/enhancements/issues/4006.
We were currently only supporting SPDY, erroring out if session
was not SPDY and relying on the kube's built-in SPDY fallback.

This PR:

- adds support for parsing contents of 'kubectl exec' sessions streamed
over WebSockets

- adds logic to distinguish 'kubectl exec' requests for a SPDY/WebSockets
sessions and call the relevant handler

Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Brad Fitzpatrick 4c2e978f1e cmd/tailscale/cli: support passing network lock keys via files
Fixes tailscale/corp#22356

Change-Id: I959efae716a22bcf582c20d261fb1b57bacf6dd9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Irbe Krumina adbab25bac
cmd/k8s-operator: fix DNS reconciler for dual-stack clusters (#13057)
* cmd/k8s-operator: fix DNS reconciler for dual-stack clusters

This fixes a bug where DNS reconciler logic was always assuming
that no more than one EndpointSlice exists for a Service.
In fact, there can be multiple, for example, in dual-stack
clusters, but also in other cases this is valid (as per kube docs).
This PR:
- allows for multiple EndpointSlices
- picks out the ones for IPv4 family
- deduplicates addresses

Updates tailscale/tailscale#13056

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Nick Khyl 67df9abdc6 util/syspolicy/setting: add package that contains types for the next syspolicy PRs
Package setting contains types for defining and representing policy settings.
It facilitates the registration of setting definitions using Register and RegisterDefinition,
and the retrieval of registered setting definitions via Definitions and DefinitionOf.
This package is intended for use primarily within the syspolicy package hierarchy,
and added in a preparation for the next PRs.

Updates #12687

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Jordan Whited f0230ce0b5
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GRO for Linux (#12921)
This commit implements TCP GRO for packets being written to gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported IP checksum functions.
gVisor is updated in order to make use of newly exported
stack.PacketBuffer GRO logic.

TCP throughput towards gVisor, i.e. TUN write direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement, sometimes as high as 2x. High bandwidth-delay product
paths remain receive window limited, bottlenecked by gVisor's default
TCP receive socket buffer size. This will be addressed in a  follow-on
commit.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.

The first result is from commit 57856fc without TCP GRO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  4.77 GBytes  4.10 Gbits/sec   20 sender
[  5]   0.00-10.00  sec  4.77 GBytes  4.10 Gbits/sec      receiver

The second result is from this commit with TCP GRO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  10.6 GBytes  9.14 Gbits/sec   20 sender
[  5]   0.00-10.00  sec  10.6 GBytes  9.14 Gbits/sec      receiver

Updates #6816

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Jordan Whited 7bc2ddaedc
go.mod,net/tstun,wgengine/netstack: implement gVisor TCP GSO for Linux (#12869)
This commit implements TCP GSO for packets being read from gVisor on
Linux. Windows support will follow later. The wireguard-go dependency is
updated in order to make use of newly exported GSO logic from its tun
package.

A new gVisor stack.LinkEndpoint implementation has been established
(linkEndpoint) that is loosely modeled after its predecessor
(channel.Endpoint). This new implementation supports GSO of monster TCP
segments up to 64K in size, whereas channel.Endpoint only supports up to
32K. linkEndpoint will also be required for GRO, which will be
implemented in a follow-on commit.

TCP throughput from gVisor, i.e. TUN read direction, is dramatically
improved as a result of this commit. Benchmarks show substantial
improvement through a wide range of RTT and loss conditions, sometimes
as high as 5x.

The iperf3 results below demonstrate the effect of this commit between
two Linux computers with i5-12400 CPUs. There is roughly ~13us of round
trip latency between them.

The first result is from commit 57856fc without TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.51 GBytes  2.15 Gbits/sec  154 sender
[  5]   0.00-10.00  sec  2.49 GBytes  2.14 Gbits/sec      receiver

The second result is from this commit with TCP GSO.

Starting Test: protocol: TCP, 1 streams, 131072 byte blocks
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  12.6 GBytes  10.8 Gbits/sec    6 sender
[  5]   0.00-10.00  sec  12.6 GBytes  10.8 Gbits/sec      receiver

Updates #6816

Signed-off-by: Jordan Whited <jordan@tailscale.com>
1 year ago
Irbe Krumina a21bf100f3
cmd/k8s-operator,k8s-operator/sessionrecording,sessionrecording,ssh/tailssh: refactor session recording functionality (#12945)
cmd/k8s-operator,k8s-operator/sessionrecording,sessionrecording,ssh/tailssh: refactor session recording functionality

Refactor SSH session recording functionality (mostly the bits related to
Kubernetes API server proxy 'kubectl exec' session recording):

- move the session recording bits used by both Tailscale SSH
and the Kubernetes API server proxy into a shared sessionrecording package,
to avoid having the operator to import ssh/tailssh

- move the Kubernetes API server proxy session recording functionality
into a k8s-operator/sessionrecording package, add some abstractions
in preparation for adding support for a second streaming protocol (WebSockets)

Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina c5623e0471
go.{mod,sum},tstest/tools,k8s-operator,cmd/k8s-operator: autogenerate CRD API docs (#12884)
Re-instates the functionality that generates CRD API docs, but using
a different library as the one we were using earlier seemed to have
some issues with its Git history.
Also regenerates the docs (make kube-generate-all).

Updates tailscale/tailscale#12859

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Andrea Gottardo 90be06bd5b
health: introduce captive-portal-detected Warnable (#12707)
Updates tailscale/tailscale#1634

This PR introduces a new `captive-portal-detected` Warnable which is set to an unhealthy state whenever a captive portal is detected on the local network, preventing Tailscale from connecting.



ipn/ipnlocal: fix captive portal loop shutdown


Change-Id: I7cafdbce68463a16260091bcec1741501a070c95

net/captivedetection: fix mutex misuse

ipn/ipnlocal: ensure that we don't fail to start the timer


Change-Id: I3e43fb19264d793e8707c5031c0898e48e3e7465

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Signed-off-by: Andrea Gottardo <andrea@gottardo.me>
1 year ago
Lee Briggs 32ce18716b
Add extra environment variables in deployment template (#12858)
Fixes #12857

Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
1 year ago
Irbe Krumina 0f57b9340b
cmd/k8s-operator,tstest,go.{mod,sum}: remove fybrik.io/crdoc dependency (#12862)
Remove fybrik.io/crdoc dependency as it is causing issues for folks attempting
to vendor tailscale using GOPROXY=direct.
This means that the CRD API docs in ./k8s-operator/api.md will no longer
be generated- I am going to look at replacing it with another tool
in a follow-up.

Updates tailscale/tailscale#12859

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina 2742153f84
cmd/k8s-operator: add a metric to track the amount of ProxyClass resources (#12833)
Updates tailscale/tailscale#10709

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Brad Fitzpatrick c6af5bbfe8 all: add test for package comments, fix, add comments as needed
Updates #cleanup

Change-Id: Ic4304e909d2131a95a38b26911f49e7b1729aaef
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
1 year ago
Irbe Krumina 986d60a094
cmd/k8s-operator: add metrics for attempted/uploaded session recordings (#12765)
Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Irbe Krumina 6a982faa7d
cmd/k8s-operator: send container name to session recorder (#12763)
Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
Nick Khyl 726d5d507d cmd/k8s-operator: update depaware.txt
This fixes an issue caused by the merge order of 2b638f550d and 8bd442ba8c.

Updates #Cleanup

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Nick Khyl e21d8768f9 types/opt: add generic Value[T any] for optional values of any types
Updates #12736

Signed-off-by: Nick Khyl <nickk@tailscale.com>
1 year ago
Irbe Krumina ba517ab388
cmd/k8s-operator,ssh/tailssh,tsnet: optionally record 'kubectl exec' sessions via Kubernetes operator's API server proxy (#12274)
cmd/k8s-operator,ssh/tailssh,tsnet: optionally record kubectl exec sessions

The Kubernetes operator's API server proxy, when it receives a request
for 'kubectl exec' session now reads 'RecorderAddrs', 'EnforceRecorder'
fields from tailcfg.KubernetesCapRule.
If 'RecorderAddrs' is set to one or more addresses (of a tsrecorder instance(s)),
it attempts to connect to those and sends the session contents
to the recorder before forwarding the request to the kube API
server. If connection cannot be established or fails midway,
it is only allowed if 'EnforceRecorder' is not true (fail open).

Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Maisem Ali 2b638f550d cmd/k8s-operator: add depaware.txt
Updates #12742

Signed-off-by: Maisem Ali <maisem@tailscale.com>
1 year ago
Tom Proctor 01a7726cf7
cmd/containerboot,cmd/k8s-operator: enable IPv6 for fqdn egress proxies (#12577)
cmd/containerboot,cmd/k8s-operator: enable IPv6 for fqdn egress proxies

Don't skip installing egress forwarding rules for IPv6 (as long as the host
supports IPv6), and set headless services `ipFamilyPolicy` to
`PreferDualStack` to optionally enable both IP families when possible. Note
that even with `PreferDualStack` set, testing a dual-stack GKE cluster with
the default DNS setup of kube-dns did not correctly set both A and
AAAA records for the headless service, and instead only did so when
switching the cluster DNS to Cloud DNS. For both IPv4 and IPv6 to work
simultaneously in a dual-stack cluster, we require headless services to
return both A and AAAA records.

If the host doesn't support IPv6 but the FQDN specified only has IPv6
addresses available, containerboot will exit with error code 1 and an
error message because there is no viable egress route.

Fixes #12215

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Tom Proctor 3099323976
cmd/k8s-operator,k8s-operator,go.{mod,sum}: publish proxy status condition for annotated services (#12463)
Adds a new TailscaleProxyReady condition type for use in corev1.Service
conditions.

Also switch our CRDs to use metav1.Condition instead of
ConnectorCondition. The Go structs are seralized identically, but it
updates some descriptions and validation rules. Update k8s
controller-tools and controller-runtime deps to fix the documentation
generation for metav1.Condition so that it excludes comments and
TODOs.

Stop expecting the fake client to populate TypeMeta in tests. See
kubernetes-sigs/controller-runtime#2633 for details of the change.

Finally, make some minor improvements to validation for service hostnames.

Fixes #12216

Co-authored-by: Irbe Krumina <irbe@tailscale.com>
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
1 year ago
Irbe Krumina 8cc2738609
cmd/{containerboot,k8s-operator}: store proxy device ID early to help with cleanup for broken proxies (#12425)
* cmd/containerboot: store device ID before setting up proxy routes.

For containerboot instances whose state needs to be stored
in a Kubernetes Secret, we additonally store the device's
ID, FQDN and IPs.
This is used, between other, by the Kubernetes operator,
who uses the ID to delete the device when resources need
cleaning up and writes the FQDN and IPs on various kube
resource statuses for visibility.

This change shifts storing device ID earlier in the proxy setup flow,
to ensure that if proxy routing setup fails,
the device can still be deleted.

Updates tailscale/tailscale#12146

Signed-off-by: Irbe Krumina <irbe@tailscale.com>

* code review feedback

Signed-off-by: Irbe Krumina <irbe@tailscale.com>

---------

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
1 year ago
JunYanBJSS 4c01ce9f43 tsnet: fix error formatting bug
Fixes #12411

Signed-off-by: JunYanBJSS <johnnycocoyan@hotmail.com>
1 year ago
Irbe Krumina c3e2b7347b
tailcfg,cmd/k8s-operator,kube: move Kubernetes cap to a location that can be shared with control (#12236)
This PR is in prep of adding logic to control to be able to parse
tailscale.com/cap/kubernetes grants in control:
- moves the type definition of PeerCapabilityKubernetes cap to a location
shared with control.
- update the Kubernetes cap rule definition with fields for granting
kubectl exec session recording capabilities.
- adds a convenience function to produce tailcfg.RawMessage from an
arbitrary cap rule and a test for it.

An example grant defined via ACLs:
"grants": [{
      "src": ["tag:eng"],
      "dst": ["tag:k8s-operator"],
      "app": {
        "tailscale.com/cap/kubernetes": [{
            "recorder": ["tag:my-recorder"]
	    “enforceRecorder”: true
        }],
      },
    }
]
This grant enforces `kubectl exec` sessions from tailnet clients,
matching `tag:eng` via API server proxy matching `tag:k8s-operator`
to be recorded and recording to be sent to a tsrecorder instance,
matching `tag:my-recorder`.

The type needs to be shared with control because we want
control to parse this cap and resolve tags to peer IPs.

Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 807934f00c
cmd/k8s-operator,k8s-operator: allow proxies accept advertized routes. (#12388)
Add a new .spec.tailscale.acceptRoutes field to ProxyClass,
that can be optionally set to true for the proxies to
accept routes advertized by other nodes on tailnet (equivalent of
setting --accept-routes to true).

Updates tailscale/tailscale#12322,tailscale/tailscale#10684

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 53d9cac196
k8s-operator/apis/v1alpha1,cmd/k8s-operator/deploy/examples: update DNSConfig description (#11971)
Also removes hardcoded image repo/tag from example DNSConfig resource
as the operator now knows how to default those.

Updates tailscale/tailscale#11019

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Tom Proctor 23e26e589f
cmd/k8s-operator,k8s-opeerator: include Connector's MagicDNS name and tailnet IPs in status (#12359)
Add new fields TailnetIPs and Hostname to Connector Status. These
contain the addresses of the Tailscale node that the operator created
for the Connector to aid debugging.

Fixes #12214

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2 years ago
Irbe Krumina 3a6d3f1a5b
cmd/k8s-operator,k8s-operator,go.{mod,sum}: make individual proxy images/image pull policies configurable (#11928)
cmd/k8s-operator,k8s-operator,go.{mod,sum}: make individual proxy images/image pull policies configurable

Allow to configure images and image pull policies for individual proxies
via ProxyClass.Spec.StatefulSet.Pod.{TailscaleContainer,TailscaleInitContainer}.Image,
and ProxyClass.Spec.StatefulSet.Pod.{TailscaleContainer,TailscaleInitContainer}.ImagePullPolicy
fields.
Document that we have images in ghcr.io on the relevant Helm chart fields.

Updates tailscale/tailscale#11675

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Andrew Dunham e88a5dbc92 various: fix lint warnings
Some lint warnings caught by running 'make lint' locally.

Updates #cleanup

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I1534ed6f2f5e1eb029658906f9d62607dad98ca3
2 years ago
Irbe Krumina 82576190a7
tailcfg,cmd/k8s-operator: moves tailscale.com/cap/kubernetes peer cap to tailcfg (#12235)
This is done in preparation for adding kubectl
session recording rules to this capability grant that will need to
be unmarshalled by control, so will also need to be
in a shared location.

Updates tailscale/corp#19821

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
signed-long 1dc3136a24
cmd/k8s-operator: Support image 'repo' or 'repository' keys in helm values file (#12285)
cmd/k8s-operator/deploy/chart: Support image 'repo' or 'repository' keys in helm values

Fixes #12100

Signed-off-by: Michael Long <michaelongdev@gmail.com>
2 years ago
signed-long 5ad0dad15e
go generate directives reorder for 'make kube-generate-all' (#12210)
Fixes #11980

Signed-off-by: Michael Long <michaelongdev@gmail.com>
2 years ago
Irbe Krumina d0d33f257f
cmd/k8s-operator: add a note pointing at ProxyClass (#12246)
Updates tailscale/tailscale#12242

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 72f0f53ed0
cmd/k8s-operator: fix typo (#12217)
Fixes#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina d86d1e7601
cmd/k8s-operator,cmd/containerboot,ipn,k8s-operator: turn off stateful filter for egress proxies. (#12075)
Turn off stateful filtering for egress proxies to allow cluster
traffic to be forwarded to tailnet.

Allow configuring stateful filter via tailscaled config file.

Deprecate EXPERIMENTAL_TS_CONFIGFILE_PATH env var and introduce a new
TS_EXPERIMENTAL_VERSIONED_CONFIG env var that can be used to provide
containerboot a directory that should contain one or more
tailscaled config files named cap-<tailscaled-cap-version>.hujson.
Containerboot will pick the one with the newest capability version
that is not newer than its current capability version.

Proxies with this change will not work with older Tailscale
Kubernetes operator versions - users must ensure that
the deployed operator is at the same version or newer (up to
4 version skew) than the proxies.

Updates tailscale/tailscale#12061

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Irbe Krumina b5dbf155b1
cmd/k8s-operator: default nameserver image to tailscale/k8s-nameserver:unstable (#11991)
We are now publishing nameserver images to tailscale/k8s-nameserver,
so we can start defaulting the images if users haven't set
them explicitly, same as we already do with proxy images.

The nameserver images are currently only published for unstable
track, so we have to use the static 'unstable' tag.
Once we start publishing to stable, we can make the operator
default to its own tag (because then we'll know that for each
operator tag X there is also a nameserver tag X as we always
cut all images for a given tag.

Updates tailscale/tailscale#10499

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 406293682c
cmd/k8s-operator: cleanup runReconciler signature (#11993)
Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina cd633a7252
cmd/k8s-operator/deploy,k8s-operator: document that metrics are unstable (#11979)
Updates#11292

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 19b31ac9a6
cmd/{k8s-operator,k8s-nameserver},k8s-operator: update nameserver config with records for ingress/egress proxies (#11019)
cmd/k8s-operator: optionally update dnsrecords Configmap with DNS records for proxies.

This commit adds functionality to automatically populate
DNS records for the in-cluster ts.net nameserver
to allow cluster workloads to resolve MagicDNS names
associated with operator's proxies.

The records are created as follows:
* For tailscale Ingress proxies there will be
a record mapping the MagicDNS name of the Ingress
device and each proxy Pod's IP address.
* For cluster egress proxies, configured via
tailscale.com/tailnet-fqdn annotation, there will be
a record for each proxy Pod, mapping
the MagicDNS name of the exposed
tailnet workload to the proxy Pod's IP.

No records will be created for any other proxy types.
Records will only be created if users have configured
the operator to deploy an in-cluster ts.net nameserver
by applying tailscale.com/v1alpha1.DNSConfig.

It is user's responsibility to add the ts.net nameserver
as a stub nameserver for ts.net DNS names.
https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/#configuration-of-stub-domain-and-upstream-nameserver-using-coredns
https://cloud.google.com/kubernetes-engine/docs/how-to/kube-dns#upstream_nameservers

See also https://github.com/tailscale/tailscale/pull/11017

Updates tailscale/tailscale#10499

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Gabe Gorelick de85610be0
cmd/k8s-operator/deploy/chart: allow users to configure additional labels for the operator's Pod via Helm chart values.
cmd/k8s-operator/deploy/chart: allow users to configure additional labels for the operator's Pod via Helm chart values.

Fixes #11947

Signed-off-by: Gabe Gorelick <gabe@hightouch.io>
2 years ago
Irbe Krumina 44aa809cb0
cmd/{k8s-nameserver,k8s-operator},k8s-operator: add a kube nameserver, make operator deploy it (#11919)
* cmd/k8s-nameserver,k8s-operator: add a nameserver that can resolve ts.net DNS names in cluster.

Adds a simple nameserver that can respond to A record queries for ts.net DNS names.
It can respond to queries from in-memory records, populated from a ConfigMap
mounted at /config. It dynamically updates its records as the ConfigMap
contents changes.
It will respond with NXDOMAIN to queries for any other record types
(AAAA to be implemented in the future).
It can respond to queries over UDP or TCP. It runs a miekg/dns
DNS server with a single registered handler for ts.net domain names.
Queries for other domain names will be refused.

The intended use of this is:
1) to allow non-tailnet cluster workloads to talk to HTTPS tailnet
services exposed via Tailscale operator egress over HTTPS
2) to allow non-tailnet cluster workloads to talk to workloads in
the same cluster that have been exposed to tailnet over their
MagicDNS names but on their cluster IPs.

DNSConfig CRD can be used to configure
the operator to deploy kube nameserver (./cmd/k8s-nameserver) to cluster.

Updates tailscale/tailscale#10499

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 7d9c3f9897
cmd/k8s-operator/deploy/manifests: check if IPv6 module is loaded before using it (#11867)
Before attempting to enable IPv6 forwarding in the proxy init container
check if the relevant module is found, else the container crashes
on hosts that don't have it.

Updates#11860

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina df8f40905b
cmd/k8s-operator,k8s-operator: optionally serve tailscaled metrics on Pod IP (#11699)
Adds a new .spec.metrics field to ProxyClass to allow users to optionally serve
client metrics (tailscaled --debug) on <Pod-IP>:9001.
Metrics cannot currently be enabled for proxies that egress traffic to tailnet
and for Ingress proxies with tailscale.com/experimental-forward-cluster-traffic-via-ingress annotation
(because they currently forward all cluster traffic to their respective backends).

The assumption is that users will want to have these metrics enabled
continuously to be able to monitor proxy behaviour (as opposed to enabling
them temporarily for debugging). Hence we expose them on Pod IP to make it
easier to consume them i.e via Prometheus PodMonitor.

Updates tailscale/tailscale#11292

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Lee Briggs 14ac41febc
cmd/k8s-operator,k8s-operator: proxyclass affinity (#11862)
add ability to set affinity rules to proxyclass

Updates#11861

Signed-off-by: Lee Briggs <lee@leebriggs.co.uk>
2 years ago
Irbe Krumina 3af0f526b8
cmd{containerboot,k8s-operator},util/linuxfw: support ExternalName Services (#11802)
* cmd/containerboot,util/linuxfw: support proxy backends specified by DNS name

Adds support for optionally configuring containerboot to proxy
traffic to backends configured by passing TS_EXPERIMENTAL_DEST_DNS_NAME env var
to containerboot.
Containerboot will periodically (every 10 minutes) attempt to resolve
the DNS name and ensure that all traffic sent to the node's
tailnet IP gets forwarded to the resolved backend IP addresses.

Currently:
- if the firewall mode is iptables, traffic will be load balanced
accross the backend IP addresses using round robin. There are
no health checks for whether the IPs are reachable.
- if the firewall mode is nftables traffic will only be forwarded
to the first IP address in the list. This is to be improved.

* cmd/k8s-operator: support ExternalName Services

 Adds support for exposing endpoints, accessible from within
a cluster to the tailnet via DNS names using ExternalName Services.
This can be done by annotating the ExternalName Service with
tailscale.com/expose: "true" annotation.
The operator will deploy a proxy configured to route tailnet
traffic to the backend IPs that service.spec.externalName
resolves to. The backend IPs must be reachable from the operator's
namespace.

Updates tailscale/tailscale#10606

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina bbe194c80d
cmd/k8s-operator: correctly determine cluster domain (#11512)
Kubernetes cluster domain defaults to 'cluster.local', but can also be customized.
We need to determine cluster domain to set up in-cluster forwarding to our egress proxies.
This was previously hardcoded to 'cluster.local', so was the egress proxies were not usable in clusters with custom domains.
This PR ensures that we attempt to determine the cluster domain by parsing /etc/resolv.conf.
In case the cluster domain cannot be determined from /etc/resolv.conf, we fall back to 'cluster.local'.

Updates tailscale/tailscale#10399,tailscale/tailscale#11445

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Brad Fitzpatrick 7c1d6e35a5 all: use Go 1.22 range-over-int
Updates #11058

Change-Id: I35e7ef9b90e83cac04ca93fd964ad00ed5b48430
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Irbe Krumina 26f9bbc02b
cmd/k8s-operator,k8s-operator: document tailscale.com Custom Resource Definitions better. (#11665)
Updates tailscale/tailscale#10880

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 38fb23f120
cmd/k8s-operator,k8s-operator: allow users to configure proxy env vars via ProxyClass (#11743)
Adds new ProxyClass.spec.statefulSet.pod.{tailscaleContainer,tailscaleInitContainer}.Env field
that allow users to provide key, value pairs that will be set as env vars for the respective containers.
Allow overriding all containerboot env vars,
but warn that this is not supported and might break (in docs + a warning when validating ProxyClass).

Updates tailscale/tailscale#10709

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 231e44e742
Revert "cmd/{k8s-nameserver,k8s-operator},k8s-operator: add a kube nameserver, make operator deploy it (#11017)" (#11669)
Temporarily reverting this PR to avoid releasing
half finished featue.

This reverts commit 9e2f58f846.

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 9b5176c4d9
cmd/k8s-operator: fix failing tests (#11541)
Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 9e2f58f846
cmd/{k8s-nameserver,k8s-operator},k8s-operator: add a kube nameserver, make operator deploy it (#11017)
* cmd/k8s-nameserver,k8s-operator: add a nameserver that can resolve ts.net DNS names in cluster.

Adds a simple nameserver that can respond to A record queries for ts.net DNS names.
It can respond to queries from in-memory records, populated from a ConfigMap
mounted at /config. It dynamically updates its records as the ConfigMap
contents changes.
It will respond with NXDOMAIN to queries for any other record types
(AAAA to be implemented in the future).
It can respond to queries over UDP or TCP. It runs a miekg/dns
DNS server with a single registered handler for ts.net domain names.
Queries for other domain names will be refused.

The intended use of this is:
1) to allow non-tailnet cluster workloads to talk to HTTPS tailnet
services exposed via Tailscale operator egress over HTTPS
2) to allow non-tailnet cluster workloads to talk to workloads in
the same cluster that have been exposed to tailnet over their
MagicDNS names but on their cluster IPs.

Updates tailscale/tailscale#10499

Signed-off-by: Irbe Krumina <irbe@tailscale.com>

* cmd/k8s-operator/deploy/crds,k8s-operator: add DNSConfig CustomResource Definition

DNSConfig CRD can be used to configure
the operator to deploy kube nameserver (./cmd/k8s-nameserver) to cluster.

Signed-off-by: Irbe Krumina <irbe@tailscale.com>

* cmd/k8s-operator,k8s-operator: optionally reconcile nameserver resources

Adds a new reconciler that reconciles DNSConfig resources.
If a DNSConfig is deployed to cluster,
the reconciler creates kube nameserver resources.
This reconciler is only responsible for creating
nameserver resources and not for populating nameserver's records.

Signed-off-by: Irbe Krumina <irbe@tailscale.com>

* cmd/{k8s-operator,k8s-nameserver}: generate DNSConfig CRD for charts, append to static manifests

Signed-off-by: Irbe Krumina <irbe@tailscale.com>

---------

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 4cbef20569
cmd/k8s-operator: redact auth key from debug logs (#11523)
Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Chris Milson-Tokunaga b6dfd7443a
Change type of installCRDs (#11478)
Including the double quotes (`"`) around the value made it appear like the helm chart should expect a string value for `installCRDs`.

Signed-off-by: Chris Milson-Tokunaga <chris.w.milson@gmail.com>
2 years ago
Irbe Krumina b0c3e6f6c5
cmd/k8s-operator,ipn/conf.go: fix --accept-routes for proxies (#11453)
Fix a bug where all proxies got configured with --accept-routes set to true.
The bug was introduced in https://github.com/tailscale/tailscale/pull/11238.

Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 95dcc1745b
cmd/k8s-operator: reconcile tailscale Ingresses when their backend Services change. (#11255)
This is so that if a backend Service gets created after the Ingress, it gets picked up by the operator.

Updates tailscale/tailscale#11251

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Anton Tolchanov <1687799+knyar@users.noreply.github.com>
2 years ago
Irbe Krumina 303125d96d
cmd/k8s-operator: configure all proxies with declarative config (#11238)
Containerboot container created for operator's ingress and egress proxies
are now always configured by passing a configfile to tailscaled
(tailscaled --config <configfile-path>.
It does not run 'tailscale set' or 'tailscale up'.
Upgrading existing setups to this version as well as
downgrading existing setups at this version works.

Updates tailscale/tailscale#10869

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 45d27fafd6
cmd/k8s-operator,k8s-operator,go.{mod,sum},tstest/tools: add Tailscale Kubernetes operator API docs (#11246)
Add logic to autogenerate CRD docs.
.github/workflows/kubemanifests.yaml CI workflow will fail if the doc is out of date with regard to the current CRDs.
Docs can be refreshed by running make kube-generate-all.

Updates tailscale/tailscale#11023

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 5bd19fd3e3
cmd/k8s-operator,k8s-operator: proxy configuration mechanism via a new ProxyClass custom resource (#11074)
* cmd/k8s-operator,k8s-operator: introduce proxy configuration mechanism via ProxyClass custom resource.

ProxyClass custom resource can be used to specify customizations
for the proxy resources created by the operator.

Add a reconciler that validates ProxyClass resources
and sets a Ready condition to True or False with a corresponding reason and message.
This is required because some fields (labels and annotations)
require complex validations that cannot be performed at custom resource apply time.
Reconcilers that use the ProxyClass to configure proxy resources are expected to
verify that the ProxyClass is Ready and not proceed with resource creation
if configuration from a ProxyClass that is not yet Ready is required.

If a tailscale ingress/egress Service is annotated with a tailscale.com/proxy-class annotation, look up the corresponding ProxyClass and, if it is Ready, apply the configuration from the ProxyClass to the proxy's StatefulSet.

If a tailscale Ingress has a tailscale.com/proxy-class annotation
and the referenced ProxyClass custom resource is available and Ready,
apply configuration from the ProxyClass to the proxy resources
that will be created for the Ingress.

Add a new .proxyClass field to the Connector spec.
If connector.spec.proxyClass is set to a ProxyClass that is available and Ready,
apply configuration from the ProxyClass to the proxy resources created for the Connector.

Ensure that when Helm chart is packaged, the ProxyClass yaml is added to chart templates. Ensure that static manifest generator adds ProxyClass yaml to operator.yaml. Regenerate operator.yaml


Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina a6cc2fdc3e
cmd/{containerboot,k8s-operator/deploy/manifests}: optionally allow proxying cluster traffic to a cluster target via ingress proxy (#11036)
* cmd/containerboot,cmd/k8s-operator/deploy/manifests: optionally forward cluster traffic via ingress proxy.

If a tailscale Ingress has tailscale.com/experimental-forward-cluster-traffic-via-ingress annotation, configure the associated ingress proxy to have its tailscale serve proxy to listen on Pod's IP address. This ensures that cluster traffic too can be forwarded via this proxy to the ingress backend(s).

In containerboot, if EXPERIMENTAL_PROXY_CLUSTER_TRAFFIC_VIA_INGRESS is set to true
and the node is Kubernetes operator ingress proxy configured via Ingress,
make sure that traffic from within the cluster can be proxied to the ingress target.

Updates tailscale/tailscale#10499

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 370ec6b46b
cmd/k8s-operator: don't proceed with Ingress that has no valid backends (#10919)
Do not provision resources for a tailscale Ingress that has no valid backends.

Updates tailscale/tailscale#10910

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
ChandonPierre 2ce596ea7a
cmd/k8s-operator/deploy: allow modifying operator tags via Helm values
Updates tailscale/tailscale#10659

Signed-off-by: Chandon Pierre <cpierre@coreweave.com>
2 years ago
Joe Tsai c25968e1c5
all: make use of ctxkey everywhere (#10846)
Also perform minor cleanups on the ctxkey package itself.
Provide guidance on when to use ctxkey.Key[T] over ctxkey.New.
Also, allow for interface kinds because the value wrapping trick
also happens to fix edge cases with interfaces in Go.

Updates #cleanup

Signed-off-by: Joe Tsai <joetsai@digital-static.net>
2 years ago
Irbe Krumina 1c3c3d6752
cmd/k8s-operator: warn if unsupported Ingress Exact path type is used. (#10865)
To reduce the likelihood of breaking users,
if we implement stricter Exact path type matching in the future.

Updates tailscale/tailscale#10730

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 50b52dbd7d
cmd/k8s-operator: sync StatefulSet labels to their Pods (#10861)
So that users have predictable label values to use when configuring network policies.

Updates tailscale/tailscale#10854

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina d0492fdee5
cmd/k8s-operator: adds a tailscale IngressClass resource, prints warning if class not found. (#10823)
* cmd/k8s-operator/deploy: deploy a Tailscale IngressClass resource.

Some Ingress validating webhooks reject Ingresses with
.spec.ingressClassName for which there is no matching IngressClass.

Additionally, validate that the expected IngressClass is present,
when parsing a tailscale `Ingress`. 
We currently do not utilize the IngressClass,
however we might in the future at which point
we might start requiring that the right class
for this controller instance actually exists.

Updates tailscale/tailscale#10820

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Anton Tolchanov <anton@tailscale.com>
2 years ago
Irbe Krumina 169778e23b
cmd/k8s-operator: minor fix in name gen (#10830)
Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 5cc1bfe82d
cmd/k8s-operator: remove configuration knob for Connector (#10791)
The configuration knob (that defaulted to Connector being disabled)
was added largely because the Connector CRD had to be installed in a separate step.
Now when the CRD has been added to both chart and static manifest, we can have it on by default.

Updates tailscale/tailscale#10878

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 469af614b0
cmd/k8s-operator: fix base truncating for extra long Service names (#10825)
cmd/k8s-operator: fix base truncating for extra long Service names

StatefulSet names for ingress/egress proxies are calculated
using Kubernetes name generator and the parent resource name
as a base.
The name generator also cuts the base, but has a higher max cap.
This commit fixes a bug where, if we get a shortened base back
from the generator, we cut off too little as the base that we
have cut will be passed into the generator again, which will
then itself cut less because the base is shorter- so we end up
with a too long name again.

Updates tailscale/tailscale#10807

Co-authored-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Irbe Krumina <irbekrm@gmail.com>
2 years ago
Irbe Krumina 35f49ac99e
cmd/k8s-operator: add Connector CRD to Helm chart and static manifests (#10775)
cmd/k8s-operator: add CRD to chart and static manifest

Add functionality to insert CRD to chart at package time.
Insert CRD to static manifests as this is where they are currently consumed from.

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 05093ea7d9
cmd/k8s-operator,k8s-operator: allow the operator to deploy exit nodes via Connector custom resource (#10724)
cmd/k8s-operator/deploy/crds,k8s-operator/apis/v1alpha1: allow to define an exit node via Connector CR.

Make it possible to define an exit node to be deployed to a Kubernetes cluster
via Connector Custom resource.

Also changes to Connector API so that one Connector corresponds
to one Tailnet node that can be either a subnet router or an exit
node or both.

The Kubernetes operator parses Connector custom resource and,
if .spec.isExitNode is set, configures that Tailscale node deployed
for that connector as an exit node.

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Anton Tolchanov <anton@tailscale.com>
2 years ago
Irbe Krumina 38b4eb9419
cmd/k8s-operator/deploy/chart: document passing multiple proxy tags + log level values (#10624)
Updates #cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 1a08ea5990
cmd/k8s-operator: operator can create subnetrouter (#9505)
* k8s-operator,cmd/k8s-operator,Makefile,scripts,.github/workflows: add Connector kube CRD.

Connector CRD allows users to configure the Tailscale Kubernetes operator
to deploy a subnet router to expose cluster CIDRs or
other CIDRs available from within the cluster
to their tailnet.

Also adds various CRD related machinery to
generate CRD YAML, deep copy implementations etc.

Engineers will now have to run
'make kube-generate-all` after changing kube files
to ensure that all generated files are up to date.

* cmd/k8s-operator,k8s-operator: reconcile Connector resources

Reconcile Connector resources, create/delete subnetrouter resources in response to changes to Connector(s).

Connector reconciler will not be started unless
ENABLE_CONNECTOR env var is set to true.
This means that users who don't want to use the alpha
Connector custom resource don't have to install the Connector
CRD to their cluster.
For users who do want to use it the flow is:
- install the CRD
- install the operator (via Helm chart or using static manifests).
For Helm users set .values.enableConnector to true, for static
manifest users, set ENABLE_CONNECTOR to true in the static manifest.

Updates tailscale/tailscale#502


Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Maisem Ali fb632036e3 cmd/k8s-operator: drop https:// in capName
Add the new format but keep respecting the old one.

Updates #4217

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Irbe Krumina 49fd0a62c9
cmd/k8s-operator: generate static kube manifests from the Helm chart. (#10436)
* cmd/k8s-operator: generate static manifests from Helm charts

This is done to ensure that there is a single source of truth
for the operator kube manifests.
Also adds linux node selector to the static manifests as
this was added as a default to the Helm chart.

Static manifests can now be generated by running
`go generate tailscale.com/cmd/k8s-operator`.

Updates tailscale/tailscale#9222

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 18ceb4e1f6
cmd/{containerboot,k8s-operator}: allow users to define tailnet egress target by FQDN (#10360)
* cmd/containerboot: proxy traffic to tailnet target defined by FQDN

Add a new Service annotation tailscale.com/tailnet-fqdn that
users can use to specify a tailnet target for which
an egress proxy should be deployed in the cluster.

Updates tailscale/tailscale#10280

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Gabriel Martinez 128d3ad1a9
cmd/k8s-operator: helm chart add missing keys (#10296)
* cmd/k8s-operator: add missing keys to Helm values file

Updates  #10182

Signed-off-by: Gabriel Martinez <gabrielmartinez@sisti.pt>
2 years ago
Irbe Krumina 66471710f8
cmd/k8s-operator: truncate long StatefulSet name prefixes (#10343)
Kubernetes can generate StatefulSet names that are too long and result in invalid Pod revision hash label values.
Calculate whether a StatefulSet name generated for a Service or Ingress
will be too long and if so, truncate it.

Updates tailscale/tailscale#10284

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina dd8bc9ba03
cmd/k8s-operator: log user/group impersonated by apiserver proxy (#10334)
Updates tailscale/tailscale#10127

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina 4f80f403be
cmd/k8s-operator: fix chart syntax error (#10333)
Updates #9222

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina af49bcaa52
cmd/k8s-operator: set different app type for operator with proxy (#10081)
Updates tailscale/tailscale#9222

plain k8s-operator should have hostinfo.App set to 'k8s-operator', operator with proxy should have it set to 'k8s-operator-proxy'. In proxy mode, we were setting the type after it had already been set to 'k8s-operator'

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
David Anderson 37863205ec cmd/k8s-operator: strip credentials from client config in noauth mode
Updates tailscale/corp#15526

Signed-off-by: David Anderson <danderson@tailscale.com>
2 years ago
Rhea Ghosh 970eb5e784
cmd/k8s-operator: sanitize connection headers (#10063)
Fixes tailscale/corp#15526

Signed-off-by: Rhea Ghosh <rhea@tailscale.com>
2 years ago
Irbe Krumina c2b87fcb46
cmd/k8s-operator/deploy/chart,.github/workflows: use helm chart API v2 (#10055)
API v1 is compatible with helm v2 and v2 is not.
However, helm v2 (the Tiller deployment mechanism) was deprecated in 2020
and no-one should be using it anymore.
This PR also adds a CI lint test for helm chart

Updates tailscale/tailscale#9222

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina ed1b935238
cmd/k8s-operator: allow to install operator via helm (#9920)
Initial helm manifests.

Updates tailscale/tailscale#9222

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Irbe Krumina e9956419f6
cmd/k8s-operator: allow cleanup of cluster resources for deleted devices (#9917)
Users should delete proxies by deleting or modifying the k8s cluster resources
that they used to tell the operator to create they proxy. With this flow,
the tailscale operator will delete the associated device from the control.
However, in some cases users might have already deleted the device from the control manually.

Updates tailscale/tailscale#9773

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina cac290da87
cmd/k8s-operator: users can configure firewall mode for kube operator proxies (#9769)
* cmd/k8s-operator: users can configure operator to set firewall mode for proxies

Users can now pass PROXY_FIREWALL_MODE={nftables,auto,iptables} to operator to make it create ingress/egress proxies with that firewall mode

Also makes sure that if an invalid firewall mode gets configured, the operator will not start provisioning proxy resources, but will instead log an error and write an error event to the related Service.

Updates tailscale/tailscale#9310

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Maisem Ali f53c3be07c cmd/k8s-operator: use our own container image instead of busybox
We already have sysctl in the `tailscale/tailscale` image, just use that.

Updates #cleanup

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali 1294b89792 cmd/k8s-operator: allow setting same host value for tls and ingress rules
We were too strict and required the user not specify the host field at all
in the ingress rules, but that degrades compatibility with existing helm charts.

Relax the constraint so that rule.Host can either be empty, or match the tls.Host[0]
value exactly.

Fixes #9548

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali 1cb9e33a95 cmd/k8s-operator: update env var in manifest to APISERVER_PROXY
Replace the deprecated var with the one in docs to avoid confusion.
Introduced in 335a5aaf9a.

Updates #8317
Fixes #9764

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick 93c6e1d53b tstest/deptest: add check that x/exp/{maps,slices} imported as xfoo
Updates #cleanup

Change-Id: I4cbb5e477c739deddf7a46b66f286c9fdb106279
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Irbe Krumina bdd9eeca90
cmd/k8s-operator: fix reconcile filters (#9533)
Ensure that when there is an event on a Tailscale managed Ingress or Service child resource, the right parent type gets reconciled

Updates tailscale/tailscale#502

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Irbe Krumina c5b2a365de
cmd/k8s-operator: fix egress service name (#9494)
Updates https://github.com/tailscale/tailscale/issues/502

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2 years ago
Maisem Ali 5f4d76c18c cmd/k8s-operator: rename egress annotation
It was tailscale.com/ts-tailnet-target-ip, which was pretty
redundant. Change it to tailscale.com/tailnet-ip.

Updates #502

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali d06b48dd0a tailcfg: add RawMessage
This adds a new RawMessage type backed by string instead of the
json.RawMessage which is backed by []byte. The byte slice makes
the generated views be a lot more defensive than the need to be
which we can get around by using a string instead.

Updates #cleanup

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali 335a5aaf9a cmd/k8s-operator: add APISERVER_PROXY env
The kube-apiserver proxy in the operator would only run in
auth proxy mode but thats not always desirable. There are
situations where the proxy should just be a transparent
proxy and not inject auth headers, so do that using a new
env var APISERVER_PROXY and deprecate the AUTH_PROXY env.

THe new env var has three options `false`, `true` and `noauth`.

Updates #8317

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali 794650fe50 cmd/k8s-operator: emit event if HTTPS is disabled on Tailnet
Instead of confusing users, emit an event that explicitly tells the
user that HTTPS is disabled on the tailnet and that ingress may not
work until they enable it.

Updates #9141

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali 306b85b9a3 cmd/k8s-operator: add metrics to track usage
Updates #502

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Irbe Krumina 17438a98c0
cm/k8s-operator,cmd/containerboot: fix STS config, more tests (#9155)
Ensures that Statefulset reconciler config has only one of Cluster target IP or tailnet target IP.
Adds a test case for containerboot egress proxy mode.

Updates tailscale/tailscale#8184

Signed-off-by: irbekrm <irbekrm@gmail.com>
2 years ago
Irbe Krumina fe709c81e5
cmd/k8s-operator,cmd/containerboot: add kube egress proxy (#9031)
First part of work for the functionality that allows users to create an egress
proxy to access Tailnet services from within Kubernetes cluster workloads.
This PR allows creating an egress proxy that can access Tailscale services over HTTP only.

Updates tailscale/tailscale#8184

Signed-off-by: irbekrm <irbekrm@gmail.com>
Co-authored-by: Maisem Ali <maisem@tailscale.com>
Co-authored-by: Rhea Ghosh <rhea@tailscale.com>
2 years ago
Maisem Ali c919ff540f cmd/k8s-operator,ipn/store/kubestore: patch secrets instead of updating
We would call Update on the secret, but that was racey and would occasionaly
fail. Instead use patch whenever we can.

Fixes errors like
```
boot: 2023/08/29 01:03:53 failed to set serve config: sending serve config: updating config: writing ServeConfig to StateStore: Operation cannot be fulfilled on secrets "ts-webdav-kfrzv-0": the object has been modified; please apply your changes to the latest version and try again

{"level":"error","ts":"2023-08-29T01:03:48Z","msg":"Reconciler error","controller":"ingress","controllerGroup":"networking.k8s.io","controllerKind":"Ingress","Ingress":{"name":"webdav","namespace":"default"},"namespace":"default","name":"webdav","reconcileID":"96f5cfed-7782-4834-9b75-b0950fd563ed","error":"failed to provision: failed to create or get API key secret: Operation cannot be fulfilled on secrets \"ts-webdav-kfrzv-0\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226"}
```

Updates #502
Updates #7895

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali 0c6fe94cf4 cmd/k8s-operator: add matching family addresses to status
This was added in 3451b89e5f, but
resulted in the v6 Tailscale address being added to status when
when the forwarding only happened on the v4 address.

Updates #502

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali f92e6a1be8 cmd/k8s-operator: update RBAC to allow creating events
The new ingress reconcile raises events on failure, but I forgot to
add the updated permission.

Updates #502

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Mike Beaumont 3451b89e5f cmd/k8s-operator: put Tailscale IPs in Service ingress status
Updates #502

Signed-off-by: Mike Beaumont <mjboamail@gmail.com>
2 years ago
Mike Beaumont ce4bf41dcf cmd/k8s-operator: support being the default loadbalancer controller
Updates #502

Signed-off-by: Mike Beaumont <mjboamail@gmail.com>
2 years ago
Maisem Ali c8dea67cbf cmd/k8s-operator: add support for Ingress resources
Previously, the operator would only monitor Services and create
a Tailscale StatefulSet which acted as a L3 proxy which proxied
traffic inbound to the Tailscale IP onto the services ClusterIP.

This extends that functionality to also monitor Ingress resources
where the `ingressClassName=tailscale` and similarly creates a
Tailscale StatefulSet, acting as a L7 proxy instead.

Users can override the desired hostname by setting:

```
- tls
  hosts:
  - "foo"
```

Hostnames specified under `rules` are ignored as we only create a single
host. This is emitted as an event for users to see.

Fixes #7895

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Maisem Ali 12ac672542 cmd/k8s-operator: handle changes to services w/o teardown
Previously users would have to unexpose/expose the service in order to
change Hostname/TargetIP. This now applies those changes by causing a
StatefulSet rollout now that a61a9ab087 is in.

Updates #502

Signed-off-by: Maisem Ali <maisem@tailscale.com>
2 years ago
Brad Fitzpatrick 98a5116434 all: adjust some build tags for plan9
I'm not saying it works, but it compiles.

Updates #5794

Change-Id: I2f3c99732e67fe57a05edb25b758d083417f083e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago