Commit Graph

39 Commits (808b4139eec4f9ffcf8fc7a39b0519395efcc165)

Author SHA1 Message Date
Anton Tolchanov c8f258a904 prober: propagate DERPMap request creation errors
Updates tailscale/corp#8497

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
4 months ago
Brad Fitzpatrick 8a11a43c28 cmd/derpprobe: support 'local' derpmap to get derp map via LocalAPI
To make it easier for people to monitor their custom DERP fleet.

Updates tailscale/corp#20654

Change-Id: Id8af22936a6d893cc7b6186d298ab794a2672524
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
5 months ago
Brad Fitzpatrick 6877d44965 prober: plumb a now-required netmon to derphttp
Updates #11896

Change-Id: Ie2f9cd024d85b51087d297aa36c14a9b8a2b8129
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
6 months ago
Brad Fitzpatrick 3672f29a4e net/netns, net/dns/resolver, etc: make netmon required in most places
The goal is to move more network state accessors to netmon.Monitor
where they can be cheaper/cached. But first (this change and others)
we need to make sure the one netmon.Monitor is plumbed everywhere.

Some notable bits:

* tsdial.NewDialer is added, taking a now-required netmon

* because a tsdial.Dialer always has a netmon, anything taking both
  a Dialer and a NetMon is now redundant; take only the Dialer and
  get the NetMon from that if/when needed.

* netmon.NewStatic is added, primarily for tests

Updates tailscale/corp#10910
Updates tailscale/corp#18960
Updates #7967
Updates #3299

Change-Id: I877f9cb87618c4eb037cee098241d18da9c01691
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Brad Fitzpatrick 7c1d6e35a5 all: use Go 1.22 range-over-int
Updates #11058

Change-Id: I35e7ef9b90e83cac04ca93fd964ad00ed5b48430
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
7 months ago
Anton Tolchanov 5336362e64 prober: export probe class and metrics from bandwidth prober
- Wrap each prober function into a probe class that allows associating
  metric labels and custom metrics with a given probe;
- Make sure all existing probe classes set a `class` metric label;
- Move bandwidth probe size from being a metric label to a separate
  gauge metric; this will make it possible to use it to calculate
  average used bandwidth using a PromQL query;
- Also export transfer time for the bandwidth prober (more accurate than
  the total probe time, since it excludes connection establishment
  time).

Updates tailscale/corp#17912

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
7 months ago
Anton Tolchanov 21671ca374 prober: remove unused notification code
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
7 months ago
Andrew Dunham 7d7d159824 prober: support creating multiple probes in ForEachAddr
So that we can e.g. check TLS on multiple ports for a given IP.

Updates tailscale/corp#16367

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I81d840a4c88138de1cbb2032b917741c009470e6
7 months ago
Andrew Dunham ac574d875c prober: add helper function to check all IPs for a DNS hostname
This allows us to check all IP addresses (and address families) for a
given DNS hostname while dynamically discovering new IPs and removing
old ones as they're no longer valid.

Also add a testable example that demonstrates how to use it.

Alternative to #11610
Updates tailscale/corp#16367

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I6d6f39bafc30e6dfcf6708185d09faee2a374599
7 months ago
Anton Tolchanov f12d2557f9 prober: add a DERP bandwidth probe
Updates tailscale/corp#17912

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
8 months ago
Anton Tolchanov 5018683d58 prober: remove unused derp prober latency measurements
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
8 months ago
Anton Tolchanov 205a10b51a prober: export probe counters and cumulative latency
Updates #cleanup

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
8 months ago
Brad Fitzpatrick a4a909a20b prober: add TLS probe constructor to split dial addr from cert name
So we can probe load balancers by their unique DNS name but without
asking for that cert name.

Updates tailscale/corp#13050

Change-Id: Ie4c0a2f951328df64281ed1602b4e624e3c8cf2e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
9 months ago
Anton Tolchanov 869b34ddeb prober: log HTTP response body on failure
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
11 months ago
Thomas Kosiewski 96a80fcce3 Add support for custom DERP port in TLS prober
Updates #10146

Signed-off-by: Thomas Kosiewski <thoma471@googlemail.com>
1 year ago
valscale f314fa4a4a
prober: fix data race when altering derpmap (#8397)
Move the clearing of STUNOnly flag to the updateMap() function.

Fixes #8395

Signed-off-by: Val <valerie@tailscale.com>
1 year ago
valscale 88097b836a
prober: allow monitoring of nodes marked as STUN only in default derpmap (#8391)
prober uses NewRegionClient() to connect to a derper using a faked up
single-node region, but NewRegionClient() fails to connect if there is
no non-STUN only client in the region. Set the STUN only flag to false
before we call NewRegionClient() so we can monitor nodes marked as
STUN only in the default derpmap.

Updates #11492

Signed-off-by: Val <valerie@tailscale.com>
1 year ago
Mihai Parparita 7330aa593e all: avoid repeated default interface lookups
On some platforms (notably macOS and iOS) we look up the default
interface to bind outgoing connections to. This is both duplicated
work and results in logspam when the default interface is not available
(i.e. when a phone has no connectivity, we log an error and thus cause
more things that we will try to upload and fail).

Fixed by passing around a netmon.Monitor to more places, so that we can
use its cached interface state.

Fixes #7850
Updates #7621

Signed-off-by: Mihai Parparita <mihai@tailscale.com>
2 years ago
Andrew Dunham 280255acae
various: add golangci-lint, fix issues (#7905)
This adds an initial and intentionally minimal configuration for
golang-ci, fixes the issues reported, and adds a GitHub Action to check
new pull requests against this linter configuration.

Signed-off-by: Andrew Dunham <andrew@du.nham.ca>
Change-Id: I8f38fbc315836a19a094d0d3e986758b9313f163
2 years ago
Anton Tolchanov c153e6ae2f prober: migrate to Prometheus metric library
This provides an example of using native Prometheus metrics with tsweb.

Prober library seems to be the only user of PrometheusVar, so I am
removing support for it in tsweb.

Updates https://github.com/tailscale/corp/issues/10205

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2 years ago
Anton Tolchanov 7083246409 prober: only record latency for successful probes
This will make it easier to track probe latency on a dashboard.

Updates https://github.com/tailscale/corp/issues/9916

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2 years ago
Anton Tolchanov e59dc29a55 prober: log client pubkeys on derp mesh probe failures
Updates https://github.com/tailscale/corp/issues/9916

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2 years ago
Anton Tolchanov 100d8e909e cmd/derpprobe: migrate to the prober framework
`prober.DERP` was created in #5988 based on derpprobe. Having used it
instead of derpprobe for a few months, I think we have enough confidence
that it works and can now migrate derpprobe to use the prober framework
and get rid of code duplication.

A few notable changes in behaviour:
- results of STUN probes over IPv4 and IPv6 are now reported separately;
- TLS probing now includes OCSP verification;
- probe names in the output have changed;
- ability to send Slack notification from the prober has been removed.
  Instead, the prober now exports metrics in Expvar (/debug/vars) and
  Prometheus (/debug/varz) formats.

Fixes https://github.com/tailscale/corp/issues/8497

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2 years ago
Will Norris 71029cea2d all: update copyright and license headers
This updates all source files to use a new standard header for copyright
and license declaration.  Notably, copyright no longer includes a date,
and we now use the standard SPDX-License-Identifier header.

This commit was done almost entirely mechanically with perl, and then
some minimal manual fixes.

Updates #6865

Signed-off-by: Will Norris <will@tailscale.com>
2 years ago
Brad Fitzpatrick a1b4ab34e6 util/httpm: add new package for prettier HTTP method constants
See package doc.

Change-Id: Ibbfc8e1f98294217c56f3a9452bd93ffa3103572
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Andrew Dunham 06b55ab50f prober: fix test flake
This was tested by running 10000 test iterations and observing no flakes
after this change was made.

Change-Id: Ib036fd03a3a17800132c53c838cc32bfe2961306
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2 years ago
Anton Tolchanov bd47e28638 prober: optionally spread probes over time
By default all probes with the same probe interval that have been added
together will run on a synchronized schedule, which results in spiky
resource usage and potential throttling by third-party systems (for
example, OCSP servers used by the TLS probes).

To address this, prober can now run in "spread" mode that will
introduce a random delay before the first run of each probe.

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2 years ago
Anton Tolchanov 69f61dcad8 prober: add a DERP probe manager based on derpprobe
This ensures that each DERP server is probed individually (TLS and STUN)
and also manages per-region mesh probing. Actual probing code has been
copied from cmd/derpprobe.

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2 years ago
Denton Gentry b55761246b prober: add utilities to generate alerts and warnings.
sendAlert will trigger the Incident Response system.
sendWarning will post to Slack.

Co-authored-by: M. J. Fromberger <fromberger@tailscale.com>
Signed-off-by: Denton Gentry <dgentry@tailscale.com>
2 years ago
Anton Tolchanov 26af329fde prober: expand certificate verification logic in the TLS prober
TLS prober now checks validity period for all server certificates
and verifies OCSP revocation status for the leaf cert.

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2 years ago
Josh Soref d4811f11a0 all: fix spelling mistakes
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2 years ago
Brad Fitzpatrick 4950fe60bd syncs, all: move to using Go's new atomic types instead of ours
Fixes #5185

Change-Id: I850dd532559af78c3895e2924f8237ccc328449d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
Brad Fitzpatrick 116f55ff66 all: gofmt for Go 1.19
Updates #5210

Change-Id: Ib02cd5e43d0a8db60c1f09755a8ac7b140b670be
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2 years ago
David Anderson 7c7f37342f prober: used keyed initializer for LimitedReader.
Reported by go vet.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
Dave Anderson 0968b2d55a
prober: support adding key/value labels to probes. (#4250)
prober: add labels to Probe instances.

This allows especially dynamically-registered probes to have a bunch
more dimensions along which they can be sliced in Prometheus.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson a09c30aac2 prober: refactor probe state into a Probe struct.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 94aaec5c66 prober: rename Probe to ProbeFunc.
Making way for a future Probe struct to encapsulate per-probe state.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson 19f61607b6 prober: run all probes once on initial registration.
Turns out, it's annoying to have to wait the entire interval
before getting any monitorable data, especially for very long
interval probes like hourly/daily checks.

Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago
David Anderson e41a3b983c prober: library to build healthchecking probers.
Signed-off-by: David Anderson <danderson@tailscale.com>
3 years ago