You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
tailscale/util
Nick Khyl f58cbffda1 util/eventbus: use unbounded event queues for DeliveredEvents in subscribers
Bounded DeliveredEvent queues reduce memory usage, but they can deadlock under load.
Two common scenarios trigger deadlocks when the number of events published in a short
period exceeds twice the queue capacity (there's a PublishedEvent queue of the same size):
 - a subscriber tries to acquire the same mutex as held by a publisher, or
 - a subscriber for A events publishes B events

Avoiding these scenarios is not practical and would limit eventbus usefulness and reduce its adoption,
pushing us back to callbacks and other legacy mechanisms. These deadlocks already occurred in customer
devices, dev machines, and tests. They also make it harder to identify and fix slow subscribers and similar
issues we have been seeing recently.

Choosing an arbitrary large fixed queue capacity would only mask the problem. A client running
on a sufficiently large and complex customer environment can exceed any meaningful constant limit,
since event volume depends on the number of peers and other factors. Behavior also changes
based on scheduling of publishers and subscribers by the Go runtime, OS, and hardware, as the issue
is essentially a race between publishers and subscribers. Additionally, on lower-end devices,
an unreasonably high constant capacity is practically the same as using unbounded queues.

Therefore, this PR changes the event queue implementation to be unbounded by default.
The PublishedEvent queue keeps its existing capacity of 16 items, while subscribers'
DeliveredEvent queues become unbounded.

This change fixes known deadlocks and makes the system stable under load,
at the cost of higher potential memory usage, including cases where a queue grows
during an event burst and does not shrink when load decreases.

Further improvements can be implemented in the future as needed.

Fixes #17973
Fixes #18012

Signed-off-by: Nick Khyl <nickk@tailscale.com>
(cherry picked from commit 1ccece0f78)
1 week ago
..
backoff util/backoff: rename logtail/backoff package to util/backoff 2 months ago
cache util/cache: fix missing interface methods (#11275) 2 years ago
checkchange net/dns, ipn/ipnlocal: fix regressions from change moving away from deephash 2 months ago
cibuild all: update copyright and license headers 3 years ago
clientmetric feature/featuretags: make clientmetrics optional 2 months ago
cloudenv feature/featuretags, all: add build features, use existing ones in more places 2 months ago
cmpver util/cmpver: add Less/LessEq helper funcs 2 years ago
codegen cmd/viewer, types/views: implement support for json/v2 (#16852) 4 months ago
cstruct all: use Go 1.21's binary.NativeEndian 11 months ago
ctxkey all: use reflect.TypeFor now available in Go 1.22 (#11078) 2 years ago
deephash util/deephash: move tests that depend on other tailscale packages to deephash_test 7 months ago
dirwalk all: use tstest.Replace more 3 years ago
dnsname tailcfg: adjust ServiceName.Validate to use vizerror 10 months ago
eventbus util/eventbus: use unbounded event queues for DeliveredEvents in subscribers 1 week ago
execqueue control/controlclient,util/execqueue: extract execqueue into a package 2 years ago
expvarx util/expvarx: deflake TestSafeFuncHappyPath with synctest 3 months ago
goroutines ipn/ipnlocal, util/goroutines: track goroutines for tests, shutdown 11 months ago
groupmember util/groupmember: fail earlier if group doesn't exist, use slices.Contains 2 years ago
hashx all: use Go 1.22 range-over-int 2 years ago
httphdr util/httphdr: add new package for parsing HTTP headers (#9797) 2 years ago
httpm util/httpm: don't run test if .git doesn't exist 2 years ago
limiter all: add test for package comments, fix, add comments as needed 1 year ago
lineiter types/result, util/lineiter: add package for a result type, use it 1 year ago
lineread all: update copyright and license headers 3 years ago
linuxfw util/linuxfw: fix 32-bit arm regression with iptables 1 month ago
lru util/slicesx: add MapKeys and MapValues from golang.org/x/exp/maps 11 months ago
mak util/mak: delete long-deprecated, unused, pre-generics NonNil func 7 months ago
multierr all: use Go 1.22 range-over-int 2 years ago
must util/must: add Get2 for functions that return two values 6 months ago
nocasemaps all: use Go 1.22 range-over-int 2 years ago
osdiag all: add test for package comments, fix, add comments as needed 1 year ago
osshare clientupdate, util/osshare, util/winutil, version: improve Windows GUI filename resolution and WinUI build awareness 2 months ago
osuser ssh/tailssh: add Plan 9 support for Tailscale SSH 8 months ago
pidowner types/result, util/lineiter: add package for a result type, use it 1 year ago
pool util/pool: add package for storing and using a pool of items 2 years ago
precompress all: update copyright and license headers 3 years ago
progresstracking ipn/localapi: add support for multipart POST to file-put 2 years ago
prompt util/prompt: add a default and take default in non-interactive cases 2 months ago
quarantine all: update copyright and license headers 3 years ago
race all: use Go 1.22 range-over-int 2 years ago
racebuild all: update copyright and license headers 3 years ago
rands wgengine/magicsock: use math/rands/v2 2 years ago
reload all: use math/rand/v2 more 2 years ago
ringlog util/ringbuffer: rename to ringlog 3 months ago
set control/controlclient: restore aggressive Direct.Close teardown 2 months ago
singleflight util/singleflight: add DoChanContext 2 years ago
slicesx util/slicesx: add AppendNonzero 10 months ago
stringsx util/stringsx: add package for extra string functions, like CompareFold 12 months ago
syspolicy types/persist: add AttestationKey (#17281) 2 months ago
sysresources util/sysresources, magicsock: scale DERP buffer based on system memory 3 years ago
testenv nettest, *: add option to run HTTP tests with in-memory network 8 months ago
topk all: use Go 1.22 range-over-int 2 years ago
truncate util/truncate: support []byte as well (#11614) 2 years ago
usermetric feature/featuretags: make usermetrics modular 2 months ago
vizerror util/vizerror: add WrapWithMessage 1 year ago
winutil clientupdate, util/osshare, util/winutil, version: improve Windows GUI filename resolution and WinUI build awareness 2 months ago
zstdframe all: use Go 1.22 range-over-int 2 years ago