mitogen

Commit Graph

Author	SHA1	Message	Date
David Wilson	f78a5f08c6	issue #605 : ansible: share a sem_t instead of a pthread_mutex_t The previous version quite reliably causes worker deadlocks within 10 minutes running: # 100 times: - import_playbook: integration/async/runner_one_job.yml # 100 times: - import_playbook: integration/module_utils/adjacent_to_playbook.yml via .ci/soak/mitogen.sh with PLAYBOOK= set to the above playbook. Attaching to the worker with gdb reveals it in an instruction immediately following a futex() call, which likely returned EINTR due to attaching gdb. Examining the pthread_mutex_t state reveals it to be completely unlocked. pthread_mutex_t on Linux should have zero trouble living in shmem, so it's not clear how this deadlock is happening. Meanwhile POSIX semaphores are explicitly designed for cross-process use and have a completely different internal implementation, so try those instead. 1 hour of soaking reveals no deadlock. This is about avoiding managing a lockable temporary file on disk to contain our counter, and somehow communicating a reference to it into subprocesses (despite the subprocess module closing inherited fds, etc), somehow deleting it reliably at exit, and somehow avoiding concurrent Ansible runs stepping on the same file. For now ctypes is still less pain. A final possibility would be to abandon a shared counter and instead pick a CPU based on the hash of e.g. the new child's process ID. That would likely balance equally well, and might be worth exploring when making this code work on BSD.	5 years ago
David Wilson	5af6c9b26f	issue #615 : use FileService for target->controll file transfers	5 years ago
David Wilson	6f12980611	[linear2] merge fallout: re-enable _send_module_forwards().	5 years ago
David Wilson	5298e87548	Split out and make readable more log messages across both packages	5 years ago
David Wilson	0f23a90d50	ansible: log affinity assignments	5 years ago
David Wilson	4f051a38a7	ansible: improve docstring	5 years ago
David Wilson	5811909c8d	[linear2] simplify _listener_for_name()	5 years ago
David Wilson	c68dbdd569	ansible: stop relying on SIGTERM to shut down service pool It's no longer necessary, since connection attempts are no longer truly blocking. When CTRL+C is hit in the top-level process, broker will begin shutdown, which will cancel all pending connection attempts, causing pool threads to wake. The pool can't block during shutdown anymore.	5 years ago
David Wilson	f4ca926b21	ansible: cleanup various docstrings	5 years ago
David Wilson	edde251d58	issue #549 : ansible: reduce risk by capping RLIM_INFINITY	5 years ago
David Wilson	d408caccf5	issue #573 : guard against a forked top-level Ansible process See comment.	5 years ago
David Wilson	3ceac2c9ed	[linear2] simplify ClassicWorkerModel and fix repeat initialization "self.initialized = False" slipped in a few days ago, on second thoughts that flag is not needed at all, by simply rearranging ClassicWorkerModel to have a regular constructor. This hierarchy is still squishy, it needs more love. Remaining MuxProcess class attributes should eliminated.	5 years ago
David Wilson	395b03a77d	issue #549 : fix setrlimit() crash and hard-wire OS X default OS X advertised unlimited, but really it means kern.maxfilesperproc.	5 years ago
David Wilson	33bceb6eb4	issue #602 : recover task_vars for synchronize and meta: reset_connection	5 years ago
David Wilson	6b4bcf4fe0	ansible: remove cutpasted docstring	5 years ago
David Wilson	619f4dee07	[linear2] merge fallout: restore optimization from #491 / `7b129e857`	5 years ago
David Wilson	e4321f81a0	issue #600 : /etc/environment may be non-ASCII in an unknown encoding	5 years ago
David Wilson	75d179e4b9	remove unused imports flagged by lgtm	5 years ago
David Wilson	c80fddd487	[linear2]: merge fallout flaggged by LGTM	5 years ago
David Wilson	eeb7150f24	issue #549 : increase open file limit automatically if possible While catching every possible case where "open file limit exceeded" is not possible, we can at least increase the soft limit to the available hard limit without any user effort. Do this in Ansible top-level process, even though we probably only need it in the MuxProcess. It seems there is no reason this could hurt	5 years ago
David Wilson	acab26d796	ansible: improve process.py docs	5 years ago
David Wilson	4dfbe82e76	tests: hide ugly error during Ansible tests	5 years ago
David Wilson	108015aa22	ansible: gracefully handle failure to connect to MuxProcess It's possible to hit an ugly exception during early CTRL+C	5 years ago
David Wilson	bf1f3682aa	ansible: pin per-CPU muxes to their corresponding CPU This slightly breaks the old scheme, in that CPU 1 may now end up with a mux and the top-level process pinned to it.	5 years ago
David Wilson	dc9f4e89e6	ansible: reap mux processes on shut down Previously we exitted without calling waitpid(), which meant the top-level process struct rusage did not reflect the resource usage consumed by the multiplexer processes. Existing benchmarks are made using perf so this never created a problem, but it could be confusing to others using the "time" command, and also allows logging the final exit status of the process.	5 years ago
David Wilson	136dee1fb4	[linear2] more merge fallout, fix Connection._mitogen_reset(mode=)	5 years ago
David Wilson	a9755d4ad0	[linear2] update mitogen_get_stack for new _build_stack() return value	5 years ago
David Wilson	1fca0b7a94	[linear2] fix MuxProcess test fixture and some merge fallout	5 years ago
David Wilson	0f63ca4c68	Make setting affinity optional.	5 years ago
David Wilson	9035884c77	ansible: abstract worker process model. Move all details of broker/router setup out of connection.py, instead deferring it to a WorkerModel class exported by process.py via get_worker_model(). The running strategy can override the configured worker model via _get_worker_model(). ClassicWorkerModel is installed by default, which implements the extension's existing process model. Add optional support for the third party setproctitle module, so children have pretty names in ps output. Add optional support for per-CPU multiplexers to classic runs.	5 years ago
David Wilson	402dba4197	module_finder: pass raw file to compile() Newer Ansibles have e.g. UTF-8 present in apt.py.	5 years ago
David Wilson	1aceacf89e	[stream-refactor] replace old detach_popen() reference	5 years ago
David Wilson	300f8b2ff9	ansible: fixturize creation of MuxProcess This relies on the previous commit resetting global variables. Update clean_shutdown() to handle duplicate calls, due to tests repeatedly installing it.	5 years ago
David Wilson	26b6333787	[stream-refactor] fix unix.Listener construction	5 years ago
Jordan Webb	1a02a86331	Add buildah transport	6 years ago
David Wilson	7ae926b325	ansible: prevent tempfile.mkstemp() leaks. This avoids a leak present in Ansible 2.7.0..current HEAD, and all similar leaks. See ansible/ansible#57327.	6 years ago
David Wilson	3620fce071	issue #593 : expose configurables for SSH keepalive and increase the default	6 years ago
David Wilson	0b7fd3f290	issue #591 : ansible: restore CWD prior to AnsibleModule initialization.	6 years ago
David Wilson	4f23f0bec1	issue #590 : update comment to indicate the hack is permanent	6 years ago
David Wilson	1a92995a24	issue #590 : include nasty workaround for sys.modules junk	6 years ago
David Wilson	92b4724010	issue #587 : consistent become_exe() behaviour for older Ansibles.	6 years ago
David Wilson	f35194fe0f	issue #587 : mitogen_doas should not become_exe for doas_path Looks like this has always been wrong - when used as a connection method, PlayContext.become_method/become_exe may hold totally unrelated data.	6 years ago
David Wilson	c1c8d5c31e	issue #587 : 2.8 PlayContext lacks sudo_flags attribute. This is a huge bodge.	6 years ago
David Wilson	e11b251c75	issue #587 : 2.8 PluginLoader.get() introduced new collection_list kwarg	6 years ago
David Wilson	46dde95962	issue #587 : 2.8 PlayContext.connection no longer contains connection name Not clear what the intention is here. Either need to ferret it out of some other location, or just stop preloading the connection class in the top-level process.	6 years ago
David Wilson	4a614c3950	issue #587 : bump max Ansible version	6 years ago
David Wilson	f105a81e20	ansible: descriptive version check during startup.	6 years ago
David Wilson	f30a4c05c8	issue #581 : expose mitogen_mask_remote_name variable.	6 years ago
David Wilson	65deb3feac	issue #575 : fix exception text rendering	6 years ago
David Wilson	34fb9da1be	issue #570 : add firewalld to always-fork list for now.	6 years ago
David Wilson	3ff6123483	issue #557 : support correct cpu_set_t size	6 years ago
David Wilson	2bd0bbd4df	issue #555 : ansible: workaround ancient reload(sys) hack. This is the most minimal change for what might be relatively minimal edge case. Alternative is replacing reload(), but let's not do that yet. Closes #555	6 years ago
David Wilson	6309774be2	issue #554 : fix Ansible 2.4 compatibility	6 years ago
David Wilson	7743e57ff3	issue #554 : track and remove multiple make_tmp_path() calls.	6 years ago
David Wilson	7dacb68eeb	issue #552 : include process identity in log messages.	6 years ago
David Wilson	26e6194d0a	issue #548 : always treat transport=smart as 'ssh' for mitogen_via=. The idea behind transport=smart is to select between paramiko and OpenSSH given the availability of connection multiplexing and/or OSX kernel bugs. We need to make no such choice.	6 years ago
David Wilson	458a4faa97	ansible: create stub __init__.py for sdist. This went into 0.2.5 sdist tarball but it's not checked in.	6 years ago
David Wilson	8f9c67daf1	ansible: refactor affinity class and add abstract tests.	6 years ago
David Wilson	0f30808234	ansible: quiesce boto logger; closes #541 .	6 years ago
David Wilson	7fd0d34910	tests/ansible: Spec.port() test & mitogen_via= fix. ansible_ssh_port was not respected.	6 years ago
David Wilson	1f77d24bec	Update copyright year everywhere.	6 years ago
David Wilson	b5b23e8f3d	tests/ansible: Spec.become_pass() test.	6 years ago
David Wilson	ae5a471e31	issue #539 : disable logger propagation.	6 years ago
David Wilson	1c955a9876	ansible: capture stderr stream of async tasks. Closes #540 .	6 years ago
David Wilson	7ff4e6694c	issue #536 : rework how 2.3-compatible simplejson is served Regardless of the version of simplejson loaded in the master, load up the ModuleResponder cache with our 2.4-compatible version. To cope with simplejson being loaded due to modules like ec2_group that try to import it before importing 'json', also update target.py to remove it from the whitelist if a local 'json' module import succeeds.	6 years ago
David Wilson	8ae6ca1d5b	tests/ansible: Spec.become_method() test & mitogen_via= fix. ansible_become_method hostvar was not taken into account.	6 years ago
David Wilson	d1cadf8ac8	tests/ansible: Spec.password() test, document interactive pw limitation.	6 years ago
David Wilson	21ad299d7b	tests/ansible: Spec.remote_user() test & mitogen_via= fix. ansible_ssh_user precedence was incorrect.	6 years ago
David Wilson	748f5f675d	tests/ansible: Spec.remote_addr() test & mitogen_via= fix. ansible_ssh_host was not respected.	6 years ago
David Wilson	e1df98168c	issue #536 : add mitogen_via= tests too.	6 years ago
David Wilson	604b418412	ansible: fix a crash on 2.3 when mitogen_via= host is missing.	6 years ago
David Wilson	001e3fee86	issue #536 : restore correct Python interpreter selection behaviour.	6 years ago
David Wilson	05b1ccb658	ansible: stash PID files in CWD if requested for debugging.	6 years ago
David Wilson	eb67fbe9d2	ansible: double the default pool size. Tempted to push this up to 64, but let's do it incrementally just in case.	6 years ago
David Wilson	b89e53fd70	ansible: raise error with correct exception type.	6 years ago
David Wilson	0e193c223c	issue #508 : master: minify all Mitogen/ansible_mitogen sources. Minify-safe files are marked with a magical "# !mitogen: minify_safe" comment anywhere in the file, which activates the minifier. The result is naturally cached by ModuleResponder, therefore lru_cache is gone too. Given: import os, mitogen @mitogen.main() def main(router): c = router.ssh(hostname='k3') c.call(os.getpid) router.sudo(via=c) SSH footprint drops from 56.2 KiB to 42.75 KiB (-23.9%) Ansible "shell: hostname" drops 149.26 KiB to 117.42 KiB (-21.3%)	6 years ago
David Wilson	7badb4a25b	ansible: hacky parser to alow bools to be specified on command line	6 years ago
David Wilson	b499fbe29b	ansible: add mitogen_ssh_compression variable.	6 years ago
David Wilson	a2ae4ed696	SyntaxError.	6 years ago
David Wilson	a9d48a8fdc	ansible: don't pin controller if <4 cores.	6 years ago
David Wilson	4531338b12	ansible: document and make affinity stuff portable to non-Linux Portable as in does nothing for the time at least for now.	6 years ago
David Wilson	de5c050707	ansible: fix affinity.py test failure on 2 cores.	6 years ago
David Wilson	00ae90b2b2	ansible: preheat PluginLoader caches before fork. This has been broken for some time, but somehow it has become noticeable on recent Ansible. loop-100-tasks.yml before: 15.532724001 seconds time elapsed 8.453850000 seconds user 5.808627000 seconds sys loop-100-tasks.yml after: 8.991635735 seconds time elapsed 5.059232000 seconds user 2.578842000 seconds sys	6 years ago
David Wilson	7b129e8576	ansible: use Poller for WorkerProcess; closes #491 .	6 years ago
David Wilson	c6d5aa29ba	ansible: new multiplexer/workers configuration Following on from 152effc26c9a5918cb7ead7a97fe7fa7f81b6764, * Pin mux to CPU 0 * Pin top-level CPU 1 * Pin workers sequentially to CPU 2..n Nets 19.5% improvement on issue_140__thread_pileup.yml when targetting 64 Docker containers on the same 8 core/16 thread machine. Before (prior to last scheme, no affinity at all): 2294528.731458 task-clock (msec) # 6.443 CPUs utilized 10,429,745 context-switches # 0.005 M/sec 2,049,618 cpu-migrations # 0.893 K/sec 8,258,952 page-faults # 0.004 M/sec 5,532,719,253,824 cycles # 2.411 GHz (83.35%) 3,267,471,616,230 instructions # 0.59 insn per cycle # 1.22 stalled cycles per insn (83.35%) 662,006,455,943 branches # 288.515 M/sec (83.33%) 39,453,895,977 branch-misses # 5.96% of all branches (83.37%) 356.148064576 seconds time elapsed After: 2226463.958975 task-clock (msec) # 7.784 CPUs utilized 9,831,466 context-switches # 0.004 M/sec 180,065 cpu-migrations # 0.081 K/sec 5,082,278 page-faults # 0.002 M/sec 5,592,548,587,259 cycles # 2.512 GHz (83.35%) 3,135,038,855,414 instructions # 0.56 insn per cycle # 1.32 stalled cycles per insn (83.32%) 636,397,509,232 branches # 285.833 M/sec (83.30%) 39,135,441,790 branch-misses # 6.15% of all branches (83.35%) 286.036681644 seconds time elapsed	6 years ago
David Wilson	1b909e8697	ansible: pin connection multiplexer to a single core Nets a reliable 8% improvement in issue_140__thread_pileup.yml when targetting 64 Docker containers on the same 8 core/16 thread machine. Before: 2294528.731458 task-clock (msec) # 6.443 CPUs utilized 10,429,745 context-switches # 0.005 M/sec 2,049,618 cpu-migrations # 0.893 K/sec 8,258,952 page-faults # 0.004 M/sec 5,532,719,253,824 cycles # 2.411 GHz (83.35%) 4,001,276,805,120 stalled-cycles-frontend # 72.32% frontend cycles idle (83.30%) 2,024,159,442,463 stalled-cycles-backend # 36.59% backend cycles idle (66.65%) 3,267,471,616,230 instructions # 0.59 insn per cycle # 1.22 stalled cycles per insn (83.35%) 662,006,455,943 branches # 288.515 M/sec (83.33%) 39,453,895,977 branch-misses # 5.96% of all branches (83.37%) 356.148064576 seconds time elapsed After: 2208247.938562 task-clock (msec) # 6.735 CPUs utilized 8,489,840 context-switches # 0.004 M/sec 1,432,967 cpu-migrations # 0.649 K/sec 7,508,957 page-faults # 0.003 M/sec 5,477,293,750,357 cycles # 2.480 GHz (83.31%) 3,984,360,350,811 stalled-cycles-frontend # 72.74% frontend cycles idle (83.32%) 1,976,646,418,711 stalled-cycles-backend # 36.09% backend cycles idle (66.64%) 3,196,197,480,792 instructions # 0.58 insn per cycle # 1.25 stalled cycles per insn (83.36%) 648,247,332,967 branches # 293.557 M/sec (83.35%) 39,004,881,070 branch-misses # 6.02% of all branches (83.37%) 327.876903668 seconds time elapsed	6 years ago
David Wilson	e587396e70	ansible: hook strategy and worker processes into profiler	6 years ago
David Wilson	84944a9a61	ansible: ensure MuxProcess MITOGEN_PROFILING results reach disk. This has been broken for quite some time.	6 years ago
David Wilson	954f874085	issue #527 : catch new-style module tracebacks like vanilla.	6 years ago
David Wilson	a1121c5a84	issue #499 : respect C.BECOME_ALLOW_SAME_USER.	6 years ago
David Wilson	be6ab52fe1	issue #488 : fix shutdown damage caused in `6ca2677de5` os._exit() subverted calm shutdown, meaning unix.Listener never had a chance to cleanup its socket. Move unix.Listener socket cleanup into its class so it is automatic during shutdown, rather than cutpasted for each consumer. Disable the watcher thread in the MuxProcess, it is useless. Add .sock extension to /tmp/mitogen_unix_*, so we can write a test.	6 years ago
David Wilson	38a553d42d	issue #490 : prevent double close() destroying unrelated Connection.	6 years ago
David Wilson	e7fe95af88	issue #477 : fix sudo_args selection.	6 years ago
David Wilson	599da0689a	issue #477 / ansible: avoid a race in async job startup. Ansible 2.3/Python 2.4 work revealed there is no guarantee a slow target will have written the initial job status file out before a fast controller makes an initial check for it. Therefore, provide AsyncRunner with a sender it should send a message to when the initial job file has been written. As a bonus, also catch and report exceptions happening early in AsyncRunner, rather than leaving them to end up in -vvv output.	6 years ago
David Wilson	0175052099	issue #477 : fix source of become_flags on 2.3.	6 years ago
David Wilson	97f3cfe4f4	issue #477 : target.file_exists() wrapper. os.path.exists physical module name varies across major Python versions.	6 years ago
David Wilson	8f5b65f7ec	issue #477 : introduce subprocess isolation. Since Python 2.4 fork is so defective, we must use subprocesses for mitogen_task_isolation=fork. This has plenty of upside, since the long term goal is to dump forking altogether. This allows a gentle introduction of its replacement.	6 years ago
David Wilson	b9924683ac	ansible: docstring fixes.	6 years ago
David Wilson	75f53faf8c	issue #477 : shlex.split() in 2.4 required bytes input.	6 years ago
David Wilson	dc1d4251e3	ansible: synchronize module needs '.docker_cmd' attr for Docker plugin.	6 years ago

1 2 3 4 5 ...

522 Commits (101e20053887bf33d6753afb572cc697f137f112)