Without this, MuxProcess will start dying too early, before Ansible /
TaskQueueManager.cleanup() has a chance to wait on worker processes.
That would allow WorkerProcess to see ECONNREFUSED from the MuxProcess
socket much more easily.
This 4GB limit was already set for MuxProcess and inherited by all
descendents including the context running on the target host, but it was
not applied to the WorkerProcess router.
That explains why the error from the ticket is being raised by the
router within the WorkerProcess rather than the router on the original
target.
It's no longer necessary, since connection attempts are no longer truly
blocking. When CTRL+C is hit in the top-level process, broker will begin
shutdown, which will cancel all pending connection attempts, causing
pool threads to wake. The pool can't block during shutdown anymore.
"self.initialized = False" slipped in a few days ago, on second thoughts
that flag is not needed at all, by simply rearranging ClassicWorkerModel
to have a regular constructor.
This hierarchy is still squishy, it needs more love. Remaining
MuxProcess class attributes should eliminated.
While catching every possible case where "open file limit exceeded" is
not possible, we can at least increase the soft limit to the available
hard limit without any user effort.
Do this in Ansible top-level process, even though we probably only need
it in the MuxProcess. It seems there is no reason this could hurt
Previously we exitted without calling waitpid(), which meant the
top-level process struct rusage did not reflect the resource usage
consumed by the multiplexer processes.
Existing benchmarks are made using perf so this never created a problem,
but it could be confusing to others using the "time" command, and also
allows logging the final exit status of the process.
Move all details of broker/router setup out of connection.py, instead
deferring it to a WorkerModel class exported by process.py via
get_worker_model(). The running strategy can override the configured
worker model via _get_worker_model().
ClassicWorkerModel is installed by default, which implements the
extension's existing process model.
Add optional support for the third party setproctitle module, so
children have pretty names in ps output.
Add optional support for per-CPU multiplexers to classic runs.
This relies on the previous commit resetting global variables.
Update clean_shutdown() to handle duplicate calls, due to tests
repeatedly installing it.
Regardless of the version of simplejson loaded in the master, load up
the ModuleResponder cache with our 2.4-compatible version.
To cope with simplejson being loaded due to modules like ec2_group that
try to import it before importing 'json', also update target.py to
remove it from the whitelist if a local 'json' module import succeeds.
os._exit() subverted calm shutdown, meaning unix.Listener never had a
chance to cleanup its socket.
Move unix.Listener socket cleanup into its class so it is automatic
during shutdown, rather than cutpasted for each consumer.
Disable the watcher thread in the MuxProcess, it is useless.
Add .sock extension to /tmp/mitogen_unix_*, so we can write a test.
This is in part so image_prep can run against an ancient CentOS 5 image
without any upfront help, and in part simply because it's very easy to
support.
The connection multiplexer can expect to not be scheduled at least until
every $forks worker processes has attempted a connection, so the backlog
must be able to hold every worker.
* Always enable the faulthandler module in the top-level process if it
is available.
* Make MITOGEN_DUMP_THREAD_STACKS interval configurable, to better
handle larger runs.
* Add docs subsection on diagnosing hangs.
Conflicts:
ansible_mitogen/process.py