When Ansible abnormally shuts down, the broker begins
force-disconnecting every context, including those for which connection
is currently in-progress.
When that happens, .call(init_child) throws ChannelError, and that needs
returned back to the worker, assuming the worker still even exists.
This solution is incomplete: with sick nodes, it's also possible the
worker died naturally, and so the worker should perhaps respond by
retrying the connection.
Previously, the unhandled ChannelError would spam the console when e.g.
fork() began returning EAGAIN.
In some scenarios, Ansible's worker seems to exit early, resulting in
EPIPE during .recv() or .send(). Log an error and gracefully disconnect
in that case.
The connection multiplexer can expect to not be scheduled at least until
every $forks worker processes has attempted a connection, so the backlog
must be able to hold every worker.
* Always enable the faulthandler module in the top-level process if it
is available.
* Make MITOGEN_DUMP_THREAD_STACKS interval configurable, to better
handle larger runs.
* Add docs subsection on diagnosing hangs.
Conflicts:
ansible_mitogen/process.py
Without this, an invocation like:
sudo ansible-playbook foo.yml
Where foo.yml uses setns, could inherit the HOME environment variable
from the external non-root user, which broke /usr/bin/mysql_upgrade and
plenty more.
There were two problems with detection and handling of class methods as call targets in Python 3:
* Methods no longer define `im_self` -- this is now only `__self__`
* The `types` module no longer defines a `ClassType`
The universally-compatible (v2.6+) solution was to switch to using the `inspect` module -- whose interface has been stable -- and to checking the method attribute `__self__`.
(It doesn't hurt that `inspect` checks are more brief and we now no longer need the `types` module here.)