To support detach, we must be able to preload the target with every
module it will need prior to detachment. This implements the
intermediary part of the process (i.e. the Ansible fork parent) --
receiving LOAD_MODULE/FORWARD_MODULE pairs and ensuring they reach the
child.
This may come back to bite later, but in the meantime it avoids shipping
up to 12KiB of junk metadata for every single task invocation.
For detachment (aka. async), we must ensure the target has two types of
preloads completed (modules and module_utils files) before detaching.
Bootstrap would hang if (as of writing) a pipe sufficient to hold 42,006
bytes was not handed out by the kernel to the first stage. It was luck
that this didn't manifest before, as first stage could write the full
source and exit completely before reading begun.
It is not clear under which circumstances this could previously occur,
but at least since Linux 4.5, it can be triggered if
/proc/sys/fs/pipe-max-size is reduced from the default of 1MiB, which
can have the effect of capping the default pipe buffer size of 64KiB to
something lower.
Suspicion is that buffer pipe size could also be reduced under memory
pressure, as reference to busy machines appeared a few times in the bug
report.
The OpenShift installer modifies /etc/resolv.conf then tests the new
resolver configuration, however, there was no mechanism to reload
resolv.conf in our reuseable interpreter.
https://github.com/openshift/openshift-ansible/blob/release-3.9/roles/openshift_web_console/tasks/install.yml#L137
This inserts an explicit call to res_init() for every new style
invocation, with an approximate cost of ~1usec on Linux since glibc
verifies resolv.conf has changed before reloading it.
There is little to be done for users of the thread-safe resolver APIs,
their state is hidden from us. If bugs like that manifest, whack-a-mole
style 'del sys.modules[thatmod]' patches may suffice.
The module the connection class is now loaded as is
"ansible.plugins.connection.mitogen_ssh", etc., which breaks the test.
Instead, check if the connection is an instance of the base Connection
class.
Traced git log all the way back to beginning of time, and checked
Ansible versions starting Jan 2016. Zero clue where this came from, but
the convention suggests it came from Ansible at some point.
While adding support for non-new style module types, NewStyleRunner
began writing modules to a temporary file, and sys.argv was patched to
actually include the script filename. The argv change was never required
to fix any particular bug, and a search of the standard modules reveals
no argv users. Update argv[0] to be '', like an interactive interpreter
would have.
While fixing #210, new style runner began setting __file__ to the
temporary file path in order to allow apt.py to discover the Ansiballz
temporary directory. 5 out of 1,516 standard modules follow this
pattern, but in each case, none actually attempt to access __file__,
they just call dirname on it. Therefore do not write the contents of
file, simply set it to the path as it would exist, within a real
temporary directory.
Finally move temporary directory creation out of runner and into target.
Now a single directory exists for the duration of a run, and is emptied
by runner.py as necessary after each task invocation.
This could be further extended to stop rewriting non-new-style modules
in a with_items loop, but that's another step.
Finally the last bullet point in the documentation almost isn't a lie
again.