Minify-safe files are marked with a magical "# !mitogen: minify_safe"
comment anywhere in the file, which activates the minifier. The result
is naturally cached by ModuleResponder, therefore lru_cache is gone too.
Given:
import os, mitogen
@mitogen.main()
def main(router):
c = router.ssh(hostname='k3')
c.call(os.getpid)
router.sudo(via=c)
SSH footprint drops from 56.2 KiB to 42.75 KiB (-23.9%)
Ansible "shell: hostname" drops 149.26 KiB to 117.42 KiB (-21.3%)
This has been broken for some time, but somehow it has become noticeable
on recent Ansible.
loop-100-tasks.yml before:
15.532724001 seconds time elapsed
8.453850000 seconds user
5.808627000 seconds sys
loop-100-tasks.yml after:
8.991635735 seconds time elapsed
5.059232000 seconds user
2.578842000 seconds sys
os._exit() subverted calm shutdown, meaning unix.Listener never had a
chance to cleanup its socket.
Move unix.Listener socket cleanup into its class so it is automatic
during shutdown, rather than cutpasted for each consumer.
Disable the watcher thread in the MuxProcess, it is useless.
Add .sock extension to /tmp/mitogen_unix_*, so we can write a test.
Ansible 2.3/Python 2.4 work revealed there is no guarantee a slow target
will have written the initial job status file out before a fast
controller makes an initial check for it. Therefore, provide AsyncRunner
with a sender it should send a message to when the initial job file has
been written.
As a bonus, also catch and report exceptions happening early in
AsyncRunner, rather than leaving them to end up in -vvv output.
Since Python 2.4 fork is so defective, we must use subprocesses for
mitogen_task_isolation=fork. This has plenty of upside, since the long
term goal is to dump forking altogether. This allows a gentle
introduction of its replacement.
This is in part so image_prep can run against an ancient CentOS 5 image
without any upfront help, and in part simply because it's very easy to
support.
This refactors connection.py to pull the two huge dict-building
functions out into new transport_transport_config.PlayContextSpec and
MitogenViaSpec classes, leaving a lot more room to breath in both files
to figure out exactly how connection configuration should work.
The changes made in 1f21a30 / 3d58832 are updated or completely removed,
the original change was misguided, in a bid to fix connection delegation
taking variables from the wrong place when delegate_to was active.
The Python path no longer defaults to '/usr/bin/python', this does not
appear to be Ansible's normal behaviour. This has changed several times,
so it may have to change again, and it may cause breakage after release.
Connection delegation respects the c.DEFAULT_REMOTE_USER whereas the
previous version simply tried to fetch whatever was in the
'ansible_user' hostvar. Many more connection delegation variables closer
match vanilla's handling, but this still requires more work. Some of the
variables need access to the command line, and upstream are in the
process of changing all that stuff around.
This replaces the previous method for capping poorly Popen()
performance, instead entirely monkey-patching the problem function
rather than simply working around it.
Ideally it would only be called once, and in future maybe it can, but
right now we need to cope with these cases:
* Downstream parent notifies us of disconnection (DEL_ROUTE)
* We notify ourself of disconnection
* We notify ourself and so does downstream parent
It's case 3 that causes the error.
Simply listen to RouteMonitor's Context "disconnect" and forget
contexts according to RouteMonitor's rules, rather than duplicate them
(and screw it up).
Update _via_by_context earlier; fixes:
Traceback (most recent call last):
File "/Users/dmw/src/mitogen/mitogen/service.py", line 519, in _on_service_call
return invoker.invoke(method_name, kwargs, msg)
File "/Users/dmw/src/mitogen/mitogen/service.py", line 253, in invoke
response = self._invoke(method_name, kwargs, msg)
File "/Users/dmw/src/mitogen/mitogen/service.py", line 239, in _invoke
ret = method(**kwargs)
File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 454, in get
reraise(*result)
File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 412, in _wait_or_start
response = self._connect(key, spec, via=via)
File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 363, in _connect
self._update_lru(context, spec, via)
File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 266, in _update_lru
self._update_lru_unlocked(new_context, spec, via)
File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 253, in _update_lru_unlocked
if self._refs_by_context[context] == 0:
KeyError: Context(1008, u'ssh.localhost.sudo.mitogen__user3')
Earlier commit moved Stream.routes attribute into a private map
belonging to RouteMonitor, to make upgrades smoother. This adds a new
accessor method to RouteMonitor.
(Pull #377)
Changes:
- additional_parameters -> extra_args
- Merge with kubectl changes from dmw branch
- Update docs
- Remove unused username class member
- Avoid mutable kubectl_args class member
- Use six.iteritems
This change allows the kubectl connector to support the same options as
Ansible's original connector.
The playbook sample comes with an example of a pod containing two containers
and checking that moving from one container to another, the version of Python
changes as expected.
Reverts 49736b3a, large file copies can't avoid the RTT.
The parent stack must be blocked while FileService progresses, as unlike
the small file path, it does not make a snapshot of the (possibly
temporary) file passed by the action plug-in. So we need to keep that
file alive while the service runs.
Add a new integration test and a new soak test to cover both.
When Ansible abnormally shuts down, the broker begins
force-disconnecting every context, including those for which connection
is currently in-progress.
When that happens, .call(init_child) throws ChannelError, and that needs
returned back to the worker, assuming the worker still even exists.
This solution is incomplete: with sick nodes, it's also possible the
worker died naturally, and so the worker should perhaps respond by
retrying the connection.
Previously, the unhandled ChannelError would spam the console when e.g.
fork() began returning EAGAIN.
The connection multiplexer can expect to not be scheduled at least until
every $forks worker processes has attempted a connection, so the backlog
must be able to hold every worker.
* Always enable the faulthandler module in the top-level process if it
is available.
* Make MITOGEN_DUMP_THREAD_STACKS interval configurable, to better
handle larger runs.
* Add docs subsection on diagnosing hangs.
Conflicts:
ansible_mitogen/process.py
Calls to connect.put_file() where the file is sufficiently small enough
to fit in a single RPC proceed without waiting for an RPC response. If
the write fails the target context will log an exception, and any
subsequent step depending on the written file will fail.
I verified every built-in action plugin for file transfer calls, and
they all depend on the transferred file in the following step, so this
should be safe.
Reduces template/copy actions to 2-RTT, loop-20-templates.yml runtime
reduced from 30 seconds to 10 seconds over a 250ms link compared to
v0.2.2, and from 123 seconds compared to vanilla with pipelining
enabled.
PlayContext.delegate_to is the unexpanded template, Ansible doesn't keep
a copy of it around anywhere convenient. We either need to re-expand it
or take the expanded version that was stored on the Task, which is what
is done here.
This needs more work -- pretty certain that python_path and suchlike are
coming from the wrong place. Possibly we need another config_from_..()
specialized for delegate_to.
This change is relatively incomplete -- ideally we could snapshot
os.environ and /etc/environment at startup and respect key deletions
too, but that's a lot more work. Wait for a bug report instead.
Closes#338.
Concurrent calls to ModuleDepService would cause significant wasted
work, as potentially all pool threads run the same uncached module dep
scan.
Without:
3243581 function calls (3233009 primitive calls) in 4770.672 seconds
ncalls tottime percall cumtime percall filename:lineno(function)
2523 0.011 0.000 39.849 0.016 services.py:409(scan)
With:
2801561 function calls (2800042 primitive calls) in 5166.843 seconds
ncalls tottime percall cumtime percall filename:lineno(function)
2506 0.009 0.000 1.967 0.001 services.py:411(scan)
Ignore timing variance due to problems with the test job.
Given an extracted download of mitogen-2.2.tar.gz, with strategy_plugins
pointing into it, if an old version of the package was pip-installed,
then the old pip-installed package would be imported and override
whatever came from the tarball.
Instead, modify sys.path before attempting any import. This still isn't
perfect, but it's better.
When running any kind of script, rewrite the hashbang like Ansible does,
but subsequently ignore it and explicitly use a fragment of shell from
the ansible_*_interpreter variable to call the interpreter, just like
Ansible does.
This fixes hashbangs containing '/usr/bin/env A=1 bash' on Linux, where
putting that into a hashbang line results in an infinite loop.
* mitogen/ansible_mitogen should only generate ERROR-level logs in
log_path unless -vvv is enabled.
* Targets were accidentally configured to always have DEBUG set, causing
many log messages to be sent on the wire even though they would be
filtered in the master.
Closes#317.
Vanilla Ansible support expandvars-like expansions widely in a variety
of places. Prefer to whitelist those we need, rather than sprinkling
hellish semantics everywhere.
On OS X with case-insensitive filenames, resolving
'ansible.module_utils.facts.base.Hardware' finds
'ansible.module_utils.facts.hardware/__init__.py', because
module_finder's procedure is completely wrong for resolving child
modules. Patch over it for now since it otherwise works for Ansible.
* ansible: use unicode_literals everywhere since it only needs to be
compatible back to 2.6.
* compat/collections.py: delete this entirely and rip out the parts of
functools that require it.
* Introduce serializable Kwargs dict subclass that translates keys to
Unicode on instantiation.
* enable_debug_logging() must set _v/_vv globals.
* cStringIO does not exist in 3.x.
* Treat IOLogger and LogForwarder input as latin-1.
* Avoid ResourceWarnings in first stage by explicitly closing fps.
* Fix preamble_size.py syntax errors.
The failed job result is likely to be "interrupted system call", and we
don't want that to overwrite the SIGALRM handler's "the task timed out",
so just discard it.
The controller must know the ID of the forked child in order to
propagate dependencies to it, so forking+starting the module run cannot
happen entirely on the target, without some additional mechanism to
wait-and-repropagate the deps as they arrive on the target.
Rework things so that init_child() also handles starting the fork parent,
and returns it along with the context's home directory in a single round
trip.
Now master knows the identity of the fork parent, it can directly create
fork children and call run_module_async() in them. This necessitates 2
roundtrips to start an asynchronous task.
This whole thing sucks and entirely needs simplified, but for now things
almost work, so keeping it.
connection.py:
* Expect ContextService to return the entire dict return value of
init_child(). Store the fork_contxt from the return value.
planner.py:
* Rework Planner to store the invocation as an instance attribute, to
simplify method calls.
* Add Planner.get_push_files() and Planner.get_module_deps().
* Add _propagate_deps() which takes a Planner and ensures the deps it
describes are sent to a (non forked or forked) context.
* Move async task logic out of target.py and into invoke() /
_invoke_*().
process.py:
* Services no longer need references to each other. planner.py handles
sending module deps with one extra RPC.
services.py:
* Return "init_child_result" key instead of simple "home_dir" key.
* Get rid of dep propagation from ModuleDepService, it lives in
planner.py now.
target.py:
* Get rid of async task start logic, lives in planner.py now.
planner.py:
* Rather than grant FileService access to a file for children, use
PushFileService to trigger deduplicating send of the file through
the hierarchy immediately.
* Send the complete list of Ansible module imports to the target so
runner.py knows which files and scripts must be loaded via
PushFileService prior to detaching.
runner.py:
* Teach NewStyleRunner to use the full module map to block until
everything is loaded prior to detach().
target.py:
* Delete old _get_file(), replace get_file() with get_small_file()
which uses PushFileService instead.
Closes#186
For lack of a better place to keep the client function, make it a
classmethod of FileService itself for now.
The old _get_file() is removed in a subsequent commit.
It's not simple without executing a module to determine whether the
above refers to a submodule of a package, or an object defined within a
module.
Therefore detect when resolution of a child module yields the same path
as the parent, and ignore the result.
For "ansible -m setup" over a 25ms link, avoids 65 roundtrips and
reduces runtime from 5.7s to 4.1s (-28%).
For "ansible -m setup" over a simulated 250 ms link, reduces runtime
from m27.015s to 0m8.254s (-69%).