This refactors connection.py to pull the two huge dict-building
functions out into new transport_transport_config.PlayContextSpec and
MitogenViaSpec classes, leaving a lot more room to breath in both files
to figure out exactly how connection configuration should work.
The changes made in 1f21a30 / 3d58832 are updated or completely removed,
the original change was misguided, in a bid to fix connection delegation
taking variables from the wrong place when delegate_to was active.
The Python path no longer defaults to '/usr/bin/python', this does not
appear to be Ansible's normal behaviour. This has changed several times,
so it may have to change again, and it may cause breakage after release.
Connection delegation respects the c.DEFAULT_REMOTE_USER whereas the
previous version simply tried to fetch whatever was in the
'ansible_user' hostvar. Many more connection delegation variables closer
match vanilla's handling, but this still requires more work. Some of the
variables need access to the command line, and upstream are in the
process of changing all that stuff around.
This replaces the previous method for capping poorly Popen()
performance, instead entirely monkey-patching the problem function
rather than simply working around it.
Ideally it would only be called once, and in future maybe it can, but
right now we need to cope with these cases:
* Downstream parent notifies us of disconnection (DEL_ROUTE)
* We notify ourself of disconnection
* We notify ourself and so does downstream parent
It's case 3 that causes the error.
Simply listen to RouteMonitor's Context "disconnect" and forget
contexts according to RouteMonitor's rules, rather than duplicate them
(and screw it up).
Update _via_by_context earlier; fixes:
Traceback (most recent call last):
File "/Users/dmw/src/mitogen/mitogen/service.py", line 519, in _on_service_call
return invoker.invoke(method_name, kwargs, msg)
File "/Users/dmw/src/mitogen/mitogen/service.py", line 253, in invoke
response = self._invoke(method_name, kwargs, msg)
File "/Users/dmw/src/mitogen/mitogen/service.py", line 239, in _invoke
ret = method(**kwargs)
File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 454, in get
reraise(*result)
File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 412, in _wait_or_start
response = self._connect(key, spec, via=via)
File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 363, in _connect
self._update_lru(context, spec, via)
File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 266, in _update_lru
self._update_lru_unlocked(new_context, spec, via)
File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 253, in _update_lru_unlocked
if self._refs_by_context[context] == 0:
KeyError: Context(1008, u'ssh.localhost.sudo.mitogen__user3')
Earlier commit moved Stream.routes attribute into a private map
belonging to RouteMonitor, to make upgrades smoother. This adds a new
accessor method to RouteMonitor.
(Pull #377)
Changes:
- additional_parameters -> extra_args
- Merge with kubectl changes from dmw branch
- Update docs
- Remove unused username class member
- Avoid mutable kubectl_args class member
- Use six.iteritems
This change allows the kubectl connector to support the same options as
Ansible's original connector.
The playbook sample comes with an example of a pod containing two containers
and checking that moving from one container to another, the version of Python
changes as expected.
Reverts 49736b3a, large file copies can't avoid the RTT.
The parent stack must be blocked while FileService progresses, as unlike
the small file path, it does not make a snapshot of the (possibly
temporary) file passed by the action plug-in. So we need to keep that
file alive while the service runs.
Add a new integration test and a new soak test to cover both.
When Ansible abnormally shuts down, the broker begins
force-disconnecting every context, including those for which connection
is currently in-progress.
When that happens, .call(init_child) throws ChannelError, and that needs
returned back to the worker, assuming the worker still even exists.
This solution is incomplete: with sick nodes, it's also possible the
worker died naturally, and so the worker should perhaps respond by
retrying the connection.
Previously, the unhandled ChannelError would spam the console when e.g.
fork() began returning EAGAIN.
The connection multiplexer can expect to not be scheduled at least until
every $forks worker processes has attempted a connection, so the backlog
must be able to hold every worker.
* Always enable the faulthandler module in the top-level process if it
is available.
* Make MITOGEN_DUMP_THREAD_STACKS interval configurable, to better
handle larger runs.
* Add docs subsection on diagnosing hangs.
Conflicts:
ansible_mitogen/process.py
Calls to connect.put_file() where the file is sufficiently small enough
to fit in a single RPC proceed without waiting for an RPC response. If
the write fails the target context will log an exception, and any
subsequent step depending on the written file will fail.
I verified every built-in action plugin for file transfer calls, and
they all depend on the transferred file in the following step, so this
should be safe.
Reduces template/copy actions to 2-RTT, loop-20-templates.yml runtime
reduced from 30 seconds to 10 seconds over a 250ms link compared to
v0.2.2, and from 123 seconds compared to vanilla with pipelining
enabled.
PlayContext.delegate_to is the unexpanded template, Ansible doesn't keep
a copy of it around anywhere convenient. We either need to re-expand it
or take the expanded version that was stored on the Task, which is what
is done here.
This needs more work -- pretty certain that python_path and suchlike are
coming from the wrong place. Possibly we need another config_from_..()
specialized for delegate_to.
This change is relatively incomplete -- ideally we could snapshot
os.environ and /etc/environment at startup and respect key deletions
too, but that's a lot more work. Wait for a bug report instead.
Closes#338.
Concurrent calls to ModuleDepService would cause significant wasted
work, as potentially all pool threads run the same uncached module dep
scan.
Without:
3243581 function calls (3233009 primitive calls) in 4770.672 seconds
ncalls tottime percall cumtime percall filename:lineno(function)
2523 0.011 0.000 39.849 0.016 services.py:409(scan)
With:
2801561 function calls (2800042 primitive calls) in 5166.843 seconds
ncalls tottime percall cumtime percall filename:lineno(function)
2506 0.009 0.000 1.967 0.001 services.py:411(scan)
Ignore timing variance due to problems with the test job.