Merge remote-tracking branch 'origin/dmw'

* origin/dmw:
  github: tweak issue template.
  docs: include strace wrapper trick.
  docs: docs and docstrings.
issue510
David Wilson 6 years ago
commit c0d1ac291a

@ -5,13 +5,17 @@ Feel free to write an issue in your preferred format, however if in doubt, use
the following checklist as a guide for what to include.
* Have you tried the latest master version from Git?
* Do you have some idea of what the underlying problem may be?
https://mitogen.rtfd.io/en/stable/ansible.html#common-problems has
instructions to help figure out the likely cause and how to gather relevant
logs.
* Mention your host and target OS and versions
* Mention your host and target Python versions
* If reporting a performance issue, mention the number of targets and a rough
description of your workload (lots of copies, lots of tiny file edits, etc.)
* If reporting a crash or hang in Ansible, please rerun with -vvvv and include
the last 200 lines of output, along with a full copy of any traceback or
error text in the log. Beware "-vvvv" may include secret data! Edit as
necessary before posting.
* If reporting a crash or hang in Ansible, please rerun with -vvv and include
200 lines of output around the point of the error, along with a full copy of
any traceback or error text in the log. Beware "-vvv" may include secret
data! Edit as necessary before posting.
* If reporting any kind of problem with Ansible, please include the Ansible
version along with output of "ansible-config dump --only-changed".

@ -487,8 +487,9 @@ class Connection(ansible.plugins.connection.ConnectionBase):
the delegated-to machine.
When running with delegate_to, Ansible tasks have variables associated
with the original machine, therefore it does not make sense to extract
connection-related configuration information from them.
with the original machine, not the delegated-to machine, therefore it
does not make sense to extract connection-related configuration for the
delegated-to machine from them.
"""
if self._task_vars:
if self.delegate_to_hostname is None:
@ -515,7 +516,7 @@ class Connection(ansible.plugins.connection.ConnectionBase):
def _spec_from_via(self, via_spec):
"""
Produce a dict connection specifiction given a string `via_spec`, of
the form `[become_user@]inventory_hostname`.
the form `[[become_method:]become_user@]inventory_hostname`.
"""
become_user, _, inventory_name = via_spec.rpartition('@')
become_method, _, become_user = become_user.rpartition(':')
@ -540,6 +541,33 @@ class Connection(ansible.plugins.connection.ConnectionBase):
via_cycle_msg = 'mitogen_via=%s of %s creates a cycle (%s)'
def _stack_from_spec(self, spec, stack=(), seen_names=()):
"""
Return a tuple of ContextService parameter dictionaries corresponding
to the connection described by `spec`, and any connection referenced by
its `mitogen_via` or `become` fields. Each element is a dict of the
form::
{
# Optional. If present and `True`, this hop is elegible for
# interpreter recycling.
"enable_lru": True,
# mitogen.master.Router method name.
"method": "ssh",
# mitogen.master.Router method kwargs.
"kwargs": {
"hostname": "..."
}
}
:param ansible_mitogen.transport_config.Spec spec:
Connection specification.
:param tuple stack:
Stack elements from parent call (used for recursion).
:param tuple seen_names:
Inventory hostnames from parent call (cycle detection).
:returns:
Tuple `(stack, seen_names)`.
"""
if spec.inventory_name() in seen_names:
raise ansible.errors.AnsibleConnectionFailure(
self.via_cycle_msg % (
@ -599,8 +627,12 @@ class Connection(ansible.plugins.connection.ConnectionBase):
def _connect_stack(self, stack):
"""
Pass `stack` to ContextService, requesting a copy of the context object
representing the target. If no connection exists yet, ContextService
will establish it before returning it or throwing an error.
representing the last tuple element. If no connection exists yet,
ContextService will recursively establish it before returning it or
throwing an error.
See :meth:`ansible_mitogen.services.ContextService.get` docstring for
description of the returned dictionary.
"""
try:
dct = self.parent.call_service(
@ -628,6 +660,11 @@ class Connection(ansible.plugins.connection.ConnectionBase):
self.init_child_result = dct['init_child_result']
def get_good_temp_dir(self):
"""
Return the 'good temporary directory' as discovered by
:func:`ansible_mitogen.target.init_child` immediately after
ContextService constructed the target context.
"""
self._connect()
return self.init_child_result['good_temp_dir']
@ -716,8 +753,8 @@ class Connection(ansible.plugins.connection.ConnectionBase):
# #420: Ansible executes "meta" actions in the top-level process,
# meaning "reset_connection" will cause :class:`mitogen.core.Latch` FDs
# to be cached and subsequently erroneously shared by children on
# subsequent task forks. To handle that, call on_fork() to ensure any
# to be cached and erroneously shared by children on subsequent
# WorkerProcess forks. To handle that, call on_fork() to ensure any
# shared state is discarded.
mitogen.fork.on_fork()
@ -736,7 +773,7 @@ class Connection(ansible.plugins.connection.ConnectionBase):
action, we cannot capture task variables via :meth:`on_action_run`.
Instead walk the parent frames searching for the `all_vars` local from
StrategyBase._execute_meta(). If this fails, just leave task_vars
unset, likely causing the wrong configuration to be created.
unset, likely causing a subtly wrong configuration to be selected.
"""
frame = sys._getframe()
while frame and not self._task_vars:

@ -935,13 +935,20 @@ reporting it:
example, Mitogen may pick the wrong username or SSH parameters.
To detect this, use the special ``mitogen_get_stack`` action described
below to verify all the configuration variables Mitogen has chosen for the
connection make sense.
below to verify the settings Mitogen has chosen for the connection make
sense.
**Process Environment Differences**
Mitogen's process model differs significantly to Ansible's in certain
places. In the past, bugs have been reported because Ansible plug-ins
modify an environment variable after Mitogen processes are started
Mitogen's process model differs significantly to Ansible's in many places.
In the past, bugs have been reported because Ansible plug-ins modify an
environment variable after Mitogen processes are started.
If your task's failure may relate to the process environment in some way,
for example, ``SSH_AUTH_SOCK``, ``LC_ALL`` or ``PATH``, then an environment
difference may explain it. Environment differences are always considered
bugs in the extension, and are very easy to repair, so even if you find a
workaround, please report them to avoid someone else encountering the same
problem.
**Variable Expansion Differences**
To avoid many classes of bugs, Mitogen avoids shell wherever possible.
@ -1069,6 +1076,39 @@ For example, this method can be used to ascertain whether SSH attempted agent
authentication, or what private key files it was able to access and which it tried.
Post-authentication Bootstrap Failure
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If logging indicates Mitogen was able to authenticate, but some error occurred
after authentication preventing the Python bootstrap from completing, it can be
immensely useful to temporarily replace ``ansible_python_interpreter`` with a
wrapper that runs Python under ``strace``::
$ ssh badbox
badbox$ cat > strace-python.sh
#!/bin/sh
strace -o /tmp/strace-python.$$ -ff -s 100 python "$@"
^D
badbox$ chmod +x strace-python.sh
badbox$ logout
$ ansible-playbook site.yml \
-e ansible_python_interpreter=./strace-python.sh \
-l badbox
This will produce a potentially large number of log files under ``/tmp/``. The
lowest-numbered traced PID is generally the main Python interpreter. The most
intricate bootstrap steps happen there, any error should be visible near the
end of the trace.
It is also possible the first stage bootstrap failed. That is usually the next
lowest-numbered PID and tends to be the smallest file. Even if you can't
ascertain the problem with your configuration from these logs, including them
in a bug report can save days of detective effort.
.. _diagnosing-hangs:
Diagnosing Hangs

Loading…
Cancel
Save