Commit Graph

3610 Commits (b91407a7792e1e827082cbfda0076e5ef45c99f9)
 

Author SHA1 Message Date
David Wilson 75b195ba4b core: race during Receiver construction.
It's possible for a message to arrive after .add_handler() but before
Latch construction.

This is papering over a bigger problem with service pool instantiation.

https://travis-ci.org/dw/mitogen/jobs/390409832#L2901

    TASK [Spin up a few interpreters] **********************************************
    changed: [target] => (item=1)
    ERROR! [pid 5355] 14:47:50.224945 E mitogen.ctx.ssh.localhost:2201.sudo.mitogen__user2: mitogen: Router(Broker(0x7f1e93911450))._invoke(Message(19100, 19095, 19095, 110, 1005, '\x80\x02U\x1fmitogen.service.PushFileServiceq\x01U\x11store_and_f'..8955)): <bound method Receiver._on_receive of Receiver(Router(Broker(0x7f1e93911450)), 110)> crashed
    Traceback (most recent call last):
      File "<stdin>", line 1471, in _invoke
      File "<stdin>", line 491, in _on_receive
    AttributeError: 'Receiver' object has no attribute '_latch'
6 years ago
dw 27ab051289
Merge pull request #282 from dw/issue278
Issue278
6 years ago
David Wilson 6d14652077 issue #278: tests: fix fakessh.
See source comment. This behaviour always existed, but it now seems to
be triggered since we started draining the master side input buffer,
which someone was prolonging the life of the PTY.
6 years ago
David Wilson 0e958ea177 issue #278: tty logger Side constructed with incorrect Stream
Harmless, but produced the wrong log message prefix.
6 years ago
David Wilson 04b65020ac issue #278: ansible: support mitogen_ssh_debug_level variable. 6 years ago
David Wilson b58603c7a4 issue #278: ssh: support ssh_debug_level option and log TTY output.
Now debug logs may be captured all the way through the connection.
6 years ago
dw 29262a6000
Merge pull request #281 from dw/issue280
Issue280
6 years ago
David Wilson 888829544a issue #280: move find_module() log output to IOLOG
It just generates far too much spam, and its final decision is obvious
since a followup load_module() will exist for positive matches.
6 years ago
David Wilson 7853b74e7f issue #280: put 'dnf' on the always fork list 6 years ago
dw c846c3be2d
Merge pull request #269 from moreati/retox-the-freak-in-me
Fix Docker image construction and Tox test invocation
6 years ago
dw bf478b451b
Merge pull request #276 from dw/issue271
ssh: Only match "permission denied" at start of line; closes #271.
6 years ago
David Wilson e3482bdd8f ssh: Only match "permission denied" at start of line; closes #271. 6 years ago
dw 58ff550b37
Merge pull request #274 from dw/issue270
Get integration tests running under 2.6.
6 years ago
David Wilson 2fbe1f1b54 Get integration tests running under 2.6.
Closes #270
Closes #273
6 years ago
Alex Willmer 21199f290e Fix bash loop when add users to docker images 6 years ago
Alex Willmer c4899a0ce4 Fix invocation of test runner by tox
I think tox calls it in a way that #! is ignored
6 years ago
dw 876a82f00d
Merge pull request #263 from dw/dmw
Stray mux process on CTRL+C, EINTR on async task timeout, temp dir cleanup race
6 years ago
David Wilson 9617f4d7bf Revert "try to catch EINTR on travis"
This reverts commit 42797d5cff.
6 years ago
David Wilson 08538d327b ansible: don't write failed job result after async timeout.
The failed job result is likely to be "interrupted system call", and we
don't want that to overwrite the SIGALRM handler's "the task timed out",
so just discard it.
6 years ago
David Wilson 205052ed90 service: fix SerializedInvoker CallError handling.
This cutpaste needs refactored. Ensure the caller receives a copy of the
exception.
6 years ago
David Wilson 45b748833d ansible: don't randomly fail due to temp directory cleanup.
Happens about 1 time in 3 when async task times out.
6 years ago
David Wilson fbb67e837e tests: import nice_stdout plugin 6 years ago
David Wilson 42797d5cff try to catch EINTR on travis 6 years ago
David Wilson ffc7306cf8 tests: better runner_two_simultaneous_jobs.yml. 6 years ago
David Wilson 1d96d80e8d tests: osx_setup.yml missing line 6 years ago
David Wilson d5c4333b9e debug: functions for triggering EINTR 6 years ago
David Wilson 6377f2d69c issue #257: split pool shutdown and join. 6 years ago
David Wilson d33ef1866e ansible: wrap socket calls in io_op()
Breaks under signal stress test.
6 years ago
dw 3a2f422725
Merge pull request #262 from dw/dmw
Fully functional async tasks, minify refactor, import trimming, module/script preloading/deduplication, service framework enhancements, fix logging deadlock
6 years ago
David Wilson 3994f1b30a ansible: implment async job time limit. 6 years ago
David Wilson d2accbce53 docs: remove more Ansible limitations 6 years ago
David Wilson e35694acd5 ansible: flake8 fixes. 6 years ago
David Wilson df8fe59eda tests: replace hard-coded sleep with a polling loop 6 years ago
David Wilson 4bd992e35a issue #186: move module code fetch back to overridden method 6 years ago
David Wilson 3909cb11f6 service: recreate the pool after fork. 6 years ago
David Wilson ae20a689ef issue #186: finally enable detach. 6 years ago
David Wilson 05e0b134f9 service: simplify CALL_SERVICE stub and fix race.
If PushService.store_and_forward() loses the race to arrive at a brand
new context first, and the context's main thread is already executing a
CALL_FUNCTION that is blocked on the result of PushService, deadlock
could occur in the old scheme.

Instead (for now) simply spam a thread for each incoming message, and
use the get_or_create_pool() lock to ensure things work out in the end.
This could potentially generate a huge number of threads given the wrong
app, but we'll fix that problem when it appears.
6 years ago
David Wilson 92ecf29559 core: check in the hacks that let Ansible work just now. 6 years ago
David Wilson f6d9b074ff master: reduce module verbosity somewhat. 6 years ago
David Wilson caffaa79f7 issue #186: rework async/forked tasks again.
The controller must know the ID of the forked child in order to
propagate dependencies to it, so forking+starting the module run cannot
happen entirely on the target, without some additional mechanism to
wait-and-repropagate the deps as they arrive on the target.

Rework things so that init_child() also handles starting the fork parent,
and returns it along with the context's home directory in a single round
trip.

Now master knows the identity of the fork parent, it can directly create
fork children and call run_module_async() in them. This necessitates 2
roundtrips to start an asynchronous task.

This whole thing sucks and entirely needs simplified, but for now things
almost work, so keeping it.

connection.py:
  * Expect ContextService to return the entire dict return value of
    init_child(). Store the fork_contxt from the return value.

planner.py:
  * Rework Planner to store the invocation as an instance attribute, to
    simplify method calls.
  * Add Planner.get_push_files() and Planner.get_module_deps().
  * Add _propagate_deps() which takes a Planner and ensures the deps it
    describes are sent to a (non forked or forked) context.
  * Move async task logic out of target.py and into invoke() /
    _invoke_*().

process.py:
  * Services no longer need references to each other. planner.py handles
    sending module deps with one extra RPC.

services.py:
  * Return "init_child_result" key instead of simple "home_dir" key.
  * Get rid of dep propagation from ModuleDepService, it lives in
    planner.py now.

target.py:
  * Get rid of async task start logic, lives in planner.py now.
6 years ago
David Wilson 526590027a issue #186: PushFileService improvements.
New method to send all modules and files in one roundtrip.
6 years ago
David Wilson 9e78c20eba core/parent: add Context.call_no_reply(). 6 years ago
David Wilson b3a5fa70b0 core: copy debug setting to child's Router too.
core.Router doesn't pay attention to this attribute, but after
upgrade_router() has been called, the new parent.Router will.
6 years ago
David Wilson 64b60be50c tests: split runner_new_process out of runner_one_job 6 years ago
David Wilson 23b2a545cf fork: avoid another logging deadlock at startup.
The very first task /must/ be clearing out logging locks, since
_at_fork() functions call LOG.debug() via Side.close(). Additionally,
the root logger is not included in loggerDict, so we must specify it
explicitly.
6 years ago
David Wilson 785df88fa4 issue #186: core: remove long-forgotten hack.
This is likely to break something, it was definitely needed at some
point, but I never put much effort into figuring out why. Meanwhile,
Python appears to make find_module('ansible.module_utils.facts.')
requests in some circumstances, which causes us to indicate the module
exists while this hack exists.

So remove it, and let's see what breaks.
6 years ago
David Wilson 569c12a2d6 ansible: use PushFileService for module deps.
planner.py:
  * Rather than grant FileService access to a file for children, use
    PushFileService to trigger deduplicating send of the file through
    the hierarchy immediately.
  * Send the complete list of Ansible module imports to the target so
    runner.py knows which files and scripts must be loaded via
    PushFileService prior to detaching.

runner.py:
  * Teach NewStyleRunner to use the full module map to block until
    everything is loaded prior to detach().

target.py:
  * Delete old _get_file(), replace get_file() with get_small_file()
    which uses PushFileService instead.

Closes #186
6 years ago
David Wilson 7d4f4b205f ansible: update module preload list. 6 years ago
David Wilson 76beea6554 issue #186: move target._get_file into mitogen.service
For lack of a better place to keep the client function, make it a
classmethod of FileService itself for now.

The old _get_file() is removed in a subsequent commit.
6 years ago
David Wilson a3b747af1b issue #186: add PushFileService
This is like FileService but blocks until the file is pushed by a parent
context, with deduplicating behaviour at each level in the hierarchy. It
does not stream large files, so it is only suitable for small files like
Python modules.

Additionally add SerializedInvoker for use with PushFileService, which
ensures all method calls to a single service occur in sequence.
6 years ago