Commit Graph

212 Commits (master)

Author SHA1 Message Date
David Wilson 85cfa3b0f5 master: .encode() needed for Py3. 5 years ago
David Wilson 1d509d03ff issue #508: master: minify_safe_re must be bytes for Py3. 5 years ago
David Wilson 0e193c223c issue #508: master: minify all Mitogen/ansible_mitogen sources.
Minify-safe files are marked with a magical "# !mitogen: minify_safe"
comment anywhere in the file, which activates the minifier. The result
is naturally cached by ModuleResponder, therefore lru_cache is gone too.

Given:

    import os, mitogen
    @mitogen.main()
    def main(router):
        c = router.ssh(hostname='k3')
        c.call(os.getpid)
        router.sudo(via=c)

SSH footprint drops from 56.2 KiB to 42.75 KiB (-23.9%)
Ansible "shell: hostname" drops 149.26 KiB to 117.42 KiB (-21.3%)
5 years ago
David Wilson 9adc38d8ec parent: pre-cache bootstrap if possible.
When the interpreter is modern enough, use zlib.compressobj() to
pre-compress the unchanging parts of the bootstrap once, then use
compressobj.copy() to append just the context's config during stream
construction.

Before: 100 loops, best of 3: 5.81 msec per loop
After: 10000 loops, best of 3: 35.9 usec per loop

With 100 targets this is enough to knock 6 seconds off startup, at 500
targets it becomes half a minute.

Test 'program':
        python -m timeit -s '
                import mitogen.parent as p;
                import mitogen.master as m;
                r=m.Router();
                s=p.Stream(r, 0, max_message_size=1);
                r.broker.shutdown()'\
                \
                's.get_preamble()'
5 years ago
David Wilson cafdfdb8e2 master: set Router.profiling if MITOGEN_PROFILING variable present. 5 years ago
David Wilson b8a0e0f929 master: cache sent/forwarded module names
On 32x Docker run of issue_140__thread_pileup.yml

Before:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   384807    3.595    0.000   12.207    0.000 /home/dmw/src/mitogen/mitogen/master.py:896(_forward_one_module)
  1218352    4.867    0.000    7.302    0.000 /home/dmw/src/mitogen/mitogen/master.py:853(_send_module_and_related)

After:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   384807    3.839    0.000    6.543    0.000 /home/dmw/src/mitogen/mitogen/master.py:902(_forward_one_module)
  1218352    0.723    0.000    0.898    0.000 /home/dmw/src/mitogen/mitogen/master.py:856(_send_module_and_related)
5 years ago
David Wilson dc4b27c6bf master: keep is_stdlib_path() result as negative cache entry
On 32x Docker run of issue_140__thread_pileup.yml

Before:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1218500    1.716    0.000    7.325    0.000 /home/dmw/src/mitogen/mitogen/master.py:118(is_stdlib_path)

After:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      166    0.000    0.000    0.001    0.000 /home/dmw/src/mitogen/mitogen/master.py:123(is_stdlib_path)
5 years ago
David Wilson 3435f24e8d issue #479: ModuleFinder special case for __main__ on Py3.x. 5 years ago
David Wilson 4ca3051e39 issue #477: document master.Router.max_message_size. 5 years ago
David Wilson 19eafc5755 issue #477: set().union(a, b, ..) unsupported on Py2.4. 5 years ago
David Wilson e460d648d5 issue #477: Logger.log(extra=) unsupported on Py2.4. 5 years ago
David Wilson 112caa94f9 issue #477: Py2.4 dep scanner bytecode difference 5 years ago
David Wilson 7ecd5d8ba3 issue #477: ModuleFinder test fixes. 5 years ago
David Wilson bc434a4f99 issue #477: backport mitogen.master to Python 2.4. 5 years ago
David Wilson 97a96f5dd8 issue #477: rename and add tests for polyfill functions. 5 years ago
David Wilson 5135ff9068 issue #477: master: ability to override ModuleResponder output.
This is needed to cope Ansible 2.3 doing weird stuff as usual. It serves
up __init__.py for ansible and ansible.module_utils as hard-coded
namespace packages, the real ansible/__init__.py on disk is not 2.4
compatible.
5 years ago
David Wilson 96c35ccab1 issue #61: unused variable (reported by LGTM) 5 years ago
David Wilson 06415bb720 issue #310: fix test failures, teach old import method new tricks
- don't try anything unless something really lives in sys.modules by
  that name
- non-ASCII files are possible
- the unimportable thing might be an extension module, we don't want
  that
5 years ago
David Wilson 6af1a64cce master: handle crazy non-modules in sys.modules again; closes #310. 5 years ago
David Wilson fd90834944 issue #408: fix test fallout. 5 years ago
David Wilson 41626b82dd issue #408: 2.4 compat: remove ternary if use in master.py. 5 years ago
David Wilson cce1dbf3b1 tests: quieten a bunch of spam printed during run 5 years ago
David Wilson b6840aab75 issue #459: one line stats output during shutdown
CI logs are too noisy.
5 years ago
David Wilson f2f41809ae issue #459: initial get_stats() implementation 5 years ago
David Wilson d030decf57 issue #444: master: lower log level for soft import error. 6 years ago
David Wilson 8e4c164d93 issue #388: fix Sphinx markup 6 years ago
David Wilson 804bacdadb docs: move most remaining docstrings back into *.py; closes #388
The remaining ones are decorators which don't seem to have an autodoc
equivlent.
6 years ago
David Wilson 0d04e940b7 master: docstring fixes. 6 years ago
David Wilson 9828588e97 master: group is_stdlib_name() with other module functions. 6 years ago
David Wilson bf597d257f master: document LogForwarder. 6 years ago
David Wilson 74cf9c3c96 master: document ThreadWatcher 6 years ago
David Wilson 4146648759 master: log error an refuse __main__ import if no guard detected.
Closes #366.
6 years ago
David Wilson 32751cd356 master: allow batching context switches for forward_modules()
-7 switches per task.
6 years ago
David Wilson 946c4964bb issue #349: master: fix broken importer logging format string 6 years ago
David Wilson 442d88e3d7 docs: many more fixes/merges. 6 years ago
David Wilson 038f8d5dec master: fix another case where built-in module loader throws ImportError
requests/packages.py just imports urllib3 normally, then makes up new
names for it. pkgutil can't cope with that, and returns the loader
(builtin) for the requests package. The built-in loader obviously can't
find_module() for "requests/packages/urllib3/contrib/pyopenssl" because
it doesn't exist on disk.
6 years ago
David Wilson c141dd10ec master: fix resolve_relpath()
looks like this was just as broken on 2.x, and suddenly we're
finding a bunch more legit Django deps. It seems anywhere
absolute_import appeared in 2.x, we skipped some imports.
6 years ago
David Wilson b0404bef40 tests: fix get_module_via_* encoding issues 6 years ago
David Wilson 9903692811 master: update scan_code_imports to cope with wordcode
Constant-sized opcodes were introduced as an optimization in Python 3.6.
See https://bugs.python.org/issue26647
6 years ago
David Wilson 9fb2371d64 importer: reorder/tweak find_module() tests to cope with six.moves
The old hack on the master side we had is broken for some reason on 3.x.
Instead tweak the client to be more selective: if a request is for a
module within a package, the package must be loaded (in sys.modules),
and its __loader__ must be us. Previously if the module didn't exist in
sys.modules, we'd still try to fetch from the master, which doesn't
appear to ever make sense.
6 years ago
David Wilson 410016ff47 Initial Python 3.x port work.
* ansible: use unicode_literals everywhere since it only needs to be
  compatible back to 2.6.
* compat/collections.py: delete this entirely and rip out the parts of
  functools that require it.
* Introduce serializable Kwargs dict subclass that translates keys to
  Unicode on instantiation.
* enable_debug_logging() must set _v/_vv globals.
* cStringIO does not exist in 3.x.
* Treat IOLogger and LogForwarder input as latin-1.
* Avoid ResourceWarnings in first stage by explicitly closing fps.
* Fix preamble_size.py syntax errors.
6 years ago
David Wilson f6d9b074ff master: reduce module verbosity somewhat. 6 years ago
David Wilson b577a11f86 master: fix IdAllocator log messages. 6 years ago
David Wilson f7d2eace08 tests: importer fixes 6 years ago
David Wilson f7b368b1fb master: implement ModuleResponder.forward_module(). 6 years ago
David Wilson 9492dbc4d7 parent: split out minify.py and add stub where master can install it.
This needs a cleaner mechanism to install it, at least this one is
documented.
6 years ago
David Wilson d2714752ee docs: tidy ups 6 years ago
David Wilson ddf28987a0 master: split Select() into new module to reduce wire size.
service.py currently imports master.py(+parent.py) just to get Select().
6 years ago
David Wilson bc7be1879d issue #249: initial poller implementation (BSD only) 6 years ago
David Wilson 356647bef4 issue #132: initial unidirectional routing mode. 6 years ago
David Wilson e615265ebd master: make is_stdlib_path() a free function
For Ansible module loader work.
6 years ago
David Wilson 7f1060f54a issue #186: initial version of subtree detachment. 6 years ago
David Wilson 6fb3a76e68 master: annotate LogForwarder messages.
mitogen/master.py:
    Annotate forwarded log entries with their original source, logger
    name, and message.

ansible:
    mark stderr in red with -vvv

    Tempting to make this appaer 100% of the time, but some crappy
    bashrcs may cause lots of junk to be printed.
6 years ago
David Wilson 7c88e4d013 Move _DEAD into header, autogenerate dead messages
This change blocks off 2 common scenarios where a race condition is
upgraded to a hang, when the library could internally do better.

* Since we don't know whether the receiver of a `reply_to` is expecting
  a raw or pickled message, and since in the case of a raw reply, there
  is no way to signal "dead" to the receiver, override the reply_to
  field to explicitly mark a message as dead using a special handle.

  This replaces the serialized _DEAD sentinel value with a slightly
  neater interface, in the form of the reserved IS_DEAD handle, and
  enables an important subsequent change: when a context cannot route a
  message, it can send a generic 'dead' reply back towards the message
  source, ensuring any sleeping thread is woken with ChannelError.

  The use of this field could potentially be extended later on if
  additional flags are needed, but for now this seems to suffice.

* Teach Router._invoke() to reply with a dead message when it receives a
  message for an invalid local handle.

* Teach Router._async_route() to reply with a dead message when it
  receives an unroutable message.
6 years ago
David Wilson ce6fb05d87 tests: 'fix' responder test.
Needs a complete rewrite, but this will do for now.
6 years ago
David Wilson f9eb66e76e _py_filename() must handle None too. 6 years ago
David Wilson 34a1e3337f Fix get_module_via_sys_modules when running under unit2. 6 years ago
David Wilson 6670cba41c Introduce handler policy functions; closes #138.
Now you can specify a function to add_handler() that authenticates the
message header, with has_parent_authority() and is_immediate_child()
built in.
6 years ago
David Wilson 1ff27ada49 Add maximum message size checks. Closes #151. 6 years ago
David Wilson cba3347556 issue #155: move connection factories to parent.py. 6 years ago
David Wilson 6a74edce6b issue #155: parent: move master.Context into parent.
The Context and Router APIs for constructing children and making
function calls should be available in every parent context, as user code
wants to have access to the same API.
6 years ago
David Wilson 90791768be issue #155: core: slightly rearrange how shutdown works
This eliminates Context.on_disconnect() and instead moves its
functionality to a signal wired up by ExternalContext.main().

It leaves mitogen.master.Context is in a better condition to move into
mitogen.parent where it belongs.
6 years ago
David Wilson 1a8ac9f4d1 issue #155: introduce mitogen.fork / Router.fork() 6 years ago
David Wilson 54ff1c90fa issue #155: add DEL_ROUTE, propagate ADD_ROUTE upwards
* IDs are allocated by the parent responsible for contructing a new
  child, using ALLOCATE_ID to the master as necessary to allocate new ID
  ranges.

* ADD_ROUTE is sent up the tree rather than down. This permits
  construction of the new context to complete concurrent to parent
  contexts learning about its existence. Since all streams are strictly
  ordered, it's not possible for any parent to observe messages from the
  new context prior to arrival of an ADD_ROUTE from the parent notifying
  of its existence.

  If the new context, for example, implements an Ansible async task, its
  parent can start executing that without waiting for any synchronous
  confirmation from any parent or the master.

* Since routes propagate up, it's no longer possible for a plain
  non-parent child to ever receive ADD_ROUTE, so that code can be moved
  out of core.py and into parent.py (-0.2kb compressed).

* Add a .routes attribute to parent.Stream, and respond to disconnection
  signal on the stream by propagating DEL_ROUTE for any ADD_ROUTE ever
  received from that stream.

* Centralize route management in a new parent.RouteMonitor class
6 years ago
David Wilson 469279d9ca master: refactor ThreadWatcher
In order to support a .remove() method, to prevent a minor but annoying
(log visible) memory leak while running the tests.
6 years ago
David Wilson f4ba66e3ee issue #155: allocate child IDs in batches of 1000.
Avoids a roundtrip for every fork.
6 years ago
David Wilson 4f361be7e7 issue #144: teach Select() to close its latch
Causes all threads sleeping on the select to wake.
6 years ago
David Wilson 65df36895e issue #140: prevent duplicate watcher thread creation
When a Broker() is running with install_watcher=True, arrange for only
one watcher thread to exist for each target thread, and to reset the
mapping of watchers to targets after process fork.

This is probably the last change I want to make to the watcher feature
before deciding to rip it out, it may be more trouble than it is worth.
6 years ago
Alex Willmer f999b9adbf Crank zlib.compress() upto 9
SSH command size: 482 bytes (no change)
Preamble size: 8946 bytes (down 33)
6 years ago
David Wilson b243da087c issue #121: fix call_function_test by not raising the dead
A first small mea culpa to all my testing sins of late :)
6 years ago
David Wilson 7aca02c2c7 importer: don't include related modules that are blacklisted
Cuts down on even more spam
6 years ago
David Wilson 37d38f8d9f importer: _is_stdlib_name() needed to handle relative paths
Was causing tons of log spam due to 'skipping absent related name'
6 years ago
David Wilson 6ec349b386 importer: quieten 'cannot find source' warning, closes #110 6 years ago
David Wilson d348a826ff master: tidy up trixxy importer syntax slightly 6 years ago
David Wilson ff617824a1 ansible: fix some flake8 errors
* Unused imports
* Undefined names in helpers.py
* Copyright header wrapping
6 years ago
David Wilson 5036a9448f Add Docker method. 6 years ago
Alex Willmer 2dcab3fe06 master: Reword ModuleFinder.find_realted*() docstrings
Hopefully these are correct, and clearerabout the
behaviour/pre-conditions of these methods.
6 years ago
Alex Willmer c2405369d2 ModuleFinder: Get stdlib paths by sys.prefix, sys.real_prefix etc.
I took the liberty of renaming ModuleFinder.STDLIB_DIRS to
_STDLIB_PATHS, since it felt like an implementation detail that
shouldn't be baked into a public API and stdlib can also be imported
from e.g. a zip file.

I also changed it to a set to handle any duplicates.

Fixes #86
6 years ago
David Wilson afc8697288 core: Ensure add_handler() callbacks really receive _DEAD on shutdown 6 years ago
David Wilson 4d940f08ae importer: drop redundant prefix from pkg_present
For the 52 submodules of ansible.modules.system, this produced a 1602
byte pkg_present list. After stripping it becomes 406 bytes, and the
entire LOAD_MODULE size drops from 1988 bytes to 792 bytes (-60%).

For the 68 submodules of ansible.module_utils, 1902 bytes pkg_present
becomes 474 bytes (-75%), and LOAD_MODULE size drops from 2867 bytes to
1439 bytes (-49%).

In a simple test running Ansible's "setup" module followed by its "apt"
module, wire bytes sent drops from 140,357 to 135,531 (-3.4%).
6 years ago
David Wilson 71a6b9e32a master: Select.all() sugar 6 years ago
David Wilson b543b84e80 importer: share blacklist logic between master/parent 6 years ago
David Wilson 8ec6ae1da0 importer: module whitelist/blacklist support
Hoped to avoid it, but it's the obvious solution for Ansible.
6 years ago
David Wilson 6905dc4e8d master: use queue-like Latch in Select() too. 6 years ago
David Wilson e6a107c5aa core: replace Queue with Latch
On Python 2.x, operations on pthread objects with a timeout set actually
cause internal polling. When polling fails to yield a positive result,
it quickly backs off to a 50ms loop, which results in a huge amount of
latency throughout.

Instead, give up using Queue.Queue.get(timeout=...) and replace it with
the UNIX self-pipe trick. Knocks another 45% off my.yml in the Ansible
examples directory against a local VM.

This has the potential to burn a *lot* of file descriptors, but hell,
it's not the 1940s any more, RAM is all but infinite. I can live with
that.

This gets things down to around 75ms per playbook step, still hunting
for additional sources of latency.
6 years ago
David Wilson beb16d0db6 master: helper functions to force disconnect everything 6 years ago
David Wilson 10230f62dd core: Message.reply() helper function 6 years ago
David Wilson dd088908df select: clean up API. 6 years ago
David Wilson df07e47d24 core: de-munge Message.unpickle() vs. Receiver.get(). 6 years ago
David Wilson 2848d35aff importer: warn on duplicate request, simplify preload logic
* Children should never generate a request for a module that has already
  been sent, however there are a variety of edge cases where, e.g.
  asynchronous calls are made into unloaded modules in a set of
  children, causing those children to request modules (and deps) in a
  different order, which might break deduplication. So add a warning to
  catch when this happens, so we can figure out how to handle it.
  Meanwhile it's only a warning since in the worst case, this just adds
  needless latency.

* Don't bother treating sent packages separately, there doesn't seem to
  be any need for this (after docs are updated to match how preloading
  actually works now).
6 years ago
David Wilson 2e729e54cc importer: fix glaring bug in find_related()
Overwriting 'fullname' variable caused basically nonsensical filtering.
Result was including the module being searched in the list of
dependencies, which was causing ModuleResponder to send it early, which
was causing contexts to start importing the module before preloading of
dependencies had completed.
6 years ago
David Wilson 0dbb1ec028 importer: warn once about missing source and cache negative hit 6 years ago
David Wilson b158259c86 Split up parent and master modules
Knocks 4kb off network footprint for a proxy connection.
6 years ago
David Wilson 0f899f34ff importer: new format to signal ImportError
Previously we'd send just None in GET_MODULE reply, but now since there
is no single request-reply structure, we must include the fullname in
the LOAD_MODULE response and make all of its data fields None to
indicate the same.
6 years ago
David Wilson 07ae14f11d importer: semi-functional preloader
Doesn't yet implement the rules in the docs, but I think the doc rules
could maybe change to match this. Needs lots of cleanup work and
thorough testing, but this is a great start.
6 years ago
David Wilson 4d01dc3ba6 Initial pass at module preloading
* Don't implement the rules for when preloading occurs yet
* Don't attempt to streamily preload modules downstream while this
  context hasn't yet received the final module. There is quite
  significant latency buried in here, but for now it's a lot of work to
  fix.

This works well enough to handle at least the mitogen package, but it's
likely broken for anything bigger.
6 years ago
David Wilson b75e77b410 master: force set_block() in tty_create_child too.
For gevent, just as in 5f7633cd56
6 years ago
David Wilson ed71ae72f8 master: make mitogen minimally functional under gevent
It seems gevent automatically sets blocking behaviour on fds produced by
the socket module, which causes the Python process we fork to fail
horribly. So in the child, always reset the blocking flag.
6 years ago
David Wilson 247f50e08d master: add comment 6 years ago
David Wilson bbcfc585a8 master: add a comment to explain what's going on, and fix log msg.
Closes #70
6 years ago