mitogen/master.py:
Annotate forwarded log entries with their original source, logger
name, and message.
ansible:
mark stderr in red with -vvv
Tempting to make this appaer 100% of the time, but some crappy
bashrcs may cause lots of junk to be printed.
This change blocks off 2 common scenarios where a race condition is
upgraded to a hang, when the library could internally do better.
* Since we don't know whether the receiver of a `reply_to` is expecting
a raw or pickled message, and since in the case of a raw reply, there
is no way to signal "dead" to the receiver, override the reply_to
field to explicitly mark a message as dead using a special handle.
This replaces the serialized _DEAD sentinel value with a slightly
neater interface, in the form of the reserved IS_DEAD handle, and
enables an important subsequent change: when a context cannot route a
message, it can send a generic 'dead' reply back towards the message
source, ensuring any sleeping thread is woken with ChannelError.
The use of this field could potentially be extended later on if
additional flags are needed, but for now this seems to suffice.
* Teach Router._invoke() to reply with a dead message when it receives a
message for an invalid local handle.
* Teach Router._async_route() to reply with a dead message when it
receives an unroutable message.
The Context and Router APIs for constructing children and making
function calls should be available in every parent context, as user code
wants to have access to the same API.
This eliminates Context.on_disconnect() and instead moves its
functionality to a signal wired up by ExternalContext.main().
It leaves mitogen.master.Context is in a better condition to move into
mitogen.parent where it belongs.
* IDs are allocated by the parent responsible for contructing a new
child, using ALLOCATE_ID to the master as necessary to allocate new ID
ranges.
* ADD_ROUTE is sent up the tree rather than down. This permits
construction of the new context to complete concurrent to parent
contexts learning about its existence. Since all streams are strictly
ordered, it's not possible for any parent to observe messages from the
new context prior to arrival of an ADD_ROUTE from the parent notifying
of its existence.
If the new context, for example, implements an Ansible async task, its
parent can start executing that without waiting for any synchronous
confirmation from any parent or the master.
* Since routes propagate up, it's no longer possible for a plain
non-parent child to ever receive ADD_ROUTE, so that code can be moved
out of core.py and into parent.py (-0.2kb compressed).
* Add a .routes attribute to parent.Stream, and respond to disconnection
signal on the stream by propagating DEL_ROUTE for any ADD_ROUTE ever
received from that stream.
* Centralize route management in a new parent.RouteMonitor class
When a Broker() is running with install_watcher=True, arrange for only
one watcher thread to exist for each target thread, and to reset the
mapping of watchers to targets after process fork.
This is probably the last change I want to make to the watcher feature
before deciding to rip it out, it may be more trouble than it is worth.
I took the liberty of renaming ModuleFinder.STDLIB_DIRS to
_STDLIB_PATHS, since it felt like an implementation detail that
shouldn't be baked into a public API and stdlib can also be imported
from e.g. a zip file.
I also changed it to a set to handle any duplicates.
Fixes#86
For the 52 submodules of ansible.modules.system, this produced a 1602
byte pkg_present list. After stripping it becomes 406 bytes, and the
entire LOAD_MODULE size drops from 1988 bytes to 792 bytes (-60%).
For the 68 submodules of ansible.module_utils, 1902 bytes pkg_present
becomes 474 bytes (-75%), and LOAD_MODULE size drops from 2867 bytes to
1439 bytes (-49%).
In a simple test running Ansible's "setup" module followed by its "apt"
module, wire bytes sent drops from 140,357 to 135,531 (-3.4%).
On Python 2.x, operations on pthread objects with a timeout set actually
cause internal polling. When polling fails to yield a positive result,
it quickly backs off to a 50ms loop, which results in a huge amount of
latency throughout.
Instead, give up using Queue.Queue.get(timeout=...) and replace it with
the UNIX self-pipe trick. Knocks another 45% off my.yml in the Ansible
examples directory against a local VM.
This has the potential to burn a *lot* of file descriptors, but hell,
it's not the 1940s any more, RAM is all but infinite. I can live with
that.
This gets things down to around 75ms per playbook step, still hunting
for additional sources of latency.
* Children should never generate a request for a module that has already
been sent, however there are a variety of edge cases where, e.g.
asynchronous calls are made into unloaded modules in a set of
children, causing those children to request modules (and deps) in a
different order, which might break deduplication. So add a warning to
catch when this happens, so we can figure out how to handle it.
Meanwhile it's only a warning since in the worst case, this just adds
needless latency.
* Don't bother treating sent packages separately, there doesn't seem to
be any need for this (after docs are updated to match how preloading
actually works now).
Overwriting 'fullname' variable caused basically nonsensical filtering.
Result was including the module being searched in the list of
dependencies, which was causing ModuleResponder to send it early, which
was causing contexts to start importing the module before preloading of
dependencies had completed.
Previously we'd send just None in GET_MODULE reply, but now since there
is no single request-reply structure, we must include the fullname in
the LOAD_MODULE response and make all of its data fields None to
indicate the same.
Doesn't yet implement the rules in the docs, but I think the doc rules
could maybe change to match this. Needs lots of cleanup work and
thorough testing, but this is a great start.
* Don't implement the rules for when preloading occurs yet
* Don't attempt to streamily preload modules downstream while this
context hasn't yet received the final module. There is quite
significant latency buried in here, but for now it's a lot of work to
fix.
This works well enough to handle at least the mitogen package, but it's
likely broken for anything bigger.
It seems gevent automatically sets blocking behaviour on fds produced by
the socket module, which causes the Python process we fork to fail
horribly. So in the child, always reset the blocking flag.
Since the above if block ends in a call to os.execv() this block will
only ever run when the if condition was false. Hence putting it in an
else clause is unnecessary.
Can't figure out what it's supposed to do any more, and can't find a
version of Ansible before August 2016 (when I wrote that code) that
seems to need it.
Add some more mitigations to avoid sending dylibs.
Now there is a separate SHUTDOWN message that relies only on being
received by the broker thread, the main thread can be hung horribly and
the process will still eventually receive a SIGTERM.
This version is based on the modulefinder standard library module,
pruned back just to handle modules we know have been loaded already, and
to scan module-level imports only, rather than imports occurring in
class and function scope (crappy heuristic, but assume they are lazy
imports).
The ast and compiler modules were far too slow, whereas this version can
bytecode compile and scan all the imports for django.db.models (58
modules) in around 200ms.. 3.4ms per dependency, it's probably not going
to get much faster than that.
Now ssh requires a tty allocation. This presents a scalability problem,
a future version could selectively allocate a tty only if typing
passwords is desired.
Sudo's tty handling is now moved into mitogen.master.
* Support passing Context() objects in function calls and return values.
Now the fakessh demo from the documentation index would work
correctly.
* Since slaves can communicate with each other now, they should also use
the same approach to unpickling as the master already used. Collapse
away all the unpickle extension crap and hard-wire just the 3 types
that support unpickling.