Might want to de-overload the meaning of whitelist in future, but in
the meantime it works fine for Ansible and I can't think of a
whitelisting use case that would break because of it.
Closes#114.
Amazed this one managed to scrape through for so long. Calling
__import__ from within find_module() was causing the target module, in
this case cookielib, to be loaded *then overwritten* by a subsequent
duplicate load higher in the stack.
The result is that cookielib was loaded twice, and, per usual Python
import semantics, a reference to the partially initialized first
cookielib was installed in sys.modules while its code executed.
At the end of cookielib on 2.x, it imports _LWPCookieJar, which in turn
imports the partially built cookielib from sys.modules, then subclasses
the CookieJar from /that/ module.
Everything is wonderful. Then the call returns back up into the import
mechanism which restarts the entire process -- only this time,
_LWPCookieJar is /not/ reinitialized, so the copy in sys.modules is
still left with types pointing at the old module!
So the duplicate import creates a new CookieJar which is not the base
class of LWPCookieJar. Tada! 3 hours debugging.
This is probably a performance fix in disguise, didn't realize things
were so broken. It may also be a regression elsewhere. Urgently need to
finish the tests.
Found due to a LGTM warning about unused loop variable (related). As far
as I can tell the callback was sending fullname multiple times. KeyError
check added because I found NestedTest failed - mitogen.parent had
mitogen as one of it's related, and mitogen was not in the cache.
Refs #61
Since the for loops don't contain any break statements the StreamErrors
will always be raised when the loop completes without the method
resturning.
See https://lgtm.com/rules/5980098/
Refs #61
Python 2.4 does not support explicit relative imports. They were added
at Python 2.5, along with `from __future__ import absolute_import`.
On 2.x this will mean the import is first (implicitly) tried relative,
but on 3.x it will always be tried absolute.
Fixes#92
I took the liberty of renaming ModuleFinder.STDLIB_DIRS to
_STDLIB_PATHS, since it felt like an implementation detail that
shouldn't be baked into a public API and stdlib can also be imported
from e.g. a zip file.
I also changed it to a set to handle any duplicates.
Fixes#86
Using the same test as in 7af97c0365,
transmitted wire bytes drops from 135,531 to 133,071 (-1.81%), while
received drops from 21,073 to 14,775 (-30%).
Combined, both changes shave 13,914 bytes (-8.6%) off aggregate
bandwidth usage.
Make it configurable as compression hurts in some scenarios.
For the 52 submodules of ansible.modules.system, this produced a 1602
byte pkg_present list. After stripping it becomes 406 bytes, and the
entire LOAD_MODULE size drops from 1988 bytes to 792 bytes (-60%).
For the 68 submodules of ansible.module_utils, 1902 bytes pkg_present
becomes 474 bytes (-75%), and LOAD_MODULE size drops from 2867 bytes to
1439 bytes (-49%).
In a simple test running Ansible's "setup" module followed by its "apt"
module, wire bytes sent drops from 140,357 to 135,531 (-3.4%).
It looks ugly as sin, but this nets about a 20% drop in user CPU time,
and close to 15% increase in throughput.
The average log call is around 10 opcodes, prefixing with '_v and' costs
an extra 2, but both are simple operations, and the remaining 10 are
skipped entirely when _v or _vv are False.
Turns out it is far too easy to burn through available file descriptors,
so try something else: self-pipes are per thread, and only temporarily
associated with a Lack that wishes to sleep.
Reduce pointless locking by giving Latch its own queue, and removing
Queue.Queue() use in some places.
Temporarily undo merging of of Waker and Latch, let's do this one step
at a time.
On Python 2.x, operations on pthread objects with a timeout set actually
cause internal polling. When polling fails to yield a positive result,
it quickly backs off to a 50ms loop, which results in a huge amount of
latency throughout.
Instead, give up using Queue.Queue.get(timeout=...) and replace it with
the UNIX self-pipe trick. Knocks another 45% off my.yml in the Ansible
examples directory against a local VM.
This has the potential to burn a *lot* of file descriptors, but hell,
it's not the 1940s any more, RAM is all but infinite. I can live with
that.
This gets things down to around 75ms per playbook step, still hunting
for additional sources of latency.
Fix a MyPy warning by only passing lists to select.select(). At least on
Python 2.x, select.select() was internally converting the sets to lists
anyway.
By the time lists become inefficient here, it is likely that
select.select() itself will also be inefficient, and need replaced with
.poll() or similar.
No discernible performance different when transferring django.db.models
to a local VM.
* Children should never generate a request for a module that has already
been sent, however there are a variety of edge cases where, e.g.
asynchronous calls are made into unloaded modules in a set of
children, causing those children to request modules (and deps) in a
different order, which might break deduplication. So add a warning to
catch when this happens, so we can figure out how to handle it.
Meanwhile it's only a warning since in the worst case, this just adds
needless latency.
* Don't bother treating sent packages separately, there doesn't seem to
be any need for this (after docs are updated to match how preloading
actually works now).
Overwriting 'fullname' variable caused basically nonsensical filtering.
Result was including the module being searched in the list of
dependencies, which was causing ModuleResponder to send it early, which
was causing contexts to start importing the module before preloading of
dependencies had completed.
* SIGTERM safety net prevents profiler from writing results, so disable
it when profiling is active.
* fix warning corrupting stream when profiling=True
Previously we'd send just None in GET_MODULE reply, but now since there
is no single request-reply structure, we must include the fullname in
the LOAD_MODULE response and make all of its data fields None to
indicate the same.
Doesn't yet implement the rules in the docs, but I think the doc rules
could maybe change to match this. Needs lots of cleanup work and
thorough testing, but this is a great start.
* Don't implement the rules for when preloading occurs yet
* Don't attempt to streamily preload modules downstream while this
context hasn't yet received the final module. There is quite
significant latency buried in here, but for now it's a lot of work to
fix.
This works well enough to handle at least the mitogen package, but it's
likely broken for anything bigger.
It seems gevent automatically sets blocking behaviour on fds produced by
the socket module, which causes the Python process we fork to fail
horribly. So in the child, always reset the blocking flag.
Although these are synonyms in Python 2.x, when using MyPy to typecheck
code use of file() causes spurious errors.
This commit also serves as one small step to Python 3.x compatibility,
since 3.x removes the file() builtin.
Since the above if block ends in a call to os.execv() this block will
only ever run when the if condition was false. Hence putting it in an
else clause is unnecessary.
Without this, it's possible for Waker to be start_received() after the
shutdown signal has already been sent, resulting in 5 second delay
during shutdown.
Additionally mask EBADF during os.write() to waker's write side.
Necessary since nothing synchronizes writer threads from the broker
thread during shutdown. Could be done with a lock instead, but this is
cheaper.
Can't figure out what it's supposed to do any more, and can't find a
version of Ansible before August 2016 (when I wrote that code) that
seems to need it.
Add some more mitigations to avoid sending dylibs.
Now there is a separate SHUTDOWN message that relies only on being
received by the broker thread, the main thread can be hung horribly and
the process will still eventually receive a SIGTERM.
If no ADD_ROUTE message has been received from the master associating a
stream with a particular context ID, then it is expected messages
originating from that context ID can only be routed via the parent.
This version is based on the modulefinder standard library module,
pruned back just to handle modules we know have been loaded already, and
to scan module-level imports only, rather than imports occurring in
class and function scope (crappy heuristic, but assume they are lazy
imports).
The ast and compiler modules were far too slow, whereas this version can
bytecode compile and scan all the imports for django.db.models (58
modules) in around 200ms.. 3.4ms per dependency, it's probably not going
to get much faster than that.
This probably worsens performance in the common case, but it prevents
runaway producers (see e.g. issue #36) from spending all their CPU
copying around huge strings.
It's also a small step towards a solution to issue #6, which will
replace the output buffer with some sort of fancier queue anyway.
This reduces a particular 40 second run of rsync to 1.5 seconds.
The last time I tested set_nonblock() as a fix for the rsync hang, I
used F_SETFD rather than F_SETFL, which resulted in no error, but also
did not set O_NONBLOCK. Turns out missing O_NONBLOCK was the problem.
The rsync hang was due to every context blocking in os.write() waiting
for either a parent or child buffer to empty, which was exacerbated by
rsync's own pipelining, that allows writes from both sides to proceed
even while reads aren't progressing. The hang was due to os.write() on a
blocking fd blocking until buffer space is available to complete the
write. Partial writes are only supported when O_NONBLOCK is enabled.
Now ssh requires a tty allocation. This presents a scalability problem,
a future version could selectively allocate a tty only if typing
passwords is desired.
Sudo's tty handling is now moved into mitogen.master.
* Support passing Context() objects in function calls and return values.
Now the fakessh demo from the documentation index would work
correctly.
* Since slaves can communicate with each other now, they should also use
the same approach to unpickling as the master already used. Collapse
away all the unpickle extension crap and hard-wire just the 3 types
that support unpickling.