Commit Graph

265 Commits (1af2d9aef1ecb2ffff2ede9f40ce71713cb78410)

Author SHA1 Message Date
David Wilson 8889708f24 core: blacklist Jython org.* by default too.
1 silly roundtrip.
6 years ago
David Wilson 38c0ad1eea core: don't deregister Router handles until Broker exit.
Lots of "invalid handle: ..., 102" messages started appearing during
exit recently because ordering changed slightly, and local handles were
sent _DEAD even though the broker loop was still progressing through
shutdown.

The "shutdown" event is too early to close handles: it is the start of
the grace period where streams and downstream contexts can finish up any
work and deliver buffered data, including FORWARD_LOG messages that
haven't arrived yet.

So instead,

- move the _DEAD logic to the "exit" event,
- get rid of Context.on_shutdown() entirely, it's been unused for over
  a month,
- get rid of the "crash" event, since it always fires prior to "exit",
  and its only use was to send _DEAD to local handles, which now happens
  during exit anyway.
6 years ago
David Wilson 813d139d48 Import v2.7.11 tokenize.py for use on older Pythons; closes #189.
It's worth note that 2.7.10 shipped with Sierra, managed to not notice
this due to using a Homebrew 2.7.14.
6 years ago
David Wilson 3682ac6e29 fork: ensure importer handle is installed on the new router. 6 years ago
David Wilson 8f175bf7a8 issue #106: _unpickle_context() did not allow nameless contexts.
These are generated by any child calling .context_by_id() without
previously knowing the context's name, such as Contexts set in
ExternalContext.master and ExternalContext.parent.
6 years ago
David Wilson 17bfb596d0 issue #106: mitogen.service missing from modules list. 6 years ago
David Wilson 504032e6e8 issue #106: has_parent_authority should accept own context ID.
When a stream (such as unix.connect()) has its auth_id set to the
current context's, we should allow those requests too, since the request
is working with the privilege of the current context.
6 years ago
David Wilson fa271fcc8e issue #174: select.error interface differs to OSError
Only OSError got the magical attribute treatment, select.error still
behaves like a tuple.
6 years ago
David Wilson ffdd192397 issue #155: must catch select.error too.
Regression caused by merging exception handlers in 9079176.
6 years ago
David Wilson 6670cba41c Introduce handler policy functions; closes #138.
Now you can specify a function to add_handler() that authenticates the
message header, with has_parent_authority() and is_immediate_child()
built in.
6 years ago
David Wilson 46a14d4ae2 core: Fix logging crash if data is non-string. 6 years ago
David Wilson 40b978c9b7 core: Fix source verification.
Previously:

* src_id could be spoofed
* auth_id was checked but the message was still delivered!
6 years ago
David Wilson fe614aa966 core: cleanup handlers on broker crash; closes #112. 6 years ago
David Wilson 1ff27ada49 Add maximum message size checks. Closes #151. 6 years ago
David Wilson 6db3588c93 Only call _start_transmit when required; closes #165. 6 years ago
David Wilson 80a97fbc9b core: Rename Sender.put() to Sender.send().
Been annoying me for months.
6 years ago
David Wilson 085b3d21bd core: fix call_function_test regression
Second time in 3 weeks. So stupid. This time write tests.
6 years ago
David Wilson 0f29baa077 core: support pickling senders, Receiver.to_sender()
CC @moreati, in case this impacts you
6 years ago
David Wilson 692af860ba core: remove use of defer() from _async_route(). 6 years ago
David Wilson 8676c40674 core: make _start_transmit / _stop_transmit async-only
For now at least, these APIs are always used in an asynchronous context,
so stop using the defer mechanism.
6 years ago
David Wilson ee0f21d57f core: remove Queue locking from broker loop.
Move defer handling out of Broker and into Waker (where it belongs?).
Now the lock must only be taken if Waker was actually woken.

Knocks 400-item run_hostname_100_times from 10.62s to 10.05s (-5.3%).
6 years ago
David Wilson 7b12f84366 core: support CallError(str) for service.py. 6 years ago
David Wilson c34f8dbef3 core: Fix Receiver.__iter__ loop termination.
Since the Message refactoring from a few weeks back, __iter__ has had
nothing to throw ChannelError if the remote sent _DEAD.
6 years ago
David Wilson 671f753207 core: cache result of unpickling message. 6 years ago
David Wilson 5b87d10ae6 core: clean up no longer useful Latch.__repr__
Those fields are always None since the recent fork cleanup work.
6 years ago
David Wilson d4bc44468b core: fix crash in fork stress test
14:50:04 E mitogen: mitogen.fork.Stream('fork.7431') crashed
Traceback (most recent call last):
  File "/home/dmw/src/mitogen/mitogen/core.py", line 1287, in _call
    func(self)
  File "/home/dmw/src/mitogen/mitogen/core.py", line 758, in on_receive
    return self.on_disconnect(broker)
  File "/home/dmw/src/mitogen/mitogen/parent.py", line 370, in on_disconnect
    super(Stream, self).on_disconnect(broker)
  File "/home/dmw/src/mitogen/mitogen/core.py", line 721, in on_disconnect
    fire(self, 'disconnect')
  File "/home/dmw/src/mitogen/mitogen/core.py", line 162, in fire
    return [func(*args, **kwargs) for func in signals.get(name, ())]
  File "/home/dmw/src/mitogen/mitogen/core.py", line 1160, in <lambda>
    listen(stream, 'disconnect', lambda: self.on_stream_disconnect(stream))
  File "/home/dmw/src/mitogen/mitogen/core.py", line 1142, in on_stream_disconnect
    for context in self._context_by_id.itervalues():
RuntimeError: dictionary changed size during iteration
6 years ago
David Wilson a868498469 Replace assertions with fixed checks; closes #157. 6 years ago
David Wilson 862ec21160 core: allow shutdown triggered by any parent, not just immediate parent 6 years ago
David Wilson 0f08783330 core: fix NameError on disconnect 6 years ago
David Wilson 872181bebd issue #155: core: implement Side._on_fork()
Central mechanism for deleting all non-Latch file descriptors belonging
to the parent process during fork().
6 years ago
David Wilson 80642ed9ec issue #155: core: remove one duplicate set_cloexec(). 6 years ago
David Wilson cb71ce94c4 issue #155: core: be more careful reconfiguring stdio
Many dragons present!
6 years ago
David Wilson 443c94eb39 issue #155: core: prevent set_cloexec() use on standard handles 6 years ago
David Wilson 22cc1a3689 issue #155: core: refactor Latch to avoid TLS use
TLS destructors are not called after fork, therefore we must explicitly
track a global list of free file descriptors, and arrange for that list
to explicitly be destroyed from fork.py.
6 years ago
David Wilson 2cf9edc895 issue #155: core: ensure reused Importer gets new Context reference.
More hacky layering violations.. force Importer's _context attribute to
our new parent.
6 years ago
David Wilson cd5b37ea5b core: Use Side.read() rather than bare os.read(). 6 years ago
David Wilson 19d17982f3 core: Split blocking and non-blocking Latch.get()
Mostly just to avoid embarrassing function size, but it may come in
useful for testing later.
6 years ago
David Wilson 721caafb33 core: Do not decrement Latch._waking if we weren't woken. 6 years ago
David Wilson 67e0a4fe59 issue #155: add mitogen.fork to Importer list 6 years ago
David Wilson f752653e77 core: IoLogger: don't set O_CLOEXEC on standard handles
nested_test was failing due to the recent change to centralize
O_CLOEXEC, since stdout and stderr were being marked as non-inheritable.
That meant child processes would start with no stdout/stderr, triggering
a race between Waker opening its pipes, and IoLogger dup2'ing its pipes
over the stdio handles.

Since the stdio handles were closed, Waker would receive one of them as
one end of its pipe, and consequently have it overwritten by IoLogger.

When IoLogger dups over the top of fd 2, it becomes possible for
Waker.on_read() to be called due to pipe's other end to be closed,
causing an OSError exception with errno EAGAIN to appear.
6 years ago
David Wilson 0eeba2eaa8 core: include fds in Waker repr 6 years ago
David Wilson 7a061fe18b core: merge restart() into io_op()
Any long-running system call may suffer EINTR, so arrange for all IO
calls to be wrapped in the restart loop.
6 years ago
David Wilson 90791768be issue #155: core: slightly rearrange how shutdown works
This eliminates Context.on_disconnect() and instead moves its
functionality to a signal wired up by ExternalContext.main().

It leaves mitogen.master.Context is in a better condition to move into
mitogen.parent where it belongs.
6 years ago
David Wilson db1b5f7d62 issue #155: core: refactor main() to support forking.
* Split setup_globals() from setup_package() and make package setup
  optional (fork never needs it -- synthetic package already exists in
  children and the real package exists in masters).

* Add main() parameter to allow passing in the existing Importer
  instance. In forks from children, this means we inherit all the cached
  module state along with the __loader__ used to import any existing
  modules.
6 years ago
David Wilson a956aa409e Remove duplicate set_cloexec calls everywhere
Now it's handled in Side() constructor, it can disappear elsewhere.
6 years ago
David Wilson 65fcef2374 core: mark every side O_CLOEXEC
Not sure why this wasn't done before, seems it should have always been
this way, and can't see any reason it wasn't. Without it, many fds are
leaked into at least .local() children. Closes #163.
6 years ago
David Wilson 54ff1c90fa issue #155: add DEL_ROUTE, propagate ADD_ROUTE upwards
* IDs are allocated by the parent responsible for contructing a new
  child, using ALLOCATE_ID to the master as necessary to allocate new ID
  ranges.

* ADD_ROUTE is sent up the tree rather than down. This permits
  construction of the new context to complete concurrent to parent
  contexts learning about its existence. Since all streams are strictly
  ordered, it's not possible for any parent to observe messages from the
  new context prior to arrival of an ADD_ROUTE from the parent notifying
  of its existence.

  If the new context, for example, implements an Ansible async task, its
  parent can start executing that without waiting for any synchronous
  confirmation from any parent or the master.

* Since routes propagate up, it's no longer possible for a plain
  non-parent child to ever receive ADD_ROUTE, so that code can be moved
  out of core.py and into parent.py (-0.2kb compressed).

* Add a .routes attribute to parent.Stream, and respond to disconnection
  signal on the stream by propagating DEL_ROUTE for any ADD_ROUTE ever
  received from that stream.

* Centralize route management in a new parent.RouteMonitor class
6 years ago
David Wilson e3209d1de0 core: log Broker's id in repr. 6 years ago
David Wilson 7ec02f9bb0 issue #156: ensure Latch state is cleaned up if select throws. 6 years ago
David Wilson 20f5d89dfa issue #156: fix several more races
* Don't need to sleep if queue>sleepers, can just pop the right queue
  element and return it.

* If queue>sleeping and waking==sleeping, no mechanism existed to ensure
  a thread newly added to sleeping would ever be woken. Above change
  fixes that.

* Cannot trust select() return value, scheduler might sleep us
  indefinitely while put() writes a byte.

* Sleeping threads didn't pop FIFO, they popped in whatever order
  scheduler woke them up. Must recover index and use it to pick the pop
  index.
6 years ago
David Wilson 526b0a514b issue #156: prevent Latch.close() triggering spurious wakeups 6 years ago
David Wilson 2c22c41819 issue #156: don't decrement `waking` if we timed out rather than being woken. 6 years ago
David Wilson 07a8994ff5 issue #156: waking thread result dictionary with an integer. 6 years ago
David Wilson 001e0163fe issue #156: handle multiple _put() before wake of first sleeper
- If latch.get() is called and the queue is empty, a thread is put to
  sleep.

- If Latch.put() from another thread then appends an item to the queue and
  wakes the sleeping thread, and

- If a subsequent Latch.put() from the same or another thread manages to
  acquire `lock` before the sleeping thread is scheduled,

- The sleeping thread's wake socket would have multiple bytes written to
  it.

Therefore create a new _pending variable to track the only item assigned
to each thread (keyed by its write socket), and remove the socket from
`sleeping` from within put.
6 years ago
David Wilson 168a954d90 issue #156: prefix Latch private variables 6 years ago
David Wilson 512ff77a46 issue #156: prevent non-sleeping threads from starving sleeping threads.
See new docs
6 years ago
David Wilson c20c2587d9 issue #156: make Latch() repr match Pool() repr. 6 years ago
David Wilson 6e368d37da issue #156: log queue size too 6 years ago
David Wilson 037b461c39 issue #156: yet more logging :( 6 years ago
David Wilson 653c73c8f0 issue #156: also log target of wakes 6 years ago
David Wilson a5cc7cb43c issue #156: add extra debugging around Latch
Change from writing '\x00' to writing '\x7f', and verify that is the
byte that woke the sleeping thread. Add a bunch more IO logging.
6 years ago
David Wilson ac7a64dfa3 core: assign common expression to a variable. 6 years ago
David Wilson 148ce1d703 issue #155: increase context ID width to 32 bits
Needed to make large range allocations (1000 per ALLOCATE_ID roundtrip)
feasible.
6 years ago
David Wilson 8aada2646c core: support throwing LatchError in every sleeping thread
This is to allow Select() to be used as a generic queueing primitive
that supports graceful shutdown.
6 years ago
David Wilson ebfe733914 core: tidy up Stream.on_receive() branches. 6 years ago
David Wilson df488237d4 core: fix race in PidfulStreamHandler
Need to re-test with the lock held, else >1 threads can end up waiting
for lock then reopening the log repeatedly.
6 years ago
David Wilson a06c92d285 core: enable_debug_logging() should reopen file post-fork. 6 years ago
David Wilson 728a0da8a4 issue #139: eliminate quadratic behaviour from transmit path
Implication: the entire message remains buffered until its last byte is
transmitted. Not wasting time on it, as there are pieces of work like
issue #6 that might invalidate these problems on the transmit path
entirely.
6 years ago
David Wilson a3b4b459fa issue #139: eliminate quadratic behaviour on input path
Rather than slowly build up a Python string over time, we just store a
deque of chunks (which, in a later commit, will now be around 128KB
each), and track the total buffer size in a separate integer.

The tricky loop is there to ensure the header does not need to be sliced
off the full message (which may be huge, causing yet another spike and
copy), but rather only off the much smaller first 128kb-sized chunk
received.

There is one more problem with this code: the ''.join() causes RAM usage
to temporarily double, but that was true of the old solution too. Shall
wait for bug reports before fixing this, as it gets very ugly very fast.
6 years ago
David Wilson ba9a06d0f5 issue #139: core: Side.write(): let the OS write as much as possible.
There is no penalty for just passing as much data to the OS as possible,
it is not copied, and for a non-blocking socket, the OS will just keep
buffer as much as it can and tell us how much that was.

Also avoids a rather pointless string slice.
6 years ago
David Wilson 49db4125d0 issue #139: core: bump CHUNK_SIZE from 16kb to 128Kb
Reduces the number of IO loop iterations required to receive large
messages at a small cost to RAM usage.

Note that when calling read() with a large buffer value like this,
Python must zero-allocate that much RAM. In other words, for even a
single byte received, 128kb of RAM might need to be written.
Consequently CHUNK_SIZE is quite a sensitive value and this might need
further tuning.
6 years ago
David Wilson 44d36eccba issue #146: don't crash during on_broker_shutdown
There is some insane unidentifiable Mitogen context (the local context?)
that instantly crashes with a higher forks setting. It appears to be
harmless, but meanwhile this naturally shouldn't be happening.
6 years ago
David Wilson cb620500d1 issue #131: log stack and PPID with MITOGEN_ROUTER_DEBUG=1 6 years ago
David Wilson d58b5ad777 core: prevent creation of unicode Message.data
Was triggering a crash indirectly due to Ansible passing us Unicode
strings. Needs a better fix.
6 years ago
Alex Willmer f999b9adbf Crank zlib.compress() upto 9
SSH command size: 482 bytes (no change)
Preamble size: 8946 bytes (down 33)
6 years ago
David Wilson b243da087c issue #121: fix call_function_test by not raising the dead
A first small mea culpa to all my testing sins of late :)
6 years ago
David Wilson f1009b7502 issue #121: fix breakage caused by a9c6c13
This actually addresses multiple problems:

* Single-file programs were broken, since the fix introduced in
  6931cc10c4 caused builtin_find_module()
  to start indicating __main__ can always be loaded locally. That's
  broken, and there might be more cases where the same problem will crop
  up.

  Since it was indicated __main__ could be loaded locally, the built-in
  import machinery was allowed to attempt that (since we remove __main__
  from sys.modules during bootstrap), which caused a safety check to
  fire in the bowels of Python:

      "Cannot re-init internal module %.200s"

* The check for presence of the whitelist was totally broken, since the
  whitelist is never an empty list. Therefore 'self' was being returned
  for every module, including extension modules like 'termios'.

I have hand-verified this does not break the fix for issue #113. I
looked at writing a test for that, but it requires a Docker container
(or similar) with an ancient version of Ansible installed. Will open a
separate ticket tracking this.
6 years ago
David Wilson 5dddee62ea Revert "issue #121: minimal fix for nested_test."
Mega broken.

This reverts commit a7dbbd96aa.
6 years ago
David Wilson a0c4df72b0 issue #121: minimal fix for nested_test. 6 years ago
David Wilson 28afa955a3 importer: take priority over system packages when whitelisting is enabled
Might want to de-overload the meaning of whitelist in future, but in
the meantime it works fine for Ansible and I can't think of a
whitelisting use case that would break because of it.

Closes #114.
6 years ago
David Wilson cf01c6b710 importer: avoid duplicate module load(!); closes #113.
Amazed this one managed to scrape through for so long. Calling
__import__ from within find_module() was causing the target module, in
this case cookielib, to be loaded *then overwritten* by a subsequent
duplicate load higher in the stack.

The result is that cookielib was loaded twice, and, per usual Python
import semantics, a reference to the partially initialized first
cookielib was installed in sys.modules while its code executed.

At the end of cookielib on 2.x, it imports _LWPCookieJar, which in turn
imports the partially built cookielib from sys.modules, then subclasses
the CookieJar from /that/ module.

Everything is wonderful. Then the call returns back up into the import
mechanism which restarts the entire process -- only this time,
_LWPCookieJar is /not/ reinitialized, so the copy in sys.modules is
still left with types pointing at the old module!

So the duplicate import creates a new CookieJar which is not the base
class of LWPCookieJar. Tada! 3 hours debugging.

This is probably a performance fix in disguise, didn't realize things
were so broken. It may also be a regression elsewhere. Urgently need to
finish the tests.
6 years ago
David Wilson ff617824a1 ansible: fix some flake8 errors
* Unused imports
* Undefined names in helpers.py
* Copyright header wrapping
6 years ago
Alex Willmer 33781aba2c core: Correct naming of Stream.sent_modules
Fixes #90
6 years ago
Alex Willmer a1aab30e63 core: Implement Dead.__ne__ & Dead.__hash__
Both these addtions are to address warnings in
https://lgtm.com/projects/g/dw/mitogen/alerts/?mode=list. Namely that if
a class defines an equality method then it should also define an
inequality and a hash method.

Refs #61
6 years ago
Alex Willmer 4b373c421b core: Standardise type of Importer.whitelist
This seemed a reasonable streamlining, but I'm happy to be overruled.
6 years ago
Alex Willmer ecaa8609f3 core: Add docstring to is_blacklisted_import()
This documents the existing behaviour, which may not be the intended.
6 years ago
David Wilson 5855f1739f core: Handle unpicklable data in dispatch_calls()
Sending just via .call_async() would previously crash the child, now it
generates CallError like intended.
6 years ago
David Wilson d4169557f1 Fix some more Python 2.4 syntax 6 years ago
David Wilson afc8697288 core: Ensure add_handler() callbacks really receive _DEAD on shutdown 6 years ago
David Wilson 020036f807 core: add a nasty hack for Ansible modules. 6 years ago
David Wilson 4d940f08ae importer: drop redundant prefix from pkg_present
For the 52 submodules of ansible.modules.system, this produced a 1602
byte pkg_present list. After stripping it becomes 406 bytes, and the
entire LOAD_MODULE size drops from 1988 bytes to 792 bytes (-60%).

For the 68 submodules of ansible.module_utils, 1902 bytes pkg_present
becomes 474 bytes (-75%), and LOAD_MODULE size drops from 2867 bytes to
1439 bytes (-49%).

In a simple test running Ansible's "setup" module followed by its "apt"
module, wire bytes sent drops from 140,357 to 135,531 (-3.4%).
6 years ago
David Wilson b543b84e80 importer: share blacklist logic between master/parent 6 years ago
David Wilson 8ec6ae1da0 importer: module whitelist/blacklist support
Hoped to avoid it, but it's the obvious solution for Ansible.
6 years ago
David Wilson 43ba1c76dc core: wrap selects in EINTR handlers
This isn't nearly enough, but it catches the most common victim of
EINTR.
6 years ago
David Wilson aafe458a13 core: #39: don't call logging framework when logging is disabled
It looks ugly as sin, but this nets about a 20% drop in user CPU time,
and close to 15% increase in throughput.

The average log call is around 10 opcodes, prefixing with '_v and' costs
an extra 2, but both are simple operations, and the remaining 10 are
skipped entirely when _v or _vv are False.
6 years ago
Alex Willmer 3261c561dd Fix AttributeError in mitogen.core.Context.send_await()
As of adc8fe3aed Receiver objects do not
have a get_data() method and Receiver.get() does not unpickle the
message.
6 years ago
David Wilson 6905dc4e8d master: use queue-like Latch in Select() too. 6 years ago
David Wilson 20afa5b90c Latch v2: combined queue + one self-pipe-per-thread
Turns out it is far too easy to burn through available file descriptors,
so try something else: self-pipes are per thread, and only temporarily
associated with a Lack that wishes to sleep.

Reduce pointless locking by giving Latch its own queue, and removing
Queue.Queue() use in some places.

Temporarily undo merging of of Waker and Latch, let's do this one step
at a time.
6 years ago
David Wilson e6a107c5aa core: replace Queue with Latch
On Python 2.x, operations on pthread objects with a timeout set actually
cause internal polling. When polling fails to yield a positive result,
it quickly backs off to a 50ms loop, which results in a huge amount of
latency throughout.

Instead, give up using Queue.Queue.get(timeout=...) and replace it with
the UNIX self-pipe trick. Knocks another 45% off my.yml in the Ansible
examples directory against a local VM.

This has the potential to burn a *lot* of file descriptors, but hell,
it's not the 1940s any more, RAM is all but infinite. I can live with
that.

This gets things down to around 75ms per playbook step, still hunting
for additional sources of latency.
6 years ago
David Wilson a35fcf44cc ansible: restructure to avoid intermediate imports 6 years ago
David Wilson f3e51a7b18 core: CALL_FUNCTION should check auth_id, not src_id 6 years ago
David Wilson 32f6ee7d43 issue #40: mitogen.unix initial implementation. 6 years ago
David Wilson e63e9d299e docs: add Message documentation 6 years ago
David Wilson 10230f62dd core: Message.reply() helper function 6 years ago
David Wilson 6fc8fa5b22 core: Don't crash if a stream is missing a side. 6 years ago
David Wilson 9238e09ae8 core: Restore behaviour of unpickling Router-specific Context subclass 6 years ago
David Wilson dd088908df select: clean up API. 6 years ago
David Wilson df07e47d24 core: de-munge Message.unpickle() vs. Receiver.get(). 6 years ago
David Wilson a39cd44bf2 core: add auth_id field. 6 years ago
David Wilson a54c96faae core: remove unused SecurityError. 6 years ago
David Wilson 07d4d799f1 Add mitogen.main() decorator mainly for docs and demo use. 6 years ago
David Wilson 55c23e1c57 issue #68: replace sets with lists
Fix a MyPy warning by only passing lists to select.select(). At least on
Python 2.x, select.select() was internally converting the sets to lists
anyway.

By the time lists become inefficient here, it is likely that
select.select() itself will also be inefficient, and need replaced with
.poll() or similar.

No discernible performance different when transferring django.db.models
to a local VM.
6 years ago
David Wilson a0d9d34231 core: fix profiling
* SIGTERM safety net prevents profiler from writing results, so disable
  it when profiling is active.
* fix warning corrupting stream when profiling=True
6 years ago
David Wilson 5f2fa2cda6 importer: always refuse builtins and __builtin__. 6 years ago
David Wilson 0f899f34ff importer: new format to signal ImportError
Previously we'd send just None in GET_MODULE reply, but now since there
is no single request-reply structure, we must include the fullname in
the LOAD_MODULE response and make all of its data fields None to
indicate the same.
6 years ago
David Wilson 4d01dc3ba6 Initial pass at module preloading
* Don't implement the rules for when preloading occurs yet
* Don't attempt to streamily preload modules downstream while this
  context hasn't yet received the final module. There is quite
  significant latency buried in here, but for now it's a lot of work to
  fix.

This works well enough to handle at least the mitogen package, but it's
likely broken for anything bigger.
6 years ago
David Wilson ed71ae72f8 master: make mitogen minimally functional under gevent
It seems gevent automatically sets blocking behaviour on fds produced by
the socket module, which causes the Python process we fork to fail
horribly. So in the child, always reset the blocking flag.
6 years ago
David Wilson 326886832e Add license text everywhere. 6 years ago
Alex Willmer 3831ac360f Replace all calls to file() with open()
Although these are synonyms in Python 2.x, when using MyPy to typecheck
code use of file() causes spurious errors.

This commit also serves as one small step to Python 3.x compatibility,
since 3.x removes the file() builtin.
6 years ago
David Wilson 0481c08beb Ensure _run_defer() fully executes at least once before shutdown
Without this, it's possible for Waker to be start_received() after the
shutdown signal has already been sent, resulting in 5 second delay
during shutdown.

Additionally mask EBADF during os.write() to waker's write side.
Necessary since nothing synchronizes writer threads from the broker
thread during shutdown. Could be done with a lock instead, but this is
cheaper.
6 years ago
David Wilson b1ad04330b docs: move Router.route() into Sphinx. 6 years ago
David Wilson fb759f7c16 docs: move Broker docstrings into Sphinx. 6 years ago
David Wilson ffa063cc01 docs: annother barriage of cross-reference fixes. 6 years ago
David Wilson 7f3a58d514 core: Remove unused on_shutdown attribute. 6 years ago
David Wilson ec66152e37 docs: better io_op doc, move Side docs to Sphinx. 6 years ago
David Wilson 0767abf26f docs: move BasicStream docs into Sphinx. 6 years ago
David Wilson b7a9aa46cf core: More robust shutdown
Now there is a separate SHUTDOWN message that relies only on being
received by the broker thread, the main thread can be hung horribly and
the process will still eventually receive a SIGTERM.
6 years ago
David Wilson 3285fc2f75 Implement test_aborted_on_local_context_disconnect 6 years ago
David Wilson 690ee6dbe2 Fix select_test failure, remove crap old timing_test. 6 years ago
David Wilson 2454dcc990 core: loosen assertion to allow fakessh_test to succeed. 6 years ago
David Wilson 9b13a4cc61 Fix 2 call_function_test failures. 6 years ago
David Wilson f1d82c7284 More API documentation. 6 years ago
David Wilson b7f95e558f Better document Router API and constructors. 6 years ago
David Wilson 815f23bddd Sense of block= parameter was inverted. 6 years ago
David Wilson c4d9f124c6 Document Sender and Receiver classes. 6 years ago
David Wilson 849ccebe04 receiver: only permit one notify callback
There is no point spamming a list for every function call, there is no
use case where multiple notify callbacks would be useful.
6 years ago
David Wilson 48bf987570 issue #20: fix queue.get() parameter list. 6 years ago
David Wilson e3d967ebeb issue #20: initial implementation of mitogen.master.Select(). 6 years ago
David Wilson 14783c75e8 issue #9: log warning when a cross-sibling CALL_FUNCTION occurs
First step to making it a fatal error.
6 years ago
David Wilson 9de1fca3bf issue #9: ensure messages arrive on the expected stream
If no ADD_ROUTE message has been received from the master associating a
stream with a particular context ID, then it is expected messages
originating from that context ID can only be routed via the parent.
6 years ago
David Wilson 01729b18a5 core: use an output deque rather than string to improve worst case perf
This probably worsens performance in the common case, but it prevents
runaway producers (see e.g. issue #36) from spending all their CPU
copying around huge strings.

It's also a small step towards a solution to issue #6, which will
replace the output buffer with some sort of fancier queue anyway.

This reduces a particular 40 second run of rsync to 1.5 seconds.
6 years ago
David Wilson effe4117e1 Treat EPIPE as disconnect too; needed for fakessh. 6 years ago
David Wilson 9c4bf37cfc Remove final vestiges of context.key. 6 years ago
David Wilson 05cc74d142 core: Support profiling 6 years ago
David Wilson a9387b0504 core: remove pointless eval() of ARGV0 environment variable. 6 years ago
David Wilson fb9ce1054c core: set O_NONBLOCK on every side. Closes #33
The last time I tested set_nonblock() as a fix for the rsync hang, I
used F_SETFD rather than F_SETFL, which resulted in no error, but also
did not set O_NONBLOCK. Turns out missing O_NONBLOCK was the problem.

The rsync hang was due to every context blocking in os.write() waiting
for either a parent or child buffer to empty, which was exacerbated by
rsync's own pipelining, that allows writes from both sides to proceed
even while reads aren't progressing. The hang was due to os.write() on a
blocking fd blocking until buffer space is available to complete the
write. Partial writes are only supported when O_NONBLOCK is enabled.
6 years ago
David Wilson 7a60b20dc6 core: Generalize/duplicate the call/send_await code using Receiver. 6 years ago
David Wilson e4c832685d core: synchronize Stream._output_buf by deferring send()
Previously _output_buf was racy. This may or may not be cheaper than
simply using a lock, but it requires much less code, so I prefer it for
now.
6 years ago
David Wilson ead67de883 core: make Side.write() return None rather than crash if side already closed. 6 years ago
David Wilson 74b31bbe47 core: better Message.__repr__. 6 years ago