Commit Graph

442 Commits (master)

Author SHA1 Message Date
David Wilson 2a70b3d5f4 issue #493: Py3.x fix. 5 years ago
David Wilson bc84d1e950 issue #493: less CPU-intensive cookie format. 5 years ago
David Wilson 7dae88f0f5 issue #490: have Side._on_fork() empty _fork_refs
This is mostly to avoid ugly debugging that depends on the state of GC.
Discard sides from _fork_refs after they have been closed.
5 years ago
David Wilson 2b234936b8 core: docstring tidyups. 5 years ago
David Wilson f17fb91993 core: ensure early debug messages are logged correctly.
The magical _v and _vv were being set too late. Drag _setup_logging()
out of the Router constructor and call it at the right moment during
bootstrap.
5 years ago
David Wilson 8a931e79b0 core: log disconnection reason. 5 years ago
David Wilson 1f59bcc313 issue #477: fix another Threading.getName() call. 5 years ago
David Wilson d6dcb8d010 issue #477: blacklist 'thread' module to avoid roundtrip on 2.x->3.x 5 years ago
David Wilson 4c1ddf6fc1 issue #477: Python3 does not have Pickler.dispatch. 5 years ago
David Wilson a31718a6bc issue #477: use PY24 constant rather than explicit test. 5 years ago
David Wilson ffd46e9f1c issue #477: parent: make iter_read() log disconnect reason. 5 years ago
David Wilson e9706a4a09 issue #477: _update_linecache() must append newlines. 5 years ago
David Wilson 19b708e141 issue #415, #477: Poller must handle POLLHUP too.
Linux will fire poll() with simply the POLLHUP bit set even though it
was not requested, resulting in an infinite loop.
5 years ago
David Wilson 07f1b9bdd0 issue #477: Python 2.5 needs next() polyfill too. 5 years ago
David Wilson 3afd667136 issue #477: explicitly populate Py2.4 linecache from Importer. 5 years ago
David Wilson 97a96f5dd8 issue #477: rename and add tests for polyfill functions. 5 years ago
David Wilson da13415b00 issue #477: various core.py docstring cleanups. 5 years ago
David Wilson dd36450daf issue #477: yet another bug in core._partition(). 5 years ago
David Wilson 1f17422598 issue #477: make CallError serializable on 2.4.
Making CallError inherit from object broke 'raise CallError()'.

Instead use pure-Python pickler on 2.4 (grmbl) and force it to emit
new-style-alike output for what is otherwise a classic class.

Remove needless complexity from _unpickle_call_error() that only worked
for new-style classes.
5 years ago
David Wilson 4b89dc4813 issue #477: log full module name when SyntaxError occurs. 5 years ago
David Wilson d4afa102c7 issue #477: more Py2.4 (str|unicode).partition(). 5 years ago
David Wilson 0ee8ee78b8 issue #477: Py2.4 cannot tolerate unicode kwargs. 5 years ago
David Wilson 08cecb92f6 issue #477: Py2.4 lacks BaseException. 5 years ago
David Wilson 51a07dce70 issue #477: Py2.4: more unicode.rpartition() usage. 5 years ago
David Wilson 07401d767a issue #477: Python 2.4 type(exc) returns old-style instance. 5 years ago
David Wilson 33caea06ed issue #477: Python <2.5 lacked any(). 5 years ago
David Wilson 3109abd518 issue #477: Python <2.6 lacked rpartition(). 5 years ago
David Wilson 84601f41fd issue #477: make CallError inherit from object for 2.4/2.5.
Otherwise cPickle will not call __reduce__().
5 years ago
David Wilson 881dc7d5ca issue #477: more 2.4-compatible thread.get_ident() use. 5 years ago
David Wilson a1e0b4381f issue #477: bump corrupt msg output size to 2Kb
Allows much more of any tracebacks present to become visible.
5 years ago
David Wilson e99b8a8de7 core: replace ancient YOLO loop in fire(). 5 years ago
David Wilson 120c667052 core: many docstring updates and an example substitute for Channel 5 years ago
David Wilson 84f75551a3 core: make Receiver a self-closing context manager. 5 years ago
David Wilson fcc403cc2f core: make Receiver.to_sender() use Router.myself(). 5 years ago
David Wilson abfb6e39a8 issue #61: unused variable (reported by LGTM) 5 years ago
David Wilson 5bd9efb723 issue #61: add missing close() implementation (reported by LGTM) 5 years ago
David Wilson ea9ef50b3c issue #415: replace default Poller with select.poll()
30% latency reduction for IPC.
5 years ago
David Wilson 5b45b5851c issue #408: use compatible method to get thread ID. 5 years ago
David Wilson 5761652e02 core: allow Router.shutdown() to succeed after exit.
For join_thread():

Exception in thread mitogen.master.join_thread_async:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/home/dmw/src/mitogen/mitogen/master.py", line 249, in _watch
    watcher.on_join()
  File "/home/dmw/src/mitogen/mitogen/master.py", line 816, in shutdown
    super(Broker, self).shutdown()
  File "/home/dmw/src/mitogen/mitogen/core.py", line 2741, in shutdown
    self.defer(_shutdown)
  File "/home/dmw/src/mitogen/mitogen/core.py", line 2142, in defer
    raise Error(self.broker_shutdown_msg)
Error: An attempt was made to enqueue a message with a Broker that has already exitted. It is likely your program called Broker.shutdown() too early.
5 years ago
David Wilson 822978520f issue #446: update Receiver.__iter__ to match
iter() previously relied on the fake dead message being enqueued.
5 years ago
David Wilson 5ef94eb3e2 issue #456: loosen Waker.defer() shutdown test a little
Allow messages to continue being queued during the shutdown period,
right up until the final loop iteration, even though this is racy, as
too many things depend on .defer() during exit right now.

This doesn't hurt the spirit of the check: it still catches the worst
situation where $user accidentally shut down Broker then tried to
continue using it.
5 years ago
David Wilson bcd9827c3b core: Latch.empty() improvements
- throw LatchError if the latch is closed.
- wrap with the lock to avoid unexpected weirdness.
5 years ago
David Wilson 388649df97 core: Receiver.close() now wakes all threads; closes #446. 5 years ago
David Wilson 1d97493fcd tests: fallout from #447. 5 years ago
David Wilson ab8d6afbae core: use ModuleNotFoundError in imporer if it is available; closes #448. 5 years ago
David Wilson de719fa249 core: throw error on duplicate add_handler(); closes #447. 5 years ago
David Wilson ec056042e0 core: more Poller docstrings. 6 years ago
David Wilson d286eeb2ea core: more Poller docs 6 years ago
David Wilson 5f5396bcb2 core: more poller doc 6 years ago
David Wilson 499e7273d1 core: poller tidyups and minify fix 6 years ago
David Wilson 3f5774cfd5 core: document/tidy up poller.
Remove duplicate attribute creates in subclasses too.
6 years ago
David Wilson a156d7aab3 core: move importer inline data out to class vars. 6 years ago
David Wilson 824c7931a9 core: improve importer exception messages. 6 years ago
David Wilson 1eb08fb5c5 core: docstring tidyups 6 years ago
David Wilson 81a68223d4 issue #456: exception text typo. 6 years ago
David Wilson 497234e782 issue #456: core: raise error during defer() if Broker shutdown 6 years ago
David Wilson 917a1ffb29 issue #453: prevent accidental child logging loop. 6 years ago
David Wilson 9680a84824 core: rename Router.self() to Router.myself(). 6 years ago
David Wilson 8f85ee038e core: Add Router.self()
Returns a reference to the current context.
6 years ago
David Wilson 300cb41e2e core: detect stream corruption. Closes #438. 6 years ago
David Wilson c2c7caa34f core: ignore DeprecationWarning for imp module.
Closes #399, #437.
6 years ago
David Wilson 57504ba6ec issue #109: core: meta_path regression in newer Pythons
Python at some point (at least since https://bugs.python.org/issue14605)
began populating sys.meta_path with its internal importer classes,
meaning that interpreters no longer start with an empty sys.meta_path.
6 years ago
David Wilson 65d9eec353 issue #364: core: have Sender.close() supply reason= to dead() 6 years ago
David Wilson 01c4f3fee1 core: rearrange stdio setup to cope with buffering; closes #422 6 years ago
David Wilson c7931be524 issue #420: core: include PID in Latch cookie data. 6 years ago
David Wilson 6e1f9e2596 core: 2.6 str.decode() compat fix. 6 years ago
David Wilson 76ec4f201c issue #413: paper over harmless duplicate del_route()
Ideally it would only be called once, and in future maybe it can, but
right now we need to cope with these cases:

* Downstream parent notifies us of disconnection (DEL_ROUTE)
* We notify ourself of disconnection
* We notify ourself and so does downstream parent

It's case 3 that causes the error.
6 years ago
David Wilson cf97932fad core: dead messages have optional body, use it everywhere; closes #387. 6 years ago
David Wilson c09780aeb0 core: fix add_handler(respondent=..) memory leak
Closes #416.
6 years ago
David Wilson 10af266678 issue #406: attempt Broker cleanup in case of a crash. 6 years ago
David Wilson d1c2e7a834 issue #406: call Poller.close() during broker shutdown. 6 years ago
David Wilson e4280dc14a core: Don't crash in Waker.__repr__ if partially initialized. 6 years ago
David Wilson 87e8c45f76 core: fix minify_test regression introduced in 804bacdadb
The minifier can't handle empty function bodies, so the pass statements
are necessary.
6 years ago
David Wilson 16c364910a core: avoid redundant write() calls in Waker.defer()
Using _lock we can know for certain whether the Broker has received a
wakeup byte yet. If it has, we can skip the wasted system call.

Now on_receive() can exactly read the single byte that can possibly
exist (modulo FD sharing bugs -- this could be improved on later)
6 years ago
David Wilson 804bacdadb docs: move most remaining docstrings back into *.py; closes #388
The remaining ones are decorators which don't seem to have an autodoc
equivlent.
6 years ago
David Wilson 711aed7a4c core: split _broker_shutdown() out into its own function.
Makes _broker_main() logic much clearer.
6 years ago
David Wilson 1d32ed3b5a core: avoid shutdown() in IoLogger on WSL; closes #333. 6 years ago
David Wilson 5be9a55bf4 core: allow Context to be pickled by non-Mitogen pickler. 6 years ago
David Wilson a7ee23719a issue #388: move a ton of documentation back into the source 6 years ago
David Wilson 73cda2994f issue #333: add versioning, initial batch of poller tests
Now poller is start enough to know a start_receive() during an iteration
does not cause events yielded by that iteration to associate with the
wrong descriptor.

These changes are tangentially related to the associated ticket, but
event versioning is still the underlying issue.
6 years ago
David Wilson 1cbff1011e core: send dead message if max message size exceeded; closes #405 6 years ago
David Wilson 9ec360c26d core: split out & extend Broker.sync_call() 6 years ago
David Wilson 58d0a45738 issue #76: quieten routing errors.
Receiving DEL_ROUTE without a corresponding ADD_ROUTE is now legit
behaviour, so don't print an error in this case.

Don't print an error for dropped messages if the reply_to indicates the
sender doesn't care about a response (dead and no_reply)
6 years ago
David Wilson b9bafb78af issue #76: add stub DEL_ROUTE handler to core.py.
This handler knows how to fire 'disconnect' event on reception of a
DEL_ROUTE, and nothing more.
6 years ago
David Wilson babe3eec31 issue #76: record egress context IDs
Used in a subsequent change to broadcast DEL_ROUTE to potentially
interested children.
6 years ago
David Wilson d7d40f1123 issue #76: reduce Context duplication during unpickling
When unpickling a context, arrange for there to be a single instance
representing that context, managed by the corresponding router. This
context_by_id() was already in use by parent.py, it just needs to move
down.

This to eventually reach the point where a single Context exists that
needs 'disconnect' fired on it, so all sleeping receivers are definitely
woken.
6 years ago
David Wilson a7b1831ddf core: move IS_DEAD doc into core.py. 6 years ago
Alex Willmer 6da31c9dee docs: Remove unneeded backslash escapes
Python 3.x was emitting a DeprecationWarning. AFAICT there has been no
impact on the HTML rendering.
6 years ago
Yannig Perré 6828926a36 Kubernetes connection support for mitogen. 6 years ago
David Wilson 294f17e491 core: fix econtext on_start parameter, used by fork_test. 6 years ago
David Wilson 4d3873c784 core: call chains v3: abstract it into a new CallChain class. 6 years ago
David Wilson a3957d6aaf parent: add Context.forget_chain(). 6 years ago
David Wilson 37223adacd core: fix Dispatcher race introduced in 3a7815e5ca6255272334415916b6289378173859
It must be constructed before are messages pumped.
6 years ago
David Wilson 42b1b3d286 core: support mitogen_chain dispatcher option. 6 years ago
David Wilson 92c092d27b core: split Dispatcher out into own class. 6 years ago
David Wilson ba0b3af205 core: remove accidentally checked in debug crap (#337) 6 years ago
David Wilson c6159c9154 core: fix startup logging race. Closes #305. 6 years ago
David Wilson 7d62a53264 issue #337: ssh: disabling PTYs round 2: make it automatic. 6 years ago
David Wilson 2fcea4b199 add extra 'pass' statements to work around minify issues. 6 years ago
David Wilson 27b64a484b docs: document mitogen.core.CHUNK_SIZE. 6 years ago
David Wilson df5342af22 core: split out _internal_receive()
This is needed for libssh2.
6 years ago
David Wilson 442d88e3d7 docs: many more fixes/merges. 6 years ago
David Wilson a561fb79e5 docs: merge more docs back into mitogen/core.py. 6 years ago
David Wilson 81c8156965 Support LXD; closes #339. 6 years ago
David Wilson 5c573f7fcb ansible: insert short sleep when MITOGEN_PROFILING active.
Hacky, but works fine.
6 years ago
David Wilson d26fe5b993 issue #310: fix negative imports on Python 3.x.
On 3.x, Importer() can still have its methods called even if
load_module() raises ImportError.

Closes #310.
6 years ago
David Wilson f7e288fa25 core: fd 0/1 were accidently made non-blocking.
This breaks regular code. Triggered by a huge pprint() in the child to
stdout.
6 years ago
napkindrawing 745d72bb1d core: support for "doas" become_method 6 years ago
David Wilson 3a8ea930d7 core: fix NameError in Latch.put(), FileService exception 6 years ago
David Wilson 484d4fdb74 core: fix Latch socket sharing race.
If thread A is about to wake as thread B is about to sleep, and A loses
the GIL at an inopportune moment, it was possible for two latches to
share the same socketpair, causing wakeups routed to the wrong latch.

The pair was returned to the 'idle sockets' list before .recv() had been
called. This manifested as TimeoutError() thrown rarely with many active
threads and the host is heavily loaded (such as Travis CI).

Add more documentation and stop writing single wake bytes. Instead the
recipient's identity is written instead, making it simpler to detect
future bugs.
6 years ago
David Wilson 29f15c236c core: remove needless size prefix from core_src_fd.
I think this is brainwrong held over from an early attempt to write the
duplicate copy of core_src on stdin.
6 years ago
David Wilson 04e138e060 core: fix serialization of empty bytes() on 3.x. 6 years ago
David Wilson ff2f44b046 core: reduce chance of Latch.read()/write()/close() race.
Previously it was possible for a thread to call Waker.defer() after
Broker has torns its Waker down, and the underlying file descriptor
reallocated by the OS to some other component.

This manifested as latches of a subsequent test invocation receiving the
waker byte (' ') rather than their expected byte '\x7f'.

This doesn't fix the problem, it just significantly reduces the chance
of it occurring. In future Side.write()/read()/close() must be
synchronized with a lock.

Previously the problem could be reliably triggered with:

    while :; do
        python tests/call_function_test.py -vf CallFunctionTest.{test_aborted_on_local_broker_shutdown,test_aborted_on_local_context_disconnect}
    done
6 years ago
David Wilson e24eddb1ce core: move Latch docs back inline. 6 years ago
David Wilson 42276f158b core: log the data received on the latch file handle. 6 years ago
David Wilson a52064a24f core: reordered find_module() test was broken (again)
e81b3bd0652b5eb125eb224ceca281b9d540dd5e

The whitelist check must happen /after/ the other checks, otherwise we
unconditionally retunr self for crap like 'ansible.module_utils.json'.
6 years ago
David Wilson db529e8228 core: fix Receiver.__iter__ regression on EOF 6 years ago
David Wilson 9fb2371d64 importer: reorder/tweak find_module() tests to cope with six.moves
The old hack on the master side we had is broken for some reason on 3.x.
Instead tweak the client to be more selective: if a request is for a
module within a package, the package must be loaded (in sys.modules),
and its __loader__ must be us. Previously if the module didn't exist in
sys.modules, we'd still try to fetch from the master, which doesn't
appear to ever make sense.
6 years ago
David Wilson 410016ff47 Initial Python 3.x port work.
* ansible: use unicode_literals everywhere since it only needs to be
  compatible back to 2.6.
* compat/collections.py: delete this entirely and rip out the parts of
  functools that require it.
* Introduce serializable Kwargs dict subclass that translates keys to
  Unicode on instantiation.
* enable_debug_logging() must set _v/_vv globals.
* cStringIO does not exist in 3.x.
* Treat IOLogger and LogForwarder input as latin-1.
* Avoid ResourceWarnings in first stage by explicitly closing fps.
* Fix preamble_size.py syntax errors.
6 years ago
David Wilson e0c116a29f issue #275: logging package uses classic classes in 2.6. 6 years ago
David Wilson 75b195ba4b core: race during Receiver construction.
It's possible for a message to arrive after .add_handler() but before
Latch construction.

This is papering over a bigger problem with service pool instantiation.

https://travis-ci.org/dw/mitogen/jobs/390409832#L2901

    TASK [Spin up a few interpreters] **********************************************
    changed: [target] => (item=1)
    ERROR! [pid 5355] 14:47:50.224945 E mitogen.ctx.ssh.localhost:2201.sudo.mitogen__user2: mitogen: Router(Broker(0x7f1e93911450))._invoke(Message(19100, 19095, 19095, 110, 1005, '\x80\x02U\x1fmitogen.service.PushFileServiceq\x01U\x11store_and_f'..8955)): <bound method Receiver._on_receive of Receiver(Router(Broker(0x7f1e93911450)), 110)> crashed
    Traceback (most recent call last):
      File "<stdin>", line 1471, in _invoke
      File "<stdin>", line 491, in _on_receive
    AttributeError: 'Receiver' object has no attribute '_latch'
6 years ago
David Wilson 888829544a issue #280: move find_module() log output to IOLOG
It just generates far too much spam, and its final decision is obvious
since a followup load_module() will exist for positive matches.
6 years ago
David Wilson 05e0b134f9 service: simplify CALL_SERVICE stub and fix race.
If PushService.store_and_forward() loses the race to arrive at a brand
new context first, and the context's main thread is already executing a
CALL_FUNCTION that is blocked on the result of PushService, deadlock
could occur in the old scheme.

Instead (for now) simply spam a thread for each incoming message, and
use the get_or_create_pool() lock to ensure things work out in the end.
This could potentially generate a huge number of threads given the wrong
app, but we'll fix that problem when it appears.
6 years ago
David Wilson 92ecf29559 core: check in the hacks that let Ansible work just now. 6 years ago
David Wilson 9e78c20eba core/parent: add Context.call_no_reply(). 6 years ago
David Wilson b3a5fa70b0 core: copy debug setting to child's Router too.
core.Router doesn't pay attention to this attribute, but after
upgrade_router() has been called, the new parent.Router will.
6 years ago
David Wilson 785df88fa4 issue #186: core: remove long-forgotten hack.
This is likely to break something, it was definitely needed at some
point, but I never put much effort into figuring out why. Meanwhile,
Python appears to make find_module('ansible.module_utils.facts.')
requests in some circumstances, which causes us to indicate the module
exists while this hack exists.

So remove it, and let's see what breaks.
6 years ago
David Wilson 34daec4a7a core: prevent warning when CALL_FUNCTION used without reply_to
Such as when the stub CALL_SERVICE handler is used.
6 years ago
David Wilson f7d2eace08 tests: importer fixes 6 years ago
David Wilson 9492dbc4d7 parent: split out minify.py and add stub where master can install it.
This needs a cleaner mechanism to install it, at least this one is
documented.
6 years ago
David Wilson 3b0addcfb0 service: v2. Closes #213 6 years ago
David Wilson a4ddef25a1 core: move reader/writer debug prints
They stop working with kqueue/epoll poller in the old location. Also
comment them out again, should never have been checked in uncommented.
6 years ago
David Wilson fc59f57ba2 issue #213: core: split out import_module() for use in services.py. 6 years ago
David Wilson 49fb25ee1c issue #213: core: fix shutdown crash due to member variable rename 6 years ago
David Wilson 40c6c6426f issue #213: core: fix test breakage due to log message change 6 years ago
David Wilson 2310497d55 issue #213: core: have Message.reply() log msg for zero reply_to
It's easy to call msg.reply() by accident on a message that never had
reply_to set, resulting in a "invalid handle" error message coming from
router. Instead log a more accurate message on the stack that actualy
caused the problem.
6 years ago
David Wilson d2714752ee docs: tidy ups 6 years ago
David Wilson 61365236ad docs/select: fix up more references, fix headings. 6 years ago
David Wilson ddf28987a0 master: split Select() into new module to reduce wire size.
service.py currently imports master.py(+parent.py) just to get Select().
6 years ago
David Wilson 7a592d1c34 core: better Poller.__repr__ 6 years ago
David Wilson b0ce6eecd7 fork: support on_start= argument. 6 years ago
David Wilson 00edf0d66d core: have ExternalContext accept a config dict rather than kwargs.
The parameter lists had gotten out of control.
6 years ago
David Wilson 55fff54774 core: make try/catch logic a little clearer in Latch.get() 6 years ago
David Wilson 05a5f2b6e5 core: if Poller.poll() fails, TimeoutError would be raised.
We must check whether poller threw an exception both in the case that we
weren't woken and the case that we were.
6 years ago
David Wilson 5bdc1719c5 issue #249: epoll() raises IOError for EINTR, not select.error. 6 years ago
David Wilson 07056b0dd1 issue #249: fix ordering bug masked by previous implementation 6 years ago
David Wilson 36a1024861 issue #249: port Latch to poller too.
This is probably going to suck for perf :/
6 years ago
David Wilson dcf0aa351e issue #249: whoops, fix new poller timeouts. 6 years ago
David Wilson 4df020827d issue #249: explicitly close pollers when done. 6 years ago
David Wilson 9abcf63155 issue #249: Poller API v2 (BSD only).
Now it's BasicStream/Side-agnostic, so it can be reused for Latch and
iter_read().
6 years ago