Commit Graph

1045 Commits (master)

Author SHA1 Message Date
David Wilson 30ae3d85cb compat: fix Py2.4 SyntaxError 5 years ago
David Wilson 2ee0e07037 core: MitogenProtocol.is_privileged was not set in children
Follow the previous unidirectional routing fix, now errors are occurring
where they should not.
5 years ago
David Wilson 5924af1566 [security] core: undirectional routing wasn't respected in some cases
When creating a context using Router.method(via=somechild),
unidirectional mode was set on the new child correctly, however if the
child were to call Router.method(), due to a typing mistake the new
child would start without it.

This doesn't impact the Ansible extension, as only forked tasks are
started directly by children, and they are not responsible for routing
messages.

Add test so it can't happen again.
5 years ago
David Wilson 436a4b3b3c docs: tidy up Select.all() 5 years ago
Marc Hartmayer 2ed8395d6c master: fix TypeError
Add a guard for the case `path == None`.

This commit fixes

`TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType`
5 years ago
Marc Hartmayer 0a6c0cd8fb pkgutil: fix Python3 compatibility
Starting with Python3 the `as` clause must be used to associate a name to the
exception being passed.
5 years ago
Marc Hartmayer 444b7d6d97 parent: use protocol for getting remote_id
Fixes 8d1b01d8ef ("Refactor Stream, introduce quasi-asynchronous connect, much
more").
5 years ago
David Wilson e3dcce2069 os_fork: do not attempt to cork the active thread. 5 years ago
David Wilson 3231c62a66 parent: fix get_log_level() for split out loggers. 5 years ago
David Wilson cc02906d2a issue #547: fix service_test failures. 5 years ago
David Wilson 769a8b2015 issue #547: core/service: race/deadlock-free service pool init
The previous method of spinning up a transient thread to import the
service pool in a child context could deadlock with use of the importer
on the main thread. Therefore wake the main thread to handle import for
us, and use a regular Receiver to buffer messages to the stub, which is
inherited rather than replaced by the real service pool.
5 years ago
David Wilson ecc570cbda select: make Select.add() handle multiple buffered items.
Previously given something like:

    l = mitogen.core.Latch()
    l.put(1)
    l.put(2)

    s = mitogen.select.Select([l], oneshot=False)
    assert 1 == s.get(block=False)
    assert 2 == s.get(block=False)

The second call would throw TimeoutError, because Select.add() only
queued the receiver/latch once if it was non-empty, rather than once for
each item as should happen.
5 years ago
David Wilson 49a6446af8 core/select: add {Select,Latch,Receiver}.size(), deprecate empty()
Knowing an estimate of the buffered items is needed for adding a
latch/receiver with many existing buffered items via Select.add().
5 years ago
David Wilson 95b067a114 parent: docstring fixes 5 years ago
David Wilson b33b29af33 core: remove dead Router.on_shutdown() and Router "shutdown" signal
Its functionality was duplicated by _on_broker_exit() somewhere along
the way, and nothing has referred to it in a long time. I have no idea
how this happened.

Merge its docstring into _on_broker_exit() and delete it, remove the
Router "shutdown" signal after confirming it has no users, and move all
the Router-originated error messages together in a block at the top of
the class.

Already covered by router_test.AddHandlerTest.test_dead_message_sent_at_shutdown
5 years ago
David Wilson 70deb34bce [stream-refactor] stop leaking FD 100 for the life of the child
This prevents successful detachment since [stream-refactor] landed
5 years ago
David Wilson f304ab8dec core: split preserve_tty_fp() out into a function 5 years ago
David Wilson f4cee16526 parent: zombie reaping v3
Improvements:

- Refactored off Process, separately testable without a connection
- Don't delay Broker shutdown indefinitely for detached children
5 years ago
David Wilson 709a0c013f issue #410: fix test failure due to obsolete parentfp/childfp 5 years ago
David Wilson 9c0cb44ee9 issue #170: replace Timer.cancelled with Timer.active
It's more flexable: False can represent 'cancelled' or 'expired',
whereas setting cancelled=True for an expired timer didn't feel right.
5 years ago
David Wilson 9839e6781c core: more descriptive graceful shutdown timeout error
Accounts for timers too
Tidy up a wordy comment further down the file
5 years ago
David Wilson 65bec2244d core: fix Python2.4 crash due to missing Logger.getChild(). 5 years ago
David Wilson e8b1bf5909 issue #410: automatically work around SELinux braindamage. 5 years ago
David Wilson ce04fd39c9 core: cache stream reference in DelimitedProtocol
Stream.set_protocol() was updated to break the reference on the previous
protocol, to encourage a crash should an old protocol continue operating
after it's not supposed to be active any more.

That broke DelimitedProtocol's protocol switching functionality.
5 years ago
David Wilson ad590f3321 parent: docstring formatting 5 years ago
David Wilson a79d2bd50b docs: another round of docstring cleanups. 5 years ago
David Wilson 20532ea591 master: allow filtering forwarded logs using logging package functions.
Given a message sent on "ssh.foo" to "mypkg.mymod", instead of logging
it to "mitogen.ctx.ssh.foo" in the master process, with the message
prefixed with the original logger name, instead log it to
"mypkg.mymod.[ssh.foo]", permitting normal logging package filtering
features to work as they usually do.

This also helps tidy up logging output a little bit.
5 years ago
David Wilson feb1654305 docs: many more internals.rst tidyups 5 years ago
David Wilson 11c7e3f561 service: centralize fetching thread name, and tidy up logs 5 years ago
David Wilson f0782ccd42 [stream-refactor] get caught up on internals.rst updates 5 years ago
David Wilson 7379144a12 Stop using mitogen root logger in more modules, remove unused loggers 5 years ago
David Wilson b76da4698b parent: move subprocess creation to mux thread too
Now connect() really is a pure blocking wrapper.
5 years ago
David Wilson 5298e87548 Split out and make readable more log messages across both packages 5 years ago
David Wilson aa06b960f5 parent: define Connection behaviour during Broker.shutdown()
- Connection attempt fails reliably, and it fails with CancelledError
- Add new mitogen.core.unlisten()
- Add test.
5 years ago
David Wilson cebccf6f41 issue #549 / [stream-refactor]: fix close/poller deregister crash on OSX
See source comment.
5 years ago
David Wilson 2fede49078 service: clean up log messages, especially at shutdown 5 years ago
David Wilson 75d179e4b9 remove unused imports flagged by lgtm 5 years ago
David Wilson 45a3014fd4 parent: decode logged stdout as UTF-8. 5 years ago
David Wilson 3b000c7d15 unix: include more IO in the try/except for connection failure 5 years ago
David Wilson 108015aa22 ansible: gracefully handle failure to connect to MuxProcess
It's possible to hit an ugly exception during early CTRL+C
5 years ago
David Wilson 1fca0b7a94 [linear2] fix MuxProcess test fixture and some merge fallout 5 years ago
David Wilson e93762b3db service: avoid taking another lock in the usual case 5 years ago
David Wilson 50bfe4c746 service: don't acquire lock when pool already initialized 5 years ago
David Wilson f4709b1dc2 profiler: marginal improvements 5 years ago
David Wilson 3b585b841e core: ensure 'exit' signal fires even on Broker crash. 5 years ago
David Wilson d6faff06c1 core: wake Waker outside of lock.
Given:

- Broker asleep in poll()
- thread B calling Latch.put()

Previously,

- B takes lock,
- B wakes socket by dropping GIL and writing to it
- Broker wakes from poll(), acquires GIL only to find Latch._lock is held
- Broker drops GIL, sleeps on futex() for _lock
- B wakes, acquires GIL, releases _lock
- Broker wakes from futex(), acquires lock

Now,

- B takes lock, updates state, releases lock
- B wakes socket by droppping GIL and writing to it
- Broker wakes from poll(), acquires GIL and _lock
- Everyone lives happily ever after.
5 years ago
David Wilson 807cbef9ca core: wake Latch outside of lock.
Given:

- thread A asleep in Latch._get_sleep()
- thread B calling Latch.put()

Previously,

- B takes lock,
- B wakes socket by dropping GIL and writing to it
- A wakes from poll(), acquires GIL only to find Latch._lock is held
- A drops GIL, sleeps on futex() for _lock
- B wakes, acquires GIL, releases _lock
- A wakes from futex(), acquires lock

Now,

- B takes lock, updates state, releases lock
- B wakes socket by droppping GIL and writing to it
- A wakes from poll(), acquires GIL and _lock
- Everyone lives happily ever after.
5 years ago
David Wilson 7e51a93231 core: remove old blocking call guard, it's in the wrong place
It should have been in Receiver.get(). Placing it here prevents
*_async() method calls from broker thread.
5 years ago
David Wilson 9035884c77 ansible: abstract worker process model.
Move all details of broker/router setup out of connection.py, instead
deferring it to a WorkerModel class exported by process.py via
get_worker_model(). The running strategy can override the configured
worker model via _get_worker_model().

ClassicWorkerModel is installed by default, which implements the
extension's existing process model.

Add optional support for the third party setproctitle module, so
children have pretty names in ps output.

Add optional support for per-CPU multiplexers to classic runs.
5 years ago
David Wilson 6b8a7cbcc4 [stream-refactor] parent: fix crash on graceful shutdown
Now it's possible for stream.protocol to not refer to MitogenProtocol,
move the signal handler to a MitogenProtocol subclass instead.

Fixes a crash where CTRL+C during child bootstrap would print
AttributeError.
5 years ago
David Wilson 2ccdeeeb87 parent: tidy up create_socketpair() 5 years ago
David Wilson c0513425ca core: more concise Side.repr. 5 years ago
David Wilson f45d8eae66 [stream-refactor] replace cutpaste with Stream.accept() in mitogen.unix 5 years ago
David Wilson 1843f183a3 [stream-refactor] fix flake8 errors 5 years ago
David Wilson c02358698b [stream-refactor] don't abort Connection until all buffers are empty 5 years ago
David Wilson 93342ba60c Normalize docstring formatting 5 years ago
David Wilson 4e6aadc40a [stream-refactor] fix LogHandler.uncork() race
During early initialization under hackbench, it is possible for Broker
to be in LogHandler._send() while the main thread has already destroyed
_buffer. So we must synchronize them, but only while the handler is
corked.
5 years ago
David Wilson 90c989ee59 [stream-refactor] BufferedWriter must disconenct Stream, not Protocol
Fix a race where if Stream.on_receive() detects disconnect, it calls
Stream.on_disconnect(), which fires Stream 'disconnect' event, whereas
if BufferedWriter.on_transmit() detects disconnect, it called
Protocol.on_disconnect(), which did not fire the Stream 'disconnect'
event.

Since mitogen.parent listens on Stream's 'disconnect' event to reap
children, this was causing a very difficult to trigger test failure.

Triggered after <1000 runs on a Xeon E5530 with hyperthreading using
hackbench running at the same priority:

    $ hackbench -s 1048576 -l 100000000000 -g 4
5 years ago
David Wilson 65e31f63fe [stream-refactor] fix Py2.4 failure by implementing missing Timer method 5 years ago
David Wilson 11ae6f3873 core: better Side attribute docstrings 5 years ago
David Wilson fdf3484a2a [stream-refactor] 3.x socket.send() requires bytes 5 years ago
David Wilson c09bbdc2f9 [stream-refactor] fix 2.4 syntax error. 5 years ago
David Wilson b1379e6f45 [stream-refactor] send MITO002 earlier
Prevents 2.4 bootstrap from attempting to fetch os_fork too early.

Connection(None).connect(): pid:25098 stdin:81 stdout:81 stderr:79
ssh.localhost:2201: (partial): mitogen__has_sudo_nopw@localhost's password:
ssh.localhost:2201: (password prompt): mitogen__has_sudo_nopw@localhost's password:
ssh.localhost:2201: (unrecognized): mitogen__has_sudo_nopw@localhost's password:
BootstrapProtocol(ssh.localhost:2201): first stage started succcessfully
BootstrapProtocol(ssh.localhost:2201): first stage received bootstrap
ssh.localhost:2201: (partial): MIdmitogen.os_fork
ssh.localhost:2201: (unrecognized partial): MIdmitogen.os_fork
ssh.localhost:2201: failing connection due to TimeoutError(u'Failed to setup connection after 10.00 seconds',)
5 years ago
David Wilson 4eecc08047 [stream-refactor] merge stdout+stderr when reporting EofError
Fixes sudo regression
5 years ago
David Wilson 1d2bfc28da [stream-refactor] fix crash in detach() / during async/multiple_items_loop.yml 5 years ago
David Wilson 93abbcaf7a [stream-refactor] fix crash in runner/forking_active.yml 5 years ago
David Wilson 6e33de7cd2 unix: ensure mitogen.context_id is reset when client disconnects
To ensure a test process can successfully recreate an Ansible
MuxProcess, reset fork-inherited globals during disconnection.

There is basically no good place for this. Per the comments on #91, it
would be far better if the context's identity was tied to its router,
rather than some global variable.
5 years ago
David Wilson 7c4621a010 [stream-refactor] make syntax 2.4 compatible 5 years ago
David Wilson 0ff5fb8fc4 [stream-refactor] fix su_test failure (issue #363) 5 years ago
David Wilson 8769c3ce24 [stream-refactor] more readable log string format 5 years ago
David Wilson d411003b64 [stream-refactor] dont doubly log last partial line 5 years ago
David Wilson 1069ca43d6 [stream-refactor] port mitogen.buildah, added to master since work began 5 years ago
David Wilson 26b6333787 [stream-refactor] fix unix.Listener construction 5 years ago
David Wilson 1fb3852fa6 [stream-refactor] fix crash when no stderr present. 5 years ago
David Wilson 4b0870aa6e [stream-refactor] fix Process constructor invocation 5 years ago
David Wilson f039c81bb0 [stream-refactor] rename Process attrs, fix up more create_child_test 5 years ago
David Wilson acade4ce88 ssh: fix issue #271 regression due to refactor, add test. 5 years ago
David Wilson 8d1b01d8ef Refactor Stream, introduce quasi-asynchronous connect, much more
Split Stream into many, many classes

  * mitogen.parent.Connection: Handles connection setup logic only.
    * Maintain references to stdout and stderr streams.
    * Manages TimerList timer to cancel connection attempt after
      deadline
    * Blocking setup code replaced by async equivalents running on the
      broker

  * mitogen.parent.Options: Tracks connection-specific options. This
    keeps the connection class small, but more importantly, it is
    generic to the future desire to build and execute command lines
    without starting a full connection.

  * mitogen.core.Protocol: Handles program behaviour relating to events
    on a stream. Protocol performs no IO of its own, instead deferring
    it to Stream and Side. This makes testing much easier, and means
    libssh can reimplement Stream and Side to reuse MitogenProtocol

  * mitogen.core.MitogenProtocol: Guts of the old Mitogen stream
    implementtion

  * mitogen.core.BufferedWriter: Guts of the old Mitogen buffered
    transmit implementation, made generic

  * mitogen.core.DelineatedProtocol: Guts of the old IoLogger, knows how
    to split up input and pass it on to a
    on_line_received()/on_partial_line_received() callback.

  * mitogen.parent.BootstrapProtocol: Asynchronous equivalent of the old
    blocking connect code. Waits for various prompts (MITO001 etc) and
    writes the bootstrap using a BufferedWriter. On success, switches
    the stream to MitogenProtocol.

  * mitogen.core.Message: move encoding parts of MitogenProtocol out to
    Message (where it belongs) and write a bunch of new tests for
    pickling.

  * The bizarre Stream.construct() is gone now, Option.__init__ is its
    own constructor. Should fix many LGTM errors.

* Update all connection methods:  Every connection method is updated to
  use async logic, defining protocols as required to handle interactive
  prompts like in SSH or su. Add new real integration tests for at least
  doas and su.

* Eliminate manual fd management: File descriptors are trapped in file
  objects at their point of origin, and Side is updated to use file
  objects rather than raw descriptors. This eliminates a whole class of
  bugs where unrelated FDs could be closed by the wrong component. Now
  an FD's open/closed status is fused to it everywhere in the library.

* Halve file descriptor usage: now FD open/close state is tracked by
  its file object, we don't need to duplicate FDs everywhere so that
  receive/transmit side can be closed independently. Instead both sides
  back on to the same file object. Closes #26, Closes #470.

* Remove most uses of dup/dup2: Closes #256. File descriptors are
  trapped in a common file object and shared among classes. The
  remaining few uses for dup/dup2 are as close to minimal as possible.

* Introduce mitogen.parent.Process: uniform interface for subprocesses
  created either via mitogen.fork or the subprocess module. Remove all
  the crap where we steal a pid from subprocess guts. Now we use
  subprocess to manage its processes as it should be. Closes #169 by
  using the new Timers facility to poll for a slow-to-exit subprocess.

* Fix su password race: Closes #363. DelineatedProtocol naturally
  retries partially received lines, preventing the cause of the original
  race.

* Delete old blocking IO utility functions
  iter_read()/write_all()/discard_until().

Closes #26
Closes #147
Closes #169
Closes #256
Closes #363
Closes #419
Closes #470
5 years ago
David Wilson 37beb3a5c5 core: teach iter_split() to break on callback returning False. 5 years ago
David Wilson 33ecc8a5d2 issue #507: log fatal errors to syslog.
Next round should log entire exception text, but this is useful enough
already.
5 years ago
David Wilson 46ebd56c7a core/master: docstring, repr, and debug log message cleanups
Debug output is vastly more readable now.
5 years ago
David Wilson 0f7bbcece9 parent: remove unused Timer parameter. 5 years ago
David Wilson f66611dc83 parent: docstring improvements, cfmakeraw() regression. 5 years ago
David Wilson c7ebb39ad4 core: introduce Protocol, DelimitedProtocol and BufferedWriter.
They aren't wired in yet as of this commit, and continue duplicating
other code.
5 years ago
David Wilson d368971749 core: introduce mitogen.core.pipe()
It's used in later commit. This is an os.pipe() wrapper that traps the
file descriptors in a file object, to ensure leaked objects will
eventually be collected, and a central place exists to track open/closed
status.
5 years ago
David Wilson d1f5e0663d core: move message encoding to Message.pack(), add+refactor tests.
The old inline pack is still present in the old location but will be
removed in a followup commit.
5 years ago
David Wilson 3f1ef6e243 master: expect forwarded logs to be in UTF-8.
latin1 was causing corruption of internationalized messages.
5 years ago
David Wilson aba396d4c5 core: bootstrap FD management improvements
- open fd 0/1/2 with correct file mode
- trap reserve_tty_fd in a file object, since all other FD manual FD
  management is going away.
5 years ago
David Wilson 4f0a946f30 core: pending timers should keep broker alive. 5 years ago
David Wilson 237a3babaf core: more succinct iter_split(). 5 years ago
David Wilson dfefc4c05c core: replace UTF8_CODEC with encodings.utf_8.encode() function. 5 years ago
David Wilson 3f90030c1e core: docstring style cleanups, dead code. 5 years ago
David Wilson 70ff4b674c parent: discard cancelled events in TimerList.get_timeout().
Otherwise get_timeout() keeps broker alive via keep_alive() for a
cancelled timer during shutdown.
5 years ago
David Wilson 5aca9d6c3f core: split out iter_split() for use in parent.py. 5 years ago
David Wilson f43f886c37 parent: various style cleanups, remove unused function. 5 years ago
David Wilson 3aded0ca95 issue #170: add TimerList docstrings. 5 years ago
David Wilson a5536c3514 core: eliminate some quadratric behaviour from IoLogger
This is the same problem as used to afflict Stream: large input buffers
containing many small messages cause intense string copying. This
eliminates the worst of it.
5 years ago
David Wilson 2fbc77a155 issue #170: implement timers. 5 years ago
Jordan Webb 1a02a86331
Add buildah transport 5 years ago
David Wilson 874e75276f issue #589: ensure real FileService/PushFileService are in the docs 5 years ago
David Wilson ed8acb5153 master: sysconfig did not exist until 2.7. 5 years ago
David Wilson 5eb10aacef master: fix _is_stdlib_path() failure on Ubuntu. 5 years ago
David Wilson 72ab917c89 issue #590: add FinderMethod docstrings. 5 years ago
David Wilson 875ff5c060 issue #590: refactor ModuleFinder and teach it a new special case.
Now it's possible to find both packages and modules when the
sys.modules[...] state for the package/module is junk. Previously only
modules were possible.

This also refactors things to make writing better tests for all these
cases much simpler.
5 years ago
David Wilson 8f940e2ccb issue #590: teach importer to handle self-replacing modules 5 years ago
David Wilson ba2c65d5ef Bump version for release. 5 years ago
David Wilson ee62c57c9d issue #576: fix Kwargs minor version check.
Unicode kwargs were introduced in Python 2.6.5, not 2.6.0.
5 years ago
David Wilson c255a8b562 Bump version for release. 5 years ago
David Wilson bd4d55dc90 issue #550: parent: add explanatory comment. 5 years ago
David Wilson 87c8ab4323 issue #550: fix up TTY ioctls on WSL 2016 Anniversary Update 5 years ago
David Wilson ae8ba24f59 service: make service list optional.
Used by the new work.
5 years ago
David Wilson d51e70636d os_fork: more doc tweaks 5 years ago
David Wilson 7763549653 os_fork: more doc tweaks 5 years ago
David Wilson add357a029 os_fork: yet more doc tidyup 5 years ago
David Wilson 0a66ca72ef os_fork: more doc tweaks 5 years ago
David Wilson 5dc0bd6f8d os_fork: clean up docs 5 years ago
David Wilson c413d53144 os_fork: python 3 fixes and tests. 5 years ago
David Wilson 18b984a0b4 issue #535: activate Corker on 2.4 in master too. 5 years ago
David Wilson 06e52ca89f issue #535: wire mitogen.os_fork into Broker and Pool. 5 years ago
David Wilson c1d73e1f4f issue #535: parent: add create_socketpair(size=..) parameter. 5 years ago
David Wilson 63f4864b21 issue #535: introduce mitogen.os_fork module and Corker class. 5 years ago
David Wilson 514d35fd10 issue #535: service: support Pool.defer() like Broker.defer() 5 years ago
David Wilson eb9ec26622 issue #535: core: unicode.encode() may take importer lock on 2.x
Found on Python 2.4, where import happens immediately following connect.

- Main thread executes import statement, triggers request to parent
- Broker thread attempts to deliver request via Router
- Router discovers parent has disconnected, prepares a dead message
- .dead() calls unicode.encode() to format reason string
- .encode() attemptsto import a codec module
- deadlock

----

(gdb) pystack
/usr/local/python2.4.6/lib/python2.4/encodings/__init__.py (69): search_function
<stdin> (733): dead
<stdin> (2717): _maybe_send_dead
<stdin> (2724): _invoke
<stdin> (2749): _async_route
<stdin> (1635): _receive_one
<stdin> (1603): _internal_receive
<stdin> (1613): on_receive
<stdin> (2931): _call
<stdin> (2942): _loop_once
<stdin> (2988): _do_broker_main
<stdin> (545): _profile_hook
<stdin> (3007): _broker_main
/usr/local/python2.4.6/lib/python2.4/threading.py (420): run
/usr/local/python2.4.6/lib/python2.4/threading.py (424): __bootstrap
5 years ago
David Wilson 72862f0bb9 issue #535: docs: fix up Select doc 5 years ago
David Wilson b3f592acee issue #535: core/select: support selecting from Latches. 5 years ago
David Wilson 7d0480e8bd core: increase cookie field lengths to 64-bit; closes #545. 5 years ago
David Wilson ca63c26e01 core: Make Latch.put(obj=) optional. 5 years ago
David Wilson 9bcd2ec56c issue #542: return of select poller, new selection logic 5 years ago
David Wilson e010667230 Bump version for release. 5 years ago
David Wilson 2a8567b432 core: serialize calls to _service_stub_main().
See comment.
5 years ago
David Wilson d4c0250083 issue #532: PushFileService race.
There has always been a race in PushFileService since given a parent
asked to forward modules to two children via some intermediary:

    interm = router.local()
    c1 = router.local(via=interm)
    c2 = router.local(via=interm)

    service.propagate_to(c1, 'foo/bar.py')
    service.propagate_to(c2, 'foo/bar.py')

Two calls will be emitted to 'interm':

    PushFileService.store_and_forward(c1, 'foo/bar.py', [blob])
    PushFileService.store(c2, 'foo/bar.py')

Which will be processed in-order up to the point where service pool
threads in 'interm' are woken to process the message.

While it is guaranteed store_and_forward() will be processed first, no
guarantee existed that its assigned pool thread would wake and take
_lock first, thus it was possible for forward() to win the race, and for
a request to arrive to forward a file that had not been placed in local
cache yet.

Here we get rid of SerializedInvoker entirely, as it is partially to
blame for hiding the race: SerializedInvoker can only ensure no two
messages are processed simultaneously, it cannot ensure the messages are
processed in their intended order.

Instead, teach forward() that it may be called before
store_and_forward(), and if that is the case, to place the forward
request on to _waiters alongside any local threads blocked in get().
5 years ago
David Wilson 1f77d24bec Update copyright year everywhere. 5 years ago
David Wilson 7ff4e6694c issue #536: rework how 2.3-compatible simplejson is served
Regardless of the version of simplejson loaded in the master, load up
the ModuleResponder cache with our 2.4-compatible version.

To cope with simplejson being loaded due to modules like ec2_group that
try to import it before importing 'json', also update target.py to
remove it from the whitelist if a local 'json' module import succeeds.
5 years ago
David Wilson fa0c25bb2d Bump version for release. 5 years ago
David Wilson 78ec634dab issue #481: core: preserve stderr TTY FD if one is present.
Since 802de6a8d5, sudo on CentOS 5 had
begun failing due to a TTY FD leak in the parent process being fixed.

The old versions of sudo doesn't hang around after starting a child --
they exec the privilege-escalated child process on top of themselves,
meaning no spare copy of the TTY FD is kept alive by sudo.

When the child starts up, it replaces stdio with IoLoggers, including
the inherited stderr FD connected to DiagLogStream/the slave PTY. When
the last process closes a slave PTY, the kernel sends SIGHUP to any
processes still having it as the controlling TTY.

Therefore we must either ignore SIGHUP until the first stage has been
waited on (since the first stage also preserve the FD), or dup the
inherited TTY FD and keep it around forever.

Wasting one FD seems less annoying than modifying process signals for
all potential library users, so that is the approach taken here.
5 years ago
David Wilson b263e01867 issue #481: avoid crash if disconnect occurs during forward_modules() 5 years ago
David Wilson 4abd34e7a6 issue #520: add AIX auth failure string to su. 5 years ago
David Wilson 5ae7464011 core: cProfile is not available in 2.4. 5 years ago
David Wilson 54835d4c9b service: fix PushFileService exception
[costapp]

ERROR! [pid 25135] 21:10:56.284733 E mitogen.ctx.ssh.35.200.203.48: mitogen: While calling no-reply method PushFileService.forward
Traceback (most recent call last):
  File "master:/home/dmw/src/mitogen/mitogen/service.py", line 260, in _invoke
    ret = method(**kwargs)
  File "master:/home/dmw/src/mitogen/mitogen/service.py", line 718, in forward
    self._forward(path, context)
  File "master:/home/dmw/src/mitogen/mitogen/service.py", line 633, in _forward
    stream = self.router.stream_by_id(context.context_id)
AttributeError: 'unicode' object has no attribute 'context_id'
^C [ERROR]: User interrupted execution
5 years ago
David Wilson 23d7a961e7 service: start pool shutdown on broker shutdown. 5 years ago
David Wilson 85cfa3b0f5 master: .encode() needed for Py3. 5 years ago
David Wilson 1d509d03ff issue #508: master: minify_safe_re must be bytes for Py3. 5 years ago
David Wilson 0e193c223c issue #508: master: minify all Mitogen/ansible_mitogen sources.
Minify-safe files are marked with a magical "# !mitogen: minify_safe"
comment anywhere in the file, which activates the minifier. The result
is naturally cached by ModuleResponder, therefore lru_cache is gone too.

Given:

    import os, mitogen
    @mitogen.main()
    def main(router):
        c = router.ssh(hostname='k3')
        c.call(os.getpid)
        router.sudo(via=c)

SSH footprint drops from 56.2 KiB to 42.75 KiB (-23.9%)
Ansible "shell: hostname" drops 149.26 KiB to 117.42 KiB (-21.3%)
5 years ago
David Wilson cfb94e463f parent: PartialZlib docstrings. 5 years ago
David Wilson 9adc38d8ec parent: pre-cache bootstrap if possible.
When the interpreter is modern enough, use zlib.compressobj() to
pre-compress the unchanging parts of the bootstrap once, then use
compressobj.copy() to append just the context's config during stream
construction.

Before: 100 loops, best of 3: 5.81 msec per loop
After: 10000 loops, best of 3: 35.9 usec per loop

With 100 targets this is enough to knock 6 seconds off startup, at 500
targets it becomes half a minute.

Test 'program':
        python -m timeit -s '
                import mitogen.parent as p;
                import mitogen.master as m;
                r=m.Router();
                s=p.Stream(r, 0, max_message_size=1);
                r.broker.shutdown()'\
                \
                's.get_preamble()'
5 years ago
David Wilson d6c4a983e1 service: PushFileService never recorded a file as sent.
Ansible modules were being resent continuously - but only the main
script module, and any custom modutils if any were present.

Wire footprint drops by ~1/3rd for a 500 task run of 'shell: hostname':

-rw-r--r-- 1 root root 584K Jan 31 22:06 500mito-before2
-rw-r--r-- 1 root root 434K Jan 31 22:04 500mito-filesbugonly
5 years ago
David Wilson 7ca927608c parent: synchronize get_core_source()
Single task 100 SSH target run, before:

        3533181 function calls (3533083 primitive calls) in 616.688 seconds
        User time (seconds): 32.52
        System time (seconds): 2.71
        Percent of CPU this job got: 64%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:54.88

After:

        451602 function calls (451504 primitive calls) in 570.746 seconds
        User time (seconds): 29.48
        System time (seconds): 2.81
        Percent of CPU this job got: 67%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:48.20
5 years ago
David Wilson 2399a9e621 service: use correct profile aggregation name. 5 years ago
David Wilson c6d5aa29ba ansible: new multiplexer/workers configuration
Following on from 152effc26c9a5918cb7ead7a97fe7fa7f81b6764,

* Pin mux to CPU 0
* Pin top-level CPU 1
* Pin workers sequentially to CPU 2..n

Nets 19.5% improvement on issue_140__thread_pileup.yml when targetting
64 Docker containers on the same 8 core/16 thread machine.

Before (prior to last scheme, no affinity at all):

    2294528.731458      task-clock (msec)         #    6.443 CPUs utilized
        10,429,745      context-switches          #    0.005 M/sec
         2,049,618      cpu-migrations            #    0.893 K/sec
         8,258,952      page-faults               #    0.004 M/sec
 5,532,719,253,824      cycles                    #    2.411 GHz                      (83.35%)
 3,267,471,616,230      instructions              #    0.59  insn per cycle
                                                  #    1.22  stalled cycles per insn  (83.35%)
   662,006,455,943      branches                  #  288.515 M/sec                    (83.33%)
    39,453,895,977      branch-misses             #    5.96% of all branches          (83.37%)

     356.148064576 seconds time elapsed

After:

    2226463.958975      task-clock (msec)         #    7.784 CPUs utilized
         9,831,466      context-switches          #    0.004 M/sec
           180,065      cpu-migrations            #    0.081 K/sec
         5,082,278      page-faults               #    0.002 M/sec
 5,592,548,587,259      cycles                    #    2.512 GHz                      (83.35%)
 3,135,038,855,414      instructions              #    0.56  insn per cycle
                                                  #    1.32  stalled cycles per insn  (83.32%)
   636,397,509,232      branches                  #  285.833 M/sec                    (83.30%)
    39,135,441,790      branch-misses             #    6.15% of all branches          (83.35%)

     286.036681644 seconds time elapsed
5 years ago
David Wilson e77048ec2d utils: pad out reset_affinity() and integrate with detach_popen() 5 years ago