Commit Graph

846 Commits (8fc491ac4309b6e18b5bf68e6569b18cbd37e040)

Author SHA1 Message Date
David Wilson 874e75276f issue #589: ensure real FileService/PushFileService are in the docs 5 years ago
David Wilson ed8acb5153 master: sysconfig did not exist until 2.7. 5 years ago
David Wilson 5eb10aacef master: fix _is_stdlib_path() failure on Ubuntu. 5 years ago
David Wilson 72ab917c89 issue #590: add FinderMethod docstrings. 5 years ago
David Wilson 875ff5c060 issue #590: refactor ModuleFinder and teach it a new special case.
Now it's possible to find both packages and modules when the
sys.modules[...] state for the package/module is junk. Previously only
modules were possible.

This also refactors things to make writing better tests for all these
cases much simpler.
5 years ago
David Wilson 8f940e2ccb issue #590: teach importer to handle self-replacing modules 5 years ago
David Wilson ba2c65d5ef Bump version for release. 5 years ago
David Wilson ee62c57c9d issue #576: fix Kwargs minor version check.
Unicode kwargs were introduced in Python 2.6.5, not 2.6.0.
5 years ago
David Wilson c255a8b562 Bump version for release. 5 years ago
David Wilson bd4d55dc90 issue #550: parent: add explanatory comment. 5 years ago
David Wilson 87c8ab4323 issue #550: fix up TTY ioctls on WSL 2016 Anniversary Update 5 years ago
David Wilson ae8ba24f59 service: make service list optional.
Used by the new work.
5 years ago
David Wilson d51e70636d os_fork: more doc tweaks 5 years ago
David Wilson 7763549653 os_fork: more doc tweaks 5 years ago
David Wilson add357a029 os_fork: yet more doc tidyup 5 years ago
David Wilson 0a66ca72ef os_fork: more doc tweaks 5 years ago
David Wilson 5dc0bd6f8d os_fork: clean up docs 5 years ago
David Wilson c413d53144 os_fork: python 3 fixes and tests. 5 years ago
David Wilson 18b984a0b4 issue #535: activate Corker on 2.4 in master too. 5 years ago
David Wilson 06e52ca89f issue #535: wire mitogen.os_fork into Broker and Pool. 5 years ago
David Wilson c1d73e1f4f issue #535: parent: add create_socketpair(size=..) parameter. 5 years ago
David Wilson 63f4864b21 issue #535: introduce mitogen.os_fork module and Corker class. 5 years ago
David Wilson 514d35fd10 issue #535: service: support Pool.defer() like Broker.defer() 5 years ago
David Wilson eb9ec26622 issue #535: core: unicode.encode() may take importer lock on 2.x
Found on Python 2.4, where import happens immediately following connect.

- Main thread executes import statement, triggers request to parent
- Broker thread attempts to deliver request via Router
- Router discovers parent has disconnected, prepares a dead message
- .dead() calls unicode.encode() to format reason string
- .encode() attemptsto import a codec module
- deadlock

----

(gdb) pystack
/usr/local/python2.4.6/lib/python2.4/encodings/__init__.py (69): search_function
<stdin> (733): dead
<stdin> (2717): _maybe_send_dead
<stdin> (2724): _invoke
<stdin> (2749): _async_route
<stdin> (1635): _receive_one
<stdin> (1603): _internal_receive
<stdin> (1613): on_receive
<stdin> (2931): _call
<stdin> (2942): _loop_once
<stdin> (2988): _do_broker_main
<stdin> (545): _profile_hook
<stdin> (3007): _broker_main
/usr/local/python2.4.6/lib/python2.4/threading.py (420): run
/usr/local/python2.4.6/lib/python2.4/threading.py (424): __bootstrap
5 years ago
David Wilson 72862f0bb9 issue #535: docs: fix up Select doc 5 years ago
David Wilson b3f592acee issue #535: core/select: support selecting from Latches. 5 years ago
David Wilson 7d0480e8bd core: increase cookie field lengths to 64-bit; closes #545. 5 years ago
David Wilson ca63c26e01 core: Make Latch.put(obj=) optional. 5 years ago
David Wilson 9bcd2ec56c issue #542: return of select poller, new selection logic 5 years ago
David Wilson e010667230 Bump version for release. 5 years ago
David Wilson 2a8567b432 core: serialize calls to _service_stub_main().
See comment.
5 years ago
David Wilson d4c0250083 issue #532: PushFileService race.
There has always been a race in PushFileService since given a parent
asked to forward modules to two children via some intermediary:

    interm = router.local()
    c1 = router.local(via=interm)
    c2 = router.local(via=interm)

    service.propagate_to(c1, 'foo/bar.py')
    service.propagate_to(c2, 'foo/bar.py')

Two calls will be emitted to 'interm':

    PushFileService.store_and_forward(c1, 'foo/bar.py', [blob])
    PushFileService.store(c2, 'foo/bar.py')

Which will be processed in-order up to the point where service pool
threads in 'interm' are woken to process the message.

While it is guaranteed store_and_forward() will be processed first, no
guarantee existed that its assigned pool thread would wake and take
_lock first, thus it was possible for forward() to win the race, and for
a request to arrive to forward a file that had not been placed in local
cache yet.

Here we get rid of SerializedInvoker entirely, as it is partially to
blame for hiding the race: SerializedInvoker can only ensure no two
messages are processed simultaneously, it cannot ensure the messages are
processed in their intended order.

Instead, teach forward() that it may be called before
store_and_forward(), and if that is the case, to place the forward
request on to _waiters alongside any local threads blocked in get().
5 years ago
David Wilson 1f77d24bec Update copyright year everywhere. 5 years ago
David Wilson 7ff4e6694c issue #536: rework how 2.3-compatible simplejson is served
Regardless of the version of simplejson loaded in the master, load up
the ModuleResponder cache with our 2.4-compatible version.

To cope with simplejson being loaded due to modules like ec2_group that
try to import it before importing 'json', also update target.py to
remove it from the whitelist if a local 'json' module import succeeds.
5 years ago
David Wilson fa0c25bb2d Bump version for release. 5 years ago
David Wilson 78ec634dab issue #481: core: preserve stderr TTY FD if one is present.
Since 802de6a8d5, sudo on CentOS 5 had
begun failing due to a TTY FD leak in the parent process being fixed.

The old versions of sudo doesn't hang around after starting a child --
they exec the privilege-escalated child process on top of themselves,
meaning no spare copy of the TTY FD is kept alive by sudo.

When the child starts up, it replaces stdio with IoLoggers, including
the inherited stderr FD connected to DiagLogStream/the slave PTY. When
the last process closes a slave PTY, the kernel sends SIGHUP to any
processes still having it as the controlling TTY.

Therefore we must either ignore SIGHUP until the first stage has been
waited on (since the first stage also preserve the FD), or dup the
inherited TTY FD and keep it around forever.

Wasting one FD seems less annoying than modifying process signals for
all potential library users, so that is the approach taken here.
5 years ago
David Wilson b263e01867 issue #481: avoid crash if disconnect occurs during forward_modules() 5 years ago
David Wilson 4abd34e7a6 issue #520: add AIX auth failure string to su. 5 years ago
David Wilson 5ae7464011 core: cProfile is not available in 2.4. 5 years ago
David Wilson 54835d4c9b service: fix PushFileService exception
[costapp]

ERROR! [pid 25135] 21:10:56.284733 E mitogen.ctx.ssh.35.200.203.48: mitogen: While calling no-reply method PushFileService.forward
Traceback (most recent call last):
  File "master:/home/dmw/src/mitogen/mitogen/service.py", line 260, in _invoke
    ret = method(**kwargs)
  File "master:/home/dmw/src/mitogen/mitogen/service.py", line 718, in forward
    self._forward(path, context)
  File "master:/home/dmw/src/mitogen/mitogen/service.py", line 633, in _forward
    stream = self.router.stream_by_id(context.context_id)
AttributeError: 'unicode' object has no attribute 'context_id'
^C [ERROR]: User interrupted execution
5 years ago
David Wilson 23d7a961e7 service: start pool shutdown on broker shutdown. 5 years ago
David Wilson 85cfa3b0f5 master: .encode() needed for Py3. 5 years ago
David Wilson 1d509d03ff issue #508: master: minify_safe_re must be bytes for Py3. 5 years ago
David Wilson 0e193c223c issue #508: master: minify all Mitogen/ansible_mitogen sources.
Minify-safe files are marked with a magical "# !mitogen: minify_safe"
comment anywhere in the file, which activates the minifier. The result
is naturally cached by ModuleResponder, therefore lru_cache is gone too.

Given:

    import os, mitogen
    @mitogen.main()
    def main(router):
        c = router.ssh(hostname='k3')
        c.call(os.getpid)
        router.sudo(via=c)

SSH footprint drops from 56.2 KiB to 42.75 KiB (-23.9%)
Ansible "shell: hostname" drops 149.26 KiB to 117.42 KiB (-21.3%)
5 years ago
David Wilson cfb94e463f parent: PartialZlib docstrings. 5 years ago
David Wilson 9adc38d8ec parent: pre-cache bootstrap if possible.
When the interpreter is modern enough, use zlib.compressobj() to
pre-compress the unchanging parts of the bootstrap once, then use
compressobj.copy() to append just the context's config during stream
construction.

Before: 100 loops, best of 3: 5.81 msec per loop
After: 10000 loops, best of 3: 35.9 usec per loop

With 100 targets this is enough to knock 6 seconds off startup, at 500
targets it becomes half a minute.

Test 'program':
        python -m timeit -s '
                import mitogen.parent as p;
                import mitogen.master as m;
                r=m.Router();
                s=p.Stream(r, 0, max_message_size=1);
                r.broker.shutdown()'\
                \
                's.get_preamble()'
5 years ago
David Wilson d6c4a983e1 service: PushFileService never recorded a file as sent.
Ansible modules were being resent continuously - but only the main
script module, and any custom modutils if any were present.

Wire footprint drops by ~1/3rd for a 500 task run of 'shell: hostname':

-rw-r--r-- 1 root root 584K Jan 31 22:06 500mito-before2
-rw-r--r-- 1 root root 434K Jan 31 22:04 500mito-filesbugonly
5 years ago
David Wilson 7ca927608c parent: synchronize get_core_source()
Single task 100 SSH target run, before:

        3533181 function calls (3533083 primitive calls) in 616.688 seconds
        User time (seconds): 32.52
        System time (seconds): 2.71
        Percent of CPU this job got: 64%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:54.88

After:

        451602 function calls (451504 primitive calls) in 570.746 seconds
        User time (seconds): 29.48
        System time (seconds): 2.81
        Percent of CPU this job got: 67%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:48.20
5 years ago
David Wilson 2399a9e621 service: use correct profile aggregation name. 5 years ago
David Wilson c6d5aa29ba ansible: new multiplexer/workers configuration
Following on from 152effc26c9a5918cb7ead7a97fe7fa7f81b6764,

* Pin mux to CPU 0
* Pin top-level CPU 1
* Pin workers sequentially to CPU 2..n

Nets 19.5% improvement on issue_140__thread_pileup.yml when targetting
64 Docker containers on the same 8 core/16 thread machine.

Before (prior to last scheme, no affinity at all):

    2294528.731458      task-clock (msec)         #    6.443 CPUs utilized
        10,429,745      context-switches          #    0.005 M/sec
         2,049,618      cpu-migrations            #    0.893 K/sec
         8,258,952      page-faults               #    0.004 M/sec
 5,532,719,253,824      cycles                    #    2.411 GHz                      (83.35%)
 3,267,471,616,230      instructions              #    0.59  insn per cycle
                                                  #    1.22  stalled cycles per insn  (83.35%)
   662,006,455,943      branches                  #  288.515 M/sec                    (83.33%)
    39,453,895,977      branch-misses             #    5.96% of all branches          (83.37%)

     356.148064576 seconds time elapsed

After:

    2226463.958975      task-clock (msec)         #    7.784 CPUs utilized
         9,831,466      context-switches          #    0.004 M/sec
           180,065      cpu-migrations            #    0.081 K/sec
         5,082,278      page-faults               #    0.002 M/sec
 5,592,548,587,259      cycles                    #    2.512 GHz                      (83.35%)
 3,135,038,855,414      instructions              #    0.56  insn per cycle
                                                  #    1.32  stalled cycles per insn  (83.32%)
   636,397,509,232      branches                  #  285.833 M/sec                    (83.30%)
    39,135,441,790      branch-misses             #    6.15% of all branches          (83.35%)

     286.036681644 seconds time elapsed
5 years ago