Commit Graph

517 Commits (a3b4b459fa578be674867174e53793233c4af091)
 

Author SHA1 Message Date
David Wilson a3b4b459fa issue #139: eliminate quadratic behaviour on input path
Rather than slowly build up a Python string over time, we just store a
deque of chunks (which, in a later commit, will now be around 128KB
each), and track the total buffer size in a separate integer.

The tricky loop is there to ensure the header does not need to be sliced
off the full message (which may be huge, causing yet another spike and
copy), but rather only off the much smaller first 128kb-sized chunk
received.

There is one more problem with this code: the ''.join() causes RAM usage
to temporarily double, but that was true of the old solution too. Shall
wait for bug reports before fixing this, as it gets very ugly very fast.
7 years ago
David Wilson ba9a06d0f5 issue #139: core: Side.write(): let the OS write as much as possible.
There is no penalty for just passing as much data to the OS as possible,
it is not copied, and for a non-blocking socket, the OS will just keep
buffer as much as it can and tell us how much that was.

Also avoids a rather pointless string slice.
7 years ago
David Wilson 49db4125d0 issue #139: core: bump CHUNK_SIZE from 16kb to 128Kb
Reduces the number of IO loop iterations required to receive large
messages at a small cost to RAM usage.

Note that when calling read() with a large buffer value like this,
Python must zero-allocate that much RAM. In other words, for even a
single byte received, 128kb of RAM might need to be written.
Consequently CHUNK_SIZE is quite a sensitive value and this might need
further tuning.
7 years ago
David Wilson 8e2b07a54e issue #139: add profiling=True option to mitogen.main(). 7 years ago
David Wilson 017e8105cf issue #131: disable non-blocking IO during UNIX accept()
accept() (per interface) returns a non-blocking socket because the
listener socket is in non-blocking mode, therefore it is pure scheduling
luck that a connecting-in child has a chance to write anything for the
top-level processs to read during the subsequent .recv().

A higher forks setting in ansible.cfg was enough to cause our luck to
run out, causing the .recv() to crashi with EGAIN, and the multiplexer
to respond to the handler's crash by calling its disconnect method. This
is why some reports mentioned ECONNREFUSED -- the listener really was
gone, because its Stream class had crashed.

Meanwhile since the window where we're waiting for the remote process to
identify itself is tiny, simply flip off O_NONBLOCK for the duration of
the connection handshake. Stream.accept() (via Side.__init__) will
reenable O_NONBLOCK for the descriptors it duplicates, so we don't even
need to bother turning this back off.

A better solution entails splitting Stream up into a state machine and
doing the handshake with non-blocking IO, but that isn't going to be
available until asynchronous connect is implemented. Meanwhile in
reality this solution is probably 100% fine.
7 years ago
David Wilson 44d36eccba issue #146: don't crash during on_broker_shutdown
There is some insane unidentifiable Mitogen context (the local context?)
that instantly crashes with a higher forks setting. It appears to be
harmless, but meanwhile this naturally shouldn't be happening.
7 years ago
David Wilson cb620500d1 issue #131: log stack and PPID with MITOGEN_ROUTER_DEBUG=1 7 years ago
David Wilson 0f5a31fb52 issue #131: test with forks=50 7 years ago
David Wilson cd455e8c58 ansible: minor tidy up 7 years ago
David Wilson d1888f1908 docs: reorder sections 7 years ago
David Wilson 3e40b9ab8e issue #131: import something clean that might tickle the problem 7 years ago
David Wilson 014247ce66 docs: another crazy Ansible success story 7 years ago
David Wilson 87435bf45d issue #140: nicer filetree construction 7 years ago
David Wilson 3584084be6 issue #140: explicit Broker management, and guard against crap plug-ins.
Implement Connection.__del__, which is almost certainly going to trigger
more bugs down the line, because the state of the Connection instance is
not guranteed during __del__. Meanwhile, it is temporarily needed for
deployed-today Ansibles that have a buggy synchronize action that does
not call Connection.close().

A better approach to this would be to virtualize the guts of Connection,
and move its management to one central place where we can guarantee
resource destruction happens reliably, but that may entail another
Ansible monkey-patch to give us such a reliable hook.
7 years ago
David Wilson 83c8412474 issue #140: permit mitogen.unix.connect() to accept preconstructed Broker.
Part of an effort to make resource management a little more explicit.
7 years ago
David Wilson 65df36895e issue #140: prevent duplicate watcher thread creation
When a Broker() is running with install_watcher=True, arrange for only
one watcher thread to exist for each target thread, and to reset the
mapping of watchers to targets after process fork.

This is probably the last change I want to make to the watcher feature
before deciding to rip it out, it may be more trouble than it is worth.
7 years ago
David Wilson 1b93a4f51a issue #141: remove reference to incomplete change 7 years ago
Alex Willmer e3b700b553 tests: Fix no such option -o running FakeSsh.test_okay()
Full output of failed test

```
ERROR: test_okay (__main__.FakeSshTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests/ssh_test.py", line 16, in test_okay
    ssh_path=testlib.data_path('fakessh.py'),
  File "/home/alex/src/mitogen/mitogen/master.py", line 650, in ssh
    return self.connect('ssh', **kwargs)
  File "/home/alex/src/mitogen/mitogen/parent.py", line 463, in connect
    return self._connect(context_id, klass, name=name, **kwargs)
  File "/home/alex/src/mitogen/mitogen/parent.py", line 449, in _connect
    stream.connect()
  File "/home/alex/src/mitogen/mitogen/ssh.py", line 104, in connect
    super(Stream, self).connect()
  File "/home/alex/src/mitogen/mitogen/parent.py", line 395, in connect
    self._connect_bootstrap()
  File "/home/alex/src/mitogen/mitogen/ssh.py", line 116, in
_connect_bootstrap
    time.time() + 10.0):
  File "/home/alex/src/mitogen/mitogen/parent.py", line 207, in
iter_read
    (''.join(bits)[-300:],)
mitogen.core.StreamError: EOF on stream; last 300 bytes received:
'Usage: fakessh.py [options]\n\nfakessh.py: error: no such option: -o\n'
```
7 years ago
Alex Willmer 7063d172e9 tests: Add Tox config for Python 2.6 and 2.7
I could not get Python 2.5 or earlier to work. Too many packages
(critically docker) don't support it.
7 years ago
Alex Willmer 332e3ec5d0 setup: Scan project dir to find packages
This eliminates the possibility of the filesystem and setup.py
diverging, as had happened with ansible_mitogen/connection/ vs
ansible_mitogen/connection.py
7 years ago
David Wilson 88c198ea05 issue #141: copy Ansible's connect_timeout for sudo too. 7 years ago
David Wilson 63c3fc623c docs: note the semantic difference in Mitogen vs. Ansible timeouts
Related to issue #141.
7 years ago
David Wilson 587256bbce issue #141: unify connect deadline handling
Now there is a single deadline calculated by the parent.Stream
constructor, and reused for both SSH and sudo.
7 years ago
David Wilson d58b5ad777 core: prevent creation of unicode Message.data
Was triggering a crash indirectly due to Ansible passing us Unicode
strings. Needs a better fix.
7 years ago
David Wilson 31065ffe4a issue #143: avoid long-form options in sudo.py. 7 years ago
David Wilson 21a8026a63 issue #140: import reproduction 7 years ago
David Wilson 8f85943083 issue #139: mention relating buffering issue 7 years ago
David Wilson 1f1d691a28 docs: update to match @moreati's code golf birdies :) 7 years ago
Alex Willmer f95b37429f parent: Read preamble in first stage with os.fdopen()
SSH command size: 439 (+4 bytes)
Preamble size: 8941 (no change)

This _increases_ the size of the first stage, but
- Eliminates one of the two remaining uses of `sys`
- Reads the preamble as a byte-string, no call `.encode()`
   is needed on Python 3 before calling `_()`
7 years ago
Alex Willmer a62edd0b7e parent: Use os.execl in first stage
SSH command size: 435 (-4 bytes)
Preamble size: 8962 (no change)

os.execl is the same as os.execv, but it take a variable number of
arguments instead of a single sequence.
7 years ago
Alex Willmer 545652c34f parent: Trim whitespace & e variable in first stage
SSH command size: 439 (-4 bytes)
Preamble size: 8962 (no change)
7 years ago
Alex Willmer 0336de6722 parent: Combine first stage imports
SSH command size: 443 (-5 bytes)
Preamble size: 8962
7 years ago
Alex Willmer 48949cd249 parent: Use 'zip' alias of 'zlib' decoder
SSH command size: 448 (-5 bytes)
Preamble size: 8941 (no change)

NB: The 'zip' alias was absent in Python 3.x, until Python 3.4. This
should change be reverted if Python 3.0, 3.2, or 3.3 support is
required.
7 years ago
Alex Willmer 0f82f68fee parent: Precompute preamble sizes for first stage
SSH command size: 453 (no change)
Preamble size: 8941 (-5 bytes)
7 years ago
Alex Willmer dfd7070ceb parent: reuse _=codecs.decode alias in exec'd first stage
SSH command size: 453 (-8 bytes)
Preamble size: 8946 (no change)
7 years ago
Alex Willmer 53a8c59ae5 parent: Remove redudant os.exit() in first stage
SSH command size: 461 (-8 bytes)
Preamble size: 8946 (no change)

Since python has reached the last statement this should occur anyway.
7 years ago
Alex Willmer e051cf0ea0 parent: Unroll os.close() loop in first stage
SSH command size: 469 (-11 bytes)
Preamble size: 8946 (no change)

Although the source is longer, the _compressed_ length is reduced.
7 years ago
Alex Willmer 85f36f4cb1 parent: Prefer "import foo;x=foo" in first stage
SSH command size: 481 (down 1)
Preamble size: 8946 (no change)
7 years ago
Alex Willmer f999b9adbf Crank zlib.compress() upto 9
SSH command size: 482 bytes (no change)
Preamble size: 8946 bytes (down 33)
7 years ago
Alex Willmer 9aa83ef77f docs: First round of Pickle-likes survey 7 years ago
Alex Willmer a1fc21bb06 docs: Maximum size of pencode values 7 years ago
Alex Willmer e24db89f3a docs: Disco comparison 7 years ago
Alex Willmer 04f4851138 docs: multiprocessing comparison
Not strictly a rival, but has enough commonalities to be worth noting
7 years ago
Alex Willmer 8c227b2bdd docs: More detail about Baker 7 years ago
Alex Willmer e06e438228 docs: More detail about execnet 7 years ago
Alex Willmer da58f8595d docs: More detail about chopsticks 7 years ago
Alex Willmer d7fbb9aef6 docs: Link compared projects to their website
All outgoing links checked with

```bash
cd docs
make linkcheck
```
7 years ago
Alex Willmer 4615ab1a8e docs: Enable sphinx-autobuild
```bash
cd docs
make
```

to run a webserver that automatically rerenders whenever the rST is
modified.
7 years ago
David Wilson b243da087c issue #121: fix call_function_test by not raising the dead
A first small mea culpa to all my testing sins of late :)
7 years ago
David Wilson f1009b7502 issue #121: fix breakage caused by a9c6c13
This actually addresses multiple problems:

* Single-file programs were broken, since the fix introduced in
  6931cc10c4 caused builtin_find_module()
  to start indicating __main__ can always be loaded locally. That's
  broken, and there might be more cases where the same problem will crop
  up.

  Since it was indicated __main__ could be loaded locally, the built-in
  import machinery was allowed to attempt that (since we remove __main__
  from sys.modules during bootstrap), which caused a safety check to
  fire in the bowels of Python:

      "Cannot re-init internal module %.200s"

* The check for presence of the whitelist was totally broken, since the
  whitelist is never an empty list. Therefore 'self' was being returned
  for every module, including extension modules like 'termios'.

I have hand-verified this does not break the fix for issue #113. I
looked at writing a test for that, but it requires a Docker container
(or similar) with an ancient version of Ansible installed. Will open a
separate ticket tracking this.
7 years ago