In some scenarios, Ansible's worker seems to exit early, resulting in
EPIPE during .recv() or .send(). Log an error and gracefully disconnect
in that case.
The connection multiplexer can expect to not be scheduled at least until
every $forks worker processes has attempted a connection, so the backlog
must be able to hold every worker.
* ansible: use unicode_literals everywhere since it only needs to be
compatible back to 2.6.
* compat/collections.py: delete this entirely and rip out the parts of
functools that require it.
* Introduce serializable Kwargs dict subclass that translates keys to
Unicode on instantiation.
* enable_debug_logging() must set _v/_vv globals.
* cStringIO does not exist in 3.x.
* Treat IOLogger and LogForwarder input as latin-1.
* Avoid ResourceWarnings in first stage by explicitly closing fps.
* Fix preamble_size.py syntax errors.
The Context and Router APIs for constructing children and making
function calls should be available in every parent context, as user code
wants to have access to the same API.
accept() (per interface) returns a non-blocking socket because the
listener socket is in non-blocking mode, therefore it is pure scheduling
luck that a connecting-in child has a chance to write anything for the
top-level processs to read during the subsequent .recv().
A higher forks setting in ansible.cfg was enough to cause our luck to
run out, causing the .recv() to crashi with EGAIN, and the multiplexer
to respond to the handler's crash by calling its disconnect method. This
is why some reports mentioned ECONNREFUSED -- the listener really was
gone, because its Stream class had crashed.
Meanwhile since the window where we're waiting for the remote process to
identify itself is tiny, simply flip off O_NONBLOCK for the duration of
the connection handshake. Stream.accept() (via Side.__init__) will
reenable O_NONBLOCK for the descriptors it duplicates, so we don't even
need to bother turning this back off.
A better solution entails splitting Stream up into a state machine and
doing the handshake with non-blocking IO, but that isn't going to be
available until asynchronous connect is implemented. Meanwhile in
reality this solution is probably 100% fine.