mitogen: Fix non-blocking IO errors in first stage of bootstrap

When /etc/sudoers has log_output (or similar) enabled the process spawned by
`ctx.sudo()` via `mitogen.parent.Connection.start_child()` receives a stdin
that is in non-blocking mode. The immediate symptom is that `os.openfd(0,
...).read(n)` sometimes returns `None`, causing the first stage to raise an
unhandled TypeError.

The fix (for now) is to use `select.select()` in a while loop to read stdin.
This increases the command size slightly, but I think it's a reasonable
tradeoff until/unless the cause is more fully understood.

All CI tests are now run with sudoers log_output enabled, in order to catch
regressions. `first_stage_test.CommandLineTest` has been amended, because it
relied on implementation details of the bootstrap process that are no longer
true.

Before
```
SSH command size: 755
Preamble (mitogen.core + econtext) size: 18227 (17.80KiB)

                        Original           Minimized           Compressed
mitogen.core         152218 148.7KiB  68437 66.8KiB 45.0%  18124 17.7KiB 11.9%
mitogen.parent        98853  96.5KiB  51103 49.9KiB 51.7%  12881 12.6KiB 13.0%
mitogen.fork           8445   8.2KiB   4139  4.0KiB 49.0%   1652  1.6KiB 19.6%
mitogen.ssh           10827  10.6KiB   6893  6.7KiB 63.7%   2099  2.0KiB 19.4%
mitogen.sudo          12089  11.8KiB   5924  5.8KiB 49.0%   2249  2.2KiB 18.6%
mitogen.select        12325  12.0KiB   2929  2.9KiB 23.8%    964  0.9KiB  7.8%
mitogen.service       41581  40.6KiB  22398 21.9KiB 53.9%   5847  5.7KiB 14.1%
mitogen.fakessh       15767  15.4KiB   8149  8.0KiB 51.7%   2676  2.6KiB 17.0%
mitogen.master        55317  54.0KiB  28846 28.2KiB 52.1%   7528  7.4KiB 13.6%
```

After
```
SSH command size: 798
Preamble (mitogen.core + econtext) size: 18227 (17.80KiB)

                        Original           Minimized           Compressed
mitogen.core         152218 148.7KiB  68437 66.8KiB 45.0%  18124 17.7KiB 11.9%
mitogen.parent        98944  96.6KiB  51180 50.0KiB 51.7%  12910 12.6KiB 13.0%
mitogen.fork           8445   8.2KiB   4139  4.0KiB 49.0%   1652  1.6KiB 19.6%
mitogen.ssh           10827  10.6KiB   6893  6.7KiB 63.7%   2099  2.0KiB 19.4%
mitogen.sudo          12089  11.8KiB   5924  5.8KiB 49.0%   2249  2.2KiB 18.6%
mitogen.select        12325  12.0KiB   2929  2.9KiB 23.8%    964  0.9KiB  7.8%
mitogen.service       41581  40.6KiB  22398 21.9KiB 53.9%   5847  5.7KiB 14.1%
mitogen.fakessh       15767  15.4KiB   8149  8.0KiB 51.7%   2676  2.6KiB 17.0%
mitogen.master        55317  54.0KiB  28846 28.2KiB 52.1%   7528  7.4KiB 13.6%
```
pull/1307/head
Alex Willmer 4 months ago
parent c508bfb58b
commit 85d6046f2f

@ -41,6 +41,8 @@ IMAGE_TEMPLATE = os.environ.get(
'MITOGEN_TEST_IMAGE_TEMPLATE',
'ghcr.io/mitogen-hq/%(distro)s-test:2021',
)
SUDOERS_DEFAULTS_SRC = './tests/image_prep/files/sudoers_defaults'
SUDOERS_DEFAULTS_DEST = '/etc/sudoers.d/mitogen_test_defaults'
TESTS_SSH_PRIVATE_KEY_FILE = os.path.join(GIT_ROOT, 'tests/data/docker/mitogen__has_sudo_pubkey.key')

@ -2,6 +2,7 @@
# Run the Mitogen tests.
import os
import subprocess
import ci_lib
@ -13,6 +14,14 @@ os.environ.update({
if not ci_lib.have_docker():
os.environ['SKIP_DOCKER_TESTS'] = '1'
subprocess.check_call(
"umask 0022; sudo cp '%s' '%s'"
% (ci_lib.SUDOERS_DEFAULTS_SRC, ci_lib.SUDOERS_DEFAULTS_DEST),
shell=True,
)
subprocess.check_call(['sudo', 'visudo', '-cf', ci_lib.SUDOERS_DEFAULTS_DEST])
subprocess.check_call(['sudo', '-l'])
interesting = ci_lib.get_interesting_procs()
ci_lib.run('./run_tests -v')
ci_lib.check_stray_processes(interesting)

@ -21,6 +21,8 @@ To avail of fixes in an unreleased version, please download a ZIP file
In progress (unreleased)
------------------------
* :gh:issue:`1306` :mod:`ansible_mitogen`: Fix non-blocking IO errors in
first stage of bootstrap
* :gh:issue:`1306` CI: Report sudo version on Ansible targets
* :gh:issue:`1306` CI: Move sudo test users defaults into ``/etc/sudoers.d``
* :gh:issue:`1306` preamble_size: Fix variability of measured command size

@ -1440,9 +1440,9 @@ class Connection(object):
os.environ['ARGV0']=sys.executable
os.execl(sys.executable,sys.executable+'(mitogen:CONTEXT_NAME)')
os.write(1,'MITO000\n'.encode())
fp=os.fdopen(0,'rb')
C=zlib.decompress(fp.read(PREAMBLE_COMPRESSED_LEN))
fp.close()
C=''.encode()
while PREAMBLE_COMPRESSED_LEN-len(C)and select.select([0],[],[]):C+=os.read(0,PREAMBLE_COMPRESSED_LEN-len(C))
C=zlib.decompress(C)
fp=os.fdopen(W,'wb',0)
fp.write(C)
fp.close()
@ -1478,11 +1478,12 @@ class Connection(object):
# Just enough to decode, decompress, and exec the first stage.
# Priorities: wider compatibility, faster startup, shorter length.
# `import os` here, instead of stage 1, to save a few bytes.
# `sys.path=...` for https://github.com/python/cpython/issues/115911.
# `import os,select` here (not stage 1) to save a few bytes overall.
return self.get_python_argv() + [
'-c',
'import sys;sys.path=[p for p in sys.path if p];import binascii,os,zlib;'
'import sys;sys.path=[p for p in sys.path if p];'
'import binascii,os,select,zlib;'
'exec(zlib.decompress(binascii.a2b_base64("%s")))' % (encoded.decode(),),
]

@ -1,5 +1,6 @@
import subprocess
import mitogen.core
import mitogen.parent
from mitogen.core import b
@ -21,14 +22,18 @@ class CommandLineTest(testlib.RouterMixin, testlib.TestCase):
conn.context = mitogen.core.Context(None, 123)
args = conn.get_boot_command()
# Executing the boot command will print "EC0" and expect to read from
# stdin, which will fail because it's pointing at /dev/null, causing
# the forked child to crash with an EOFError and disconnect its write
# pipe. The forked and freshly execed parent will get a 0-byte read
# from the pipe, which is a valid script, and therefore exit indicating
# success.
# The boot command should write an ECO marker to stdout, read the
# preamble from stdin, then execute it.
fp = open("/dev/null", "r")
# This test attaches /dev/zero to stdin to create a specific failure
# 1. Fork child reads PREAMBLE_COMPRESSED_LEN bytes of junk (all `\0`)
# 2. Fork child crashes (trying to decompress the junk data)
# 3. Fork child's file descriptors (write pipes) are closed by the OS
# 4. Fork parent does `dup(<read pipe>, <stdin>)` and `exec(<python>)`
# 5. Python reads `b''` (i.e. EOF) from stdin (a closed pipe)
# 6. Python runs `''` (a valid script) and exits with success
fp = open("/dev/zero", "r")
try:
proc = subprocess.Popen(args,
stdin=fp,
@ -39,6 +44,9 @@ class CommandLineTest(testlib.RouterMixin, testlib.TestCase):
self.assertEqual(0, proc.returncode)
self.assertEqual(stdout,
mitogen.parent.BootstrapProtocol.EC0_MARKER+b('\n'))
self.assertIn(b("Error -5 while decompressing data"), stderr)
self.assertIn(
b("Error -3 while decompressing data"), # Unknown compression method
stderr,
)
finally:
fp.close()

@ -1,3 +1,7 @@
# Testing non-blocking stdio during bootstrap
# https://github.com/mitogen-hq/mitogen/issues/1306
Defaults log_output
Defaults>mitogen__pw_required targetpw
Defaults>mitogen__require_tty requiretty
Defaults>mitogen__require_tty_pw_required requiretty,targetpw

Loading…
Cancel
Save