diff --git a/docs/ansible_detailed.rst b/docs/ansible_detailed.rst index 65f68efb..22d7223f 100644 --- a/docs/ansible_detailed.rst +++ b/docs/ansible_detailed.rst @@ -140,7 +140,7 @@ Testimonials Noteworthy Differences ---------------------- -* Ansible 2.3-2.7 are supported along with Python 2.6, 2.7, 3.6 and 3.7. Verify +* Ansible 2.3-2.8 are supported along with Python 2.6, 2.7, 3.6 and 3.7. Verify your installation is running one of these versions by checking ``ansible --version`` output. @@ -164,6 +164,12 @@ Noteworthy Differences - initech_app - y2k_fix +* Ansible 2.8 `interpreter discovery + `_ + and `become plugins + `_ are not yet + supported. + * The ``doas``, ``su`` and ``sudo`` become methods are available. File bugs to register interest in more. diff --git a/docs/changelog.rst b/docs/changelog.rst index 51cdd2df..fe15ca27 100644 --- a/docs/changelog.rst +++ b/docs/changelog.rst @@ -15,61 +15,65 @@ Release Notes -v0.2.8 (unreleased) +v0.2.9 (unreleased) ------------------- To avail of fixes in an unreleased version, please download a ZIP file `directly from GitHub `_. +*(no changes)* + + +v0.2.8 (2019-08-18) +------------------- + +This release includes Ansible 2.8 and SELinux support, fixes for two deadlocks, +and major internal design overhauls in preparation for future functionality. + + Enhancements ~~~~~~~~~~~~ * :gh:issue:`556`, - :gh:issue:`587`: Ansible 2.8 is partially - supported. `Become plugins + :gh:issue:`587`: Ansible 2.8 is supported. `Become plugins `_ and `interpreter discovery `_ are not yet handled. -* :gh:issue:`419`, :gh:issue:`470`, file descriptor usage during large runs is - halved, as it is no longer necessary to manage read and write sides - distinctly in order to work around a design problem. +* :gh:issue:`419`, :gh:issue:`470`, file descriptor usage is approximately + halved, as it is no longer necessary to separately manage read and write + sides to work around a design problem. -* :gh:issue:`419`: almost all connection setup happens on one thread, reducing - contention and context switching early in a run. +* :gh:issue:`419`: setup for all connections happens almost entirely on one + thread, reducing contention and context switching early in a run. * :gh:issue:`419`: Connection setup is better pipelined, eliminating some network round-trips. Most infrastructure is in place to support future - removal of the final round-trips between a target fully booting and receiving + removal of the final round-trips between a target booting and receiving function calls. * :gh:pull:`595`: the :meth:`~mitogen.parent.Router.buildah` connection method is available to manipulate `Buildah `_ containers, and is exposed to Ansible as the :ans:conn:`buildah`. -* :gh:issue:`615`: the ``mitogen_fetch`` - action is included, and the standard Ansible :ans:mod:`fetch` is redirected - to it. This implements streaming file transfer in every case, including when - ``become`` is active, preventing excessive CPU usage and memory spikes, and - significantly improving throughput. A copy of 2 files of 512 MiB each drops - from 47 seconds to just under 7 seconds, with peak memory usage dropping from - 10.7 GiB to 64.8 MiB. +* :gh:issue:`615`: a modified :ans:mod:`fetch` implements streaming transfer + even when ``become`` is active, avoiding excess CPU usage and memory spikes, + and improving performance. A copy of two 512 MiB files drops from 47 seconds + to 7 seconds, with peak memory usage dropping from 10.7 GiB to 64.8 MiB. * `Operon `_ no longer requires a custom - installation, both Operon and Ansible are supported by a unified release. + library installation, both Ansible and Operon are supported by a single + Mitogen release. -* The ``MITOGEN_CPU_COUNT`` environment variable shards the connection - multiplexer into per-CPU workers. This may improve throughput for runs - involving large file transfers, and is required for future in-process SSH - support. One multiplexer starts by default, to match existing behaviour. +* The ``MITOGEN_CPU_COUNT`` variable shards the connection multiplexer into + per-CPU workers. This may improve throughput for large runs involving file + transfer, and is required for future functionality. One multiplexer starts by + default, to match existing behaviour. -* :gh:commit:`d6faff06`, - :gh:commit:`807cbef9`, - :gh:commit:`e93762b3`, - :gh:commit:`50bfe4c7`: locking is - avoided on hot paths, and some locks are released earlier, before waking a - thread that must immediately take the same lock. +* :gh:commit:`d6faff06`, :gh:commit:`807cbef9`, :gh:commit:`e93762b3`, + :gh:commit:`50bfe4c7`: locking is avoided on hot paths, and some locks are + released before waking a thread that must immediately acquire the same lock. Mitogen for Ansible @@ -80,46 +84,41 @@ Mitogen for Ansible * :gh:issue:`410`: Uses of :linux:man7:`unix` sockets are replaced with traditional :linux:man7:`pipe` pairs when SELinux is detected, to work around - a broken heuristic in popular SELinux policies that prevents inheriting + a broken heuristic in common SELinux policies that prevents inheriting :linux:man7:`unix` sockets across privilege domains. * `#467 `_: an incompatibility - running Mitogen under Molecule was resolved. - -* :gh:issue:`547`, :gh:issue:`598`: fix a serious deadlock - possible while initializing the service pool of any child, such as during - connection, ``async`` tasks, tasks using custom :mod:`module_utils`, - ``mitogen_task_isolation: fork`` modules, and those present on an internal - blacklist of misbehaving modules. - - This deadlock is relatively easy hit, has been present since 0.2.0, and is - likely to have impacted many users. For new connections it could manifest as - a *Connection timed out* error, for forked tasks it could manifest as a - timeout or an apparent hang. - -* :gh:issue:`549`: the open file descriptor limit for the Ansible process is - increased to the available hard limit. It is common for distributions to ship - with a much higher hard limit than their default soft limit, allowing *"too - many open files"* errors to be avoided more often in large runs without user - configuration. - -* :gh:issue:`558`, :gh:issue:`582`: on Ansible 2.3 a remote directory was + running Mitogen under `Molecule + `_ was resolved. + +* :gh:issue:`547`, :gh:issue:`598`: fix a deadlock during initialization of + connections, ``async`` tasks, tasks using custom :mod:`module_utils`, + ``mitogen_task_isolation: fork`` modules, and modules present on an internal + blacklist. This would manifest as a timeout or hang, was easily hit, had been + present since 0.2.0, and likely impacted many users. + +* :gh:issue:`549`: the open file limit is increased to the permitted hard + limit. It is common for distributions to ship with a higher hard limit than + the default soft limit, allowing *"too many open files"* errors to be avoided + more often in large runs without user intervention. + +* :gh:issue:`558`, :gh:issue:`582`: on Ansible 2.3 a directory was unconditionally deleted after the first module belonging to an action plug-in - had executed, causing the :ans:mod:`unarchive` module to fail. + had executed, causing the :ans:mod:`unarchive` to fail. -* :gh:issue:`578`: the extension could crash while rendering an error message, - due to an incorrect format string. +* :gh:issue:`578`: the extension could crash while rendering an error due to an + incorrect format string. * :gh:issue:`590`: the importer can handle modules that replace themselves in - :data:`sys.modules` during import. + :data:`sys.modules` with completely unrelated modules during import, as in + the case of Ansible 2.8 :mod:`ansible.module_utils.distro`. -* :gh:issue:`591`: the target's current working directory is restored to a - known-existent directory between tasks to ensure :func:`os.getcwd` will not - fail when called, in the same way that :class:`AnsibleModule` restores it - during initialization. However this restore happens before the module ever - executes, ensuring any code that calls :func:`os.getcwd` prior to +* :gh:issue:`591`: the working directory is reset between tasks to ensure + :func:`os.getcwd` cannot fail, in the same way :class:`AnsibleModule` + resets it during initialization. However this restore happens before the + module executes, ensuring code that calls :func:`os.getcwd` prior to :class:`AnsibleModule` initialization, such as the Ansible 2.7 - :ans:mod:`pip`, cannot fail due to the behavior of a prior task. + :ans:mod:`pip`, cannot fail due to the actions of a prior task. * :gh:issue:`593`: the SSH connection method exposes ``mitogen_ssh_keepalive_interval`` and ``mitogen_ssh_keepalive_count`` @@ -131,32 +130,47 @@ Mitogen for Ansible encoding. * :gh:issue:`602`: connection configuration is more accurately inferred for - `meta: reset_connection`, the `synchronize` module, and for any action - plug-ins that establish additional connections. + :ans:mod:`meta: reset_connection ` the :ans:mod:`synchronize`, and for + any action plug-ins that establish additional connections. * :gh:issue:`598`, :gh:issue:`605`: fix a deadlock managing a shared counter - used for load balancing. + used for load balancing, present since 0.2.4. -* :gh:issue:`615`: streaming file transfer is implemented for ``fetch`` and - other actions that transfer files from the target to the controller. - Previously the file was sent in one message, requiring it to fit in RAM and - be smaller than the internal message size limit. +* :gh:issue:`615`: streaming is implemented for the :ans:mod:`fetch` and other + actions that transfer files from targets to the controller. Previously files + delivered were sent in one message, requiring them to fit in RAM and be + smaller than an internal message size sanity check. Transfers from controller + to targets have been streaming since 0.2.0. -* :gh:commit:`7ae926b3`: the Ansible :ans:mod:`lineinfile` began leaking - writable temporary file descriptors since Ansible 2.7.0. When - :ans:mod:`~lineinfile` was used to create or modify a script, and that script - was later executed, the execution could fail with "*text file busy*" due to - the leaked descriptor. Temporary descriptors are now tracked and cleaned up - on exit for all modules. +* :gh:commit:`7ae926b3`: the :ans:mod:`lineinfile` leaks writable temporary + file descriptors since Ansible 2.7.0. When :ans:mod:`~lineinfile` created or + modified a script, and that script was later executed, the execution could + fail with "*text file busy*". Temporary descriptors are now tracked and + cleaned up on exit for all modules. Core Library ~~~~~~~~~~~~ -* Log readability is improving, and many :func:`repr` strings are more - descriptive. The old pseudo-function-call format is slowly migrating to - human-readable output where possible. For example, - *"Stream(ssh:123).connect()"* might be written *"connecting to ssh:123"*. +* Log readability is improving and many :func:`repr` strings are more + descriptive. The old pseudo-function-call format is migrating to + readable output where possible. For example, *"Stream(ssh:123).connect()"* + might be written *"connecting to ssh:123"*. + +* In preparation for reducing default log output, many messages are delivered + to per-component loggers, including messages originating from children, + enabling :mod:`logging` aggregation to function as designed. An importer + message like:: + + 12:00:00 D mitogen.ctx.remotehost mitogen: loading module "foo" + + Might instead be logged to the ``mitogen.importer.[remotehost]`` logger:: + + 12:00:00 D mitogen.importer.[remotehost] loading module "foo" + + Allowing a filter or handler for ``mitogen.importer`` to select that logger + in every process. This introduces a small risk of leaking memory in + long-lived programs, as logger objects are internally persistent. * :func:`bytearray` was removed from the list of supported serialization types. It was never portable between Python versions, unused, and never made much @@ -168,27 +182,24 @@ Core Library asynchronous context. * :gh:issue:`419`: the internal - :class:`~mitogen.core.Stream` has been refactored into 7 new classes, + :class:`~mitogen.core.Stream` has been refactored into many new classes, modularizing protocol behaviour, output buffering, line-oriented input parsing, option handling and connection management. Connection setup is - internally asynchronous, laying almost all the groundwork needed for fully - asynchronous connect, proxied Ansible become plug-ins, and integrating - `libssh `_. + internally asynchronous, laying most groundwork for fully asynchronous + connect, proxied Ansible become plug-ins, and in-process SSH. * :gh:issue:`169`, :gh:issue:`419`: zombie subprocess reaping - has vastly improved, by using timers to efficiently poll for a slow child to - finish exiting, and delaying broker shutdown while any subprocess remains. - Polling avoids relying on process-global configuration such as a `SIGCHLD` - handler, or :func:`signal.set_wakeup_fd` available in modern Python. - -* :gh:issue:`256`, - :gh:issue:`419`: most :func:`os.dup` use - was eliminated, along with almost all manual file descriptor management. - Descriptors are trapped in :func:`os.fdopen` objects at creation, ensuring a - leaked object will close itself, and ensuring every descriptor is fused to a - `closed` flag, preventing historical bugs where a double close could destroy - descriptors belonging to unrelated streams. + has vastly improved, by using timers to efficiently poll for a child to exit, + and delaying shutdown while any subprocess remains. Polling avoids + process-global configuration such as a `SIGCHLD` handler, or + :func:`signal.set_wakeup_fd` available in modern Python. + +* :gh:issue:`256`, :gh:issue:`419`: most :func:`os.dup` use was eliminated, + along with most manual file descriptor management. Descriptors are trapped in + :func:`os.fdopen` objects at creation, ensuring a leaked object will close + itself, and ensuring every descriptor is fused to a `closed` flag, preventing + historical bugs where a double close could destroy unrelated descriptors. * :gh:issue:`533`: routing accounts for a race between a parent (or cousin) sending a message to a child via an @@ -218,13 +229,13 @@ Core Library deliver a message for some reason other than the sender cannot or should not reach the recipient, and no reply-to address is present on the message, instead send a :ref:`dead message ` to the original recipient. This - ensures a descriptive messages is delivered to a thread sleeping on the reply + ensures a descriptive message is delivered to a thread sleeping on the reply to a function call, where the reply might be dropped due to exceeding the maximum configured message size. -* :gh:issue:`624`: the number of threads used for a child's auto-started thread - pool has been reduced from 16 to 2. This may drop to 1 in future, and become - configurable via a :class:`Router` option. +* :gh:issue:`624`: the number of threads used for a child's automatically + initialized service thread pool has been reduced from 16 to 2. This may drop + to 1 in future, and become configurable via a :class:`Router` option. * :gh:commit:`a5536c35`: avoid quadratic buffer management when logging lines received from a child's redirected @@ -264,6 +275,7 @@ bug reports, testing, features and fixes in this release contributed by `Florent Dutheil `_, `James Hogarth `_, `Jordan Webb `_, +`Julian Andres Klode `_, `Marc Hartmayer `_, `Nigel Metheringham `_, `Orion Poplawski `_, @@ -271,6 +283,7 @@ bug reports, testing, features and fixes in this release contributed by `Stefane Fermigier `_, `Szabó Dániel Ernő `_, `Ulrich Schreiner `_, +`Vincent S. Cojot `_, `yen `_, `Yuki Nishida `_, `@alexhexabeam `_, diff --git a/docs/conf.py b/docs/conf.py index 86332cd2..1a6a117b 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -10,6 +10,10 @@ author = u'Network Genomics' copyright = u'2019, Network Genomics' exclude_patterns = ['_build', '.venv'] extensions = ['sphinx.ext.autodoc', 'sphinx.ext.intersphinx', 'sphinxcontrib.programoutput', 'domainrefs'] + +# get rid of version from , it messes with piwik +html_title = 'Mitogen Documentation' + html_show_copyright = False html_show_sourcelink = False html_show_sphinx = False @@ -51,11 +55,11 @@ domainrefs = { 'url': 'https://github.com/dw/mitogen/pull/%s', }, 'ans:mod': { - 'text': '%s Module', + 'text': '%s module', 'url': 'https://docs.ansible.com/ansible/latest/modules/%s_module.html', }, 'ans:conn': { - 'text': '%s Connection Plug-in', + 'text': '%s connection plug-in', 'url': 'https://docs.ansible.com/ansible/latest/plugins/connection/%s.html', }, 'freebsd:man2': { diff --git a/docs/toc.rst b/docs/toc.rst index 2bbd0f9a..e43326f1 100644 --- a/docs/toc.rst +++ b/docs/toc.rst @@ -7,11 +7,11 @@ Table Of Contents index Mitogen for Ansible <ansible_detailed> - contributors changelog + contributors howitworks - getting_started api + getting_started examples internals diff --git a/mitogen/__init__.py b/mitogen/__init__.py index 5e2e29b6..47e570ab 100644 --- a/mitogen/__init__.py +++ b/mitogen/__init__.py @@ -35,7 +35,7 @@ be expected. On the slave, it is built dynamically during startup. #: Library version as a tuple. -__version__ = (0, 2, 7) +__version__ = (0, 2, 8) #: This is :data:`False` in slave contexts. Previously it was used to prevent diff --git a/mitogen/parent.py b/mitogen/parent.py index 6e99bb66..1c3e1874 100644 --- a/mitogen/parent.py +++ b/mitogen/parent.py @@ -2607,9 +2607,9 @@ class Reaper(object): if not self.kill: pass - elif self._tries == 1: + elif self._tries == 2: self._signal_child(signal.SIGTERM) - elif self._tries == 5: # roughly 4 seconds + elif self._tries == 6: # roughly 4 seconds self._signal_child(signal.SIGKILL) diff --git a/tests/reaper_test.py b/tests/reaper_test.py new file mode 100644 index 00000000..e78fdbf2 --- /dev/null +++ b/tests/reaper_test.py @@ -0,0 +1,54 @@ + +import signal +import unittest2 +import testlib +import mock + +import mitogen.parent + + +class ReaperTest(testlib.TestCase): + @mock.patch('os.kill') + def test_calc_delay(self, kill): + broker = mock.Mock() + proc = mock.Mock() + proc.poll.return_value = None + reaper = mitogen.parent.Reaper(broker, proc, True, True) + self.assertEquals(50, int(1000 * reaper._calc_delay(0))) + self.assertEquals(86, int(1000 * reaper._calc_delay(1))) + self.assertEquals(147, int(1000 * reaper._calc_delay(2))) + self.assertEquals(254, int(1000 * reaper._calc_delay(3))) + self.assertEquals(437, int(1000 * reaper._calc_delay(4))) + self.assertEquals(752, int(1000 * reaper._calc_delay(5))) + self.assertEquals(1294, int(1000 * reaper._calc_delay(6))) + + @mock.patch('os.kill') + def test_reap_calls(self, kill): + broker = mock.Mock() + proc = mock.Mock() + proc.poll.return_value = None + + reaper = mitogen.parent.Reaper(broker, proc, True, True) + + reaper.reap() + self.assertEquals(0, kill.call_count) + + reaper.reap() + self.assertEquals(1, kill.call_count) + + reaper.reap() + reaper.reap() + reaper.reap() + self.assertEquals(1, kill.call_count) + + reaper.reap() + self.assertEquals(2, kill.call_count) + + self.assertEquals(kill.mock_calls, [ + mock.call(proc.pid, signal.SIGTERM), + mock.call(proc.pid, signal.SIGKILL), + ]) + + +if __name__ == '__main__': + unittest2.main()