ansible/docs/docsite/rst/user_guide/playbooks_error_handling.rst

Error Handling In Playbooks
===========================

.. contents:: Topics

Ansible normally has defaults that make sure to check the return codes of commands and modules and
it fails fast -- forcing an error to be dealt with unless you decide otherwise.

Sometimes a command that returns different than 0 isn't an error.  Sometimes a command might not always
need to report that it 'changed' the remote system.  This section describes how to change
the default behavior of Ansible for certain tasks so output and error handling behavior is
as desired.

.. _ignoring_failed_commands:

Ignoring Failed Commands
````````````````````````

Generally playbooks will stop executing any more steps on a host that has a task fail.
Sometimes, though, you want to continue on.  To do so, write a task that looks like this::

    - name: this will not be counted as a failure
      command: /bin/false
      ignore_errors: yes

Note that the above system only governs the return value of failure of the particular task,
so if you have an undefined variable used or a syntax error, it will still raise an error that users will need to address.
Note that this will not prevent failures on connection or execution issues.
This feature only works when the task must be able to run and return a value of 'failed'.

Ignoring Unreachable Host Errors
````````````````````````````````````````

.. versionadded:: 2.7

You may ignore task failure due to the host instance being 'UNREACHABLE' with the ``ignore_unreachable`` keyword.
Note that task errors are what's being ignored, not the unreachable host.

Here's an example explaining the behavior for an unreachable host at the task level::

    - name: this executes, fails, and the failure is ignored
      command: /bin/true
      ignore_unreachable: yes

    - name: this executes, fails, and ends the play for this host
      command: /bin/true

And at the playbook level::

    - hosts: all
      ignore_unreachable: yes
      tasks:
      - name: this executes, fails, and the failure is ignored
        command: /bin/true

      - name: this executes, fails, and ends the play for this host
        command: /bin/true
        ignore_unreachable: no

.. _resetting_unreachable:

Resetting Unreachable Hosts
```````````````````````````

.. versionadded:: 2.2

Connection failures set hosts as 'UNREACHABLE', which will remove them from the list of active hosts for the run.
To recover from these issues you can use `meta: clear_host_errors` to have all currently flagged hosts reactivated,
so subsequent tasks can try to use them again.


.. _handlers_and_failure:

Handlers and Failure
````````````````````

When a task fails on a host, handlers which were previously notified
will *not* be run on that host. This can lead to cases where an unrelated failure
can leave a host in an unexpected state. For example, a task could update
a configuration file and notify a handler to restart some service. If a
task later on in the same play fails, the service will not be restarted despite
the configuration change.

You can change this behavior with the ``--force-handlers`` command-line option,
or by including ``force_handlers: True`` in a play, or ``force_handlers = True``
in ansible.cfg. When handlers are forced, they will run when notified even
if a task fails on that host. (Note that certain errors could still prevent
the handler from running, such as a host becoming unreachable.)

.. _controlling_what_defines_failure:

Controlling What Defines Failure
````````````````````````````````

Ansible lets you define what "failure" means in each task using the ``failed_when`` conditional. As with all conditionals in Ansible, lists of multiple ``failed_when`` conditions are joined with an implicit ``and``, meaning the task only fails when *all* conditions are met. If you want to trigger a failure when any of the conditions is met, you must define the conditions in a string with an explicit ``or`` operator.

You may check for failure by searching for a word or phrase in the output of a command::

    - name: Fail task when the command error output prints FAILED
      command: /usr/bin/example-command -x -y -z
      register: command_result
      failed_when: "'FAILED' in command_result.stderr"

or based on the return code::

    - name: Fail task when both files are identical
      raw: diff foo/file1 bar/file2
      register: diff_cmd
      failed_when: diff_cmd.rc == 0 or diff_cmd.rc >= 2

In previous version of Ansible, this can still be accomplished as follows::

    - name: this command prints FAILED when it fails
      command: /usr/bin/example-command -x -y -z
      register: command_result
      ignore_errors: True

    - name: fail the play if the previous command did not succeed
      fail:
        msg: "the command failed"
      when: "'FAILED' in command_result.stderr"

You can also combine multiple conditions for failure. This task will fail if both conditions are true::

    - name: Check if a file exists in temp and fail task if it does
      command: ls /tmp/this_should_not_be_here
      register: result
      failed_when:
        - result.rc == 0
        - '"No such" not in result.stdout'

If you want the task to fail when only one condition is satisfied, change the ``failed_when`` definition to::

      failed_when: result.rc == 0 or "No such" not in result.stdout

If you have too many conditions to fit neatly into one line, you can split it into a multi-line yaml value with ``>``::


    - name: example of many failed_when conditions with OR
      shell: "./myBinary"
      register: ret
      failed_when: >
        ("No such file or directory" in ret.stdout) or
        (ret.stderr != '') or
        (ret.rc == 10)

.. _override_the_changed_result:

Overriding The Changed Result
`````````````````````````````

When a shell/command or other module runs it will typically report
"changed" status based on whether it thinks it affected machine state.

Sometimes you will know, based on the return code
or output that it did not make any changes, and wish to override
the "changed" result such that it does not appear in report output or
does not cause handlers to fire::

    tasks:

      - shell: /usr/bin/billybass --mode="take me to the river"
        register: bass_result
        changed_when: "bass_result.rc != 2"

      # this will never report 'changed' status
      - shell: wall 'beep'
        changed_when: False

You can also combine multiple conditions to override "changed" result::

    - command: /bin/fake_command
      register: result
      ignore_errors: True
      changed_when:
        - '"ERROR" in result.stderr'
        - result.rc == 2

Aborting the play
`````````````````

Sometimes it's desirable to abort the entire play on failure, not just skip remaining tasks for a host.

The ``any_errors_fatal`` option will end the play and prevent any subsequent plays from running. When an error is encountered, all hosts in the current batch are given the opportunity to finish the fatal task and then the execution of the play stops. ``any_errors_fatal`` can be set at the play or block level::

     - hosts: somehosts
       any_errors_fatal: true
       roles:
         - myrole

     - hosts: somehosts
       tasks:
         - block:
             - include_tasks: mytasks.yml
           any_errors_fatal: true

for finer-grained control ``max_fail_percentage`` can be used to abort the run after a given percentage of hosts has failed.

Using blocks
````````````

Most of what you can apply to a single task (with the exception of loops) can be applied at the :ref:`playbooks_blocks` level, which also makes it much easier to set data or directives common to the tasks.
Blocks also introduce the ability to handle errors in a way similar to exceptions in most programming languages.
Blocks only deal with 'failed' status of a task. A bad task definition or an unreachable host are not 'rescuable' errors::

    tasks:
    - name: Handle the error
      block:
        - debug:
            msg: 'I execute normally'
        - name: i force a failure
          command: /bin/false
        - debug:
            msg: 'I never execute, due to the above task failing, :-('
      rescue:
        - debug:
            msg: 'I caught an error, can do stuff here to fix it, :-)'

This will 'revert' the failed status of the outer ``block`` task for the run and the play will continue as if it had succeeded.
See :ref:`block_error_handling` for more examples.

.. seealso::

   :ref:`playbooks_intro`
       An introduction to playbooks
   :ref:`playbooks_best_practices`
       Best practices in playbooks
   :ref:`playbooks_conditionals`
       Conditional statements in playbooks
   :ref:`playbooks_variables`
       All about variables
   `User Mailing List <https://groups.google.com/group/ansible-devel>`_
       Have a question?  Stop by the google group!
   `irc.freenode.net <http://irc.freenode.net>`_
       #ansible IRC chat channel
Tweaking doc titles. 11 years ago			`Error Handling In Playbooks`
			`===========================`
Various docs reorg and additions 12 years ago
Add contents to various pages with more than one topic per page. 11 years ago			`.. contents:: Topics`

Explain error handling section better. 11 years ago			`Ansible normally has defaults that make sure to check the return codes of commands and modules and`
			`it fails fast -- forcing an error to be dealt with unless you decide otherwise.`
fix various typos in the documentation 11 years ago
fixed phrasing on 0 being non error 8 years ago			`Sometimes a command that returns different than 0 isn't an error. Sometimes a command might not always`
Conditionals chapter and some misc tweaks elsewhere 11 years ago			`need to report that it 'changed' the remote system. This section describes how to change`
			`the default behavior of Ansible for certain tasks so output and error handling behavior is`
			`as desired.`
Updating various doc items with 0.6 features, releasing soon, and removing references to things new in 0.4, which has been out long enough to no longer be new. 12 years ago
More exciting anchor tags, document more loop features! 11 years ago			`.. _ignoring_failed_commands:`

Added info about ignore_errors: True and added a user to the who uses page 12 years ago			`Ignoring Failed Commands`
			````````````````````````

Clear host errors (#18127) * document meta: clear_host_errors * Update playbooks_error_handling.rst Minor edit 8 years ago			`Generally playbooks will stop executing any more steps on a host that has a task fail.`
			`Sometimes, though, you want to continue on. To do so, write a task that looks like this::`
Added info about ignore_errors: True and added a user to the who uses page 12 years ago
			`- name: this will not be counted as a failure`
Update to conventional task format 11 years ago			`command: /bin/false`
Make use of yes/no booleans in playbooks At the moment Ansible prefers yes/no for module booleans, however booleans in playbooks are still using True/False, rather than yes/no. This changes modifies boolean uses in playbooks (and man pages) to favor yes/no rather than True/False. This change includes: - Adaptation of documentation and examples to favor yes/no - Modification to manpage output to favor yes/no (the docsite output already favors yes/no) 12 years ago			`ignore_errors: yes`
Added info about ignore_errors: True and added a user to the who uses page 12 years ago
clarified ignore_errors 9 years ago			`Note that the above system only governs the return value of failure of the particular task,`
Clear host errors (#18127) * document meta: clear_host_errors * Update playbooks_error_handling.rst Minor edit 8 years ago			`so if you have an undefined variable used or a syntax error, it will still raise an error that users will need to address.`
			`Note that this will not prevent failures on connection or execution issues.`
			`This feature only works when the task must be able to run and return a value of 'failed'.`

Improve ignore_unreachable documentation (#64938) 5 years ago			`Ignoring Unreachable Host Errors`
			````````````````````````````````````````

			`.. versionadded:: 2.7`

			You may ignore task failure due to the host instance being 'UNREACHABLE' with the ``ignore_unreachable`` keyword.
			`Note that task errors are what's being ignored, not the unreachable host.`

			`Here's an example explaining the behavior for an unreachable host at the task level::`

			`- name: this executes, fails, and the failure is ignored`
			`command: /bin/true`
			`ignore_unreachable: yes`

			`- name: this executes, fails, and ends the play for this host`
			`command: /bin/true`

			`And at the playbook level::`

			`- hosts: all`
			`ignore_unreachable: yes`
			`tasks:`
			`- name: this executes, fails, and the failure is ignored`
			`command: /bin/true`

			`- name: this executes, fails, and ends the play for this host`
			`command: /bin/true`
			`ignore_unreachable: no`

Clear host errors (#18127) * document meta: clear_host_errors * Update playbooks_error_handling.rst Minor edit 8 years ago			`.. _resetting_unreachable:`

			`Resetting Unreachable Hosts`
			```````````````````````````

			`.. versionadded:: 2.2`

			`Connection failures set hosts as 'UNREACHABLE', which will remove them from the list of active hosts for the run.`
			To recover from these issues you can use `meta: clear_host_errors` to have all currently flagged hosts reactivated,
			`so subsequent tasks can try to use them again.`
clarified ignore_errors 9 years ago
Some minor docs corrections. 11 years ago
Fix --force-handlers, and allow it in plays and ansible.cfg The --force-handlers command line argument was not correctly running handlers on hosts which had tasks that later failed. This corrects that, and also allows you to specify force_handlers in ansible.cfg or in a play. 9 years ago			`.. _handlers_and_failure:`

			`Handlers and Failure`
			````````````````````

			`When a task fails on a host, handlers which were previously notified`
			`will not be run on that host. This can lead to cases where an unrelated failure`
			`can leave a host in an unexpected state. For example, a task could update`
			`a configuration file and notify a handler to restart some service. If a`
			`task later on in the same play fails, the service will not be restarted despite`
			`the configuration change.`

			You can change this behavior with the ``--force-handlers`` command-line option,
			or by including ``force_handlers: True`` in a play, or ``force_handlers = True``
			`in ansible.cfg. When handlers are forced, they will run when notified even`
			`if a task fails on that host. (Note that certain errors could still prevent`
			`the handler from running, such as a host becoming unreachable.)`

Fix REST anchor targets 11 years ago			`.. _controlling_what_defines_failure:`
Add an example of failed_when. 11 years ago
			`Controlling What Defines Failure`
			````````````````````````````````

Clarifies how Ansible processes multiple `failed_when` conditions (#55941) Docs: Clarify that multiple failed_when conditions join with AND not OR to counter third-party pages online incorrectly stating that it uses `OR`. ([example](https://groups.google.com/d/msg/ansible-project/cIaQTmY3ZLE/c5w8rlmdHWIJ)). 5 years ago			Ansible lets you define what "failure" means in each task using the ``failed_when`` conditional. As with all conditionals in Ansible, lists of multiple ``failed_when`` conditions are joined with an implicit ``and``, meaning the task only fails when all conditions are met. If you want to trigger a failure when any of the conditions is met, you must define the conditions in a string with an explicit ``or`` operator.
Add an example of failed_when. 11 years ago
Clarifies how Ansible processes multiple `failed_when` conditions (#55941) Docs: Clarify that multiple failed_when conditions join with AND not OR to counter third-party pages online incorrectly stating that it uses `OR`. ([example](https://groups.google.com/d/msg/ansible-project/cIaQTmY3ZLE/c5w8rlmdHWIJ)). 5 years ago			`You may check for failure by searching for a word or phrase in the output of a command::`
Add an example of failed_when. 11 years ago
Add example task succeeding when RC is non-zero (#23698) * Add example task succeeding when RC is non-zero I added an example on how to use the return code to decide yourself what is considered a failure. This might have helped for #23679. * Use diff as example command, instead of robocopy 7 years ago			`- name: Fail task when the command error output prints FAILED`
Add an example of failed_when. 11 years ago			`command: /usr/bin/example-command -x -y -z`
			`register: command_result`
			`failed_when: "'FAILED' in command_result.stderr"`

Add example task succeeding when RC is non-zero (#23698) * Add example task succeeding when RC is non-zero I added an example on how to use the return code to decide yourself what is considered a failure. This might have helped for #23679. * Use diff as example command, instead of robocopy 7 years ago			`or based on the return code::`

			`- name: Fail task when both files are identical`
			`raw: diff foo/file1 bar/file2`
			`register: diff_cmd`
			`failed_when: diff_cmd.rc == 0 or diff_cmd.rc >= 2`

Fix typo in playbooks_error_handling (#31636) 7 years ago			`In previous version of Ansible, this can still be accomplished as follows::`
Add an example of failed_when. 11 years ago
			`- name: this command prints FAILED when it fails`
			`command: /usr/bin/example-command -x -y -z`
			`register: command_result`
			`ignore_errors: True`

			`- name: fail the play if the previous command did not succeed`
Update example syntax in playbooks_error_handling.rst. (#35675) 7 years ago			`fail:`
			`msg: "the command failed"`
Add an example of failed_when. 11 years ago			`when: "'FAILED' in command_result.stderr"`

Clarifies how Ansible processes multiple `failed_when` conditions (#55941) Docs: Clarify that multiple failed_when conditions join with AND not OR to counter third-party pages online incorrectly stating that it uses `OR`. ([example](https://groups.google.com/d/msg/ansible-project/cIaQTmY3ZLE/c5w8rlmdHWIJ)). 5 years ago			`You can also combine multiple conditions for failure. This task will fail if both conditions are true::`
Update changed_when and failed_when examples (#50411) Added examples in playbooks_error_handling doc for handlining multiple conditions in changed_when and failed_when Signed-off-by: Abhijeet Kasurde <akasurde@redhat.com> 6 years ago
			`- name: Check if a file exists in temp and fail task if it does`
			`command: ls /tmp/this_should_not_be_here`
			`register: result`
			`failed_when:`
			`- result.rc == 0`
Clarifies how Ansible processes multiple `failed_when` conditions (#55941) Docs: Clarify that multiple failed_when conditions join with AND not OR to counter third-party pages online incorrectly stating that it uses `OR`. ([example](https://groups.google.com/d/msg/ansible-project/cIaQTmY3ZLE/c5w8rlmdHWIJ)). 5 years ago			`- '"No such" not in result.stdout'`

add doc example of multiline failed_when with OR (#56007) * add doc example of multiline failed_when with OR * add variety to multiple OR failed_when doc example 5 years ago			If you want the task to fail when only one condition is satisfied, change the ``failed_when`` definition to::
Clarifies how Ansible processes multiple `failed_when` conditions (#55941) Docs: Clarify that multiple failed_when conditions join with AND not OR to counter third-party pages online incorrectly stating that it uses `OR`. ([example](https://groups.google.com/d/msg/ansible-project/cIaQTmY3ZLE/c5w8rlmdHWIJ)). 5 years ago
			`failed_when: result.rc == 0 or "No such" not in result.stdout`
Update changed_when and failed_when examples (#50411) Added examples in playbooks_error_handling doc for handlining multiple conditions in changed_when and failed_when Signed-off-by: Abhijeet Kasurde <akasurde@redhat.com> 6 years ago
add doc example of multiline failed_when with OR (#56007) * add doc example of multiline failed_when with OR * add variety to multiple OR failed_when doc example 5 years ago			If you have too many conditions to fit neatly into one line, you can split it into a multi-line yaml value with ``>``::


			`- name: example of many failed_when conditions with OR`
			`shell: "./myBinary"`
			`register: ret`
			`failed_when: >`
			`("No such file or directory" in ret.stdout) or`
			`(ret.stderr != '') or`
			`(ret.rc == 10)`

More exciting anchor tags, document more loop features! 11 years ago			`.. _override_the_changed_result:`

Tweaking doc titles. 11 years ago			`Overriding The Changed Result`
			`````````````````````````````
Document changed_when clause 11 years ago
Slight edit to when_changed docs 11 years ago			`When a shell/command or other module runs it will typically report`
Updated documentation. 11 years ago			`"changed" status based on whether it thinks it affected machine state.`
Slight edit to when_changed docs 11 years ago
			`Sometimes you will know, based on the return code`
			`or output that it did not make any changes, and wish to override`
			`the "changed" result such that it does not appear in report output or`
			`does not cause handlers to fire::`
Document changed_when clause 11 years ago
			`tasks:`
Slight edit to when_changed docs 11 years ago
			`- shell: /usr/bin/billybass --mode="take me to the river"`
			`register: bass_result`
			`changed_when: "bass_result.rc != 2"`

			`# this will never report 'changed' status`
			`- shell: wall 'beep'`
fix change_when example for False 11 years ago			`changed_when: False`
Document changed_when clause 11 years ago
Update changed_when and failed_when examples (#50411) Added examples in playbooks_error_handling doc for handlining multiple conditions in changed_when and failed_when Signed-off-by: Abhijeet Kasurde <akasurde@redhat.com> 6 years ago			`You can also combine multiple conditions to override "changed" result::`

			`- command: /bin/fake_command`
			`register: result`
			`ignore_errors: True`
			`changed_when:`
			`- '"ERROR" in result.stderr'`
			`- result.rc == 2`

Document any_errors_fatal and max_fail_percentage 9 years ago			`Aborting the play`
			`````````````````

			`Sometimes it's desirable to abort the entire play on failure, not just skip remaining tasks for a host.`

Add some details to any_errors_fatal documentation (#62029) 5 years ago			The ``any_errors_fatal`` option will end the play and prevent any subsequent plays from running. When an error is encountered, all hosts in the current batch are given the opportunity to finish the fatal task and then the execution of the play stops. ``any_errors_fatal`` can be set at the play or block level::
Document any_errors_fatal and max_fail_percentage 9 years ago
			`- hosts: somehosts`
			`any_errors_fatal: true`
			`roles:`
			`- myrole`

Add some details to any_errors_fatal documentation (#62029) 5 years ago			`- hosts: somehosts`
			`tasks:`
			`- block:`
			`- include_tasks: mytasks.yml`
			`any_errors_fatal: true`

Document any_errors_fatal and max_fail_percentage 9 years ago			for finer-grained control ``max_fail_percentage`` can be used to abort the run after a given percentage of hosts has failed.

Adding more information about blocks and blocks error handling. (#54429) * Adding more information about blocks and blocks error handling. * Update docs/docsite/rst/user_guide/playbooks_error_handling.rst and playbooks_blocks.rst * Removing undefined variables as not rescuable errors. Signed-off-by: Caio Ramos <caioramos97@gmail.com> Signed-off-by: Gabriely Pereira <gabriely.pereira@usp.br> * Apply suggestions from code review Co-Authored-By: caiohsramos <caioramos97@gmail.com> 5 years ago			`Using blocks`
			````````````

			Most of what you can apply to a single task (with the exception of loops) can be applied at the :ref:`playbooks_blocks` level, which also makes it much easier to set data or directives common to the tasks.
			`Blocks also introduce the ability to handle errors in a way similar to exceptions in most programming languages.`
			`Blocks only deal with 'failed' status of a task. A bad task definition or an unreachable host are not 'rescuable' errors::`

			`tasks:`
			`- name: Handle the error`
			`block:`
			`- debug:`
			`msg: 'I execute normally'`
			`- name: i force a failure`
			`command: /bin/false`
			`- debug:`
			`msg: 'I never execute, due to the above task failing, :-('`
			`rescue:`
			`- debug:`
			`msg: 'I caught an error, can do stuff here to fix it, :-)'`

Clarifies how Ansible processes multiple `failed_when` conditions (#55941) Docs: Clarify that multiple failed_when conditions join with AND not OR to counter third-party pages online incorrectly stating that it uses `OR`. ([example](https://groups.google.com/d/msg/ansible-project/cIaQTmY3ZLE/c5w8rlmdHWIJ)). 5 years ago			This will 'revert' the failed status of the outer ``block`` task for the run and the play will continue as if it had succeeded.
Adding more information about blocks and blocks error handling. (#54429) * Adding more information about blocks and blocks error handling. * Update docs/docsite/rst/user_guide/playbooks_error_handling.rst and playbooks_blocks.rst * Removing undefined variables as not rescuable errors. Signed-off-by: Caio Ramos <caioramos97@gmail.com> Signed-off-by: Gabriely Pereira <gabriely.pereira@usp.br> * Apply suggestions from code review Co-Authored-By: caiohsramos <caioramos97@gmail.com> 5 years ago			See :ref:`block_error_handling` for more examples.
Various docs reorg and additions 12 years ago
Add see also sections to pages that do not have them. 11 years ago			`.. seealso::`

removes last :doc: links in user guide (#58433) * removes last :doc: links in user guide 5 years ago			:ref:`playbooks_intro`
Add see also sections to pages that do not have them. 11 years ago			`An introduction to playbooks`
removes last :doc: links in user guide (#58433) * removes last :doc: links in user guide 5 years ago			:ref:`playbooks_best_practices`
Add see also sections to pages that do not have them. 11 years ago			`Best practices in playbooks`
removes last :doc: links in user guide (#58433) * removes last :doc: links in user guide 5 years ago			:ref:`playbooks_conditionals`
Add see also sections to pages that do not have them. 11 years ago			`Conditional statements in playbooks`
removes last :doc: links in user guide (#58433) * removes last :doc: links in user guide 5 years ago			:ref:`playbooks_variables`
Add see also sections to pages that do not have them. 11 years ago			`All about variables`
Prefer https:// links in the docs site All the changed urls are availible by way of https://. Most of them already redirect. 6 years ago			`User Mailing List <https://groups.google.com/group/ansible-devel>`_
Add see also sections to pages that do not have them. 11 years ago			`Have a question? Stop by the google group!`
			`irc.freenode.net <http://irc.freenode.net>`_
			`#ansible IRC chat channel`