Starting with Ansible version 2.1, you can now use the familiar Ansible models of playbook authoring and module development to manage heterogeneous networking devices. Ansible supports a growing number of network devices using both CLI over SSH and API (when available) transports.
This section discusses how to debug and troubleshoot network modules in Ansible 2.3.
How to troubleshoot
===================
This section covers troubleshooting issues with Network Modules.
Errors generally fall into one of the following categories:
:Authentication issues:
* Not correctly specifying credentials
* Remote device (network switch/router) not falling back to other other authentication methods
* SSH key issues
:Timeout issues:
* Can occur when trying to pull a large amount of data
Because logging is very verbose it is disabled by default. It can be enabled via the :envvar:`ANSIBLE_LOG_PATH` and :envvar:`ANSIBLE_DEBUG` options on the ansible-controller, that is the machine running ansible-playbook.
Before running ``ansible-playbook`` run the following commands to enable logging::
Because the log files are verbose, you can use grep to look for specific information. For example, once you have identified the ``pid`` from the ``creating new control socket for host`` line you can search for other connection log entries::
Ansible 2.8 features added logging of device interaction in log file to help diagnose and troubleshoot
issues regarding Ansible Networking modules. The messages are logged in file pointed by ``log_path`` configuration
option in Ansible configuration file or by set :envvar:`ANSIBLE_LOG_PATH` as mentioned in above section.
..warning::
The device interaction messages consist of command executed on target device and the returned response, as this
log data can contain sensitive information including passwords in plain text it is disabled by default.
Additionally, in order to prevent accidental leakage of data, a warning will be shown on every task with this
setting eneabled specifying which host has it enabled and where the data is being logged.
Be sure to fully understand the security implications of enabling this option. The device interaction logging can be enabled either globally by setting in configuration file or by setting environment or enabled on per task basis by passing special variable to task.
Before running ``ansible-playbook`` run the following commands to enable logging::
# Specify the location for the log file
export ANSIBLE_LOG_PATH=~/ansible.log
Enable device interaction logging for a given task
..code-block:: yaml
- name: get version information
ios_command:
commands:
- show version
vars:
ansible_persistent_log_messages: True
To make this a global setting, add the following to your ``ansible.cfg`` file:
..code-block:: ini
[persistent_connection]
log_messages = True
or enable environment variable `ANSIBLE_PERSISTENT_LOG_MESSAGES`
# Enable device interaction logging
export ANSIBLE_PERSISTENT_LOG_MESSAGES=True
If the task is failing at the time on connection initialization itself it is recommended to enable this option
globally else if an individual task is failing intermittently this option can be enabled for that task itself to
find the root cause.
After Ansible has finished running you can inspect the log file which has been created on the ansible-controller
..note:: Be sure to fully understand the security implications of enabling this option as it can log sensitive
information in log file thus creating security vulnerability.
`ad-hoc` refers to running Ansible to perform some quick command using ``/usr/bin/ansible``, rather than the orchestration language, which is ``/usr/bin/ansible-playbook``. In this case we can ensure connectivity by attempting to execute a single command on the remote device::
The ``socket_path does not exist or cannot be found`` and ``unable to connect to socket`` messages are new in Ansible 2.5. These messages indicate that the socket used to communicate with the remote network device is unavailable or does not exist.
For example:
..code-block:: none
fatal: [spine02]: FAILED! => {
"changed": false,
"failed": true,
"module_stderr": "Traceback (most recent call last):\n File \"/tmp/ansible_TSqk5J/ansible_modlib.zip/ansible/module_utils/connection.py\", line 115, in _exec_jsonrpc\nansible.module_utils.connection.ConnectionError: socket_path does not exist or cannot be found\n",
"module_stdout": "",
"msg": "MODULE FAILURE",
"rc": 1
}
or
..code-block:: none
fatal: [spine02]: FAILED! => {
"changed": false,
"failed": true,
"module_stderr": "Traceback (most recent call last):\n File \"/tmp/ansible_TSqk5J/ansible_modlib.zip/ansible/module_utils/connection.py\", line 123, in _exec_jsonrpc\nansible.module_utils.connection.ConnectionError: unable to connect to socket\n",
"module_stdout": "",
"msg": "MODULE FAILURE",
"rc": 1
}
Suggestions to resolve:
Follow the steps detailed in :ref:`enable network logging <enable_network_logging>`.
If the identified error message from the log file is:
..code-block:: yaml
2017-04-04 12:19:05,670 p=18591 u=fred | command timeout triggered, timeout value is 10 secs
or
..code-block:: yaml
2017-04-04 12:19:05,670 p=18591 u=fred | persistent connection idle timeout triggered, timeout value is 30 secs
Follow the steps detailed in :ref:`timeout issues <timeout_issues>`
The ``unable to open shell`` message is new in Ansible 2.3. This message means that the ``ansible-connection`` daemon has not been able to successfully talk to the remote network device. This generally means that there is an authentication issue. It is a "catch all" message, meaning you need to enable :ref:logging`a_note_about_logging` to find the underlying issues.
* If you are not using ``provider:`` nor top-level arguments ensure your inventory file is correct.
Error: "Authentication failed"
------------------------------
**Platforms:** Any
Occurs if the credentials (username, passwords, or ssh keys) passed to ``ansible-connection`` (via ``ansible`` or ``ansible-playbook``) can not be used to connect to the remote device.
For example:
..code-block:: yaml
<ios01> ESTABLISH CONNECTION FOR USER: cisco on PORT 22 TO ios01
If you are specifying credentials via ``password:`` (either directly or via ``provider:``) or the environment variable `ANSIBLE_NET_PASSWORD` it is possible that ``paramiko`` (the Python SSH library that Ansible uses) is using ssh keys, and therefore the credentials you are specifying are being ignored. To find out if this is the case, disable "look for keys". This can be done like this:
This may occur if the SSH fingerprint hasn't been added to Paramiko's (the Python SSH library) know hosts file.
When using persistent connections with Paramiko, the connection runs in a background process. If the host doesn't already have a valid SSH key, by default Ansible will prompt to add the host key. This will cause connections running in background processes to fail.
For example:
..code-block:: yaml
2017-04-04 12:06:03,486 p=17981 u=fred | using connection plugin network_cli
2017-04-04 12:06:04,680 p=17981 u=fred | connecting to host veos01 returned an error
In Ansible 2.3, persistent connection sockets are stored in ``~/.ansible/pc`` for all network devices. When an Ansible playbook runs, the persistent socket connection is displayed when verbose output is specified.
If the user requires a password to go into privileged mode, this can be specified with ``auth_pass``; if ``auth_pass`` isn't set, the environment variable `ANSIBLE_NET_AUTHORIZE` will be used instead.