Starting with Ansible version 2.1, you can now use the familiar Ansible models of playbook authoring and module development to manage heterogeneous networking devices. Ansible supports a growing number of network devices using both CLI over SSH and API (when available) transports.
This section discusses how to debug and troubleshoot network modules in Ansible 2.3.
How to troubleshoot
===================
This section covers troubleshooting issues with Network Modules.
Errors generally fall into one of the following categories:
:Authentication issues:
* Not correctly specifying credentials
* Remote device (network switch/router) not falling back to other other authentication methods
* SSH key issues
:Timeout issues:
* Can occur when trying to pull a large amount of data
* May actually be masking a authentication issue
:Playbook issues:
* Use of ``delegate_to``, instead of ``ProxyCommand``
* Not using ``connection: local``
.. warning: ``unable to open shell`
The ``unable to open shell`` message is new in Ansible 2.3, it means that the ``ansible-connection`` daemon has not been able to successfully
talk to the remote network device. This generally means that there is an authentication issue. See the "Authentication and connection issues" section
2017-03-30 13:20:15,321 p=28990 u=fred | ssh connection has completed successfully
2017-03-30 13:20:15,322 p=28990 u=fred | connection established to veos01 in 0:00:22.580626
From the log notice:
*``p=28990`` Is the PID (Process ID) of the ``ansible-connection`` process
*``u=fred`` Is the user `running` ansible, not the remote-user you are attempting to connect as
*``creating new control socket for host veos01:22 as user admin`` host:port as user
*``control socket path is`` location on disk where the persistent connection socket is created
*``using connection plugin network_cli`` Informs you that persistent connection is being used
*``connection established to veos01 in 0:00:22.580626`` Time taken to obtain a shell on the remote device
.. note: Port None ``creating new control socket for host veos01:None``
If the log reports the port as ``None`` this means that the default port is being used.
A future Ansible release will improve this message so that the port is always logged.
Because the log files are verbose, you can use grep to look for specific information. For example, once you have identified the ```pid`` from the ``creating new control socket for host`` line you can search for other connection log entries::
grep "p=28990" $ANSIBLE_LOG_PATH
Isolating an error
------------------
**Platforms:** Any
As with any effort to troubleshoot it's important to simplify the test case as much as possible.
`ad-hoc` refers to running Ansible to perform some quick command using ``/usr/bin/ansible``, rather than the orchestration language, which is ``/usr/bin/ansible-playbook``. In this case we can ensure connectivity by attempting to execute a single command on the remote device::
The ``unable to open shell`` message is new in Ansible 2.3. This message means that the ``ansible-connection`` daemon has not been able to successfully talk to the remote network device. This generally means that there is an authentication issue. It is a "catch all" message, meaning you need to enable ``ANSIBLE_LOG_PATH`` to find the underlying issues.
Follow the steps detailed in enable_network_logging_.
Once you've identified the error message from the log file, the specific solution can be found in the rest of this document.
Error: "[Errno -2] Name or service not known"
---------------------------------------------
**Platforms:** Any
Indicates that the remote host you are trying to connect to can not be reached
For example:
..code-block:: yaml
2017-04-04 11:39:48,147 p=15299 u=fred | control socket path is /home/fred/.ansible/pc/ca5960d27a
2017-04-04 11:39:48,147 p=15299 u=fred | current working directory is /home/fred/git/ansible-inc/stable-2.3/test/integration
2017-04-04 11:39:48,147 p=15299 u=fred | using connection plugin network_cli
2017-04-04 11:39:48,340 p=15299 u=fred | connecting to host veos01 returned an error
2017-04-04 11:39:48,340 p=15299 u=fred | [Errno -2] Name or service not known
Suggestions to resolve:
* If you are using the ``provider:`` options ensure that it's suboption ``host:`` is set correctly.
* If you are not using ``provider:`` nor top-level arguments ensure your inventory file is correct.
Error: "Authentication failed"
------------------------------
**Platforms:** Any
Occurs if the credentials (username, passwords, or ssh keys) passed to ``ansible-connection`` (via ``ansible`` or ``ansible-playbook``) can not be used to connect to the remote device.
For example:
..code-block:: yaml
<ios01> ESTABLISH CONNECTION FOR USER: cisco on PORT 22 TO ios01
<ios01> Authentication failed.
Suggestions to resolve:
If you are specifying credentials via ``password:`` (either directly or via ``provider:``) or the environment variable ``ANSIBLE_NET_PASSWORD`` it is possible that ``paramiko`` (the Python SSH library that Ansible uses) is using ssh keys, and therefore the credentials you are specifying are being ignored. To find out if this is the case, disable "look for keys". This can be done like this:
..code-block:: yaml
export ANSIBLE_PARAMIKO_LOOK_FOR_KEYS=False
To make this a permanent change, add the following to your ``ansible.cfg`` file:
..code-block:: ini
[paramiko_connection]
look_for_keys = False
Error: "connecting to host <hostname> returned an error" or "Bad address"
This may occur if the SSH fingerprint hasn't been added to Paramiko's (the Python SSH library) know hosts file.
When using persistent connections with Paramiko, the connection runs in a background process. If the host doesn't already have a valid SSH key, by default Ansible will prompt to add the host key. This will cause connections running in background processes to fail.
For example:
..code-block:: yaml
2017-04-04 12:06:03,486 p=17981 u=fred | using connection plugin network_cli
2017-04-04 12:06:04,680 p=17981 u=fred | connecting to host veos01 returned an error
In Ansible 2.3, persistent connection sockets are stored in ``~/.ansible/pc`` for all network devices. When an Ansible playbook runs, the persistent socket connection is displayed when verbose output is specified.
If the user requires a password to go into privileged mode, this can be specified with ``auth_pass``; if ``auth_pass`` isn't set, the environment variable ``ANSIBLE_NET_AUTHORIZE`` will be used instead.
Add `authorize: yes` to the task. For example:
..code-block:: yaml
- name: configure hostname
ios_system:
provider:
hostname: foo
authorize: yes
auth_pass: "{{ mypasswordvar }}"
register: result
.. delete_to not honoured
----------------------
FIXME Do we get an error message
FIXME Link to howto
fixmes
======
Error: "number of connection attempts exceeded, unable to connect to control socket"