Why is Ansible not failing this string in stdout conditional? - networking

I am running Ansible version 2.7 on Centos7 using the network_cli connection method.
I have a playbook that:
Instructs a networking device to pull in a new firmware image via TFTP
Instructs the networking device to calculate the md5 hash value
Stores the output of the calculation in .stdout
Has a conditional When: statment that checks for a given md5 value in the .stdout before proceeding with the task block.
No matter what md5 value I give, it always runs the task block.
The conditional statement is:
when: '"new_ios_md5" | string in md5_result.stdout'
Here is the full playbook:
- name: UPGRADE SUP8L-E SWITCH FIRMWARE
hosts: switches
connection: network_cli
gather_facts: no
vars_prompt:
- name: "compliant_ios_version"
prompt: "What is the compliant IOS version?"
private: no
- name: "new_ios_bin"
prompt: "What is the name of the new IOS file?"
private: no
- name: "new_ios_md5"
prompt: "What is the MD5 value of the new IOS file?"
private: no
- name: "should_reboot"
prompt: "Do you want Ansible to reboot the hosts? (YES or NO)"
private: no
tasks:
- name: GATHER SWITCH FACTS
ios_facts:
- name: UPGRADE IOS IMAGE IF NOT COMPLIANT
block:
- name: COPY OVER IOS IMAGE
ios_command:
commands:
- command: "copy tftp://X.X.X.X/45-SUP8L-E/{{ new_ios_bin }} bootflash:"
prompt: '[{{ new_ios_bin }}]'
answer: "\r"
vars:
ansible_command_timeout: 1800
- name: CHECK MD5 HASH
ios_command:
commands:
- command: "verify /md5 bootflash:{{ new_ios_bin }}"
register: md5_result
vars:
ansible_command_timeout: 300
- name: CONTINUE UPGRADE IF MD5 HASH MATCHES
block:
- name: SETTING BOOT IMAGE
ios_config:
lines:
- no boot system
- boot system flash bootflash:{{ new_ios_bin }}
match: none
save_when: always
- name: REBOOT SWITCH IF INSTRUCTED
block:
- name: REBOOT SWITCH
ios_command:
commands:
- command: "reload"
prompt: '[confirm]'
answer: "\r"
vars:
ansible_command_timeout: 30
- name: WAIT FOR SWITCH TO RETURN
wait_for:
host: "{{inventory_hostname}}"
port: 22
delay: 60
timeout: 600
delegate_to: localhost
- name: GATHER ROUTER FACTS FOR VERIFICATION
ios_facts:
- name: ASSERT THAT THE IOS VERSION IS CORRECT
assert:
that:
- compliant_ios_version == ansible_net_version
msg: "New IOS version matches compliant version. Upgrade successful."
when: should_reboot == "YES"
when: '"new_ios_md5" | string in md5_result.stdout'
when: ansible_net_version != compliant_ios_version
...
The other two conditionals in the playbook work as expected. I cannot figure out how to get ansible to fail the when: '"new_ios_md5" | string in md5_result.stdout' conditional and stop the play if the md5 value is wrong.
When you run the play with debug output the value of stdout is:
"stdout": [
".............................................................................................................................................Done!",
"verify /md5 (bootflash:cat4500es8-universalk9.SPA.03.10.02.E.152-6.E2.bin) = c1af921dc94080b5e0172dbef42dc6ba"
]
You can clearly see the calculated md5 in the string but my conditional doesn't seem to care either way.
Does anyone have any advice?

When you write:
when: '"new_ios_md5" | string in md5_result.stdout'
You are looking for the literal string "new_ios_md5" inside the variable md5_result.stdout. Since you actually want to refer to the new new_ios_md5 variable, you ened to remove the quotes around it:
when: 'new_ios_md5 | string in md5_result.stdout'

Credit goes to zoredache on reddit for the final solution:
BTW, you know that for most of the various networking commands ios_command the results come back as a list right? So you need to index into the list relative to the command you run.
Say you had this. task
ios_command:
commands:
- command: "verify /md5 bootflash:{{ new_ios_bin }}"
- command: show version
- command: show config
register: results
You would have output in the list like this.
# results.stdout[0] = verify
# results.stdout[1] = show version
# results.stdout[2] = show config
So the correct conditional statement would be:
when: 'new_ios_md5 in md5_result.stdout[0]'

Related

AWX playbook failing with "network os cisco.iosxr.iosxr is not supported"

I have a playbook that works just great locally, when trying to run with AWX I run into an errors that seem to indicate the device type in the task is not supported.
Loading collection ansible.netcommon from /runner/requirements_collections/ansible_collections/ansible/netcommon
Loading callback plugin awx_display of type stdout, v2.0 from /usr/local/lib/python3.8/site-packages/ansible_runner/callbacks/awx_display.py
Skipping callback 'awx_display', as we already have a stdout callback.
Skipping callback 'default', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.
PLAYBOOK: soft_reset.yml *******************************************************
Positional arguments: soft_reset.yml
verbosity: 4
remote_user: root
connection: smart
timeout: 10
become: True
become_method: sudo
tags: ('all',)
inventory: ('/runner/inventory/hosts',)
extra_vars: ('#/runner/env/extravars',)
forks: 5
1 plays in soft_reset.yml
PLAY [aggs] ****************************************************************
META: ran handlers
TASK [soft reset bgp peers] ****************************************************
task path: /runner/project/soft_reset.yml:5
fatal: [lab-core-blue]: FAILED! => {
"msg": "network os cisco.iosxr.iosxr is not supported"
}
inventory/hosts.yml
aggs:
hosts:
lab-core-blue:
ansible_host: lab-core-blue.mylab.com
vars:
ansible_network_os: cisco.iosxr.iosxr
ansible_connection: ansible.netcommon.network_cli
soft_reset.yml
- hosts: aggs
gather_facts: no
tasks:
- name: soft reset bgp peers
ansible.netcommon.cli_command:
command: clear bgp vpnv4 unicast {{item}} soft
loop:
- 1.2.3.2
- 1.2.3.3
- 1.2.3.4
collections/requirements.yml
---
collections:
- ansible.netcommon
Thanks to #sivel in #ansible irc for the answer!
I think the AWX EEs only have ansible-core by default, you need cisco.iosxr in that list
Turns out I needed to add a line to collections/requirements.yml
---
collections:
- ansible.netcommon
- cisco.iosxr

Write a playbook, which access the network devices present in host file and check for host OS of the device (nexus/cisco/arista)

I am writing a playbook, which dynamically enters the network servers provided by user at prompt in host file.Further i want to create a task which will ssh on the server one by one and run show version command to fetch the OS of those network devices.
As of now, I am getting the below error:
1."Unable to automatically determine host network os. Please manually configure ansible_network_os value for this host"
I dont want to define the OS beforehand, rather want the playbook to do it.
I tried entering some configurations in host file as below:
[device]
192.1xx.xxx.xx
[all:vars]
ansible_connection=network_cli
ansible_user=user
ansible_ssh_pass=password
ssh_args = -o GSSAPIAuthentication=no
The below plsybook is taking device name from the user and adding it to the file, further it need to ssh on those servers and fetch the OS details:
---
- name: add host dynamically
hosts: localhost
gather_facts: no
vars:
Server: null_val1
vars_prompt:
- name: "Server"
prompt: "Please enter the non-reporting server/IP"
private: no
default: null_val1
pre_tasks:
- name: Save variable
set_fact:
Server: "{{Server}}"
- name: Copying servers to host file
hosts: localhost
become: true
tasks:
- name: copying variable value
lineinfile:
path: "{{playbook_dir}}/hosts.txt"
line: "{{Server}}"
insertafter: '^\[device\]'
state: present
- name: Main execution task
hosts: device
gather_facts: true
tasks:
- name: Run command on devices
cli_command:
command: show version
register: result
- name: display result
debug:
var: result.stdout_lines
The actual result should show me the OS of network device(s) entered in host file by running show version command remotely.

How repeat ansible task until the result is failed + show timestamps of every retry?

I am trying to solve a network automation issue. The issue is we have a strange behaviour of network devices (SNOM Phones) connected in a chain to the certain Cisco switch port.
The thing is the one of such phones (every time the different one) is disappearing randomly, and after that such device can't get a IP address via DHCP. We still did not found the way to reproduce the issue, so I've enabled debug logs at the DHCP server and now awaiting then one of mac addresses will disappear from the switch interface mac address table.
And as cisco do not support linux 'watch' command, I've wrote a simple ansible playbook for such purpose:
---
- name: show mac address-table
hosts: ios
gather_facts: no
tasks:
- name: show mac address-table interface Fa0/31
ios_command:
commands: show mac address-table interface Fa0/31
wait_for:
- result[0] contains 0004.1341.799e
- result[0] contains 0004.134a.f67d
- result[0] contains 0004.138e.1a53
register: result
until: result is failed
retries: 1000
- debug: var=result
But in that configuration i see the only
FAILED - RETRYING: show mac address-table interface Fa0/31 (660 retries left).
FAILED - RETRYING: show mac address-table interface Fa0/31 (659 retries left).
FAILED - RETRYING: show mac address-table interface Fa0/31 (658 retries left).
FAILED - RETRYING: show mac address-table interface Fa0/31 (657 retries left).
at the output.
I've tried to use anstomlog callback plugin, but it show the timestamps only for the succeded conditions (i.e. in my case - then result is failed)
So, I am looking for an advice, how to achieve both goals:
run task forever until status get failed
write timestams of every single retry
Thanks in advance!
It's better to rewrite it as a normal loop (with include_tasks) and report all information you need in that task.
Relying on 'retry' as a watchdog is not a great idea.
Moreover, I think it's better to rewrite it as a independent program. If you are worrying about ssh to switch, netmiko is a great collection of ready-to-use quirks for all network devices. It has '.command' method to execute on switches.
well, as the initial question was about Ansible I solved the issue just by saving the timestamp & getting dhcp log from router & filtering log by timestamp and mac addresses:
---
- name: Find switch port by host ip address
hosts: all
gather_facts: no
connection: local
roles:
- Juniper.junos
vars:
systime: "{{ ansible_date_time.time }}"
timestamp: "{{ ansible_date_time.date }}_{{ systime }}"
connection_settings:
host: "{{ ansible_host }}"
timeout: 120
snom_mac_addresses:
- '00_04:13_41:79_9e'
- '00_04:13_4a:f6_7d'
- '00_04:13_8e:1a_53'
tasks:
- name: show mac address-table interface Fa0/31
ios_command:
commands: show mac address-table interface Fa0/31
wait_for:
- result[0] contains {{ snom_mac_addresses[0] | replace(':', '.')| replace('_', '') }}
- result[0] contains {{ snom_mac_addresses[1] | replace(':', '.')| replace('_', '') }}
- result[0] contains {{ snom_mac_addresses[2] | replace(':', '.')| replace('_', '') }}
- result[0] contains {{ snom_mac_addresses[3] | replace(':', '.')| replace('_', '') }}
register: result
until: result is failed
retries: 1000
ignore_errors: True
when: inventory_hostname == 'access-switch'
- name: save timestamp in Junos format
set_fact:
junos_timestamp: "{{ lookup('pipe','date +%b_%_d_%H:%M') | replace('_', ' ') }}"
run_once: yes
delegate_to: localhost
- debug:
var: junos_timestamp
run_once: yes
delegate_to: localhost
- name: get dhcp log from router
junos_scp:
provider: "{{ connection_settings }}"
src: /var/log/dhcp-service.log
remote_src: true
when: inventory_hostname == 'router'
- name: filter log for time
run_once: yes
shell: "egrep -i '{{ junos_timestamp }}' dhcp-service.log"
register: grep_time_output
delegate_to: localhost
- debug: var=grep_time_output.stdout_lines
- name: filter log for time and mac
run_once: yes
shell: "egrep -i '{{ snom_mac_addresses | join('|') | replace(':', ' ')| replace('_', ' ') }}' dhcp-service.log"
register: grep_mac_output
delegate_to: localhost
- debug: var=grep_mac_output.stdout_lines
Pretty sure it's not looks like an elegant solution, but at least I did all work within a single Ansible environment and anyone could re-use part of my code without significant refactoring.
just one doubt - I've to use my own format for mac addresses, because Cisco and Juniper debug log are printing them in a different manner:
Juniper debug log:
Mar 6 13:14:19.582886 [MSTR][DEBUG] client_key_compose: Composing key (0x1c6aa00) for cid_l 7, cid d4 a3 3d a1 e2 38, mac d4 a3 3d a1 e2 38, htype 1, subnet 10.111.111.1, ifindx 0, opt82_l 0, opt82 NULL
Cisco:
30 0004.133d.39fb DYNAMIC Po1
But maybe there is a clever way to handle all different formats for mac addresses in Ansible.

SSH connectivity issues with ntc-ansible modules

I am trying to using the ntc-ansible module with Ansible running on Ubuntu (WSL). I have ssh connectivity to my remote device (Cisco 2960X) and I can run ansible playbooks to the same remote switch using the built in Ansible networking modules (ios_command) and it works fine.
Issue:
When I try to run any of the ntc-ansible modules, it fails, unable to connect to the device. Probably something simple, but I have hit a wall. There is something I am missing about how to use ntc-ansible modules. Ansible is seeing the modules as I can look at the docs as was suggested as a test in the readme.
I have ntc-ansible module installed here: /home/melshman/.ansible/plugins/modules/ntc-ansible
I am running my playbooks from here: ~/projects/ansible/
The first time I ran the playbook with the ntc-ansible modules it failed and based on error message and some research I installed sshpass (sudo apt-get install sshpass). But still having ssh problems using ntc-ansible… (playbook and traceback below)
I hear folks taking about an index file, but I can’t find that file? Where does it live and what do I need to do with it?
What is my connection supposed to be setup to be? Local? SSH? Netmiko_ssh?
What should I be using for platform? Cisco_ios? cisco_ios_ssh?
Appreciate any help I can get. I have been running in circles for hours and hours.
Ansible Version Info:
VTMNB17024:~/projects/ansible $ ansible --version
ansible 2.5.3
config file = /home/melshman/projects/ansible/ansible.cfg
configured module search path = [u'/home/melshman/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/lib/python2.7/dist-packages/ansible
executable location = /usr/local/bin/ansible
python version = 2.7.12 (default, Dec 4 2017, 14:50:18) [GCC 5.4.0 20160609]
Working playbook (ios_command:) note: ansible_ssh_pass and ansible_user in group var:
- name: Test Net Automation
hosts: ctil-ios-upgrade
connection: local
gather_facts: no
tasks:
- name: Grab run config
ios_command:
commands:
- show run
register: config
- name: Create backup of running configuration
copy:
content: "{{config.stdout[0]}}"
dest: "backups/show_run_{{inventory_hostname}}.txt"
Playbook (not working) using ntc-ansible module (Note: username and password are defined in Group VAR:
- name: Cisco IOS Automation
hosts: ctil-ios-upgrade
connection: local
gather_facts: no
tasks:
- name: GET UPTIME
ntc_show_command:
connection: ssh
platform: "cisco_ios"
command: 'show version | inc uptime'
host: "{{ inventory_hostname }}"
username: "{{ username }}"
password: "{{ password }}"
use_templates: True
template_dir: /home/melshman/.ansible/plugins/modules/ntc-ansible/ntc-templates/templates
Here is the traceback I get when the error occurs:
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: netmiko.ssh_exception.NetMikoTimeoutException: Connection to device timed-out: cisco_ios VTgroup_SW:22
fatal: [VTgroup_SW]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n File \"/tmp/ansible_RJRY9m/ansible_module_ntc_save_config.py\", line 279, in \n main()\n File \"/tmp/ansible_RJRY9m/ansible_module_ntc_save_config.py\", line 251, in main\n device = ntc_device(device_type, host, username, password, **kwargs)\n File \"/usr/local/lib/python2.7/dist-packages/pyntc-0.0.6-py2.7.egg/pyntc/__init__.py\", line 35, in ntc_device\n return device_class(*args, **kwargs)\n File \"/usr/local/lib/python2.7/dist-packages/pyntc-0.0.6-py2.7.egg/pyntc/devices/ios_device.py\", line 39, in __init__\n self.open()\n File \"/usr/local/lib/python2.7/dist-packages/pyntc-0.0.6-py2.7.egg/pyntc/devices/ios_device.py\", line 55, in open\n verbose=False)\n File \"build/bdist.linux-x86_64/egg/netmiko/ssh_dispatcher.py\", line 178, in ConnectHandler\n File \"build/bdist.linux-x86_64/egg/netmiko/base_connection.py\", line 207, in __init__\n File \"build/bdist.linux-x86_64/egg/netmiko/base_connection.py\", line 693, in establish_connection\nnetmiko.ssh_exception.NetMikoTimeoutException: Connection to device timed-out: cisco_ios VTgroup_SW:22\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 1}
Here is a working solution using ntc_show_command to a Cisco IOS device.
- name: Cisco IOS Automation
hosts: pynet-rtr1
connection: local
gather_facts: no
tasks:
- name: GET UPTIME
ntc_show_command:
connection: ssh
platform: "cisco_ios"
command: 'show version'
host: "{{ ansible_host }}"
username: "{{ ansible_user }}"
password: "{{ ansible_ssh_pass }}"
use_templates: True
template_dir: '/home/kbyers/ntc-templates/templates'
If you are going to use ntc-templates, I probably would not have the '| include uptime' in the 'show version'. In other words, let TextFSM convert the output to structured data first and then grab the uptime from that structured data.
I modified inventory_hostname to ansible_host to be consistent with my inventory format (my inventory_hostname doesn't actually resolve in DNS).
I modified username and password to 'ansible_user' and 'ansible_ssh_pass' to be consistent with my inventory and also to be more consistent with Ansible 2.5/2.6 variable naming.
On your above issue, your exception message does not match your playbook (i.e. are you sure that is the exception you get for that playbook).
Here is my inventory file (I simplified this to remove some unnecessary devices and to hide confidential information)
[all:vars]
ansible_connection=local
ansible_python_interpreter=/home/kbyers/VENV/ansible/bin/python
ansible_user=user
ansible_ssh_pass=password
[local]
localhost ansible_connection=local
[cisco]
pynet-rtr1 ansible_host=cisco1.domain.com
pynet-rtr2 ansible_host=cisco2.domain.com

OS::Heat::SoftwareDeployment is staying stuck in CREATE_IN_PROGRESS status

I am trying customise new instances created within openstack mikata, using HEAT templates. Using OS::Nova::Server with a script in user_data works fine.
Next the idea is to do additional steps via OS::Heat::SoftwareConfig.
The config is:
type: OS::Nova::Server
....
user_data_format: SOFTWARE_CONFIG
user_data:
str_replace:
template:
get_file: vm_init1.sh
config1:
type: OS::Heat::SoftwareConfig
depends_on: vm
properties:
group: script
config: |
#!/bin/bash
echo "Running $0 OS::Heat::SoftwareConfig look in /var/tmp/test_script.log" | tee /var/tmp/test_script.log
deploy:
type: OS::Heat::SoftwareDeployment
properties:
config:
get_resource: config1
server:
get_resource: vm
The instance is setup nicely (the script vm_init1.sh above runs fine) and one can login, but he "config1" example above is never executed.
Analysis
- The base image is Ubuntu 16.04, created with disk-image-create and including "vm ubuntu os-collect-config os-refresh-config os-apply-config heat-config heat-config-script"
- From "openstack stack resource list $vm" one see that deployment never fisnihe, with OS::Heat::SoftwareDeployment status=CREATE_IN_PROGRESS
- "openstack stack resource show $vm config1" shows resource_status=CREATE_COMPLETE
- Within the vm, /var/log/cloud-init-output.log shows the output of the script vm_init1.sh, but no trace of the 'config1' script. The log os-apply-config.log is empty, is that normal?
How does one troubleshoot OS::Heat::SoftwareDeployment configs?
(I have read https://docs.openstack.org/developer/heat/template_guide/software_deployment.html#software-deployment-resources)

Resources