How to handle expected prompt in ansible ios_config module - networking

I am trying to implement couple simple commands on cisco ios devices using Ansible (ios_config module).
Especially, I want to remove user profile, but it requires to answer on a prompt and I am getting timeout error...
I have noticed that there are prompt/answer parameters in ios_command module, but it seems that it is not supported in ios_config module.
Has anyone run into the similar problem?
Ansible Task:
- name: remove user on remote devices
ios_config:
lines:
- no username testuser
provider: "{{ provider }}"
Output from Cisco device:
Cisco_Router(config)#no username testuser
This operation will remove all username related configurations with same name.Do you want to continue? [confirm]
Playbook output:
TASK [remove user on remote devices] *************************************************************************************************************************************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ansible.module_utils.connection.ConnectionError: timeout trying to send command: end
fatal: [Cisco_Router]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n File \"/tmp/ansible_3_OlXK/ansible_module_ios_config.py\", line 583, in <module>\n main()\n File \"/tmp/ansible_3_OlXK/ansible_module_ios_config.py\", line 512, in main\n load_config(module, commands)\n File \"/tmp/ansible_3_OlXK/ansible_modlib.zip/ansible/module_utils/network/ios/ios.py\", line 168, in load_config\n File \"/tmp/ansible_3_OlXK/ansible_modlib.zip/ansible/module_utils/connection.py\", line 149, in __rpc__\nansible.module_utils.connection.ConnectionError: timeout trying to send command: end\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 1}

Starting with Ansible 2.4 there is an ios_user module that can be used to create, edit and remove users.
Removing a specific user with state: absent
- name: set user view/role
ios_user:
name: testuser
state: absent
provider: "{{ provider }}"
The full documentation and further examples can be found at: https://docs.ansible.com/ansible/latest/modules/ios_user_module.html
_command modules and prompts
The various _command modules, including ios_command support passing prompts.
For example:
- name: run commands that require answering a prompt
ios_command:
commands:
- command: 'clear counters GigabitEthernet0/1'
prompt: 'Clear "show interface" counters on this interface \[confirm\]'
answer: 'y'
- command: 'clear counters GigabitEthernet0/2'
prompt: '[confirm]'
answer: "\r"
See https://docs.ansible.com/ansible/latest/modules/ios_command_module.html for further info.

the prompt waits for a confirmation it seems so you need to confirm the command with a second line, so you likely have to do that.
- name: remove user on remote devices
ios_config:
lines:
- no username testuser
- yes
provider: "{{ provider }}"

I have tried this as well.
It seems that ios_config module is looking for a hostname(config)# prefix after executing each line. Thats why second line is not processing at all and I got the same notification - timeout.

Related

Odd error when attempting net_put via Ansible

Looking for assistance with an odd error I am troubleshooting with a playbook.
I have a working SSH session to a switch, but having difficulty with transferring files via SCP on Ansible. I can start a SCP session directly from the same server with no issues and can transfer a text file (the same one references below) but it does not seem to work in Ansible.
I enabled verbose logging via Ansible and this is what I am seeing in the logfile generated.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/ansible/utils/jsonrpc.py", line 46, in handle_request
result = rpc_method(*args, **kwargs)
File "/root/.ansible/collections/ansible_collections/ansible/netcommon/plugins/connection/network_cli.py", line 1282, in copy_file
self.ssh_type_conn.put_file(source, destination, proto=proto)
File "/root/.ansible/collections/ansible_collections/ansible/netcommon/plugins/connection/libssh.py", line 498, in put_file
raise AnsibleError(
ansible.errors.AnsibleError: Error transferring file to flash:test.txt: Initializing SCP session of remote file [flash:test.txt] for w>
2022-10-06 11:58:35,671 p=535932 u=root n=ansible | fatal: [%remoteSwitch%]: FAILED! => {
"changed": false,
"destination": "flash:test.txt",
"msg": "Exception received: Error transferring file to flash:test.txt: Initializing SCP session of remote file [flash:test.txt] fo>
}
Afraid Google is not helping me too much with this one. If it helps, this is on Ubuntu 22.04, with Ansible 2.10.8.
Play attempting to be ran is:
- hosts: %remoteSwitch%
vars:
- firmware_image_name: "test.txt"
tasks:
- name: Copying image to the switch... This can take time, please wait...
net_put:
src: "/etc/ansible/firmware_images/C2960X/{{ firmware_image_name }}"
dest: "flash:{{ firmware_image_name }}"
vars:
ansible_command_timeout: 20
protocol: scp
I ran into this same error. I was able to fix it by switching back to Paramiko SSH. This can be accomplished by either pip uninstall ansible-pylibssh (note, this very likely has other side-effects).
Alternatively, you can force Paramiko usage at the Ansible play level:
---
- name: Test putting a file onto Cisco IOS/IOS-XE device
hosts: cisco1
# ansible-pylibssh errors out here (force paramiko usage)
vars:
ansible_network_cli_ssh_type: paramiko
tasks:
- name: Copy file
ansible.netcommon.net_put:
src: my_file1.txt
dest : flash:/my_file1.txt
protocol: scp
It would be helpful to know what type of connection it is and what platform.
I see from the file that it is a Cisco IOS device. Do you have the following settings?
ansible_connection: ansible.netcommon.network_cli
ansible_network_os: cisco.ios.ios
The following documentation mentions the need for paramiko. Will it work if you change the ssh_type to paramiko?
https://docs.ansible.com/ansible/latest/collections/ansible/netcommon/net_put_module.html
ssh_type can be set as follows:
configuration:
INI entry:
[persistent_connection]
ssh_type = paramiko
Environment variable: ANSIBLE_NETWORK_CLI_SSH_TYPE
Variable: ansible_network_cli_ssh_type
Variable: ansible_network_cli_ssh_type

Euca 5.0 Ansible Console Task Failing

Background:
I am only able to get past the ansible console install/config tasks by adding --region localhost to anywhere in: /usr/share/eucalyptus-ansible/roles/cloud-post/tasks/console.yml wherever it calls tools that take that argument.
Otherwise each sub task fails like this: ["euca-describe-images: error: connection error (('Connection aborted.', gaierror(-2, 'Name or service not known')))"]
Running the commands from that playbook directly on the euca server being configured gives the same result unless I specify --region localhost
Problem:
I'm stuck here: [cloud-post : update console route53 system domain for eucalyptus-cloud authentication]
Error: "euform-update-stack: error (ValidationError): No updates are to be performed.", "stderr_lines": ["euform-update-stack: error (ValidationError): No updates are to be performed."]
All services are running except the ImagingBackend is Not Ready
No instances are running according to euca-describe-instances
Images are available:
IMAGE ami-5be483c81cf8bd65c eucalyptus-console-image-5-0-823/eucalyptus-console-image-5-0-823.raw.manifest.xml 000216594841 available private x86_64 machine instance-store hvm
TAG image ami-5be483c81cf8bd65c type eucalyptus-console-image
TAG image ami-5be483c81cf8bd65c version 5.0.823
IMAGE ami-f31092ddb73e29af9 eucalyptus-service-image-v5.0.100/eucalyptus-service-image.raw.manifest.xml 000216594841 available privatx86_64 machine instance-store hvm
TAG image ami-f31092ddb73e29af9 provides imaging,loadbalancing
TAG image ami-f31092ddb73e29af9 type eucalyptus-service-image
TAG image ami-f31092ddb73e29af9 version 5.0.100
---
all:
hosts:
exp-euca.lan.com:
exp-enc-[01:02].lan.com:
vars:
vpcmido_public_ip_range: "192.168.100.5-192.168.100.254"
vpcmido_public_ip_cidr: "192.168.100.1/24"
cloud_system_dns_dnsdomain: "cloud.lan.com"
cloud_public_port: 443
eucalyptus_console_cloud_deploy: yes
cloud_service_image_rpm: no
cloud_properties:
services.imaging.worker.ntp_server: "x.x.x.x"
services.loadbalancing.worker.ntp_server: "x.x.x.x"
children:
cloud:
hosts:
exp-euca.lan.com:
console:
hosts:
exp-euca.lan.com:
node:
hosts:
exp-enc-[01:02].lan.com:
EDIT:
Solved. Details are in the comments of the marked answer.
The name error most likely means that DNS for the domain cloud.lan.com is not being correctly delegated to your deployment. To test this, check if the nameserver is found:
dig +short NS cloud.lan.com
you should see "ns1.cloud.lan.com" and then should be able to use that nameserver to resolve services, e.g.
dig +short ec2.cloud.lan.com #ns1.cloud.lan.com
which should be the IP of the host for the compute service.
The second item is a bug in the ansible playbook that occurs when the stack is already present and up to date. To work around it, you can either update your playbook or delete the stack before running the playbook. Depending on how far the playbook progressed you may have a script to do this:
/usr/local/bin/console-manage-stack -a delete
the related playbook change is https://github.com/AppScale/ats-deploy/pull/36

Task fails due to not being able to read log file

Composer is failing a task due to it not being able to read a log file, it's complaining about incorrect encoding.
Here's the log that appears in the UI:
*** Unable to read remote log from gs://bucket/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** 'ascii' codec can't decode byte 0xc2 in position 6986: ordinal not in range(128)
*** Log file does not exist: /home/airflow/gcs/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Fetching from: http://airflow-worker-68dc66c9db-x945n:8793/log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='airflow-worker-68dc66c9db-x945n', port=8793): Max retries exceeded with url: /log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1c9ff19d10>: Failed to establish a new connection: [Errno -2] Name or service not known',))
I try viewing the file in the google cloud console and it also throws an error:
Failed to load
Tracking Number: 8075820889980640204
But I am able to download the file via gsutil.
When I view the file, it seems to have text overriding other text.
I can't show the entire file but it looks like this:
--------------------------------------------------------------------------------
Starting attempt 1 of 1
--------------------------------------------------------------------------------
#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,313] {models.py:1569} INFO - Executing <Task(BigQueryOperator): merge_campaign_exceptions> on 2019-08-03T10:00:00+00:00#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,314] {base_task_runner.py:124} INFO - Running: ['bash', '-c', u'airflow run __campaign_exceptions_0_0_1 merge_campaign_exceptions 2019-08-03T10:00:00+00:00 --job_id 22767 --pool _bq_pool --raw -sd DAGS_FOLDER//-campaign-exceptions.py --cfg_path /tmp/tmpyBIVgT']#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:24,658] {base_task_runner.py:107} INFO - Job 22767: Subtask merge_campaign_exceptions [2019-08-04 10:01:24,658] {settings.py:176} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
Where the #-#{} pieces seems to be "on top of" the typical log.
I faced the same problem. In my case the problem was that I removed the google_gcloud_default connection that was being used to retrieve the logs.
Check the configuration and look for the connection name.
[core]
remote_log_conn_id = google_cloud_default
Then check the credentials used for that connection name has the right permissions to access the GCS bucket.
I'm having a similar problem with viewing logs in GCP Cloud Composer. It doesn't appear to be preventing the failing DAG task from running though. What it looks like is a permissions error between the GKE and Storage Bucket where the log files are kept.
You can still view the logs by going into your cluster's storage bucket in the same directory as your /dags folder where you should also see a logs/ folder.
Your helm chart should setup global env:
- name: AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT
value: "google-cloud-platform://"
Then, you should deploy a Dockerfile with root account only (not airflow account), additionaly, you set up your helm uid, gid as:
uid: 50000 #airflow user
gid: 50000 #airflow group
Then upgrade helm chart with new config
*** Unable to read remote log from gs://bucket
1)Found the solution after assigning the roles to the service account
2)The SA key(json or txt) to be added and configured to the connection in the
remote_log_conn_id = google_cloud_default
3)restart the scheduler and webserver of the airflow
4)restart the dags on the airflow
you can find the logs on the GCS bucket where its configured

SSH connectivity issues with ntc-ansible modules

I am trying to using the ntc-ansible module with Ansible running on Ubuntu (WSL). I have ssh connectivity to my remote device (Cisco 2960X) and I can run ansible playbooks to the same remote switch using the built in Ansible networking modules (ios_command) and it works fine.
Issue:
When I try to run any of the ntc-ansible modules, it fails, unable to connect to the device. Probably something simple, but I have hit a wall. There is something I am missing about how to use ntc-ansible modules. Ansible is seeing the modules as I can look at the docs as was suggested as a test in the readme.
I have ntc-ansible module installed here: /home/melshman/.ansible/plugins/modules/ntc-ansible
I am running my playbooks from here: ~/projects/ansible/
The first time I ran the playbook with the ntc-ansible modules it failed and based on error message and some research I installed sshpass (sudo apt-get install sshpass). But still having ssh problems using ntc-ansible… (playbook and traceback below)
I hear folks taking about an index file, but I can’t find that file? Where does it live and what do I need to do with it?
What is my connection supposed to be setup to be? Local? SSH? Netmiko_ssh?
What should I be using for platform? Cisco_ios? cisco_ios_ssh?
Appreciate any help I can get. I have been running in circles for hours and hours.
Ansible Version Info:
VTMNB17024:~/projects/ansible $ ansible --version
ansible 2.5.3
config file = /home/melshman/projects/ansible/ansible.cfg
configured module search path = [u'/home/melshman/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/lib/python2.7/dist-packages/ansible
executable location = /usr/local/bin/ansible
python version = 2.7.12 (default, Dec 4 2017, 14:50:18) [GCC 5.4.0 20160609]
Working playbook (ios_command:) note: ansible_ssh_pass and ansible_user in group var:
- name: Test Net Automation
hosts: ctil-ios-upgrade
connection: local
gather_facts: no
tasks:
- name: Grab run config
ios_command:
commands:
- show run
register: config
- name: Create backup of running configuration
copy:
content: "{{config.stdout[0]}}"
dest: "backups/show_run_{{inventory_hostname}}.txt"
Playbook (not working) using ntc-ansible module (Note: username and password are defined in Group VAR:
- name: Cisco IOS Automation
hosts: ctil-ios-upgrade
connection: local
gather_facts: no
tasks:
- name: GET UPTIME
ntc_show_command:
connection: ssh
platform: "cisco_ios"
command: 'show version | inc uptime'
host: "{{ inventory_hostname }}"
username: "{{ username }}"
password: "{{ password }}"
use_templates: True
template_dir: /home/melshman/.ansible/plugins/modules/ntc-ansible/ntc-templates/templates
Here is the traceback I get when the error occurs:
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: netmiko.ssh_exception.NetMikoTimeoutException: Connection to device timed-out: cisco_ios VTgroup_SW:22
fatal: [VTgroup_SW]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n File \"/tmp/ansible_RJRY9m/ansible_module_ntc_save_config.py\", line 279, in \n main()\n File \"/tmp/ansible_RJRY9m/ansible_module_ntc_save_config.py\", line 251, in main\n device = ntc_device(device_type, host, username, password, **kwargs)\n File \"/usr/local/lib/python2.7/dist-packages/pyntc-0.0.6-py2.7.egg/pyntc/__init__.py\", line 35, in ntc_device\n return device_class(*args, **kwargs)\n File \"/usr/local/lib/python2.7/dist-packages/pyntc-0.0.6-py2.7.egg/pyntc/devices/ios_device.py\", line 39, in __init__\n self.open()\n File \"/usr/local/lib/python2.7/dist-packages/pyntc-0.0.6-py2.7.egg/pyntc/devices/ios_device.py\", line 55, in open\n verbose=False)\n File \"build/bdist.linux-x86_64/egg/netmiko/ssh_dispatcher.py\", line 178, in ConnectHandler\n File \"build/bdist.linux-x86_64/egg/netmiko/base_connection.py\", line 207, in __init__\n File \"build/bdist.linux-x86_64/egg/netmiko/base_connection.py\", line 693, in establish_connection\nnetmiko.ssh_exception.NetMikoTimeoutException: Connection to device timed-out: cisco_ios VTgroup_SW:22\n", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 1}
Here is a working solution using ntc_show_command to a Cisco IOS device.
- name: Cisco IOS Automation
hosts: pynet-rtr1
connection: local
gather_facts: no
tasks:
- name: GET UPTIME
ntc_show_command:
connection: ssh
platform: "cisco_ios"
command: 'show version'
host: "{{ ansible_host }}"
username: "{{ ansible_user }}"
password: "{{ ansible_ssh_pass }}"
use_templates: True
template_dir: '/home/kbyers/ntc-templates/templates'
If you are going to use ntc-templates, I probably would not have the '| include uptime' in the 'show version'. In other words, let TextFSM convert the output to structured data first and then grab the uptime from that structured data.
I modified inventory_hostname to ansible_host to be consistent with my inventory format (my inventory_hostname doesn't actually resolve in DNS).
I modified username and password to 'ansible_user' and 'ansible_ssh_pass' to be consistent with my inventory and also to be more consistent with Ansible 2.5/2.6 variable naming.
On your above issue, your exception message does not match your playbook (i.e. are you sure that is the exception you get for that playbook).
Here is my inventory file (I simplified this to remove some unnecessary devices and to hide confidential information)
[all:vars]
ansible_connection=local
ansible_python_interpreter=/home/kbyers/VENV/ansible/bin/python
ansible_user=user
ansible_ssh_pass=password
[local]
localhost ansible_connection=local
[cisco]
pynet-rtr1 ansible_host=cisco1.domain.com
pynet-rtr2 ansible_host=cisco2.domain.com

~/.ssh/id_rsa.pub not found error while installing capistrano as ansible playbook

I try to install https://github.com/roots/bedrock-ansible to get a bedrock deployment (http://roots.io/wordpress-stack/) running.
When I run "vagrant up", after some time I get the error:
TASK: [capistrano-setup | Setup deploy group] *********************************
skipping: [default]
TASK: [capistrano-setup | Setup deploy user] **********************************
skipping: [default]
TASK: [capistrano-setup | Adding public key to server] ************************
fatal: [default] => could not locate file in lookup: ~/.ssh/id_rsa.pub
FATAL: all hosts have already failed -- aborting
PLAY RECAP ********************************************************************
to retry, use: --limit #/Users/johannes/site.retry
default : ok=46 changed=16 unreachable=1 failed=0
Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.
I do not have a clou how i can fix this. Do you have an idea?
It seems the role is trying to find your local public key. It should be in the location in the error message '~/.ssh/id_rsa.pub', but it's not. So either you don't have one, or you keep it in another location.
If you're not familiar with generating SSH keys you probably don't have one. I personally like the GitHub help page for this: https://help.github.com/articles/generating-ssh-keys/
(you only have to perform steps 1 and 2).
If you do have SSH keys, but in a different location, the capistrano-install role in bedrock uses some variables:
deploy_user: deploy
deploy_keys:
- "~/.ssh/id_rsa.pub"
So you can set (multiple) public key files in the deploy_keys list and they will be added to the deploy_user's authorized keys.
All this is needed because Capistrano will use the deploy user to connect to the remote server later. http://blakesmith.me/2010/02/08/understanding-public-key-private-key-concepts.html

Resources