I use ansible for network automation.
Usually everything works great, but when trying to create vlan at a particular Juniper switch, I get ncclient timed out while waiting for an rpc reply error. I use junos_vlan module.
I tried extending the timeout periods at ansible.cfg, switching ansible_connection from network_cli to netconf and so on, but none of that helped.
Can something be done from either server's or switch's side?
I read about a guy finding a workaround editing the module file:
Ansible, Juniper CLI commands. Timeout Error?
Can I reach a desired effect with the same approach?
I use ansible 2.8.1 and python 3.6.3. The device has JUNOS 14.1X53-D47.3 firmware.
Any suggestions?
Here is the output of the failure:
TASK [Setting vlan description and giving vlanID] **********************************************
task path: /opt/ansible/roles/juniper/tasks/add_vlan_sw.yml:2
META: noop
META: noop
<x.x.x.x> ESTABLISH NETCONF SSH CONNECTION FOR USER: Ansible on PORT 22 TO x.x.x.x
<x.x.x.x> ESTABLISH LOCAL CONNECTION FOR USER: root
<x.x.x.x> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp/ansible-local-233162pe2oyh4/ansible-tmp-1565773542.046394-83274056275477 `" && echo ansible-tmp-1565773542.046394-83274056275477="` echo /root/.ansible/tmp/ansible-local-233162pe2oyh4/ansible-tmp-1565773542.046394-83274056275477 `" ) && sleep 0'
Using module file /opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/ansible/modules/network/junos/junos_vlan.py
<x.x.x.x> PUT /root/.ansible/tmp/ansible-local-233162pe2oyh4/tmpyr3gfpgt TO /root/.ansible/tmp/ansible-local-233162pe2oyh4/ansible-tmp-1565773542.046394-83274056275477/AnsiballZ_junos_vlan.py
<x.x.x.x> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-local-233162pe2oyh4/ansible-tmp-1565773542.046394-83274056275477/ /root/.ansible/tmp/ansible-local-233162pe2oyh4/ansible-tmp-1565773542.046394-83274056275477/AnsiballZ_junos_vlan.py && sleep 0'
<x.x.x.x> EXEC /bin/sh -c '/usr/bin/python /root/.ansible/tmp/ansible-local-233162pe2oyh4/ansible-tmp-1565773542.046394-83274056275477/AnsiballZ_junos_vlan.py && sleep 0'
<x.x.x.x> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-local-233162pe2oyh4/ansible-tmp-1565773542.046394-83274056275477/ > /dev/null 2>&1 && sleep 0'
The full traceback is:
WARNING: The below traceback may *not* be related to the actual failure.
File "/tmp/ansible_junos_vlan_payload_a3tXYK/ansible_junos_vlan_payload.zip/ansible/module_utils/network/junos/junos.py", line 204, in unlock_configuration
response = conn.unlock()
File "/tmp/ansible_junos_vlan_payload_a3tXYK/ansible_junos_vlan_payload.zip/ansible/module_utils/network/common/netconf.py", line 76, in __rpc__
return self.parse_rpc_error(to_bytes(rpc_error, errors='surrogate_then_replace'))
File "/tmp/ansible_junos_vlan_payload_a3tXYK/ansible_junos_vlan_payload.zip/ansible/module_utils/network/common/netconf.py", line 108, in parse_rpc_error
raise ConnectionError(rpc_error)
fatal: [chi-leafsw06]: FAILED! => {
"changed": false,
"invocation": {
"module_args": {
"active": true,
"aggregate": null,
"description": "Client-100001-dedicated-network",
"filter_input": null,
"filter_output": null,
"host": null,
"interfaces": null,
"l3_interface": null,
"name": "vlan777",
"password": null,
"port": null,
"provider": null,
"ssh_keyfile": null,
"state": "present",
"timeout": null,
"transport": null,
"username": null,
"vlan_id": 777
}
},
"msg": "ncclient timed out while waiting for an rpc reply."
}
Any help would be greatly appreciated.
A guy has offered me a solution in here and I thought this might affect more of you (https://www.reddit.com/r/ansible/comments/cq7joa/help_ncclient_timed_out_while_waiting_for_rpc/)
What helped me was actually a suggestion to reduce the version of ncclient in my Ansible server from 0.6.6 to 0.6.4 and extend timeout values in ansible.cfg to at least 120.
So I am saved. Thanks!
Related
My ansible-playbook is running some long running task with async tag and also utilizes "creates:" condition, so it is run only once on the server. When I was writing the playbook yesterday, I am pretty sure, the task was skipped when the log set in "creates:" tag existed.
It shows changed now though, everytime I run it.
I am confused as I do not think I did change anything and I'd like to set up my registered varaible correctly as unchanged, when the condition is true.
Output of ansible-play (debug section shows the task is changed: true):
TASK [singleserver : Install Assure1 SingleServer role] *********************************************************************************************************************************
changed: [crassure1]
TASK [singleserver : Debug] *************************************************************************************************************************************************************
ok: [crassure1] => {
"msg": {
"ansible_job_id": "637594935242.28556",
"changed": true,
"failed": false,
"finished": 0,
"results_file": "/root/.ansible_async/637594935242.28556",
"started": 1
}
}
But if I check the actual results file on the target maschine, it correctly resolved condition and did not actually execute the shell script, so the task should be unchanged (shows message the task is skipped as the log exists):
[root#crassure1 assure1]# cat "/root/.ansible_async/637594935242.28556"
{"invocation": {"module_args": {"warn": true, "executable": null, "_uses_shell": true, "strip_empty_ends": true, "_raw_params": "/opt/install/install_command.sh", "removes": null, "argv": null, "creates": "/opt/assure1/logs/SetupWizard.log", "chdir": null, "stdin_add_newline": true, "stdin": null}}, "cmd": "/opt/install/install_command.sh", "changed": false, "rc": 0, "stdout": "skipped, since /opt/assure1/logs/SetupWizard.log exists"}[root#crassure1 assure1]# Connection reset by 172.24.36.123 port 22
My playbook section looks like this:
- name: Install Assure1 SingleServer role
shell:
#cmd: "/opt/assure1/bin/SetupWizard -a --Depot /opt/install/:a1-local --First --WebFQDN crassure1.tspdata.local --Roles All"
cmd: "/opt/install/install_command.sh"
async: 7200
poll: 0
register: Assure1InstallWait
args:
creates: /opt/assure1/logs/SetupWizard.log
- name: Debug
debug:
msg: "{{ Assure1InstallWait }}"
- name: Check on Installation status every 15 minutes
async_status:
jid: "{{ Assure1InstallWait.ansible_job_id }}"
register: job_result
until: job_result.finished
retries: 30
delay: 900
when: Assure1InstallWait is changed
Is there something I am missing, or is that some kind of a bug?
I am limited by Ansible version available in configured trusted repo, so I am using ansible 2.9.25
Q: "The module shell shows changed every time I run it"
A: In async mode the task can't be skipped immediately. First, the module shell must find out whether the file /opt/assure1/logs/SetupWizard.log exists at the remote host or not. Then, if the file exists the module will decide to skip the execution of the command. But, you run the task asynchronously. In this case, Ansible starts the module and returns without waiting for the module to complete. That's what the registered variable Assure1InstallWait says. The task started but didn't finish yet.
"msg": {
"ansible_job_id": "637594935242.28556",
"changed": true,
"failed": false,
"finished": 0,
"results_file": "/root/.ansible_async/637594935242.28556",
"started": 1
}
The decision to set such a task changed is correct, I think because the execution on the remote host is going on.
Print the registered result of the module async. You'll see, that the command was skipped because the file exists (you've printed the async file at the remote instead). Here the attribute changed is set false because now we know the command didn't execute
job_result:
...
attempts: 1
changed: false
failed: false
finished: 1
msg: Did not run command since '/tmp/SetupWizard.log' exists
rc: 0
...
my output now
I'm learning salt stack right now and I was wondering if there was a way to get the stdout of a salt state and put it into a document and then send it to the master. Or is there a better way to do this?
To achieve this, we'll have to save the execution of the script in a variable. It will contain a hash containing keys that are showing up under changes:. Then the contents of this variable (stdout) can be written to a file.
{% set script_res = salt['cmd.script']('salt://test.sh') %}
create-stdout-file:
file.managed:
- name: /tmp/script-stdout.txt
- contents: {{ script_res.stdout }}
The output is already going to the master. It would be better to actually output in json and query down to the data you want in your document on the master.
such as the following
Normal output
$ sudo salt salt00\* state.apply tests.test3
salt00.wolfnet.bad4.us:
----------
ID: test_run
Function: cmd.run
Name: echo test
Result: True
Comment: Command "echo test" run
Started: 10:39:51.103057
Duration: 18.281 ms
Changes:
----------
pid:
8661
retcode:
0
stderr:
stdout:
test
Summary for salt00.wolfnet.bad4.us
------------
Succeeded: 1 (changed=1)
Failed: 0
------------
Total states run: 1
Total run time: 18.281 ms
json output
$ sudo salt salt00\* state.apply tests.test3 --out json
{
"salt00.wolfnet.bad4.us": {
"cmd_|-test_run_|-echo test_|-run": {
"name": "echo test",
"changes": {
"pid": 9057,
"retcode": 0,
"stdout": "test",
"stderr": ""
},
"result": true,
"comment": "Command \"echo test\" run",
"__sls__": "tests.test3",
"__run_num__": 0,
"start_time": "10:40:55.582273",
"duration": 19.374,
"__id__": "test_run"
}
}
}
json parsed down with jq to just the stdout
$ sudo salt salt00\* state.apply tests.test3 --out=json | jq '.|.[]|."cmd_|-test_run_|-echo test_|-run"|.changes.stdout'
"test"
Also, for the record it is considered bad practice to put code that changes the system into jinja. Jinja always runs when a template is rendered and there is no way to control if it happens so just running test=true tests will still run the jinja code that makes changes which could be very harmful to your systems.
I want to create a customized openstack OpenSUSE15-image that contains some custom software and a graphical interface. I have used an existing OpenSUSE15.0 image and packer to build that image. It works fine. The packer json file is as follows:
"builders": [
{
"type" : "openstack",
"ssh_username" : "root",
"image_name": "OpenSUSE_15_custom_kde",
"source_image": "OpenSUSE 15",
"flavor": "m1.medium",
"networks": "public-network"
}
],
"provisioners":[
{
"type": "shell",
"inline": [
"sleep 10",
"sudo -s",
"zypper --gpg-auto-import-keys refresh",
"zypper -n up -y",
"zypper -n clean -a",
"zypper -n addrepo -f http://download.opensuse.org/repositories/devel\\:/languages\\:/R\\:/patched/openSUSE_Leap_15.0/ R-patched",
"zypper -n addrepo -f http://download.opensuse.org/repositories/devel\\:/languages\\:/R\\:/released/openSUSE_Leap_15.0/ R-released",
"zypper --gpg-auto-import-keys refresh",
"zypper -n install -y R-base R-base-devel R-recommended-packages rstudio",
"zypper -n clean -a",
"zypper --non-interactive install -y -t pattern kde kde_plasma devel_kernel devel_python3 devel_C_C++ office x11",
"zypper -n install xrdp",
"zypper -n clean -a",
"zypper -n dup -y",
"systemctl enable xrdp",
"systemctl start xrdp",
"cloud-init clean --logs",
"zypper -n install -y cloud-init growpart yast2-network yast2-services-manager acpid",
"cat /dev/null > /etc/udev/rules.d/70-persistent-net.rules",
"systemctl disable cloud-init.service cloud-final.service cloud-init-local.service cloud-config.service",
"systemctl enable cloud-init.service cloud-final.service cloud-init-local.service cloud-config.service sshd",
"sudo systemctl stop firewalld",
"sudo systemctl disable firewalld",
"sed -i 's/GRUB_TIMEOUT=.*$/GRUB_TIMEOUT=0/g' /etc/default/grub",
"exec grub2-mkconfig -o /boot/grub2/grub.cfg '$#'",
"systemctl restart cloud-init",
"systemctl daemon-reload",
"cat /dev/null > ~/.bash_history && history -c && sudo su",
"cat /dev/null > /var/log/wtmp",
"cat /dev/null > /var/log/btmp",
"cat /dev/null > /var/log/lastlog",
"cat /dev/null > /var/run/utmp",
"cat /dev/null > /var/log/auth.log",
"cat /dev/null > /var/log/kern.log",
"cat /dev/null > ~/.bash_history && history -c",
"rm ~/.ssh/authorized_keys"
]
},
{
"type": "file",
"source": "./cloud_init/cloud.cfg",
"destination": "/etc/cloud/cloud.cfg"
}
]
}
There are no errors in the building and provisioning phases with packer.
In a second stage, when this base image is spawned through a heat template via the openstack client, I want some personalized tasks to be completed. User creation, granting ssh-access (including adjusting the sshd_config file...). This is done through the init_image.sh file.
#!/bin/bash
useradd -m $USERNAME -p $PASSWD -s /bin/bash
usermod -a -G sudo $USERNAME
tee /etc/ssh/banner <<EOF
You are one lucky user, if you bear the key...
EOF
tee /etc/ssh/sshd_config <<EOF
## SOME IMPORTANT SSHD CONFIGURATIONS
EOF
sudo -u $USERNAME -H sh -c 'cd ~;mkdir ~/.ssh/;echo "$SSHPUBKEY" > ~/.ssh/authorized_keys;chmod -R 700 ~/.ssh/;chmod 600 ~/.ssh/authorized_keys;'
systemctl restart sshd.service
voldata_dev="/dev/disk/by-id/virtio-$(echo $VOLDATA | cut -c -20)"
mkfs.ext4 $voldata_dev
mkdir -pv /home/$USERNAME/share
echo "$voldata_dev /home/$USERNAME/share ext4 defaults 1 2" >> /etc/fstab
mount /home/$USERNAME/share
chown -R $USERNAME:users /home/$USERNAME/share/
systemctl enable xrdp
systemctl start xrdp
For this purpose, I have created the following heat template.
heat_template_version: "2018-08-31"
description: "version 2017-09-01 created by HOT Generator at Fri, 05 Jul 2019 12:56:22 GMT."
parameters:
username:
type: string
label: User Name
description: This is the user name, and will be also the name of the key and the server
default: test
imagename:
type: string
label: Image Name
description: This is the Name of the Image e.g. Ubuntu 18.04
default: "OpenSUSE Leap 15"
ssh_pub_key:
type: string
label: ssh public key
flavorname:
type: string
label: Flavor Name
description: This is the Name of the Flavor e.g. m1.small
default: "m1.small"
vol_size:
type: number
label: Volume Size
description: This is the size of the volume that should be attached in GB
default: 10
password:
type: string
label: password
description: This is the su password and user password
resources:
init:
type: OS::Heat::SoftwareConfig
properties:
group: ungrouped
config:
str_replace:
template:
{get_file: init_image.sh}
params:
$USERNAME: {get_param: username}
$SSHPUBKEY: {get_param: ssh_pub_key}
$PASSWD: {get_param: password}
$VOLDATA: {get_resource: volume}
my_key:
type: "OS::Nova::KeyPair"
properties:
name:
list_join:
["_", [ {get_param: username}, 'key']]
public_key: {get_param: ssh_pub_key}
my_server:
type: "OS::Nova::Server"
properties:
block_device_mapping_v2: [{ device_name: "vda", image : { get_param : imagename }, delete_on_termination : "false", volume_size: 20 }]
name: {get_param: username}
flavor: {get_param: flavorname}
key_name: {get_resource: my_key}
admin_pass: {get_param: password}
user_data_format: RAW
user_data: {get_resource: init}
networks:
- network: "public-network"
depends_on:
- my_key
- init
- volume
volume:
type: "OS::Cinder::Volume"
properties:
# Size is given in GB
size: {get_param: vol_size}
name:
list_join: ["-", ["vol_",{get_param: username }]]
volume_attachment:
type: "OS::Cinder::VolumeAttachment"
properties:
volume_id: { get_resource: volume }
instance_uuid: { get_resource: my_server }
depends_on:
- volume
outputs:
instance_ip:
description: The IP address of the deployed instances
value: { get_attr: [my_server, first_address] }
If I use the original image in the template I have no problems (however, the building process takes very very long) and I need to restart to have the graphical KDE interface.
However, if I use the image build with packer, my user_data are ignored? I cannot log in, the user personalized user is not created... What have I missed? Why does it not work? As you see, I clean cloud-init, restart the services... I am stuck big time...
UPDATE
Here is the accesible boot-log from the machine.
UPDATE 2
This is the output of cloud-init analyze show:
-- Boot Record 01 --
The total time elapsed since completing an event is printed after the "#" character.
The time the event takes is printed after the "+" character.
Starting stage: init-local
|`->no cache found #00.01000s +00.00000s
|`->no local data found from DataSourceOpenStackLocal #00.04700s +15.23000s
Finished stage: (init-local) 15.31200 seconds
Starting stage: init-network
|`->no cache found #16.01000s +00.00100s
|`->no network data found from DataSourceOpenStack #16.01700s +00.02600s
|`->found network data from DataSourceNone #16.04300s +00.00100s
|`->setting up datasource #16.09000s +00.00000s
|`->reading and applying user-data #16.10000s +00.00200s
|`->reading and applying vendor-data #16.10200s +00.00000s
|`->activating datasource #16.12100s +00.00100s
|`->config-migrator ran successfully #16.17900s +00.00100s
|`->config-seed_random ran successfully #16.18000s +00.00100s
|`->config-bootcmd ran successfully #16.18200s +00.00000s
|`->config-write-files ran successfully #16.18200s +00.00100s
|`->config-growpart ran successfully #16.18300s +00.46100s
|`->config-resizefs ran successfully #16.64500s +01.33400s
|`->config-disk_setup ran successfully #17.98100s +00.00300s
|`->config-mounts ran successfully #17.98500s +00.00400s
|`->config-set_hostname ran successfully #17.99000s +00.09800s
|`->config-update_hostname ran successfully #18.08900s +00.01000s
|`->config-update_etc_hosts ran successfully #18.10000s +00.00100s
|`->config-rsyslog ran successfully #18.10100s +00.00200s
|`->config-users-groups ran successfully #18.10400s +00.00200s
|`->config-ssh ran successfully #18.10700s +00.61400s
Finished stage: (init-network) 02.73600 seconds
Starting stage: modules-config
|`->config-locale ran successfully #35.00200s +00.00400s
|`->config-set-passwords ran successfully #35.00600s +00.00100s
|`->config-zypper-add-repo ran successfully #35.00700s +00.00200s
|`->config-ntp ran successfully #35.01000s +00.00100s
|`->config-timezone ran successfully #35.01100s +00.00200s
|`->config-disable-ec2-metadata ran successfully #35.01300s +00.00100s
|`->config-runcmd ran successfully #35.01800s +00.00200s
Finished stage: (modules-config) 00.05100 seconds
Starting stage: modules-final
|`->config-package-update-upgrade-install ran successfully #35.87400s +00.00000s
|`->config-puppet ran successfully #35.87500s +00.00000s
|`->config-chef ran successfully #35.87600s +00.00000s
|`->config-mcollective ran successfully #35.87600s +00.00100s
|`->config-salt-minion ran successfully #35.87700s +00.00100s
|`->config-rightscale_userdata ran successfully #35.87800s +00.00100s
|`->config-scripts-vendor ran successfully #35.87900s +00.00500s
|`->config-scripts-per-once ran successfully #35.88400s +00.00100s
|`->config-scripts-per-boot ran successfully #35.88500s +00.00000s
|`->config-scripts-per-instance ran successfully #35.88500s +00.00100s
|`->config-scripts-user ran successfully #35.88600s +00.00100s
|`->config-ssh-authkey-fingerprints ran successfully #35.88700s +00.00100s
|`->config-keys-to-console ran successfully #35.88800s +00.09000s
|`->config-phone-home ran successfully #35.97900s +00.00100s
|`->config-final-message ran successfully #35.98000s +00.00600s
|`->config-power-state-change ran successfully #35.98700s +00.00100s
Finished stage: (modules-final) 00.13600 seconds
Total Time: 18.23500 seconds
1 boot records analyzed
Update 3
Apparently, when one does not update with zypper up, cloud-init behaves well and finds the user data. Hence, I will not update the image in provisioning. However, once provisioned it makes sense to update.
In the end of your provisioning you should stop cloud-init and wipe the state. Otherwise when the image is launched cloud-init think it already executed the first launch.
systemctl stop cloud-init
rm -rf /var/lib/cloud/
I am unable to run symfony flex command composer dump-env prod using ansible composer module. I wonder if its even possible ? My task looks sth like this:
- name: Composer dump env for production
composer:
command: dump-env
working_dir: "{{ app_composer_package_dir }}"
arguments: prod
become_user: "{{app_apache_user}}"
become: yes
The error I get is:
"stderr": "\n
\n [Symfony\Component\Console\Exception\CommandNotFoundException]
\n There are no commands defined \"dump-env\".
\n
ansible verbose logs:
fatal: [testhost.com]: FAILED! => {
"changed": false,
"invocation": {
"module_args": {
"apcu_autoloader": false,
"arguments": "prod",
"classmap_authoritative": false,
"command": "dump-env",
"executable": null,
"global_command": false,
"ignore_platform_reqs": false,
"no_dev": true,
"no_plugins": false,
"no_scripts": false,
"optimize_autoloader": true,
"prefer_dist": false,
"prefer_source": false,
"working_dir": "/var/www/source"
}
},
"msg": "[Symfony\\Component\\Console\\Exception\\CommandNotFoundException] Command \"dump-env\" is not defined. help [--xml] [--format FORMAT] [--raw] [--] [<command_name>]"
}
",
I tried ansible command module to directly run the command but I get same error.
However, I am able run the command by sshing to remote (centos) instance :
sudo -u apache composer dump-env prod
Restricting packages listed in "symfony/symfony" to "4.3.*"
Successfully dumped .env files in .env.local.php
So far I am unable to run composer dump-env prod command using ansible composer module. However following task using ansible command module runs successfully e.g
- name: Composer dump env for production
command: "{{composer_install_path}} --working-dir={{ app_composer_package_dir }} dump-env prod"
become_user: "{{app_apache_user}}"
become: yes
which translates to sth:
sudo -u apache /usr/local/bin/composer --working-dir=/var/www/source dump-env prod
I have the following role:
---
- name: "Copying {{source_directory}} to {{destination_directory}}"
shell: cp -r "{{source_directory}}" "{{destination_directory}}"
being used as follows:
- { role: copy_folder, source_directory: "{{working_directory}}/ipsc/dist", destination_directory: "/opt/apache-tomcat-base/webapps/ips" }
with the parameters: working_directory: /opt/demoServer
This is being executed after I remove the directory using this role (as I do not want the previous contents)
- name: "Removing Folder {{path_to_file}}"
command: rm -r "{{path_to_file}}"
with parameters: path_to_file: "/opt/apache-tomcat-base/webapps/ips"
I get the following output:
TASK: [copy_folder | Copying /opt/demoServer/ipsc/dist to /opt/apache-tomcat-base/webapps/ips] ***
<md1cat01-demo.lnx.ix.com> ESTABLISH CONNECTION FOR USER: my.user
<md1cat01-demo.lnx.ix.com> REMOTE_MODULE command cp -r "/opt/demoServer/ipsc/dist" "/opt/apache-tomcat-base/webapps/ips" #USE_SHELL
...
changed: [md1cat01-demo.lnx.ix.com] => {"changed": true, "cmd": "cp -r \"/opt/demoServer/ipsc/dist\" \"/opt/apache-tomcat-base/webapps/ips\"", "delta": "0:00:00.211759", "end": "2016-02-05 11:05:37.459890", "rc": 0, "start": "2016-02-05 11:05:37.248131", "stderr": "", "stdout": "", "warnings": []}
What is happening is that there is never being a folder in that directory.
Basically the cp command is not doing it's job, but i get no error or so. If i run the copy command manually on the machine it works however.
Use Copy module and set directory_mode to yes