Ansible task with async and become giving Job not found error - asynchronous

When I am trying to run a task asynchronously as another user using become in ansible plabook, I am getting "Job not found error". Can some one suggest me how can I successfully check the async job status.
I am using ansible version 2.7
I read in some articles suggesting use the async_status task with same become user as async task, to read the job status.
I tried that solution but still I am getting the same "job not found error"
- hosts: localhost
tasks:
- shell: startInstance.sh
register: start_task
async: 180
poll: 0
become: yes
become_user: venu
- async_status:
jid: "{{start_task.ansible_job_id}}"
register: start_status
until: start_status.finished
retries: 30
become: yes
become_user: venu
Expected Result:
I should be able to Fire and forget the job
Actual_Result:
{"ansible_job_id": "386361757265.15925428", "changed": false, "finished": 1, "msg": "could not find job", "started": 1}

Related

Ansible showing task changed but the task has condition (creates: ) and does not actually execute

My ansible-playbook is running some long running task with async tag and also utilizes "creates:" condition, so it is run only once on the server. When I was writing the playbook yesterday, I am pretty sure, the task was skipped when the log set in "creates:" tag existed.
It shows changed now though, everytime I run it.
I am confused as I do not think I did change anything and I'd like to set up my registered varaible correctly as unchanged, when the condition is true.
Output of ansible-play (debug section shows the task is changed: true):
TASK [singleserver : Install Assure1 SingleServer role] *********************************************************************************************************************************
changed: [crassure1]
TASK [singleserver : Debug] *************************************************************************************************************************************************************
ok: [crassure1] => {
"msg": {
"ansible_job_id": "637594935242.28556",
"changed": true,
"failed": false,
"finished": 0,
"results_file": "/root/.ansible_async/637594935242.28556",
"started": 1
}
}
But if I check the actual results file on the target maschine, it correctly resolved condition and did not actually execute the shell script, so the task should be unchanged (shows message the task is skipped as the log exists):
[root#crassure1 assure1]# cat "/root/.ansible_async/637594935242.28556"
{"invocation": {"module_args": {"warn": true, "executable": null, "_uses_shell": true, "strip_empty_ends": true, "_raw_params": "/opt/install/install_command.sh", "removes": null, "argv": null, "creates": "/opt/assure1/logs/SetupWizard.log", "chdir": null, "stdin_add_newline": true, "stdin": null}}, "cmd": "/opt/install/install_command.sh", "changed": false, "rc": 0, "stdout": "skipped, since /opt/assure1/logs/SetupWizard.log exists"}[root#crassure1 assure1]# Connection reset by 172.24.36.123 port 22
My playbook section looks like this:
- name: Install Assure1 SingleServer role
shell:
#cmd: "/opt/assure1/bin/SetupWizard -a --Depot /opt/install/:a1-local --First --WebFQDN crassure1.tspdata.local --Roles All"
cmd: "/opt/install/install_command.sh"
async: 7200
poll: 0
register: Assure1InstallWait
args:
creates: /opt/assure1/logs/SetupWizard.log
- name: Debug
debug:
msg: "{{ Assure1InstallWait }}"
- name: Check on Installation status every 15 minutes
async_status:
jid: "{{ Assure1InstallWait.ansible_job_id }}"
register: job_result
until: job_result.finished
retries: 30
delay: 900
when: Assure1InstallWait is changed
Is there something I am missing, or is that some kind of a bug?
I am limited by Ansible version available in configured trusted repo, so I am using ansible 2.9.25
Q: "The module shell shows changed every time I run it"
A: In async mode the task can't be skipped immediately. First, the module shell must find out whether the file /opt/assure1/logs/SetupWizard.log exists at the remote host or not. Then, if the file exists the module will decide to skip the execution of the command. But, you run the task asynchronously. In this case, Ansible starts the module and returns without waiting for the module to complete. That's what the registered variable Assure1InstallWait says. The task started but didn't finish yet.
"msg": {
"ansible_job_id": "637594935242.28556",
"changed": true,
"failed": false,
"finished": 0,
"results_file": "/root/.ansible_async/637594935242.28556",
"started": 1
}
The decision to set such a task changed is correct, I think because the execution on the remote host is going on.
Print the registered result of the module async. You'll see, that the command was skipped because the file exists (you've printed the async file at the remote instead). Here the attribute changed is set false because now we know the command didn't execute
job_result:
...
attempts: 1
changed: false
failed: false
finished: 1
msg: Did not run command since '/tmp/SetupWizard.log' exists
rc: 0
...

How do I get just the STDOUT of a salt state?

my output now
I'm learning salt stack right now and I was wondering if there was a way to get the stdout of a salt state and put it into a document and then send it to the master. Or is there a better way to do this?
To achieve this, we'll have to save the execution of the script in a variable. It will contain a hash containing keys that are showing up under changes:. Then the contents of this variable (stdout) can be written to a file.
{% set script_res = salt['cmd.script']('salt://test.sh') %}
create-stdout-file:
file.managed:
- name: /tmp/script-stdout.txt
- contents: {{ script_res.stdout }}
The output is already going to the master. It would be better to actually output in json and query down to the data you want in your document on the master.
such as the following
Normal output
$ sudo salt salt00\* state.apply tests.test3
salt00.wolfnet.bad4.us:
----------
ID: test_run
Function: cmd.run
Name: echo test
Result: True
Comment: Command "echo test" run
Started: 10:39:51.103057
Duration: 18.281 ms
Changes:
----------
pid:
8661
retcode:
0
stderr:
stdout:
test
Summary for salt00.wolfnet.bad4.us
------------
Succeeded: 1 (changed=1)
Failed: 0
------------
Total states run: 1
Total run time: 18.281 ms
json output
$ sudo salt salt00\* state.apply tests.test3 --out json
{
"salt00.wolfnet.bad4.us": {
"cmd_|-test_run_|-echo test_|-run": {
"name": "echo test",
"changes": {
"pid": 9057,
"retcode": 0,
"stdout": "test",
"stderr": ""
},
"result": true,
"comment": "Command \"echo test\" run",
"__sls__": "tests.test3",
"__run_num__": 0,
"start_time": "10:40:55.582273",
"duration": 19.374,
"__id__": "test_run"
}
}
}
json parsed down with jq to just the stdout
$ sudo salt salt00\* state.apply tests.test3 --out=json | jq '.|.[]|."cmd_|-test_run_|-echo test_|-run"|.changes.stdout'
"test"
Also, for the record it is considered bad practice to put code that changes the system into jinja. Jinja always runs when a template is rendered and there is no way to control if it happens so just running test=true tests will still run the jinja code that makes changes which could be very harmful to your systems.

Euca 5 Ansible Install Skipping Node Actions

I'm trying to use the Euca 5 ansible installer to install a single server for all services "exp-euca.lan.com" with two node controllers "exp-enc-[01:02].lan.com" running VPCMIDO. The install goes okay and I end up with a single server running all Euca services including being able to run instances but the ansible scripts never take action to install and configure my node servers. I think I'm misunerdstanding the inventory format. What could be wrong with the following? I don't want my main euca server to run instances and I do want the two node controllers installed and running instances.
---
all:
hosts:
exp-euca.lan.com:
exp-enc-[01:02].lan.com:
vars:
vpcmido_public_ip_range: "192.168.100.5-192.168.100.254"
vpcmido_public_ip_cidr: "192.168.100.1/24"
cloud_system_dns_dnsdomain: "cloud.lan.com"
cloud_public_port: 443
eucalyptus_console_cloud_deploy: yes
cloud_service_image_rpm: no
cloud_properties:
services.imaging.worker.ntp_server: "x.x.x.x"
services.loadbalancing.worker.ntp_server: "x.x.x.x"
children:
cloud:
hosts:
exp-euca.lan.com:
console:
hosts:
exp-euca.lan.com:
zone:
hosts:
exp-euca.lan.com:
nodes:
hosts:
exp-enc-[01:02].lan.com:
All of the plays related to nodes have a pattern similar to this where they succeed and acknowledge the main server exp-euca but then skip the nodes.
2021-01-14 08:15:23,572 p=57513 u=root n=ansible | TASK [zone assignments default] ***********************************************************************************************************************
2021-01-14 08:15:23,596 p=57513 u=root n=ansible | ok: [exp-euca.lan.com] => (item=[0, u'exp-euca.lan.com']) => {"ansible_facts": {"host_zone_key": "1"}, "ansible_loop_var": "item", "changed": false, "item": [0, "exp-euca.lan.com"]}
2021-01-14 08:15:23,604 p=57513 u=root n=ansible | skipping: [exp-enc-01.lan.com] => (item=[0, u'exp-euca.lan.com']) => {"ansible_loop_var": "item", "changed": false, "item": [0, "exp-euca.lan.com"], "skip_reason": "Conditional result was False"}
It should be node, not nodes, i.e.:
node:
hosts:
exp-enc-[01:02].lan.com:
The documentation for this is currently incorrect.

Wordpress on bluemix cloudfoundry crashing

I followed the steps on the blog to get wordpress going
https://blog.pivotal.io/pivotal-cloud-foundry/products/getting-started-with-wordpress-on-cloud-foundry
when I do a cf push it keeps crashing with the following lines in the error
2016-05-14T15:41:44.22-0700 [App/0] OUT total size is 2,574,495 speedup is 0.99
2016-05-14T15:41:44.24-0700 [App/0] ERR fusermount: entry for /home/vcap/app/htdocs/wp-content not found in /etc/mtab
2016-05-14T15:41:44.46-0700 [App/0] OUT 22:41:44 sshfs | fuse: mountpoint is not empty
2016-05-14T15:41:44.46-0700 [App/0] OUT 22:41:44 sshfs | fuse: if you are sure this is safe, use the 'nonempty' mount option
2016-05-14T15:41:44.64-0700 [DEA/86] ERR Instance (index 0) failed to start accepting connections
2016-05-14T15:41:44.68-0700 [API/1] OUT App instance exited with guid cf2ea899-3599-429d-a39d-97d0e99280e4 payload: {"cc_partition"=>"default", "droplet"=>"cf2ea899-3599-429d-a39d-97d0e99280e4", "version"=>"c94b7baf-4da4-44b5-9565-dc6945d4b3ce", "instance"=>"c4f512149613477baeb2988b50f472f2", "index"=>0, "reason"=>"CRASHED", "exit_status"=>1, "exit_description"=>"failed to accept connections within health check timeout", "crash_timestamp"=>1463265704}
2016-05-14T15:41:44.68-0700 [API/1] OUT App instance exited with guid cf2ea899-3599-429d-a39d-97d0e99280e4 payload: {"cc_partition"=>"default", "droplet"=>"cf2ea899-3599-429d-a39d-97d0e99280e4", "version"=>"c94b7baf-4da4-44b5-9565-dc6945d4b3ce", "instance"=>"c4f512149613477baeb2988b50f472f2", "index"=>0, "reason"=>"CRASHED", "exit_status"=>1, "exit_description"=>"failed to accept connections within health check timeout", "crash_timestamp"=>1463265704}
^[[A
my manifest file:
cf-ex-wordpress$ cat manifest.yml
---
applications:
- name: myapp
memory: 128M
path: .
buildpack: https://github.com/cloudfoundry/php-buildpack
host: near
services:
- mysql-db
env:
SSH_HOST: user#abc.com
SSH_PATH: /home/user
SSH_KEY_NAME: sshfs_rsa
SSH_OPTS: '["cache=yes", "kernel_cache", "compression=no", "large_read"]'
vagrant#vagrant:~/Documents/shared/cf-ex-wordpress$
Please check your SSH mount, more details at https://github.com/dmikusa-pivotal/cf-ex-wordpress/issues

Saltstack: ignoring result of cmd.run

I am trying to invoke a command on provisioning via Saltstack. If command fails then I get state failing and I don't want that (retcode of command doesn't matter).
Currently I have the following workaround:
Run something:
cmd.run:
- name: command_which_can_fail || true
is there any way to make such state ignore retcode using salt features? or maybe I can exclude this state from logs?
Use check_cmd :
fails:
cmd.run:
- name: /bin/false
succeeds:
cmd.run:
- name: /bin/false
- check_cmd:
- /bin/true
Output:
local:
----------
ID: fails
Function: cmd.run
Name: /bin/false
Result: False
Comment: Command "/bin/false" run
Started: 16:04:40.189840
Duration: 7.347 ms
Changes:
----------
pid:
4021
retcode:
1
stderr:
stdout:
----------
ID: succeeds
Function: cmd.run
Name: /bin/false
Result: True
Comment: check_cmd determined the state succeeded
Started: 16:04:40.197672
Duration: 13.293 ms
Changes:
----------
pid:
4022
retcode:
1
stderr:
stdout:
Summary
------------
Succeeded: 1 (changed=2)
Failed: 1
------------
Total states run: 2
If you don't care what the result of the command is, you can use:
Run something:
cmd.run:
- name: command_which_can_fail; exit 0
This was tested in Salt 2017.7.0 but would probably work in earlier versions.

Resources