Saltstack events not firing or reactor system not responding - salt-stack

I am firing events from the salt minion using the command :
salt-call event.send 'kubemaster/kubernetes/started' '{master_ip:"10.102.28.170"}'
The master has reactor config:
reactor:
- kubemaster/kubernetes/started:
- /srv/reactor/testfile.sls
In the master debug mode I can see the event coming on the message bus but it does not show any message about rendering the sls file
[DEBUG ] Gathering reactors for tag kubemaster/kubernetes/started
[DEBUG ] Compiling reactions for tag kubemaster/kubernetes/started
[DEBUG ] LazyLoaded local_cache.prep_jid
Is there a better way to debug?
Is there any way to check if it is even looking for the sls file?
Thanks!

Could be a copy/paste error, but you are missing an indentation level for the second and third lines in the reactor config that you pasted.

Related

use saltstack state.sls to install mysql but not return

I am searching for a long time on net. But no use. Please help or try to give some ideas how to achieve this.
my saltstack file code in github
saltstack file
install mysql salt code:
[root#salt_master srv]# cat salt/base/lnmp_yum/mysql/mysql_install.sls
repo_init:
file.managed:
- name: /etc/yum.repos.d/mysql-{{pillar['mysql_version']}}.repo
- source: salt://lnmp_yum/mysql/files/mysql-{{pillar['mysql_version']}}.repo
- user: root
- group: root
- mode: 644
mysql_install:
pkg.installed:
- names:
- mysql
- mysql-server
- mysql-devel
- require:
- file: repo_init
service.running:
- name: mysqld
- enable: True
after run cmd:
salt 'lnmp_base' state.sls lnmp_yum.mysql.mysql_install -l debug
always print log:
[DEBUG ] Checking whether jid 20170526144936867490 is still running
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/master', 'salt_master_master', 'tcp://127.0.0.1:4506', 'clear')
[DEBUG ] Passing on saltutil error. This may be an error in saltclient. 'retcode'
[DEBUG ] Checking whether jid 20170526144936867490 is still running
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/master', 'salt_master_master', 'tcp://127.0.0.1:4506', 'clear')
[DEBUG ] Passing on saltutil error. This may be an error in saltclient. 'retcode'
[DEBUG ] Checking whether jid 20170526144936867490 is still running
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/master', 'salt_master_master', 'tcp://127.0.0.1:4506', 'clear')
[DEBUG ] Passing on saltutil error. This may be an error in saltclient. 'retcode'
when i look salt node server, mysql already installed and start,but salt master server always print log, no exit.
I searched for days, but I could not solve it.
the same question when i install jboss.
Thanks in advance.
Two thoughts occur to me:
I think mysql has a basic configuration ncurses gui that requires user input to configure (Set default password etc). If I remember that correctly then your salt state is still running and waiting for a human to type at the screen. You can fix this by feeding it an answer/config file.
Stolen shamelesly from another post:
sudo debconf-set-selections <<< 'mysql-server-5.6 mysql-server/root_password password your_password'
sudo debconf-set-selections <<< 'mysql-server-5.6 mysql-server/root_password_again password your_password'
sudo apt-get -y install mysql-server-5.6
The other is that it may simply take longer than your salt timeout default for a task. That can be configured in salt at the salt cmd line with -t or the config file (forget which setting)

salt-master is not receiving scheduled job event fired on salt-minion

I would to like to ask you for a help. I use saltstack as a job scheduler for slaves (minions) and I would like to be able to see on master job events fired on minion.
My setup
Job is scheduled on salt-master using a pillar for given minion. Pillar is:
schedule_returner: mongo
schedule:
cmd:
function: cmd.run
args:
- date +%s >> /tmp/job_runs
minutes: 1
maxrunning: 1
Scheduled job is executed without any problem on minion. I can see returned data in mongodb and a new timestamp in my dummy file /tmp/job_runs. The configuration file on minion /etc/salt/minion.d/_schedule.conf is:
schedule:
__mine_interval: {enabled: true, function: mine.update, jid_include: true, maxrunning: 2, minutes: 60, return_job: false}
cmd:
args: [date +%s >> /tmp/job_runs]
function: cmd.run
maxrunning: 1
minutes: 1
This file was generated and I didn't modify it.
In minion log I can see:
[DEBUG ] SaltEvent PUB socket URI:
/var/run/salt/minion/minion_event_1fa42d8010_pub.ipc
[DEBUG ] SaltEvent PULL socket URI: /var/run/salt/minion/minion_event_1fa42d8010_pull.ipc
[DEBUG ] Initializing new IPCClient for path: /var/run/salt/minion/minion_event_1fa42d8010_pull.ipc
[DEBUG ] Sending event: tag = __schedule_return; data = {'fun_args': ['date +%s >> /tmp/job_runs'], 'jid': 'req', 'return':
'', 'retcode': 0, 'success': True, 'schedule': 'cmd', 'cmd':
'_return', 'pid': 10264, '_stamp': '2017-02-22T10:03:05.750874',
'fun': 'cmd.run', 'id': 'vagrant.vm'}
[DEBUG ] Minion of "salt" is handling event tag '__schedule_return'
[DEBUG ] schedule.handle_func: Removing /var/cache/salt/minion/proc/20170222100305532940
[DEBUG ] LazyLoaded mongo.returner
Now I'm interested in listening to those events with tag __schedule_return.
On minion, I can run the following commands:
wget https://raw.github.com/saltstack/salt/develop/tests/eventlisten.py
sudo python eventlisten.py -n minion
The output of eventlisten.py is correct and I can see this event.
Now my question is: Is there any way to listen to this events on salt-master?
When I run almost the same commands on master:
wget https://raw.github.com/saltstack/salt/develop/tests/eventlisten.py
sudo python eventlisten.py
I'm not able to see those events fired on minion by my scheduled job.
My motivation to do this is that I'm running saltpad on my master and I would like to see my scheduled jobs in the recent jobs (websockets...).
Thank you for any help.
Listening for Events
The quickest way to watch the event bus is by calling the state.event runner on you salt-master:
salt-run state.event pretty=True
Firing Events
It's possible to fire an event to be sent up to the master from the minion using the event.send execution function:
salt-call event.send '__schedule_return' '{success: True, message: "It works!"}'
Reactor System
Salt's Reactor System gives the ability to trigger actions in response to an event. Reactor SLS files and event tags are associated in the master config file (by default /etc/salt/master or /etc/salt/master.d/reactor.conf).
In the master config section 'reactor:' you can specify a list of event tags to be matched. Each event tag can have a list of reactor SLS files to be run.
# Master config section "reactor"
reactor:
# Match tag "__schedule_return"
- '__schedule_return':
# Things to it matches the tag
- /srv/reactor/do_stuff.sls
See the documentation about the reactor system for more information about salt's reactor system.

salt-cp error/timeout copying files

files are there, but when I run salt-cp to collect'em I get this funky error with no clear message. FWIW, I'm using different ports than the default:
publish_port: 44505
ret_port: 44506
...Files in place and server responding normally:
salt 'test1' cmd.run 'ls /tmp/test*'
test1:
/tmp/test1-1.json
/tmp/test1-2.json
/tmp/test1-3.json
...
Wonder what's wrong with that (also tried /tmp/test/ as dest):
salt-cp 'test1' /tmp/test* salt:// -l debug
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Configuration file path: /etc/salt/master
[WARNING ] Insecure logging configuration detected! Sensitive data may be logged.
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Missing configuration file: /root/.saltrc
[DEBUG ] MasterEvent PUB socket URI: /var/run/salt/master/master_event_pub.ipc
[DEBUG ] MasterEvent PULL socket URI: /var/run/salt/master/master_event_pull.ipc
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/master', 'XXX_master', 'tcp://127.0.0.1:44506', 'clear')
[DEBUG ] Initializing new IPCClient for path: /var/run/salt/master/master_event_pub.ipc
[DEBUG ] SaltReqTimeoutError, retrying. (1/3)
[DEBUG ] SaltReqTimeoutError, retrying. (2/3)
[DEBUG ] SaltReqTimeoutError, retrying. (3/3)
[ERROR ] An un-handled exception was caught by salt's global exception handler:
SaltClientError: Salt request timed out. The master is not responding. If this error persists after verifying the master is up, worker_threads may need to be increased.
Traceback (most recent call last):
File "/usr/bin/salt-cp", line 10, in <module>
salt_cp()
File "/usr/lib/python2.7/dist-packages/salt/scripts.py", line 359, in salt_cp
client.run()
File "/usr/lib/python2.7/dist-packages/salt/cli/cp.py", line 38, in run
cp_.run()
File "/usr/lib/python2.7/dist-packages/salt/cli/cp.py", line 105, in run
ret = local.cmd(*args)
File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 568, in cmd
**kwargs)
File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 317, in run_job
raise SaltClientError(general_exception)
SaltClientError: Salt request timed out. The master is not responding. If this error persists after verifying the master is up, worker_threads may need to be increased.
salt.exceptions.SaltClientError: Salt request timed out. The master is not responding. If this error persists after verifying the master is up, worker_threads may need to be increased.
Appreciate any help.
gosh dern it. From docs: Salt copy is only intended for use with small files (< 100KB). If you need to copy large files out to minions please use the cp.get_file function.
Sure... only < 100BK files.. very useful!

Saltstack apply state on minions through salt reactors and runners

I have multiple salt deployment environments.
I have a requirement in which I raise an event from the minions, the master upon receiving the event, generates few files which I then want to copy to the minions.
How do I do this?
I was trying to get it to work using orchestrate. This is what I have right now:
reactor sls->
copy_cert:
runner.state.orchestrate:
- mods: _orch.copy_certs
- saltenv: 'central'
copy_certs sls->
copy_kube_certs:
salt.state:
- tgt: 'kubeminion'
- tgt_type: nodegroup
- sls:
- kubemaster.copy_certs
The problem is that I want to happen for all the environments and not just one. How do I do that?
Or is there a way to loop over the environments using jinja in some way.
Also is it possible using anything other than orchestrate.
You don't need to use orchestrate for this, all you need is the salt reactor.
Lets say you fire an event from the minion salt-call event.send tag='event/test' (you can watch the salt event bus using salt-run state.event pretty=True):
event/test {
"_stamp": "2017-05-24T10:36:05.907438",
"cmd": "_minion_event",
"data": {
"__pub_fun": "event.send",
"__pub_jid": "20170524133601757005",
"__pub_pid": 4590,
"__pub_tgt": "salt-call"
},
"id": "minion_A",
"tag": "event/test"
}
Now you need to decide what happens when salt receives the event, edit/create /etc/salt/master.d/reactor.conf (remember to restart the salt-master after editing this file):
reactor:
- event/test: # event tag to match
- /srv/reactor/some_state.sls # sls file to run
some_state.sls:
some_state:
local.state.apply:
- tgt: kubeminion
- tgt_type: nodegroup
- arg:
- kubemaster.copy_certs
- kwarg:
- saltenv: central
This will in turn apply the state kubemaster.copy_certs to all minions in the "kubeminion" nodegroup.
Hope this helps.

Introspection into forever-running salt highstate?

I've been experimenting with Salt, and I've managed to lock up my highstate command. It's been running for hours now, even though there's nothing that warrants that kind of time.
The last change I made was to modify the service.watch state for nginx. It currently reads:
nginx:
pkg.installed:
- name: nginx
service:
- running
- enable: True
- restart: True
- watch:
- file: /etc/nginx/nginx.conf
- file: /etc/nginx/sites-available/default.conf
- pkg: nginx
The last change I made was to add the second file: argument to watch.
After letting it run all night, with no change in state, I subsequently Ctrl-C'd the process. The last output from sudo salt -v 'web*' state.highstate -l debug was:
[DEBUG ] Checking whether jid 20140403022217881027 is still running
[DEBUG ] get_returns for jid 20140403103702550977 sent to set(['web1.mysite.com']) will timeout at 10:37:04
[DEBUG ] jid 20140403103702550977 found all minions
Execution is still running on web1.mysite.com
^CExiting on Ctrl-C
This job's jid is:
20140403022217881027
The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later run:
salt-run jobs.lookup_jid 20140403022217881027
Running it again immediately, I got this:
$ sudo salt -v 'web*' state.highstate -l debug
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Missing configuration file: /home/eykd/.salt
[DEBUG ] Configuration file path: /etc/salt/master
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Missing configuration file: /home/eykd/.salt
[DEBUG ] LocalClientEvent PUB socket URI: ipc:///var/run/salt/master/master_event_pub.ipc
[DEBUG ] LocalClientEvent PULL socket URI: ipc:///var/run/salt/master/master_event_pull.ipc
Executing job with jid 20140403103715454952
-------------------------------------------
[DEBUG ] Checking whether jid 20140403103715454952 is still running
[DEBUG ] get_returns for jid 20140403103720479720 sent to set(['web1.praycontinue.com']) will timeout at 10:37:22
[INFO ] jid 20140403103720479720 minions set(['web1.mysite.com']) did not return in time
[DEBUG ] Loaded no_out as virtual quiet
[DEBUG ] Loaded json_out as virtual json
[DEBUG ] Loaded yaml_out as virtual yaml
[DEBUG ] Loaded pprint_out as virtual pprint
web1.praycontinue.com:
Minion did not return
I then ran the same command, and received this:
$ sudo salt -v 'web*' state.highstate -l debug
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Missing configuration file: /home/eykd/.salt
[DEBUG ] Configuration file path: /etc/salt/master
[DEBUG ] Reading configuration from /etc/salt/master
[DEBUG ] Missing configuration file: /home/eykd/.salt
[DEBUG ] LocalClientEvent PUB socket URI: ipc:///var/run/salt/master/master_event_pub.ipc
[DEBUG ] LocalClientEvent PULL socket URI: ipc:///var/run/salt/master/master_event_pull.ipc
Executing job with jid 20140403103729848942
-------------------------------------------
[DEBUG ] Loaded no_out as virtual quiet
[DEBUG ] Loaded json_out as virtual json
[DEBUG ] Loaded yaml_out as virtual yaml
[DEBUG ] Loaded pprint_out as virtual pprint
web1.mysite.com:
Data failed to compile:
----------
The function "state.highstate" is running as PID 4417 and was started at 2014, Apr 03 02:22:17.881027 with jid 20140403022217881027
There is no process running under PID 4417. Running sudo salt-run jobs.lookup_jid 20140403022217881027 displays nothing.
Unfortunately, I can't connect to the minion via ssh, as salt hasn't provisioned my authorized_keys yet. :\
So, to my question: what the heck is wrong, and how in the world do I find that out?
So, after a lot of debugging, this was a result of an improperly configured Nginx service. service nginx start was hanging, and thus so was salt-minion.
I had this happen when I aborted a run of state.highstate on the salt-master with Ctrl-C. It turned out that the PID referenced in the error message was actually the PID of a salt-minion process on the minion machine.
I was able to resolve the problem by restarted the salt-minion process on the minion, and then re-executing state.highstate on the master.

Resources