We are new in the salt stack. Our use case is to Schedule a run python/go script on salt-master. Is there any way or configuration to do the same?
Yes, you can schedule anything: Job Management
To run a script from the master process (not a minion):
schedule:
job1:
function: salt.cmd
args:
- cmd.run
- /path/to/script
when: whenever
That uses the salt.cmd runner to run the cmd.run function on the master, running the given script.
If you have a minion on the master server, you can use that instead and save a level of indirection.
Related
I have allowed 4505 and 4506 port on the master node system.
When I want to use 'cp.get_dir' to transfer a directory to minion v3000.5(CentOS6 arch), it took a long time, and then returned:
Minion did not return. [No response]
The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later, run the following command:
salt-run jobs.lookup_jid 20221208081502397105
and the minion log shows:
2022-12-08 16:16:25,987 [salt.utils.parsers :1082][WARNING ][21455] Minion received a SIGTERM. Exiting.
2022-12-08 16:16:28,141 [salt.minion :2003][WARNING ][21819] The minion failed to return the job information for job 20221208081547958420. This is often due to the master be
ing shut down or overloaded. If the master is running, consider increasing the worker_threads value.
When I use the same command to transfer the same directory to minion v3005.1(CentOS7 arch), it works.
And I tried several CentOS6 minion nodes, got the same error returns.
It can transfer the name of first file and directory to minion, but with empty content in the file. It seems like the communication between the master and the minion is broken when the first file start to be transfered.
But when I use cmd.run module command, it works.
In Airflow, how should I handle the error "This DAG isn't available in the webserver DagBag object. It shows up in this list because the scheduler marked it as active in the metadata database"?
I've copied a new DAG to an Airflow server, and have tried:
unpausing it and refreshing it (basic operating procedure, given in this previous answer https://stackoverflow.com/a/42291683/160406)
restarting the webserver
restarting the scheduler
stopping the webserver and scheduler, resetting the database (airflow resetdb), then starting the webserver and scheduler again
running airflow backfill (suggested here Airflow "This DAG isnt available in the webserver DagBag object ")
running airflow trigger_dag
The scheduler log shows it being processed and no errors occurring, I can interact with it and view it's state through the CLI, but it still does not appear in the web UI.
Edit: the webserver and scheduler are running on the same machine with the same airflow.cfg. They're not running in Docker.
They're run by Supervisor, which runs them both as the same user (airflow). The airflow user has read, write and execute permission on all of the dag files.
This helped me...
pkill -9 -f "airflow scheduler"
pkill -9 -f "airflow webserver"
pkill -9 -f "gunicorn"
then restart the airflow scheduler and webserver.
Just had this issue myself. After changing permissions, resetting the meta database, restarting the webserver & even making some potential code changes to rectify the situation, it didn't happen.
However, I noticed that even though we were stopping the webserver, our gunicorn process was still running. Killing these processes & then starting everything back up resulted in success
I had the same problem on an airflow installed from a Docker image
What I did was:
1- delete all files .pyc
2- delete Metadata databse using :
for t in ["xcom", "task_instance", "sla_miss", "log", "job", "dag_run", "dag" ]:
sql="delete from {} where dag_id='{}'".format(t, dag_input)
hook.run(sql, True)
3- restart webserver & scheduler
4- Execute airflow updatedb
It resolved the problem for me.
if the airflow_home - dags_folder config parameter is same for scheduler, webUI and the command line interface the only cause for the error:
This DAG isn't available in the webserver DagBag object
can be file permissions or error in python script.
Please check
Run the dag as normal python script and check for errors
User in airflow.cfg and the one creating the dag should be same or the dag should have execute permission for the airflow user
With Airflow 1.9 I don't experience the problem with zombie gunicorn processes.
I do a simple restart: systemctl restart airflow-webserver and it forces webserver to refresh DAG status.
I need some advice on how to restart all airflow services on deploy without killing the workers in the middle of a task.
I've written a deployment procedure for my DAGs which installs airflow and any other pip dependencies in a virtualenv. Once my release directory is ready, I:
stop airflow-flower, airflow-worker, airflow-scheduler, and airflow-webserver
Update the "current" simlink to point to my new release
Start airflow-flower, airflow-worker, airflow-scheduler, and airflow-webserver
The problem with this deploy procedure is that the workers get killed immediately. I'd like to add some sort of monitoring to the script to pause all DAGs, wait for the workers to idle, then restart the services, but the airflow CLI has no way to learn which dags are enabled nor whether the workers are idle.
I understand that many of the airflow services can auto-detect changes in the dags folder, but I want each deployment to have its own virtualenv. If I don't restart all services then a new deployment won't pick up a new line in my requirements.txt file.
You have access to the Airflow DB so consider developing a deployment script that does this process for you.
Update the DAG table to pause all DAGs
Read the TASK_INSTANCE table to wait until all RUNNING state tasks complete
Restart Airflow services.
Update the DAG table to unpause DAGs.
Airflow workers gracefully quit from a SIGINT. Update your process monitor to quit with SIGINT instead of the default. If you're using systemctl, then it will look something like this:
...
[Service]
EnvironmentFile=/etc/sysconfig/airflow
User=airflow
Group=airflow
Type=simple
ExecStart=...
KillSignal=SIGINT
Restart=on-failure
RestartSec=10s
...
How can I get the service PID like tomcat or jboss?
I used salt.modules.service to start/stop service, but it seems I can not get the PID of the service.
You execute arbitrary shell commands in salt, which could be used to retrieve a process' PID:
salt minion-name cmd.run 'pgrep tomcat'
You can also use this pattern in a salt state, templating it up with jinja. For example:
linux-headers:
pkg.installed:
- name: linux-headers-{{ salt['cmd.run']('uname -r') }}
I am a total beginner with SaltStack but I have managed to setup some states on a machine and run them on a minion.
What I have right now is a Debian machine setup with salt-master as well as another Debian setup as salt-minion.
Since I am using the salt-master also as a development machine, I would like to know if I can somehow apply the states on the master itself as well. And if so, how?
Is there a command I can run to apply the states on the master? (so far I was unable to find it)
Should I install salt-minion on the same machine as well to be able to do this and simply register the same machine as a minion on itself?
Thanks!
Since I am using the salt-master also as a development machine, I would like to know if I can somehow apply the states on the master itself as well. And if so, how?
You can do that by following the following steps:
Install salt-minion on your development machine
Edit /etc/salt/minion to point to your master (vi /etc/salt/minion and change the following : master: salt -> master: 127.0.0.1)
(optional) Edit /etc/salt/minion_id to something that is meaningful to you
Start up your salt-minion
Use salt-key to accept your minion's key
Use your salt-master to control your minion as if it were any other salt-minion
Is there a command I can run to apply the states on the master?
The salt-master doesn't really run the the state files, the salt-minions do. If you followed the above steps then you can target your salt-master to run highstate with the following command:
salt 'the_value_of_/etc/salt/minion_id' state.highstate
Should I install salt-minion on the same machine as well to be able to do this and simply register the same machine as a minion on itself?
Yup. I think you have an idea as to what you need to do and just need a push in the right direction.
Install both Minion and Master on single node
I call such node Master Minion. No steps provided - you already know it based on the question.
Some conceptual info instead:
In short, Master never applies states. Instead, it triggers Minions (the local Master Minion in this case).
Salt Minion and Master are two separate services with independent runtime & configuration.
While instances use common software, runtime talk over the network (location-independent).
If you can apply states on remote Minion, the same mechanism will be used for local Minion one as well.
Additional info
There are two ways to apply states:
Master-side salt command to "push" states to multiple remote minions.
rpm -qf $(which salt)
salt-master-2015.5.3-4.fc22.noarch
Minion-side salt-call command to "pull" states on single local minion.
rpm -qf $(which salt-call)
salt-minion-2015.5.3-4.fc22.noarch
Until more than one minion is involved, it's better to use salt-call for the same effect:
salt-call state.highstate
Minion-side salt-call provides advantages especially for testing, isolation, troubleshooting:
It makes network issues (if any) more obvious.
It safely applies states only on single local minion (no way to specify more than one).
It shows debug output directly in the local terminal:
salt-call -l debug test.ping
The last point, salt-call--local can also be used in masterless setup using no network.
Now it's near end of 2015. Let's review some more possibilities to salt master self-control:
Install a minion aside with salt master on the same box
This one has been widely discussed as above two answers.
Use salt-ssh + salt-run state.orchestrate
Setup steps:
Step 1: install salt-ssh
Step 2: modify roster file (e.g. /etc/salt/roster in CentOS 6). The default installation already provide you some example. Since you probably ssh into salt master, of course username / password / private key setup should not be a problem for you. For example to control salt master vagrant box, this sample should do:
localhost:
host: 127.0.0.1
user: vagrant
passwd: vagrant
sudo: True
Now, steal from official tutorial with a little bit twist:
# /srv/salt/orch/cleanfoo.sls
cmd.run:
salt.function:
- tgt: 'localhost'
- ssh: 'true'
- arg:
- touch /tmp/test.txt
And run it with:
salt-run state.orchestrate orch.cleanfoo
Check your salt master vagrant box /tmp directory if test.txt file is there.
This approach should also work for state. Either way you need to install something. I prefer the second way since in general, calling salt master self control (to provision some work) is just a step before I actually call minion to process other state(s).