How to kill airflow worker properly (with celery executor)? - airflow

I can start an "airflow worker" (with a celery executor) but I don't know how to kill it properly. I have the impression that many subprocesses are created and I don't know how to shut them gracefully.
Thanks in advance.

I normally set up the Airflow processes as systemd processes and just run systemctl stop airflow-* however I think you can simply kill the main airflow-worker process with a standard linux kill command? Probably what the systemd daemon does anyway.

To stop a worker running on a machine you can use:
airflow celery stop

Related

How to shut down a computer (host)

I know it sounds weird, but I have a case where Deno would need to shutdown its own host (and kill its own process therefore). Is this possible?
I am specifically needing this for linux (lubuntu), if that's relevant. I guess this requires sudo rights, which sucks but would be an option.
For those interested in details: I'm coding a minecraft server software and if the server has no player for 30 minutes, it will shut itself down to save some power. A raspberry PI that runs 24/7 anyways, has a wake on lan feature, so that it can boot again. After boot, the server manager software would automatically start as a linux service.
You can create a subprocess to do this:
await Deno.run({ cmd: ["shutdown", "-h", "now"] }).status();
Concepts
Deno is capable of spawning a subprocess via Deno.run.
--allow-run permission is required to spawn a subprocess.
Spawned subprocesses do not run in a security sandbox.
Communicate with the subprocess via the stdin, stdout and stderr streams.
Use a specific shell by providing its path/name and its string input switch, e.g. Deno.run({cmd: ["bash", "-c", "ls -la"]});
See also command line - Shutdown from terminal without entering password? - Ask Ubuntu for ideas on how to avoid needing sudo to call shutdown or alternative commands that you can invoke from Deno instead.
To extend on what #mfulton2 wrote, here is how I made it work, so that I did not need to start the program with sudo rights, but still was able to shut down the computer without the use of sudo outside or within the app.
Open or create the following file: sudo nano /etc/sudoer
Add the line username ALL = NOPASSWD: /sbin/shutdown
Add this line %admin ALL = NOPASSWD: /sbin/shutdown
In your deno script, write Deno.run({ cmd: ["shutdown", "-h", "now"]}).status();
Execute script!
Keep in mind that any experienced linux user would potentially tell you that this is very dangerous (it probably is) and that it might not be the very best way. But IMHO, the damage this can cause is minor enough, as it only affects the shutdown command.

How to stop airflow? [duplicate]

I am new to airflow, tried to run a dag by starting airflow webserver and scheduler. After I closed the scheduler and airflow webserver, the airflow processes are still running.
ps aux | grep airflow shows 2 airflow webserver running, and scheduler running for all dags.
I tried running kill $(ps aux | grep airflow | awk '{print $2}') but it did not help.
I don't have sudo permissions and webserver UI access.
If you run Airflow locally and start it with the two commands airflow scheduler and airflow webserver, then those processes will run in the foreground. So, simply hitting Ctrl-C for each of them should terminate them and all their child processes.
If you don't have those two processes running in the foreground, there is another way. Airflow creates files with process IDs of the scheduler and gunicorn server in its home directory (by default ~/airflow/).
Running
kill $(cat ~/airflow/airflow-scheduler.pid)
should terminate the scheduler.
Unfortunately, airflow-webserver.pid contains the PID of the gunicorn server and not the initial Airflow command that started it (which is the parent of the gunicorn process). So, we will first have to find the parent PID of the gunicorn process and then kill the parent process.
Running
kill $(ps -o ppid= -p $(cat ~/airflow/airflow-webserver.pid))
should terminate the webserver.
If simply running kill (i.e., sending SIGTERM) for these processes does not work you can always try sending SIGKILL: kill -9 <pid>. This should definitely kill them.

Running airflow worker gives error: Address already in use

I am running Airflow with CeleryExecutor. I am able to run the commands airflow webserver and airflow scheduler but trying to run airflow worker gives the error: socket.error: [Errno 98] Address already in use.
The traceback:
In the docker container running Airflow server a process was already running on the port 8793 which the worker_log_server_port settings in airflow.cfg refers by default. I changed the port to 8795 and the command airflow worker worked.
Or you can check the process listening to 8793 as: lsof i:8793 and if you don't need that process you kill it by: kill $(lsof -t -i:8793). I was running ubuntu container in docker I had to install lsof first:
apt-get update
apt-get install lsof
See if there's server_logs process running, if so, kill it and try again.
/usr/bin/python2 /usr/bin/airflow serve_logs
I had the same problem and Javed's answer about changing the worker_log_server_port on aiflow.cfg works for me.

How do you keep your airflow scheduler running in AWS EC2 while exiting ssh?

Hi I'm using Airflow and put my airflow project in EC2. However, how would one keep the airflow scheduler running while my mac goes sleep or exiting ssh?
You have a few options, but none will keep it active on a sleeping laptop. On a server:
Can use --daemon to run as daemon: airflow scheduler --daemon
Or, maybe run in background: airflow scheduler >& log.txt &
Or, run inside 'screen' as above, then detach from screen using ctrl-a d, reattach as needed using 'screen -r'. That would work on an ssh connection.
I use nohup to keep the scheduler running and redirect the output to a log file like so:
nohup airflow scheduler >> ${AIRFLOW_HOME}/logs/scheduler.log 2>&1 &
Note: Assuming you are running the scheduler here on your EC2 instance and not on your laptop.
In case you need more details on running it as deamon i.e. detach completely from terminal and redirecting stdout and stderr, here is an example:
airflow webserver -p 8080 -D --pid /your-path/airflow-webserver.pid --stdout /your-path/airflow-webserver.out --stderr /your-path/airflow-webserver.err
airflow scheduler -D --pid /your-path/airflow-scheduler.pid —stdout /your-path/airflow-scheduler.out --stderr /your-path/airflow-scheduler.err
The most robust solution would be to register it as a service on your EC2 instance. Airflow provides systemd and upstart scripts for that (https://github.com/apache/incubator-airflow/tree/master/scripts/systemd and https://github.com/apache/incubator-airflow/tree/master/scripts/upstart).
For Amazon Linux, you'd need the upstart scripts, and for e.g. Ubuntu, you would use the systemd scripts.
Registering it as a system service is much more robust because Airflow will be started upon reboot or when it crashes. This is not the case when you use e.g. nohup like other people suggest here.

How to daemonize process?

On my hosting account I run chat in Node.js. All works fine however my hosting timeout processes every 12 hours. Apparently when the process is deamonized it will not timeout and so I tried to demonize with:
using Forever.js - running forever start chat.js . Running forever list confirms it runs and ps -ef command shows ? in TTY column
tried nohup node chat.js - running ps -ef TTY column shows pts/0 and PPID is 1
I tried to disconnect stdin, stdout, and stderr, and make it ignore the hangup signal (SIGHUP) so nohup ./myscript 0<&- &> my.admin.log.file & with no luck. ps -ef TTY column is pts/0 and PPID is anything but 1
I tried (nohup ./myscript 0<&- &>my.admin.log.file &) with no luck again. ps -ef TTY column is pts/0 and PPID is 1
After all this process always timouts in about 12hrs.
Now I tried (nohup ./myscript 0<&- &>my.admin.log.file &) & and am waiting, but do not keep my hopes up and need someones help.
Hosting guys claim that daemon processes do not timeout but how can I make sure my process is a daemon? Noting I tried seems to work even though with my limited understanding ps -ef seems to suggest process is deamonized.
What shall I do to demonize the process without moving to much more expensive hosting plans? Can I argue with hosting that after all this porcess is a daemon and they just got it wrong somewhere?
Upstart is a really easy way to daemonize processes
http://upstart.ubuntu.com/
There's some info on using it with node and monit, which will restart Node for you if it crashes
http://howtonode.org/deploying-node-upstart-monit

Resources