When I'm trying to perform airflow scheduler CLI command to run airflow scheduler (locally), and keep getting error of Connection in use: ('0.0.0.0', 8794). No scheduler process is running, but the port seems to be in use. Did someone have this problem and can help me?
Related
I have a problem with Apache Airflow Logging. My Airflow cluster has one Webserver, one Scheduler and five Workers Celery. All Workers working usually but only four Workers can fetch the log, one worker cant fetch the record with error: "Failed to fetch log file from the worker. Client error '404 NOT FOUND' for URL". I checked /etc/hosts file, this is ok and have hostname. Please help me. Thank you very much <3 <3
Hi based on airflow docs I am able to set up cloud/remote logging.
Remote logging is working for dag and task logs but it's not able to back up or remotely store following logs of.
scheduler
dag_processing_manager
I am using docker_hub airflow docker image.
Is it possible to execute an airflow DAG remotely via command line?
I know there is an airflow command line tool but it seems to allow executing from the server's terminal rather than from any external client.
I can think of two options here
Experimental Rest API (preferable): use good-old GET / POST requests to trigger / cancel DAGRuns
Airflow CLI: SSH into the machine running Airflow and trigger the DAG via command-line
I am currently setup airflow scheduler in Linux server A and airflow web server in Linux server B. Both server has no Internet access. I have start the initDB in server A and keep all the dags in server A.
However, when i refresh the webserver UI, it keep having error message:-
This DAG isn't available in the webserver DagBag object
How do i configure the dag folder for web server (server B) to read the dag from scheduler (server A)?
I am using bashoperator. Is that Celery Operator is a must?
Thanks in advance
The scheduler has found your dags_folder, and its processes, and is scheduling them accordingly. The webserver however can "see" these processes solely by their existence in the database but can't find them in its dags_folder path.
You need to ensure that the dags_folder for both servers contain the same files, and that both are kept in sync with one another. This is out of scope for Airflow and it won't handle this on your behalf.
I have a AirFlow service running normally on remote machine, which can be accessed through Browser with URL: http://airflow.xxx.com
Now I want to dynamically upload DAGs from another machine to AirFlow at airflow.xxx.com, and make that DAG auto run.
After I read the airflow document: http://airflow.incubator.apache.org/, I found way to dynamically create DAGs and auto run it, which can be done on the airflow machine airflow.xxx.com.
But I want to do it in another machine, how can I accomplish it, is there a way like webhdfs, which let me directly send command to remote AirFlow?
You should upload your new dag in the Apache Airflow dag directory.
If you didn't set Airflow up in a cluster environment, you should have web-server, scheduler and worker all running on the same machine.
On that machine, if you did not amend airflow.cfg, you should have your dag directory in dags_folder = /usr/local/airflow/dags
If you access the Airflow machine from the other machine through SFTP (or FTP) you can simply put the file in that dir.