Airflow task initiation issue - airflow

We are running more then 150 dags in our production airflow environment and we facing task initiation issues very often. We are running airflow 1.7.2 with local executer mode hosted in google compute engine, Cloudsql as our metadb.
How to fix this issue? I have upgraded airflow into 1.8.2 but no luck. For the temporary solution, we are changing our dag name and start date to fix this issue. but this is not a solution. what is the solution for this issue? and Why it is happening?

It appears that the airflow scheduler and webserver are working OK, but there is an issue with the airflow workers. If you review the airflow worker log, you may see an error there.

Related

Airflow: Why do DAG tasks run outdated DAG code?

I am running Airflow (1.10.9) through Cloud Composer (1.11.1) on GCP.
Whenever I update a DAG's code I can see the updated code refreshed in the Airflow GUI but for at least 10 minutes the DAG's tasks still run the old code.
A couple of questions:
Why does this delay occur and can it be decreased?
How can I know when the task's code has been updated to make sure nobody runs old code?

Airflow SparkSubmitOperator with Yarn Cluster Mode not being able to track application status

I started reading about how we could run Spark batch jobs using Airflow.
I have tried using SparkSubmitOperator in local and it works fine. However, I need a recommendation, if we could use it in cluster mode.
The only problem I see when using in cluster mode is that, here the application status not being able to be tracked,ref shared in the link below:
https://albertusk95.github.io/posts/2019/12/airflow-tracks-spark-driver-status-cluster-deployment/
Please suggest if anyone has tried using this operator and works well in cluster mode, or if there is any issue using it.

airflow web url unable to connect

I am using celery executor with rabbitmq. Changed the airflow config file as below
broker_url = amqp://guest:guest#localhost:5672//
celery_result_backend = amqp://guest:guest#localhost:5672//
started the webserver & other services but webui url not working.
while list the dag getting below warning.
WARNING - You have configured a result_backend of amqp://guest:guest#localhost:5672//, it is highly recommended to use an alternative result_backend (i.e. a database).
Kindly help
I had this issue originally too, then I ran pip install celery (after using a downgraded version per the article I was following). With the newest version of celery now, the UI loaded and all appears working well. But, I do still get those same warning messages for rabbitmq setup using amqp. I assume it is because it is not officially supported. See: https://github.com/apache/airflow/pull/2830/files

What is the ideal environment to run Apache Airflow on?

I am currently running Airflow through Ubuntu WSL on my PC, which is working great. However I am setting up a pipeline which will need to be running constantly (24/7), so I am looking for ideas and recommendations on what to run Airflow on. I do not want to have my system on all the time obviously.
Surprisingly, I cannot find much information on this! It seems it is not discussed at length...
It depends on your workload.
If you have few tasks to run you can just create a VM on any cloud provider (GCP, AWS, Azure, etc.) and install Airflow on it that would run 24x7.
If your workload is high you can user K8s (GKE, EKS, etc.) and install Airflow on it.

Paused dag restarted on upgrading airflow from 1.8.0 to 1.8.1?

Recently i upgraded airflow from 1.8.0 to 1.8.1. The upgrade went fine but once i restarted the web server and scheduler, all paused dags restarted automatically and started running multiple runs from the date they have been stopped. It messed up most of user data and we need to clean up manually. How can we prevent this happening in the future upgrades?
In airflow.cfg just make sure to have dags_are_paused_at_creation = True and I believe that should take care of your issue. It is super annoying to run into things like that so I'm sorry about that!

Resources