I am running Airflow (1.10.9) through Cloud Composer (1.11.1) on GCP.
Whenever I update a DAG's code I can see the updated code refreshed in the Airflow GUI but for at least 10 minutes the DAG's tasks still run the old code.
A couple of questions:
Why does this delay occur and can it be decreased?
How can I know when the task's code has been updated to make sure nobody runs old code?
Related
I have this problem. I changed the DAG workflow to replace on task with another. But it seems that the replaced task is still reflecting but is not part of the workflow already (please see image). My question is, how to take out that task?
Any help is much appreciated. Thanks!
I think your best bet would be to turn off the DAG, restart the scheduler (or just start it with airflow scheduler) and wait. Usually the changes in the DAGs are only picked up after a while that the scheduler is running.
Also it could happen that while refreshing either the graph view or tree view of the DAG you'll see the task "randomly" appear and disappear, until it finally stabilize at the latest version.
After some scheduler cycles have passed and after refreshing you only see the new configuration, you can safely turn the DAG back on.
We are running more then 150 dags in our production airflow environment and we facing task initiation issues very often. We are running airflow 1.7.2 with local executer mode hosted in google compute engine, Cloudsql as our metadb.
How to fix this issue? I have upgraded airflow into 1.8.2 but no luck. For the temporary solution, we are changing our dag name and start date to fix this issue. but this is not a solution. what is the solution for this issue? and Why it is happening?
It appears that the airflow scheduler and webserver are working OK, but there is an issue with the airflow workers. If you review the airflow worker log, you may see an error there.
I am following the tutorial in the Airflow docs. When I visit the UI I don't see the toggle to turn on and off (or pause?) the DAGs
I tried to click the trigger DAG button on the right but I guess this just manually runs it once ignoring the scheduler. (A side question, it just says it's running now, it isn't finishing... is it waiting for something?)
So, did I have to do something in order to schedule the DAG first and is that why I'm not seeing a pause button, because it isn't scheduled? that would surprise me because surely I should be able to schedule it from the UI?
Lastly, what are all those other example DAGs and how can I hide them?
Seems to me that some part of your Airflow setup is broken.
Either the scheduler is not working or the files are not deployed.
My suggestion is to check this question as well: Airflow 1.9.0 is queuing but not launching tasks
Recently i upgraded airflow from 1.8.0 to 1.8.1. The upgrade went fine but once i restarted the web server and scheduler, all paused dags restarted automatically and started running multiple runs from the date they have been stopped. It messed up most of user data and we need to clean up manually. How can we prevent this happening in the future upgrades?
In airflow.cfg just make sure to have dags_are_paused_at_creation = True and I believe that should take care of your issue. It is super annoying to run into things like that so I'm sorry about that!
I am new at airflow and when i click run 'ignore all dependence' on Task Instance Context Menu like this:
Task Instance Context Menu
It leads to 'Only works with the CeleryExecutor'
I try to Refresh the Web UI but it doesn't work.
(I use LocalExecutor and don't want to use CeleryExecutor)
Why it happened and how can i run a single task ignore all dependence on the Web UI when i use LocalExecutor
I had a similar problem. Issue was following:
With LocalExecutor you cannot run single task, you could only run the whole DAG at once. Source code
DAG was already in 'success' state.
Possible solution is to change DAG status to running.
I worked around this issue by selecting the first task in my DAG and mark all downstream tasks as success.
I would then clear the task I would actually want to run and the scheduler would pick it up and run this task for me.