I placed a dag file in the dags folder based on a tutorial with slight modifications, but it doesn't show up in the GUI or when run airflow dags list.
Answering my own question: Check the python file for Exceptions by running it directly. It turns out one exception in the dag's python script due to a missing import made the dag not show up in the list. I note this just in case another new user comes across this. To me the moral of the story is that dag files should often be checked by running with python directly when they are modified because there won't be an obvious error showing up otherwise; they may just disappear from the list.
I have 2 files inside dags directory - dag_1.py and dag_2.py
dag_1.py creates a static DAG and dag_2.py creates dynamic DAGs based on external json files at some location.
The static DAG (created by dag_1.py) contains a task at a later stage which generates some of these input json files for dag_2.py and dynamic DAGs are created in this manner.
This used to work fine with Airflow 1.x versions where DAG Serialization was not used. But with Airflow 2.0 DAG Serialization has become mandatory. Sometimes, I get the following exception in the Scheduler when dynamic DAGs are spawned -
[2021-01-02 06:17:39,493] {scheduler_job.py:1293} ERROR - Exception when executing SchedulerJob._run_scheduler_loop
Traceback (most recent call last):
File "/global/packages/python/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1275, in _execute
self._run_scheduler_loop()
File "/global/packages/python/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1377, in _run_scheduler_loop
num_queued_tis = self._do_scheduling(session)
File "/global/packages/python/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1474, in _do_scheduling
self._create_dag_runs(query.all(), session)
File "/global/packages/python/lib/python3.7/site-packages/airflow/jobs/scheduler_job.py", line 1557, in _create_dag_runs
dag = self.dagbag.get_dag(dag_model.dag_id, session=session)
File "/global/packages/python/lib/python3.7/site-packages/airflow/utils/session.py", line 62, in wrapper
return func(*args, **kwargs)
File "/global/packages/python/lib/python3.7/site-packages/airflow/models/dagbag.py", line 171, in get_dag
self._add_dag_from_db(dag_id=dag_id, session=session)
File "/global/packages/python/lib/python3.7/site-packages/airflow/models/dagbag.py", line 227, in _add_dag_from_db
raise SerializedDagNotFound(f"DAG '{dag_id}' not found in serialized_dag table")
airflow.exceptions.SerializedDagNotFound: DAG 'dynamic_dag_1' not found in serialized_dag table
After this the scheduler gets terminated which is expected.
When I check the table manually after this error, I am able to see the DAG entry in it.
This issue is not reproducible all the time. What can be the probable cause for this? Is there any Airflow configuration which I should try tweaking?
We had the same issue after updating in the following order:
1.10.12 -> 1.10.14
1.10.14 -> 2.0.0
I've followed their guide through, and we had no issues until at some random point after a few hours scheduler started crashing complaining about random DAGs not being found in the database.
Our deployment procedure involves clearing out /opt/airflow/dags folder and doing a clean install every time (we store dags and supporting code in python packages)
So every now and then on 1.10.x version we had cases when scheduler parsed an empty folder and wiped serialized dags from the database, but it always was able to restore the picture on next parse
Apparently in 2.0, as a part of the effort to make scheduler HA, they fully separated DAG processor and scheduler. Which leads to a race condition:
if scheduler job hits a database before DAG processor has updated serialized_dag table values, it finds nothing and crashes
if luck is on your side, the above will not happen and you won't see this exception
In order to get rid of this problem, I disabled scheduling of all DAGs by updating is_paused in the database, restarted the scheduler and once it generated serialized dags, turned all dags back ON
I fixed this issue in https://github.com/apache/airflow/pull/13893 which will be released as part for Airflow 2.0.1.
Will release Airflow 2.0.1 next week (8 Feb 2021 - most likely).
Not enough rep to comment so having to leave an answer, but:
is this a clean 2.0 install or an upgrade of your old 1.10.x instance? and
are you recycling the names?
I literally just hit this problem (I found this question googling to see who else was in the same boat).
In my case, it's an upgraded existing 1.10.x install, and although the dags were generated dynamically, the names were recycled. I was getting errors clicking on the dag in the GUI and it was killing the scheduler.
Turns Out(TM), deleting the dags entirely using the 'trashcan' button in the GUI overview and letting them regenerate fixed it (as in, the problem immediately went away and hasn't recurred in the last 30 minutes).
At a guess, it smells to me like maybe some aspect of the dynamic dags wasn't properly migrated in the db upgrade step, and wiping them out and letting them fully regenerate fixed the problem. Obviously, you lose all your history etc, but (in my case at least) that's not necessarily a big deal.
Selected answer didn't work for me(after bashing my head for few hours).
Heres what works:
Just go to the backend database(postgresql) and delete all the records regarding logs, task instances, faild task and ... but dont delete the main tables(if you cant tell the diffrence, just remove the tables i mentioned)
then go and do airflow db init
seems like old data about obsolete and deleted dags and tasks can really turn airflow into a mess. delete the mess, get it working.
I am using some constants in my DAG which are being imported from another (configuration) Python file in the project directory.
Scenario
Airflow is running, I add a new DAG. I am importing the schedule_interval from that configuration file and some other constant as well which I am passing to a function being called in the PythonOperator in my DAG.
I update the code base, so new dag gets added in the airflow_dag folder and the configuration file gets updated with the new constants.
Problem
The schedule_interval does not work and the dag does not get scheduled. It also throws an exception (import error) for any other constant which is being imported in the dag.
In the web ui I can see the new dag, but I can also see a red label error that displays could not found constant XYZ in configuration_file.py while it's actually there.
It does not come here no matter how long I wait.
Bad Solution
I go and restart the airflow scheduler (and webserver as well just in case), and everything starts working again.
Question
Is there a solution to this where I will not have to restart airflow and update those things?
Note: The proposed solution to refresh dag in question Can airflow load dags file without restart scheduler did not work for me.
Currently i am using Airflow with Version : 1.10.10
After opening into airflow/logs folder there are many folder that are named based on your DAG name but there is a folder named scheduler which when opened consist folder that are named in date format ( E.g 2020/07/08 ) and it goes until the date when i first using airflow.After searching through multiple forum I'm still not sure what this folder logs are for.
Anyway the probelm is I kept wondering if it is okay to delete the contents inside scheduler folder since it takes so much space unlike the rest of the folder that are named based on the DAG name (I'm assuming thats where the log of each DAG runs is stored). Will the action of deleting the contents of scheduler cause any error or loss of DAG log?.
This might be a silly question but i want to make sure since the Airflow is in production server. I've tried creating an Airflow instance in local instance and delete the scheduler folder contents and it seems no error have occurred. Any feedback and sharing experience on handling this issue is welcomed
Thanks in Advance
It contains the logs of airflow scheduler afaik. I have used it only one time for a problem about SLAs.
I've been deleting old files in it for over a year, never encountered a problem. this is my command to delete old log files of scheduler:
find /etc/airflow/logs/scheduler -type f -mtime +45 -delete
I'm having trouble updating a dag file. Dag still have an old version of my dag file. I added a task but it seems not updated when I check the log and UI (DAG->Code).
I have very simple tasks.
I of course checked the dag directory path in airflow.cfg and restarted airflow webserver/scheduler.
I have no issue running it (but with the old dag file).
Looks like a bug of airflow. A temp solution is to delete the task instances from airflow db by
delete from task_instance where dag_id=<dag_name> and task_id=<deleted_task_name>';
This should be simpler and less impactful than the resetdb route which would delete everything including variables and connections set before.
Use terminal and run the below command soon after changing the dag file.
airflow initdb
This worked for me.
You can try to remove the old .pyc file for that dag in the dags folder and generate it again.
UI sometimes is not up to date to me, but the code is actually there in dag bag. You can try to:
Use refresh button to see if code refreshed
Use higher version 1.8+, this happens to me before when I used 1.7.X, but after 1.8+, it seems much better after you refresh dag in UI
You can also use "airflow test" to see if the code is in place, and try the advice from #Him as well.
Same thing happened to me.
In the end the best thing is to "resetdb", add connections and import variables again and then airflow initdb and set the scheduler back again.
I don't know why this happens. Anybody knows? It seems not so easy to add tasks or change names once compiled. Removing *.pyc or logs folder did not work for me.
In DAG page of Airflow webserver, delete the DAG. It will delete the record in the database. After a while the DAG will appear again in the page, but the old task_id is removed.