Dag Seems to be missing - airflow

I have a dag which checks for new workflows to be generated (Dynamic DAG) at a regular interval and if found, creates them. (Ref: Dynamic dags not getting added by scheduler )
The above DAG is working and the dynamic DAGs are getting created and listed in the web-server. Two issues here:
When clicking on the DAG in web url, it says "DAG seems to be missing"
The listed DAGs are not listed using "airflow list_dags" command
Error:
DAG "app01_user" seems to be missing.
The same is for all other dynamically generated DAGs. I have compiled the Python script and found no errors.
Edit1:
I tried clearing all data and running "airflow run". It ran successfully but no Dynamic generated DAGs were added to "airflow list_dags". But when running the command "airflow list_dags", it loaded and executed the DAG, (which generated Dynamic DAGs). The dynamic DAGs are also listed as below:
[root#cmnode dags]# airflow list_dags
sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8\nLANG=en_US.UTF-8)
sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8\nLANG=en_US.UTF-8)
[2019-08-13 00:34:31,692] {settings.py:182} INFO - settings.configure_orm(): Using pool settings. pool_size=15, pool_recycle=1800, pid=25386
[2019-08-13 00:34:31,877] {__init__.py:51} INFO - Using executor LocalExecutor
[2019-08-13 00:34:32,113] {__init__.py:305} INFO - Filling up the DagBag from /root/airflow/dags
/usr/lib/python2.7/site-packages/airflow/operators/bash_operator.py:70: PendingDeprecationWarning: Invalid arguments were passed to BashOperator (task_id: tst_dyn_dag). Support for passing such arguments will be dropped in Airflow 2.0. Invalid arguments were:
*args: ()
**kwargs: {'provide_context': True}
super(BashOperator, self).__init__(*args, **kwargs)
-------------------------------------------------------------------
DAGS
-------------------------------------------------------------------
app01_user
app02_user
app03_user
app04_user
testDynDags
Upon running again, all the above generated 4 dags disappeared and only the base DAG, "testDynDags" is displayed.

When I was getting this error, there was an exception showing up in the webserver logs. Once I resolved that error and I restarted the webserver it went through normally.
From what I can see this is the error that is thrown when the webserver tried to parse the dag file and there is an error. In my case it was an error importing a new operator I added to a plugin.

Usually, I check in Airflow UI, sometimes the reason of broken DAG appear in there. But if it is not there, I usually run the .py file of my DAG, and error (reason of DAG cant be parsed) will appear.

I never got to work on dynamic DAG generation but I did face this issue when DAG was not present on all nodes ( scheduler, worker and webserver ). In case you have airflow cluster, please make sure that DAG is present on all airflow nodes.

Same error, the reason was I renamed my dag_id in uppercase. Something like "import_myclientname" into "import_MYCLIENTNAME".

I am little late to the party but I faced the error today:
In short: try executing airflow dags report and/or airflow dags reserialize
Check out my comment here:
https://stackoverflow.com/a/73880927/4437153

I found that airflow fails to recognize a dag defined in a file that does not have from airflow import DAG in it, even if DAG is not explicitly used in that file.
For example, suppose you have two files, a.py and b.py:
# a.py
from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
def makedag(dag_id="a"):
with DAG(dag_id=dag_id) as dag:
DummyOperator(task_id="nada")
dag = makedag()
and
# b.py
from a import makedag
dag = makedag(dag_id="b")
Then airflow will only look at a.py. It won't even look at b.py at all, even to notice if there's a syntax error in it! But if you add from airflow import DAG to it and don't change anything else, it will show up.

Related

Working DAG fails when triggered from another dag in CLI

I have a simple DAG which connects to an impala db and runs an sql script. The dag runs fine when running independently:
airflow dags test original_dag_name 2022-9-27
However, when I test using TriggerDagRunOpertor from another DAG, the DAG fails:
airflow tasks test other_dag_name trigger_task 2022-9-27
Looking at the logs for original_dag_name I see the following:
[Cloudera][ODBC] (11560) Unable to locate SQLGetPrivateProfileString function
...which appears to be driver related, which doesn't make sense as it works fine when I trigger the DAG on its own. Is there some sort of config not getting set correctly when triggering via TriggerDagRunOperator?
Here is the TriggerDagRunOperator task:
task_run_original_dag = TriggerDagRunOperator (
task_id='run_original_dag',
trigger_dag_id='original_dag_name',
execution_date='{{ ds }}',
reset_dag_run=True,
wait_for_completion=True,
poke_interval=60
)

Airflow 2 - debugging why dag is not loading

On Airflow 2 my dag is not showing on the UI, and I'm getting DAG Import Errors (...) for it.
The error message is insufficient for me to debug (it's a custom operator, with a lot of custom logic - so I don't want to get into details of the error itself).
On Airflow 1.X I could use cli:
airflow list_dags
to get more elaborated debug message, is there anything analogical on airflow 2 ?
I'm looking for a cli command/UI option that will provide me with more elaborated error message, than the one I'm getting on the main screen of the webserver.
As described in the Airlfow's documentation, to test DAG loading you can simply run:
python your-dag-file.py
If there is any problem during the DAG loading phase you will get a stack trace here.
The later sections also describe how to test custom operators.
As explained in the upgrading manual the
airflow list_dags has been changed to airflow dags list
The full syntax is:
airflow dags list [-h] [-o table, json, yaml] [-S SUBDIR]
for more information see docs

Airflow Scheduler fails to execute Windows EXE via WSL

My Windows 10 machine has Airflow 1.10.11 installed within WSL 2 (Ubuntu-20.04).
I have a BashOperator task which calls an .EXE on Windows (via /mnt/c/... or via symlink).
The task fails. Log shows:
[2020-12-16 18:34:11,833] {bash_operator.py:134} INFO - Temporary script location: /tmp/airflowtmp2gz6d79p/download.legacyFilesnihvszli
[2020-12-16 18:34:11,833] {bash_operator.py:146} INFO - Running command: /mnt/c/Windows/py.exe
[2020-12-16 18:34:11,836] {bash_operator.py:153} INFO - Output:
[2020-12-16 18:34:11,840] {bash_operator.py:159} INFO - Command exited with return code 1
[2020-12-16 18:34:11,843] {taskinstance.py:1150} ERROR - Bash command failed
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/airflow/models/taskinstance.py", line 984, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/python3.8/dist-packages/airflow/operators/bash_operator.py", line 165, in execute
raise AirflowException("Bash command failed")
airflow.exceptions.AirflowException: Bash command failed
[2020-12-16 18:34:11,844] {taskinstance.py:1187} INFO - Marking task as FAILED. dag_id=test-dag, task_id=download.files, execution_date=20201216T043701, start_date=20201216T073411, end_date=20201216T073411
And that's it. Return code 1 with no further useful info.
Running the very same EXE via bash works perfectly, with no error (I also tried it on my own program which emits something to the console - in bash it emits just fine, but via airflow scheduler it's the same error 1).
Some more data and things I've done to rule out any other issue:
airflow scheduler runs as root. I also confirmed it's running in a root context by putting an whoami command in my BashOperator, which indeed emitted root (I should also note that all native Linux programs run just fine! only the Windows programs don't.)
The Windows EXE I'm trying to execute and its directory have full 'Everyone' permissions (on my own program of course, wouldn't dare doing it on my Windows folder - that was just an example.)
The failure happens both when accessing via /mnt/c as well as via symlink. In the case of a symlink, the symlink has 777 permissions.
I tried running airflow test on a BashOperator task - it runs perfectly - emits output to the console and returns 0 (success).
Tried with various EXE files - both "native" (e.g. ones that come with Windows) as well as my C#-made programs. Same behavior in all.
Didn't find any similar issue documented in Airflow's GitHub repo nor here in Stack Overflow.
The question is: How does Airflow's Python usage of a subprocess (which airflow scheduler uses to run Bash Operators) different than a "normal" Bash, causing an error 1?
you can use the library subprocess and sys of Python and PowerShell
In the folder Airflow > Dags, create 2 files: main.py and caller.py
so, main.py call caller.py and caller.py go in machine (Windows) to run the files or routines.
This is the process:
code Main.py:
# Importing the libraries we are going to use in this example
from airflow import DAG
from datetime import datetime, timedelta
from airflow.operators.bash_operator import BashOperator
# Defining some basic arguments
default_args = {
'owner': 'your_name_here',
'depends_on_past': False,
'start_date': datetime(2019, 1, 1),
'retries': 0,
}
# Naming the DAG and defining when it will run (you can also use arguments in Crontab if you want the DAG to run for example every day at 8 am)
with DAG(
'Main',
schedule_interval=timedelta(minutes=1),
catchup=False,
default_args=default_args
) as dag:
# Defining the tasks that the DAG will perform, in this case the execution of two Python programs, calling their execution by bash commands
t1 = BashOperator(
task_id='caller',
bash_command="""
cd /home/[Your_Users_Name]/airflow/dags/
python3 Caller.py
""")
# copy t1, paste, rename t1 to t2 and call file.py
# Defining the execution pattern
t1
# comment: t1 execute and call t2
# t1 >> t2
Code Caller.py
import subprocess, sys
p = subprocess.Popen(["powershell.exe"
,"cd C:\\Users\\[Your_Users_Name]\\Desktop; python file.py"] # file .py
#,"cd C:\\Users\\[Your_Users_Name]\\Desktop; .\file.html"] # file .html
#,"cd C:\\Users\\[Your_Users_Name]\\Desktop; .\file.bat"] # file .bat
#,"cd C:\\Users\\[Your_Users_Name]\\Desktop; .\file.exe"] # file .exe
, stdout=sys.stdout
)
p.communicate()
How to know if your code will work in airflow, if run, its Ok.

Apache Airflow problem - "a task with task_id create_tag_template_field_result is already in the DAG"

So, I have a problem with even the blank Airflow installation.
As soon as I try to run
airflow test tutorial print_date 2015-06-01
I get a raised exception which says
PendingDeprecationWarning: The requested task could not be added to the DAG because a task with task_id create_tag_template_field_result is already in the DAG. Starting in Airflow 2.0, trying to overwrite a task will raise an exception.
What is the reason for this (as I made literally no changes to the installation whatsoever)?
I also got that when, in a previous installation, I tried to run my own dag... but the "create_tag_template_field_result" was nowhere to be found in my code.
you can set the config arg load_examples = False to solve it.
This is the test command will call get_dag function which will construct a DagBag object, in the construction function will call collect_dags function.
The collect_dags function when the conf arg LOAD_EXAMPLES=True(default True), will collect all the dags in the example path, that's where the task create_tag_template_field_result comes from.
And in the collect_dags function will call add_task function of every example task, that's where you add the create_tag_template_field_result task again.
And maybe it's quickstart when you added this task before for the first time while you didn't realize.
you can set the config arg load_examples = False to solve it
This warning is occuring in
/usr/local/lib/python3.7/dist-packages/airflow/example_dags/example_complex.py
so i remove or rename (for example, to not working name *.py.back ) this.
I had the same error with a fresh install.
Then I don't know if this helps, but I downgraded Airflow to version 1.10.10 (with python3.7) and the error was gone.

How to find the number of upstream tasks failed in Airflow?

I am having a tough time in figuring out how to find the failed task for the same dag run running twice on same day(same execution day).
Consider an example when a dag with dag_id=1 has failed on the first run (due to any reason lets say connection timeout maybe) and task got failed. TaskInstance table will contain the entry of the failed task when we try to query it. GREAT!!
But, If I re-run the same dag(note that dag_id is still 1) then in the last task(it has the rule of ALL_DONE so irrespective of the whether upstream task was failed or was successful it will be executed) I want to calculate the number of tasks failed in the current dag_run ignoring the previous dag_runs. I came across dag_run id which could be useful if we can relate it to TaskInstance but I could not. Any suggestions/help is appreciated.
In Airflow 1.10.x the same result can be achieved by much simpler code that avoids touching ORM directly.
from airflow.utils.state import State
def your_python_operator_callable(**context):
tis_dagrun = context['ti'].get_dagrun().get_task_instances()
failed_count = sum([True if ti.state == State.FAILED else False for ti in tis_dagrun])
print(f"There are {failed_count} failed tasks in this execution"
The one unfortunate problem is that context['ti'].get_dagrun() does not return instance of DAGRun when running test of a single task from CLI. In the effect, manual testing of that single task will fail but the standard run will work as expected.
You could create a PythonOperator task which queries the Airflow database to find the information you're looking for. This has the added benefit of passing along the information you need to query for the data you want:
from contextlib import closing
from airflow import models, settings
from airflow.utils.state import State
def your_python_operator_callable(**context):
with closing(settings.Session()) as session:
print("There are {} failed tasks in this execution".format(
session.query(
models.TaskInstance
).filter(
models.TaskInstance.dag_id == context["dag"].dag_id,
models.TaskInstance.execution_date == context["execution_date"],
models.TaskInstance.state == State.FAILED).count()
)
Then add the task to your DAG with a PythonOperator.
(I have not tested the above, but hopefully will send you on the right path)

Resources