Airflow DAG keeps failing - airflow

I am having an issue with an airflow DAG that keeps failing.
The error is shown below:
[2022-08-19 06:49:15,850] {taskinstance.py:1150} ERROR - task is not running but the task data does not show ended
Traceback (most recent call last):
File "path/taskinstance.py", line 984, in _run_raw_task
result = task_copy.execute(context=context)
File "path/asynchronous.py", line 30, in execute
self._execute(task_definition=self.task_definition, logger=self.logger)
File "path/asynchronous.py", line 159, in _execute
raise RuntimeError(f'task is not running but the task data does not show ended')
RuntimeError: task is not running but the task data does not show ended
I am a beginner in airflow. The code for this was prepared from someone who has unfortunately left and I am trying to troubleshoot it. Can anyone please tell me if they have any suggestions about what could be happening and how to fix it?
When i go to the code tab in airflow the code looks like this:
import json
import logging
from a.component.s3.s3 import S3
from a.context.configuration import Configuration
from a.context.environment import AirflowEnvironment
from a.process.process import Process
airflow_env = AirflowEnvironment()
config = Configuration()
s3 = S3(config=config)
for dag_definition_path in [p for p in airflow_env.sequencing_run_dag_dir_path.glob('*.json') if p.is_file()]:
with dag_definition_path.open() as inf:
json_dict = json.load(fp=inf)
process = Process.from_json_dict(json_dict=json_dict)
logging.info(f'process found, {process.process_name}')
logging.info(f'creating and registering dag for process, {process.process_name}')
dag = process.create_dag(config=config, airflow_env=airflow_env)
# register the DAG globally
globals()[dag.dag_id] = dag
There is a manager routine running every 30 minutes that starts new DAGs if new files are added in a specific folder.
Thank you

Related

Airflow : AirflowSkipException doesn't work

I run a python script through Airflow.
The script gets a source file from S3. I want to set the script to mark the task as 'skipped' when there is no file in the bucket.
from airflow.exceptions import AirflowSkipException
if len(file_list) == 0 : #This means there's no file in the bucket. I omitted some codes.
print('The target file does not exist. Skipping the task...')
raise AirflowSkipException
However, Airflow still marks this task 'failure' when the file doesn't exist.
Am I missing anything? Should I add something to DAG too?
I think you should call raise AirflowSkipException() with the round parentheses at the end, otherwise you are not raising an instance of the AirflowSkipException class, but the class itself which I guess is creating the error that sets the task to failed.

Airflow Scheduler fails to execute Windows EXE via WSL

My Windows 10 machine has Airflow 1.10.11 installed within WSL 2 (Ubuntu-20.04).
I have a BashOperator task which calls an .EXE on Windows (via /mnt/c/... or via symlink).
The task fails. Log shows:
[2020-12-16 18:34:11,833] {bash_operator.py:134} INFO - Temporary script location: /tmp/airflowtmp2gz6d79p/download.legacyFilesnihvszli
[2020-12-16 18:34:11,833] {bash_operator.py:146} INFO - Running command: /mnt/c/Windows/py.exe
[2020-12-16 18:34:11,836] {bash_operator.py:153} INFO - Output:
[2020-12-16 18:34:11,840] {bash_operator.py:159} INFO - Command exited with return code 1
[2020-12-16 18:34:11,843] {taskinstance.py:1150} ERROR - Bash command failed
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/airflow/models/taskinstance.py", line 984, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/python3.8/dist-packages/airflow/operators/bash_operator.py", line 165, in execute
raise AirflowException("Bash command failed")
airflow.exceptions.AirflowException: Bash command failed
[2020-12-16 18:34:11,844] {taskinstance.py:1187} INFO - Marking task as FAILED. dag_id=test-dag, task_id=download.files, execution_date=20201216T043701, start_date=20201216T073411, end_date=20201216T073411
And that's it. Return code 1 with no further useful info.
Running the very same EXE via bash works perfectly, with no error (I also tried it on my own program which emits something to the console - in bash it emits just fine, but via airflow scheduler it's the same error 1).
Some more data and things I've done to rule out any other issue:
airflow scheduler runs as root. I also confirmed it's running in a root context by putting an whoami command in my BashOperator, which indeed emitted root (I should also note that all native Linux programs run just fine! only the Windows programs don't.)
The Windows EXE I'm trying to execute and its directory have full 'Everyone' permissions (on my own program of course, wouldn't dare doing it on my Windows folder - that was just an example.)
The failure happens both when accessing via /mnt/c as well as via symlink. In the case of a symlink, the symlink has 777 permissions.
I tried running airflow test on a BashOperator task - it runs perfectly - emits output to the console and returns 0 (success).
Tried with various EXE files - both "native" (e.g. ones that come with Windows) as well as my C#-made programs. Same behavior in all.
Didn't find any similar issue documented in Airflow's GitHub repo nor here in Stack Overflow.
The question is: How does Airflow's Python usage of a subprocess (which airflow scheduler uses to run Bash Operators) different than a "normal" Bash, causing an error 1?
you can use the library subprocess and sys of Python and PowerShell
In the folder Airflow > Dags, create 2 files: main.py and caller.py
so, main.py call caller.py and caller.py go in machine (Windows) to run the files or routines.
This is the process:
code Main.py:
# Importing the libraries we are going to use in this example
from airflow import DAG
from datetime import datetime, timedelta
from airflow.operators.bash_operator import BashOperator
# Defining some basic arguments
default_args = {
'owner': 'your_name_here',
'depends_on_past': False,
'start_date': datetime(2019, 1, 1),
'retries': 0,
}
# Naming the DAG and defining when it will run (you can also use arguments in Crontab if you want the DAG to run for example every day at 8 am)
with DAG(
'Main',
schedule_interval=timedelta(minutes=1),
catchup=False,
default_args=default_args
) as dag:
# Defining the tasks that the DAG will perform, in this case the execution of two Python programs, calling their execution by bash commands
t1 = BashOperator(
task_id='caller',
bash_command="""
cd /home/[Your_Users_Name]/airflow/dags/
python3 Caller.py
""")
# copy t1, paste, rename t1 to t2 and call file.py
# Defining the execution pattern
t1
# comment: t1 execute and call t2
# t1 >> t2
Code Caller.py
import subprocess, sys
p = subprocess.Popen(["powershell.exe"
,"cd C:\\Users\\[Your_Users_Name]\\Desktop; python file.py"] # file .py
#,"cd C:\\Users\\[Your_Users_Name]\\Desktop; .\file.html"] # file .html
#,"cd C:\\Users\\[Your_Users_Name]\\Desktop; .\file.bat"] # file .bat
#,"cd C:\\Users\\[Your_Users_Name]\\Desktop; .\file.exe"] # file .exe
, stdout=sys.stdout
)
p.communicate()
How to know if your code will work in airflow, if run, its Ok.

Dag Seems to be missing

I have a dag which checks for new workflows to be generated (Dynamic DAG) at a regular interval and if found, creates them. (Ref: Dynamic dags not getting added by scheduler )
The above DAG is working and the dynamic DAGs are getting created and listed in the web-server. Two issues here:
When clicking on the DAG in web url, it says "DAG seems to be missing"
The listed DAGs are not listed using "airflow list_dags" command
Error:
DAG "app01_user" seems to be missing.
The same is for all other dynamically generated DAGs. I have compiled the Python script and found no errors.
Edit1:
I tried clearing all data and running "airflow run". It ran successfully but no Dynamic generated DAGs were added to "airflow list_dags". But when running the command "airflow list_dags", it loaded and executed the DAG, (which generated Dynamic DAGs). The dynamic DAGs are also listed as below:
[root#cmnode dags]# airflow list_dags
sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8\nLANG=en_US.UTF-8)
sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8\nLANG=en_US.UTF-8)
[2019-08-13 00:34:31,692] {settings.py:182} INFO - settings.configure_orm(): Using pool settings. pool_size=15, pool_recycle=1800, pid=25386
[2019-08-13 00:34:31,877] {__init__.py:51} INFO - Using executor LocalExecutor
[2019-08-13 00:34:32,113] {__init__.py:305} INFO - Filling up the DagBag from /root/airflow/dags
/usr/lib/python2.7/site-packages/airflow/operators/bash_operator.py:70: PendingDeprecationWarning: Invalid arguments were passed to BashOperator (task_id: tst_dyn_dag). Support for passing such arguments will be dropped in Airflow 2.0. Invalid arguments were:
*args: ()
**kwargs: {'provide_context': True}
super(BashOperator, self).__init__(*args, **kwargs)
-------------------------------------------------------------------
DAGS
-------------------------------------------------------------------
app01_user
app02_user
app03_user
app04_user
testDynDags
Upon running again, all the above generated 4 dags disappeared and only the base DAG, "testDynDags" is displayed.
When I was getting this error, there was an exception showing up in the webserver logs. Once I resolved that error and I restarted the webserver it went through normally.
From what I can see this is the error that is thrown when the webserver tried to parse the dag file and there is an error. In my case it was an error importing a new operator I added to a plugin.
Usually, I check in Airflow UI, sometimes the reason of broken DAG appear in there. But if it is not there, I usually run the .py file of my DAG, and error (reason of DAG cant be parsed) will appear.
I never got to work on dynamic DAG generation but I did face this issue when DAG was not present on all nodes ( scheduler, worker and webserver ). In case you have airflow cluster, please make sure that DAG is present on all airflow nodes.
Same error, the reason was I renamed my dag_id in uppercase. Something like "import_myclientname" into "import_MYCLIENTNAME".
I am little late to the party but I faced the error today:
In short: try executing airflow dags report and/or airflow dags reserialize
Check out my comment here:
https://stackoverflow.com/a/73880927/4437153
I found that airflow fails to recognize a dag defined in a file that does not have from airflow import DAG in it, even if DAG is not explicitly used in that file.
For example, suppose you have two files, a.py and b.py:
# a.py
from airflow import DAG
from airflow.operators.dummy_operator import DummyOperator
def makedag(dag_id="a"):
with DAG(dag_id=dag_id) as dag:
DummyOperator(task_id="nada")
dag = makedag()
and
# b.py
from a import makedag
dag = makedag(dag_id="b")
Then airflow will only look at a.py. It won't even look at b.py at all, even to notice if there's a syntax error in it! But if you add from airflow import DAG to it and don't change anything else, it will show up.

TypeError: expected string or Unicode object, NoneType found - Multiprocessing Pool not working in Zope/Plone external methods

I'm using
Zope - 2.13.19
Python - 2.6.8
The below piece of code works when run manually but not when in External method.
It throws the following error. Am I doing something conceptually wrong ?
Exception in thread Thread-3:
Traceback (most recent call last):
File "/opt/python2.6/lib/python2.6/threading.py", line 532, in __bootstrap_inner
self.run()
File "/opt/python2.6/lib/python2.6/threading.py", line 484, in run
self.__target(*self.__args, **self.__kwargs)
File "/opt/python2.6/lib/python2.6/multiprocessing/pool.py", line 225, in _handle_tasks
put(task)
TypeError: expected string or Unicode object, NoneType found
import time
from multiprocessing import Pool
import logging
def func(name):
print 'hello %s,' % name
time.sleep(5)
print 'nice to meet you.'
def get_data():
pool = Pool(processes=2)
pool.map(func, ('frank', 'justin', 'osi', 'thomas'))
Make sure all the things you're sending across process boundaries can be pickled.
As stated by Multimedia Mike:
It is possible to send objects across process boundaries to worker
processes as long as the objects can be pickled by Python's pickle
facility.

Why the imported PowerFactory module in python can only execute single time?

The script is be able to run a software called PoiwerFctory externally by Python as follows:
#add powerfactory.pyd path to python path
import sys
sys.path.append("C:\\Program Files\\DIgSILENT\\PowerFactory 2017
SP2\\Python\\3.6")
#import powerfactory module
import powerfactory
#start powerfactory module in unattended mode (engine mode)
app=powerfactory.GetApplication()
#get the user
user=app.GetCurrentUser()
#active project
project=app.ActivateProject('Python Test') #active project "Python Test"
prj=app.GetActiveProject #returns the actived project
#run python code below
ldf=app.GetFromStudyCase('ComLdf') #caling loadflow command object
ldf.Execute() #executing the load flow command
#get the list of lines contained in the project
Lines=app.GetCalcRelevantObjects('*.ElmLne') #returns all relevant objects,
i.e. all lines
for line in Lines: #get each element out of list
name=line.loc_name #get name of the line
value=line.GetAttribute('c:loading') # return the value of elements
#Print the results
print('Loading of the line: %s = %.2f'%(name,value))
When the above code first time executed in Spyder, it will show proper resutls. However, if re-executing the script again, the following error is appeared:
Reloaded modules: powerfactory
Traceback (most recent call last):
File "<ipython-input-9-ae989570f05f>", line 1, in <module>
runfile('C:/Users/zd1n14/Desktop/Python Test/Call Digsilent in
Python.py', wdir='C:/Users/zd1n14/Desktop/Python Test')
File "C:\ProgramData\Anaconda3\lib\site-
packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)
File "C:\ProgramData\Anaconda3\lib\site-
packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/zd1n14/Desktop/Python Test/Call Digsilent in Python.py",
line 12, in <module>
user=app.GetCurrentUser()
RuntimeError: 'powerfactory.Application' already deleted
Referred to How can I exit powerfactory using Python in Unattended mode?, this may because of PowerFactory in still running. And the only way which has been found so far is to re-start the Spyder and execute the script again, this is so inefficiency that if I want to re-write the code and debugging it.
It would be so much appropriated that if anyone could give me some advice for such problem.
I ran into the same Problem. Python is still connected to powerfactory and gives the Error if you try to connect again. What basicly worked for me was to kill the instance on the end of your skript with
del app
another idea during debugging could be:
try:
# Do something in your skript
finally:
del app
So the killing of the instance happens in any case.
The way to solve this is to reload the powerfacotry module by adding:
if __name__ == "__main__":
before import powerfacory.
The reason behind may referred to: What does if __name__ == "__main__": do?.

Resources