I am Getting DagRunAlreadyExists exception even after providing the custom run id and execution date.
This occurs when there are multiple request within a second.
Here is the MWAA CLI call
def get_unique_key():
from datetime import datetime
import random
import shortuuid
import string
timestamp = datetime.now().strftime(DT_FMT_HMSf)
random_str = timestamp + ''.join(random.choice(string.digits + string.ascii_letters) for _ in range(8))
uuid_str = shortuuid.ShortUUID().random(length=12)
return '{}{}'.format(uuid_str, random_str)
execution_date = datetime.utcnow().strftime("%Y-%m-%dT%H:%m:%S.%f")
dag_run_id = get_unique_key()
workflow_id = "my_workflow"
conf = json.dumps({"foo": "bar"})
"dags trigger {0} -c '{1}' -r {2} -e {3}".format(workflow_id, conf, dag_run_id, execution_date)
and here is the error log from MWAA CLI. If this can help to debug the issue.
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/airflow/__main__.py", line 48, in main
args.func(args)
File "/usr/local/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 48, in command
return func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/utils/cli.py", line 92, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/cli/commands/dag_command.py", line 138, in dag_trigger
dag_id=args.dag_id, run_id=args.run_id, conf=args.conf, execution_date=args.exec_date
File "/usr/local/lib/python3.7/site-packages/airflow/api/client/local_client.py", line 30, in trigger_dag
dag_id=dag_id, run_id=run_id, conf=conf, execution_date=execution_date
File "/usr/local/lib/python3.7/site-packages/airflow/api/common/experimental/trigger_dag.py", line 125, in trigger_dag
replace_microseconds=replace_microseconds,
File "/usr/local/lib/python3.7/site-packages/airflow/api/common/experimental/trigger_dag.py", line 75, in _trigger_dag
f"A Dag Run already exists for dag id {dag_id} at {execution_date} with run id {run_id}"
airflow.exceptions.DagRunAlreadyExists: A Dag Run already exists for dag id my_workflow at 2022-10-18T06:10:28+00:00 with run id CL4Adauihkvz121928332658Gp6bsTWU
The problem is that the execution_date resolution is seconds . Airflow ignoring the milliseconds.
You can see in the error that no milliseconds mentioned in the execution_date (2022-10-18T06:10:28)
Related
I'm running Airflow with Docker swarm on 5 servers. After using 2 months, there are some errors on Dags like this. These errors occurred in dags that using a custom hive operator (similar to the inner function ) and no error occurred before 2 months. (Nothing changed with Dags...)
Also, if I tried to retry dag, sometimes it succeeded and sometimes it failed.
The really weird thing about this issue is that hive job was not failed. After the task was marked as failed in the airflow webserver (Sigterm), the query was complete after 1~10 mins.
As a result, flow is like this.
Task start -> 5~10 mins -> error (sigterm, airflow) -> 1~10 mins -> hive job success (hadoop log)
[2023-01-09 08:06:07,583] {local_task_job.py:208} WARNING - State of this instance has been externally set to up_for_retry. Terminating instance.
[2023-01-09 08:06:07,588] {process_utils.py:100} INFO - Sending Signals.SIGTERM to GPID 135213
[2023-01-09 08:06:07,588] {taskinstance.py:1236} ERROR - Received SIGTERM. Terminating subprocesses.
[2023-01-09 08:13:42,510] {taskinstance.py:1463} ERROR - Task failed with exception
Traceback (most recent call last):
File "/opt/airflow/dags/common/operator/hive_q_operator.py", line 81, in execute
cur.execute(statement) # hive query custom operator
File "/home/airflow/.local/lib/python3.8/site-packages/pyhive/hive.py", line 454, in execute
response = self._connection.client.ExecuteStatement(req)
File "/home/airflow/.local/lib/python3.8/site-packages/TCLIService/TCLIService.py", line 280, in ExecuteStatement
return self.recv_ExecuteStatement()
File "/home/airflow/.local/lib/python3.8/site-packages/TCLIService/TCLIService.py", line 292, in recv_ExecuteStatement
(fname, mtype, rseqid) = iprot.readMessageBegin()
File "/home/airflow/.local/lib/python3.8/site-packages/thrift/protocol/TBinaryProtocol.py", line 134, in readMessageBegin
sz = self.readI32()
File "/home/airflow/.local/lib/python3.8/site-packages/thrift/protocol/TBinaryProtocol.py", line 217, in readI32
buff = self.trans.readAll(4)
File "/home/airflow/.local/lib/python3.8/site-packages/thrift/transport/TTransport.py", line 62, in readAll
chunk = self.read(sz - have)
File "/home/airflow/.local/lib/python3.8/site-packages/thrift_sasl/__init__.py", line 173, in read
self._read_frame()
File "/home/airflow/.local/lib/python3.8/site-packages/thrift_sasl/__init__.py", line 177, in _read_frame
header = self._trans_read_all(4)
File "/home/airflow/.local/lib/python3.8/site-packages/thrift_sasl/__init__.py", line 210, in _trans_read_all
return read_all(sz)
File "/home/airflow/.local/lib/python3.8/site-packages/thrift/transport/TTransport.py", line 62, in readAll
chunk = self.read(sz - have)
File "/home/airflow/.local/lib/python3.8/site-packages/thrift/transport/TSocket.py", line 150, in read
buff = self.handle.recv(sz)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1238, in signal_handler
raise AirflowException("Task received SIGTERM signal")
airflow.exceptions.AirflowException: Task received SIGTERM signal
I already restarted the airflow server and there was nothing changed.
Here is the failed task's log (flower log)
Is there any helpful guide for me?
Thanks :)
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.8/site-packages/celery/app/trace.py", line 412, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/celery/app/trace.py", line 704, in __protected_call__
return self.run(*args, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/celery_executor.py", line 88, in execute_command
_execute_in_fork(command_to_exec)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/celery_executor.py", line 99, in _execute_in_fork
raise AirflowException('Celery command failed on host: ' + get_hostname())
airflow.exceptions.AirflowException: Celery command failed on host: 8be4caa25d17
i have install apache airflow on localhost using ubuntu. the executor can't be loaded, this is the traceback :
[2022-12-20 22:11:13,927] {manager.py:343} WARNING - Ending without manager process.
[2022-12-20 22:11:13,928] {scheduler_job.py:788} INFO - Exited execute loop
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/airflow/__main__.py", line 39, in main
args.func(args)
File "/usr/local/lib/python3.8/dist-packages/airflow/cli/cli_parser.py", line 52, in command
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/airflow/utils/cli.py", line 108, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/airflow/cli/commands/scheduler_command.py", line 73, in scheduler
_run_scheduler_job(args=args)
File "/usr/local/lib/python3.8/dist-packages/airflow/cli/commands/scheduler_command.py", line 43, in _run_scheduler_job
job.run()
File "/usr/local/lib/python3.8/dist-packages/airflow/jobs/base_job.py", line 247, in run
self._execute()
File "/usr/local/lib/python3.8/dist-packages/airflow/jobs/scheduler_job.py", line 738, in _execute
self.executor.job_id = self.id
File "/usr/lib/python3.8/functools.py", line 967, in __get__
val = self.func(instance)
File "/usr/local/lib/python3.8/dist-packages/airflow/jobs/base_job.py", line 119, in executor
return ExecutorLoader.get_default_executor()
File "/usr/local/lib/python3.8/dist-packages/airflow/executors/executor_loader.py", line 77, in get_default_executor
cls._default_executor = cls.load_executor(executor_name)
File "/usr/local/lib/python3.8/dist-packages/airflow/executors/executor_loader.py", line 103, in load_executor
raise AirflowConfigException(
airflow.exceptions.AirflowConfigException: The module/attribute could not be loaded. Please check "executor" key in "core" section. Current value: "CeleryExecutor".
this is config file in airflow.cfg
# The executor class that airflow should use. Choices include
# ``SequentialExecutor``, ``LocalExecutor``, ``CeleryExecutor``, ``DaskExecutor``,
# ``KubernetesExecutor``, ``CeleryKubernetesExecutor`` or the
# full import path to the class when using a custom executor.
executor = CeleryExecutor
I am new to Airflow. Using Taskflow API, I am trying to dynamically change the flow of DAGs. If a condition is met, the two step workflow should be executed a second time.
After defining two functions/tasks, if I fix the DAG sequence as below, everything works fine.
#dag(default_args=default_args, schedule_interval=None, start_date=days_ago(2))
def genesis(**kwargs):
#task()
def extract():
print("X")
#task()
def add_timeframe():
print("Y")
extracted_data = extract()
timeframe_data = add_timeframe(extracted_data)
However, I write any conditional logic to trigger the second run (either inside a DAG or after the function/task definitions), I get the error below. The error seems to be about setting upstream tasks. But the older "task.set_upstream(task2)" commands don't work in Taskflow Airflow 2.0.
All examples I could find of conditionally branching were based on the non Taskflow API. Please help.
#dag(default_args=default_args, schedule_interval=None, start_date=days_ago(2))
def genesis(**kwargs):
#task()
def extract():
print("X")
if <condition>:
extracted_data2 = extract()
timeframe_data2 = add_timeframe(extracted_data2)
#task()
def add_timeframe():
print("Y")
extracted_data = extract()
timeframe_data = add_timeframe(extracted_data)
ERROR - Tried to create relationships between tasks that don't have DAGs yet. Set the DAG for at least one task and try again: [<Task(_PythonDecoratedOperator): add_timeframe>, <Task(_PythonDecoratedOperator): extract>]
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1086, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1260, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1300, in _execute_task
result = task_copy.execute(context=context)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/airflow/operators/python.py", line 233, in execute
return_value = self.python_callable(*self.op_args, **self.op_kwargs)
File "/opt/airflow/dags/genesis.py", line 77, in extract
timeframe_data = add_timeframe(extracted_data)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/airflow/operators/python.py", line 294, in factory
**kwargs,
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/airflow/models/baseoperator.py", line 91, in __call__
obj.set_xcomargs_dependencies()
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/airflow/models/baseoperator.py", line 722, in set_xcomargs_dependencies
apply_set_upstream(arg)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/airflow/models/baseoperator.py", line 711, in apply_set_upstream
apply_set_upstream(elem)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/airflow/models/baseoperator.py", line 708, in apply_set_upstream
self.set_upstream(arg.operator)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/airflow/models/baseoperator.py", line 1239, in set_upstream
self._set_relatives(task_or_task_list, upstream=True)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/airflow/models/baseoperator.py", line 1205, in _set_relatives
"task and try again: {}".format([self] + task_list)
On Migrating Airflow from V1.10.2 to V1.10.10 One of our DAG have a task which is of dagrun_operator type.
Code snippet of the task looks something as below. Please assume that DAG dag_process_pos exists.
The DAG that is being triggered by the TriggerDagRunOperator is dag_process_pos. That starts with task of type dummy_operator [ Just a hint if this could be the trouble maker ]
task_trigger_dag_positional = TriggerDagRunOperator(
trigger_dag_id="dag_process_pos",
python_callable=set_up_dag_run_preprocessing,
task_id="trigger_preprocess_dag",
on_failure_callback=log_failure,
execution_date=datetime.now(),
provide_context=False,
owner='airflow')
def set_up_dag_run_preprocessing(context, dag_run_obj):
ti = context['ti']
dag_name = context['ti'].task.trigger_dag_id
dag_run = context['dag_run']
trans_id = dag_run.conf['transaction_id']
routing_info = ti.xcom_pull(task_ids="json_validation", key="route_info")
new_file_path = routing_info['file_location']
new_file_name = os.path.basename(routing_info['new_file_name'])
file_path = os.path.join(new_file_path, new_file_name)
batch_id = "123-AD-FF"
dag_run_obj.payload = {'inputfilepath': file_path,
'transaction_id': trans_id,
'Id': batch_id}
The DAG runs all fine. In fact the python callable of the task mentioned until the last line. Then it errors out.
[2020-06-09 11:36:22,838] {taskinstance.py:1145} ERROR - No row was found for one()
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 983, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/python3.6/site-packages/airflow/operators/dagrun_operator.py", line 95, in execute
replace_microseconds=False)
File "/usr/local/lib/python3.6/site-packages/airflow/api/common/experimental/trigger_dag.py", line 141, in trigger_dag
replace_microseconds=replace_microseconds,
File "/usr/local/lib/python3.6/site-packages/airflow/api/common/experimental/trigger_dag.py", line 98, in _trigger_dag
external_trigger=True,
File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/airflow/models/dag.py", line 1471, in create_dagrun
run.refresh_from_db()
File "/usr/local/lib/python3.6/site-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/airflow/models/dagrun.py", line 109, in refresh_from_db
DR.run_id == self.run_id
File "/usr/local/lib64/python3.6/site-packages/sqlalchemy/orm/query.py", line 3446, in one
raise orm_exc.NoResultFound("No row was found for one()")
sqlalchemy.orm.exc.NoResultFound: No row was found for one()
After which the on_failure_callback of that task is executed and all code of that callable runs perfectly ok as is expected. The query here is why did the dagrun_operator fail after the python callable.
P.S : The DAG that is being triggered by the TriggerDagRunOperator , in this case dag_process_pos starts with task of typedummy_operator
I'm trying to create a HttpSensor in Airflow using the following code:
wait_to_launch = HttpSensor(
task_id="wait-to-launch",
endpoint='http://' + socket.gethostname() + ":8500/v1/kv/launch-cluster?raw",
response_check=lambda response: True if 'oui'==response.content else False,
dag=dag
)
But I keep getting this error:
Traceback (most recent call last):
File "http_sensor_test.py", line 30, in <module>
dag=dag
File "/home/me/.local/lib/python2.7/site-packages/airflow/utils/decorators.py", line 86, in wrapper
result = func(*args, **kwargs)
File "/home/me/.local/lib/python2.7/site-packages/airflow/operators/sensors.py", line 663, in __init__
self.hook = hooks.http_hook.HttpHook(method='GET', http_conn_id=http_conn_id)
File "/home/me/.local/lib/python2.7/site-packages/airflow/utils/helpers.py", line 436, in __getattr__
raise AttributeError
AttributeError
What am I missing?
You are running into a known issue, see AIRFLOW-1030. A fix has been merged (#2180), but unfortunately is not yet on a released version of airflow. The fix is marked for the next release (1.9.0), but it could be weeks/months until that is out. You can run a fork of airflow with this change or add the updated version of the HttpSensor as a custom operator (plugin).