Airflow run task if some of direct upstream are not triggered - airflow

I have a DAG, it has 5 tasks A,B,C,D,E. and 5 tasks triggered by failed tasks above, one for each, A_f, B_f, C_f, D_f and E_f (and similarly five on success). And at last, task X, which writes failure results to a database.
Lets say, if 2 of first five tasks failed (A and D), only A_f and D_f get triggered. What can I do to run task X?
Will all_done work? even if some of the upstream tasks were never triggered? I am not so sure about it.

Yes all_done should work. As long as none of Task X's upstream tasks have a state of None, which for any given dag run shouldn't be possible because tasks' states are inferred from previous tasks' states (i.e. skipped state is propagated or any children of a failed task are set to upstream failed), then the all_done trigger will work.

Related

Airflow task improperly has an `upstream_failed` status after previous task succeeded after 1 retry

I have two tasks A and B. Task A failed once but the retry succeeded and is marked as a success (green). I would expect Task B to perform normally since Task A retry succeeded but it is marked as upstream_failed and was not triggered. Is this a way to fix this behavior?
The Task B has an ALL_SUCCESS trigger rule.
I am using Airflow 2.0.2 on AWS (MWAA).
Trying to restart the scheduler.
upstream_failed happened from scheduler flow or when depends are seting to failed state, you can check states from Task Instances
in Retry Mode:
Task A will be in up_for_retry state until exceed retries number.
If trigger_rule set with all_success(it's default trigger rule), Task B will not trigger untill Task A finished, If every thing running correctly.
Could you add the DAG implementation?

Airflow: how to stop next dag run from starting after failure

I'm trying to see whether or not there is a straightforward way to not start the next dag run if the previous dag run has failures. I already set depends_on_past=True, wait_for_downstream=True, max_active_runs=1.
What i have is tasks 1, 2, 3 where they:
create resources
run job
tear down resources
task 3 always runs with trigger_rule=all_done to make sure we always tear down resources. What i'm seeing is that if task 2 fails, and task 3 then succeeds, the next dag run starts and if i have wait_for_downstream=False it runs task 1 since the previous task 1 was a success and if i have wait_for_downstream=true then it doesn't start the dag as i expect which is perfect.
The problem is that if tasks 1 and 2 succeed but task 3 fails for some reason, now my next dag run starts and task 1 runs immediately because both task 1 and task 2 (due to wait_for_downstream) were successful in the previous run. This is the worst case scenario because task 1 creates resources and then the job is never run so the resources just sit there allocated.
What i ultimately want is for any failure to stop the dag from proceeding to the next dag run. If my previous dag run is marked as fail then the next one should not start at all. Is there any mechanism for doing this?
My current 2 best effort ideas are:
Use a sub dag so that there's only 1 task in the parent dag and therefore the next dag run will never start at all if the previous single task dag failed. This seems like it will work but i've seen mixed reviews on the use of sub dag operators.
Do some sort of logic within the dag as a first task that manually queries the DB to see if the dag has previous failures and fails the task if it does. This seems hacky and not ideal but that it could work as well.
Is there any out of the box solution for this? Seems fairly standard to not want to continue on failure and not want step 1 to start of run 2 if not all steps of run 1 were successful or if run 1 itself was marked as failed.
The reason depends_on_past is not helping your is it's a task parameter not a dag parameter.
Essentially what you're asking for is for the dag to be disabled after a failure.
I can imagine valid use cases for this, and maybe we should add an AirflowDisableDagException that would trigger this.
The problem with this is you risk having your dag disabled and not noticing for days or weeks.
A better solution would be to build recovery or abort logic into your pipeline so that you don't need to disable the dag.
One way you can do this is add a cleanup task to the start of your dag, which can check whether resources were left sitting there and tear them down if appropriate, and just fail the dag run immediately if you get an appropriate error. You can consider using airflow Variable or Xcom to store the state of your resources.
The other option, notwithstanding the risks, is the disable dag approach: if your process fails to tear down resources appropriately, disable the dag. Something along these lines should work:
class MyOp(BaseOperator):
def disable_dag(self):
orm_dag = DagModel(dag_id=self.dag_id)
orm_dag.set_is_paused(is_paused=True)
def execute(self, context):
try:
print('something')
except TeardownFailedError:
self.disable_dag()
The ExternalTaskSensor may work, with an execution_delta of datetime.timedelta(days=1). From the docs:
execution_delta (datetime.timedelta) – time difference with the previous execution to look at, the default is the same execution_date as the current task or DAG. For yesterday, use [positive!] datetime.timedelta(days=1). Either execution_delta or execution_date_fn can be passed to ExternalTaskSensor, but not both.
I've only used it to wait for upstream DAG's to finish, but seems like it should work as self-referencing because the dag_id and task_id are arguments for the sensor. But you'll want to test it first of course.

Airflow: Only allow one instance of task

Is there a way specify that a task can only run once concurrently? So in the tree above where DAG concurrency is 4, Airflow will start task 4 instead of a second instance of task 2?
This DAG is a little special because there is no order between the tasks. These tasks are independent but related in purpose and therefore kept in one DAG so as to new create an excessive number of single task DAGs.
max_active_runs is 2 and dag_concurrency is 4. I would like it start all 4 tasks and only start a task in next if same task in previous run is done.
I may have mis-understood your question, but I believe you are wanting to have all the tasks in a single dagrun finish before the tasks begin in the next dagrun. So a DAG will only execute once the previous execution is complete.
If that is the case, you can make use of the max_active_runs parameter of the dag to limit how many running concurrent instances of a DAG there are allowed to be.
More information here (refer to the last dotpoint): https://airflow.apache.org/faq.html#why-isn-t-my-task-getting-scheduled
max_active_runs defines how many running concurrent instances of a DAG there are allowed to be.
Airflow operator documentation describes argument task_concurrency. Just set it to one.
From the official docs for trigger rules:
depends_on_past (boolean) when set to True, keeps a task from getting triggered if the previous schedule for the task hasn’t succeeded.
So the future DAGs will wait for the previous ones to finish successfully before executing.
On airflow.cfg under [core]. You will find
dag_concurrency = 16
//The number of task instances allowed to run concurrently by the scheduler
you're free to change this to what you desire.

Airflow: what is SubDagOperator success based on?

In Airflow, what is a SubDagOperator's success based on? From the Airflow docs: marking success on a SubDagOperator does not affect the state of the tasks within. But do all tasks within a SubDagOperator have to succeed for it to record success after a run? Or is it entirely separate from the state of its nested tasks? Is there a way to change its success rules?
For instance, let's say in case 1, a SubDagOperator task instance fails without any of the nested tasks being queued (e.g. an SQLAlchemy error). In case 2, nested task1 fails, but task1.trigger_rule is set to ALL_DONE, which triggers task2, and task2 succeeds.
Would Airflow mark case 2 as a success or a failure of the SubDagOperator task instance?
If case 2 is a failure, is there a way to distinguish between a failure like case 1 and a failure like case 2?
The subdag task success or failure depends on the inner dag's success or failure (like when you zoom into it, there's a circle above the run). I believe that it's if all final tasks are successful or skipped the dag is successful.

Apache Airflow ignore failed task

Is there a way to ignore failed task and proceed to next step after let's say 2 re-tries?
Example;
t1= SomeOperator(...)
t2= SomeOperator(...)
t2.set_upstream(t1)
# if t1 fails re-try 2 times and proceed to t2
# else if t1 success then proceed to t2 as usual
Take a look at airflows trigger rules.
By default, the trigger rule for every task is 'all_success', meaning the task will only get executed when all directly upstream tasks have succeeded.
What you would want here is the trigger rule 'all_done', meaning all directly upstream tasks are finished, no matter whether they failed or succeeded.
But be careful, as this also means that if a task that is not directly upstream fails, and the tasks following that task get marked as 'upstream_failed', the task with this trigger rule will still get executed.
So in your case, you would have to set retries=2 for t1 and trigger_rule='all_done' for t2.

Resources