Airflow tasks not failing when an exception occurs during on_success_callback function execution even if error is caught and AirflowException is thrown in the callback func.Is this normal behaviour.
Is there any other way to make sure the task fails if an exception occurs during the callback function execution.
I believe when you reach the on_success callback , it means you the task has already succeeded. Now if you want the task to still fail because of the error in on_success callback then you might need to implement a try except block, where in you must set the task as failed manually. Something like this.
def on_success_callback(context):
try:
raise ValueError
except:
dag = context['dag']
tasks = dag.task_ids
print(context['execution_date'])
dag.clear(
task_ids = tasks,
start_date = context['execution_date'],
end_date = dag.end_date
)
You can derive the task instance from the context and then error it out.
on_success_callback is executed after the task has finished with Success.
Raising exceptions in on_success_callback will not result in changing the Task status.
If the code you execute in the on_success_callback suppose to fail the task in case of exception then this code should be in the task code.
Related
I followed the docs and created slack function:
It does work and I get notifications in channel, but get the name of the task and link to log are to another task, and not to the one that gets failed.
It gets the context of the upstream failed task, but not the failed task itself:
I tried with different operators and hooks, but get the same result.
If anyone could help, I would really appreciate it.
Thank you!
The goal of the on_failure_callback argument on the Dag level, is running this callback once when the DagRun fails, so we provide the context of the DagRun which is identical between the task instances, for that we don't care which task instance context we provide (I think we provide the context of the last defined task in the dag regardless its state).
If you want to run the callback on each failed ti, you can remove the on_failure_callback argument from the dag and add it to the default args: default_args=dict(on_failure_callback=task_fail_slack_alert).
I would like to call a method when Airflow Sensor times out. Is there a hook provided by Airflow library to do such actions? I have looked into source code and it throws AirflowSensorTimeoutException once timeout happens.
Is there a way to catch above exception or some sort of hook provided by Sensor to some action post timeout?
While there is no easy option to do it for a specific kind of exception (in your case - timeout exception), you can achieve that by adding another task that does X and pass trigger_rule='all_failed' param to it, so this task will only run if the parent failed.
In example:
sensor = FileSensor(
task_id='sensor',
filepath='/usr/local/airflow/file',
timeout=10,
poke_interval=10,
dag=dag
)
when_failed = BashOperator(
task_id='creator',
bash_command='touch /usr/local/airflow/file',
trigger_rule='all_failed',
dag=dag
)
sensor >> when_failed
You can also take this approach if you want to run only on timeout exception by using a ShortCircuitOperator with the trigger_rule that checks if the exception is timeout, but it will be more complicated.
I have a DAG defined that contains a number of tasks, the last of which is only run if any of the previous tasks fail. This task simply posts to a Slack channel that the DAG run experienced errors.
What I would really like is if the message sent to the Slack channel contained the actual error that is logged in the task logs, to provide immediate context to the error and perhaps save Ops from having to dig through the logs.
Is this at all possible?
I have an airflow DAG running multiple BashOperator tasks. These tasks in turn call an API which submits jobs on EMR clusters. The API returns a different exit code based on whether EMR job has succeeded (0), failed (1) or timed out(124). Only the timed out tasks should be retried by airflow.
Is it possible to have a callback function on error which will capture the exit code and re-trigger the task? If not then can this be handled by some existing functionality that I have missed?
== Edit 1 ==
Did some digging in airflow source and found this:
if sp.returncode:
raise AirflowException("Bash command failed")
This indicates that unless exit code is 0, airflow will mark the task as failed for all other exit codes. Essentially, for any exit code other that 0, airflow will retry the task on the basis of retry value configured.
Also, if I add the logic in call back on_error for retrying based on error_code, I will not have a control on how many times the task is retried.
Any idea there is any hook to run a task when DAG is stopped from the airflow UI?
Something different than the callbacks offered on a task? Theres the on success callback, on failure callback, and on retry callback.