multiple email Id's on Airflow email alert - airflow

Description
I would like my DAG will send an email on failure to multiple email id's.
My try
default_args = {
'owner': 'my-project-owner',
'depends_on_past': False,
'email': ['email1#org.com', 'email2#org.com'],
'email_on_failure': True,
'email_on_retry': True,
'retries': 2,
#'retry_exponential_backoff': True,
'retry_delay': datetime.timedelta(minutes=5),
'start_date': datetime.datetime(2020, 4, 1)
}
not received an email with above arguments.
What is the way to mention multiple email IDs in case above arguments will not work?

Refer this
This talks about various strategies to send emails for mutiple recipients.

Related

Airflow not calling sla_miss_callback function

I'm trying to add SLA slack alerts for our airflow. Our slack_failure_notification works fine, but I've been unable to get airflow to call sla_miss_callback.
I've seen in multiple threads people saying to put the sla_miss_callback in the dag definition and NOT in the default_args. If I do this, the dag runs fine, and my test_task gets inserted into the sla miss table, but it says notification sent = False and never seems to trigger the sla_miss_callback function. Looking at the task instance details, it has sla = 0:01:00 and has an on_failure_callback attribute but no sla_miss_callback attribute (not sure if it is supposed to have that or not).
def slack_failure_notification(context):
failed_alert = SlackWebhookOperator(
task_id='slack_failure_notification',
http_conn_id='datascience_alerts_slack',
message="failed task")
return failed_alert.execute(context=context)
def slack_sla_notification(dag, task_list, blocking_task_list, slas, blocking_tis):
alert = SlackWebhookOperator(
task_id='slack_sla_notification',
http_conn_id='datascience_alerts_slack',
message="testing SLA function")
return alert.execute()
default_args = {
'owner': 'airflow',
'start_date': datetime.datetime(2019, 1, 1),
'depends_on_past': False,
'retries': 0,
'on_failure_callback': slack_failure_notification
}
dag = DAG('template_dag', default_args=default_args, catchup=False, schedule_interval="10 13 * * *", sla_miss_callback=slack_sla_notification)
test_task = SnowflakeOperator(
task_id='test_task',
dag=dag,
snowflake_conn_id='snowflake-static-datascience_airflow',
sql=somelongrunningcodehere,
sla=datetime.timedelta(minutes=1)
)
If I instead put sla_miss_callback in default_args, the task still gets put into the sla miss table, and it says notification sent = True, but the sla_miss_callback function still never triggers. I also see nothing in our log files.
default_args = {
'owner': 'airflow',
'start_date': datetime.datetime(2019, 1, 1),
'depends_on_past': False,
'retries': 0,
'on_failure_callback': slack_failure_notification,
'sla_miss_callback': slack_sla_notification
}
I have also tried defining the function using def slack_sla_notification(*args, **kwargs): with no change in behavior.
I know the airflow developers say the SLA stuff is a bit of a mess and will be reworked at some point, but I'd love to get this to work in the meantime if anyone has any ideas of things to try.
It looks like you forgot to include context in your execute method.
If you just want to see your message, I would suggest something like this:
def slack_sla_notification(dag, task_list, blocking_task_list, slas, blocking_tis):
message = "testing SLA function"
alert = SlackWebhookOperator(
task_id='slack_sla_notification',
http_conn_id='datascience_alerts_slack',
message=message)
return alert.execute(message)

Notify after 2 consecutive task failures on Airflow

Is there a way to only notify/email on 2 consecutive task failures - we want a task to retry first if failed, and if the second try failed again, page. We don't want the email to be sent on the first failure, which Airflow's email_on_failure would do.
You might need to disable email_on_retry option and enable email_on_failure in default_Args.
DEFAULT_ARGS = {
'owner': 'me',
'depends_on_past': False,
'email': ['example#example.com'],
'email_on_failure': True,
'retries': 2,
'email_on_retry': False,
'retry_delay': timedelta(seconds=5)
}
That will notify you after the task failed again

How to identify the user answer from a quiz using the python-telegram-bot library

I am trying to write a Python code so that I can apply a questionnaire to my physics students remotely. They would receive and reply to the quiz via
a Telegram bot. I'm using the python-telegram-bot module. I can already identify which users replied, but I can't get yet their answers.
I am starting with the example code pollbot.py from here: https://raw.githubusercontent.com/python-telegram-bot/python-telegram-bot/master/examples/pollbot.py
The specific methods are these:
def quiz(update: Update, context: CallbackContext) -> None:
"""Send a predefined poll"""
questions = ["1", "2", "4", "20"]
message = update.effective_message.reply_poll(
"How many eggs do you need for a cake?", questions, type=Poll.QUIZ, correct_option_id=2
)
# Save some info about the poll the bot_data for later use in receive_quiz_answer
payload = {
message.poll.id: {"chat_id": update.effective_chat.id, "message_id": message.message_id}
}
context.bot_data.update(payload)
def receive_quiz_answer(update: Update, context: CallbackContext) -> None:
"""Close quiz after three participants took it"""
# the bot can receive closed poll updates we don't care about
if update.poll.is_closed:
return
if update.poll.total_voter_count == 3:
try:
quiz_data = context.bot_data[update.poll.id]
# this means this poll answer update is from an old poll, we can't stop it then
except KeyError:
return
context.bot.stop_poll(quiz_data["chat_id"], quiz_data["message_id"])
The telegram app shows to the user if s/he answered correctly or not. I already know which user answered the quiz. I would like to know programmatically which of the possible options of the quiz the user answered.
I did print message in the quiz method and I found this:
{'message_id': 93, 'date': 1614250320, 'chat': {'id': xxxxxxxx, 'type': 'private', 'first_name': 'user', 'last_name': 'name'}, 'entities': [], 'caption_entities': [], 'photo': [], 'new_chat_members': [], 'new_chat_photo': [], 'delete_chat_photo': False, 'group_chat_created': False, 'supergroup_chat_created': False, 'channel_chat_created': False, 'poll': {'id': '5024004102909067266', 'question': 'How many eggs do you need for a cake?', 'options': [{'text': '1', 'voter_count': 0}, {'text': '2', 'voter_count': 0}, {'text': '4', 'voter_count': 0}, {'text': '20', 'voter_count': 0}], 'total_voter_count': 0, 'is_closed': False, 'is_anonymous': True, 'type': 'quiz', 'allows_multiple_answers': False, 'correct_option_id': 2, 'explanation_entities': [], 'close_date': None}, 'from': {'id': 144******6, 'first_name': 'my_own', 'is_bot': True, 'username': '_my_own_Bot'}}
It seems the info on the user answer is nowhere here.
Also voter_count is not changing, it stays at zero no matter how many
answers I click on at the chat group.
I added these lines in the method receive_quiz_answer:
quiz_data = context.bot_data[update.poll.id]
print(quiz_data)
and I only got this info:
{'chat_id': 15******53, 'message_id': 87}
which is not relevant for what I want. I believe the answer should be in
telegram.PollAnswer, which is not listed in the code above.
I added the following lines:
answer = update.poll_answer
print("answer is {}". format(answer))
in receive_quiz_answer method, but I got the reply:
answer is None
Any help appreciated.
Thanks a lot.

How to pass bearer token in the Airflow

I have a job with 3 tasks
1) Get a token using a POST request
2) Get token value and store in a variable
3) Make a GET request by using token from step 2 and pass bearer token
Issue is step 3 is not working and i am getting HTTP error. I was able to print the value of token in the step 2 and verified in the code
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': airflow.utils.dates.days_ago(2),
'email': ['airflow#example.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5),
}
token ="mytoken" //defined with some value which will be updated later
get_token = SimpleHttpOperator(
task_id='get_token',
method='POST',
headers={"Authorization": "Basic xxxxxxxxxxxxxxx=="},
endpoint='/token?username=user&password=pass&grant_type=password',
http_conn_id = 'test_http',
trigger_rule="all_done",
xcom_push=True,
dag=dag
)
def pull_function(**context):
value = context['task_instance'].xcom_pull(task_ids='get_token')
print("printing token")
print value
wjdata = json.loads(value)
print(wjdata['access_token'])
token=wjdata['access_token']
print token
run_this = PythonOperator(
task_id='print_the_context',
provide_context=True,
python_callable=pull_function,
dag=dag,
)
get_config = SimpleHttpOperator(
task_id='get_config',
method='GET',
headers={"Authorization": "Bearer " + token},
endpoint='someendpoint',
http_conn_id = 'test_conn',
trigger_rule="all_done",
xcom_push=True,
dag=dag
)
get_token >> run_this >> get_config
The way you are storing token as a "global" variable won't work. The Dag definition file (the script where you defined the tasks) is not the same runtime context as the one for executing each task. Every task can be run in a separate thread, process, or even on another machine, depending on the executor. The way you pass data between the tasks is not by global variables, but rather using the XCom - which you already partially do.
Try the following:
- remote the global token variable
- in pull_function instead of print token do return token - this will push the value to the XCom again, so the next task can access it
- access the token from XCom in your next task.
The last step is a bit tricky since you are using the SimpleHttpOperator, and it's only templated fields are endpoint and data, but not headers.
For example, if you wanted to pass in some data from the XCom of a previous task, you would do something like this:
get_config = SimpleHttpOperator(
task_id='get_config',
endpoint='someendpoint',
http_conn_id = 'test_conn',
dag=dag,
data='{{ task_instance.xcom_pull(task_ids="print_the_context", key="some_key") }}'
)
But you can't do the same with the headers unfortunately, so you have to either do it "manually" via a PythonOperator, or you could inherit SimpleHttpOperator and create your own, something like:
class HeaderTemplatedHttpOperator(SimpleHttpOperator):
template_fields = ('endpoint', 'data', 'headers') # added 'headers' headers
then use that one, something like:
get_config = HeaderTemplatedHttpOperator(
task_id='get_config',
endpoint='someendpoint',
http_conn_id = 'test_conn',
dag=dag,
headers='{{ task_instance.xcom_pull(task_ids="print_the_context") }}'
)
Keep in mind I did no testing on this, it's just for the purpose of explaining the concept. Play around with the approach and you should get there.

Apache airflow sends sla miss emails only to first person on the list

I use Apache Airflow and I would like it to send email notifications on sla miss. I store email adresses as airflow variable, and I have a dag which one of its tasks sends Email using EmailOperator.
And here comes the issue because however It sends emails when my send-mail task is run to all the recipients, It do sends sla miss notifaction only to the first adress on the list which in my example means test1#test.com.
Is this some bug, or why it's not working ?
Here's my dag and airlfow variable:
from airflow import DAG
from datetime import datetime, timedelta
from airflow.operators.email_operator import EmailOperator
from airflow.models import Variable
from airflow.operators.slack_operator import SlackAPIPostOperator
email = Variable.get("test_recipients")
args = {
'owner': 'airflow'
, 'depends_on_past': False
, 'start_date': datetime(2018, 8, 20, 0, 0)
, 'retries': 0
, 'email': email
, 'email_on_failure': True
, 'email_on_retry': True
, 'sla': timedelta(seconds=1)
}
dag = DAG('sla-email-test'
, default_args=args
, max_active_runs=1
, schedule_interval="#daily")
....
t2 = EmailOperator(
dag=dag,
task_id="send-email",
to=email,
subject="Testing",
html_content="<h3>Welcome to Airflow</h3>"
)
Yes, there is currently a bug in Airflow when it comes to sending the SLA emails - that code path doesn't correctly splitĀ a string by , like task failure emails do.
The short work around right now is to make your variable a list (i.e. with a value of ["test1#test.com","test2#test.com"] and access it like:
email = Variable.get("test_recipients", deserialize_json=True)
That should work in both cases (SLA, and task emails)
I don't see this as a bug, I would say it is an expected behavior. It is because you are passing a string "email1#example.com, email2#example.com" to email argument. You can do the following if you want to use this string as python list:
test_receipients="email1#example.com, email2#example.com"
email = Variable.get("test_recipients")
args = {
'owner': 'airflow'
, 'depends_on_past': False
, 'start_date': datetime(2018, 8, 20, 0, 0)
, 'retries': 0
, 'email': [email] # This will convert the string to python list
, 'email_on_failure': True
, 'email_on_retry': True
, 'sla': timedelta(seconds=10)
}
In general, if you want to send email to multiple email addresses, you should always use a list and not string.
Update: I have fixed this in https://github.com/apache/incubator-airflow/pull/3869 . This will be available in Airflow 1.10.1

Resources