Define prev_ds in manual trigger - airflow

Currently my DAG utilizes the {{ prev_ds }} variable.
I would like to trigger a DAG run manually. However when I trigger manually from the UI with an execution date of '2021-12-14' the {{ prev_ds }} value gets set to '2021-12-14'. Is there a way through the UI or CLI to set that value appropriately?

When manually triggering DAG prev_ds == next_ds == ds ( see docs)
These are the proper values as manual run has no prev and has no next. There is no workaround for this from UI nor CLI.
You can add conditions to your jinja templates to bring diffrent values bases on run type (manual, schedual) but that means code changes.

Related

Vertex AI Airflow Operators don't render XCom pulls (specifically CreateBatchPredictionJobOperator)

I am trying to run a batch predict job task using the Vertex AI Airflow Operator CreateBatchPredictionJobOperator. This requires pulling a model id from XCom which was pushed by a previous custom container training job. However, CreateBatchPredictionJobOperator doesn't seem to render Xcom pulls as expected.
I am running Airflow 2.3.0 on my local machine.
My code looks something like this:
batch_job_task = CreateBatchPredictionJobOperator(
gcp_conn_id="gcp_connection",
task_id="batch_job_task",
job_display_name=JOB_DISPLAY_NAME,
model_name="{{ ti.xcom_pull(key='model_conf')['model_id'] }}",
predictions_format="bigquery",
bigquery_source=BIGQUERY_SOURCE,
region=REGION,
project_id=PROJECT_ID,
machine_type="n1-standard-2",
bigquery_destination_prefix=BIGQUERY_DESTINATION_PREFIX,
This results in a value error when the task runs:
ValueError: Resource {{ ti.xcom_pull(key='model_conf')['model_id'] }} is not a valid resource id.
The expected behaviour would be to pull that variable by key and render it as a string.
I can also confirm that I am able to see the model id (and other info) in XCom by navigating there in the UI. I attempted using the same syntax with xcom_pull with a PythonOperator and it works.
def print_xcom_value(value):
print("VALUE:", value)
print_xcom_value_by_key = PythonOperator(
task_id="print_xcom_value_by_key", python_callable=print_xcom_value,
op_kwargs={"value": "{{ ti.xcom_pull(key='model_conf')['model_id'] }}" },
provide_context=True,
)
> [2022-12-15, 13:11:19 UTC] {logging_mixin.py:115} INFO - VALUE: 3673414612827265024
CreateBatchPredictionJobOperator does not accept provide_context as a variable. I assumed it would render xcom pulls by default since xcom pulls are used in the CreateBatchPredictionJobOperator in an example on the Airflow docs (link here).
Is there any way I can provide context to this Vertex AI Operator to pull from the XCom storage?
Is something wrong with the syntax that I am not seeing? Anything I a misunderstanding in the docs?
UPDATE:
One thing that confuses me is that model_name is a templated field according to the Airflow docs (link here) but the field is not rendering the XCom template.
Did you set render_template_as_native_obj=True in your DAG definition?
What version of apache-airflow-providers-google do you use?
====
From OP:
Your answer was a step in the right direction.
The solution was to upgrade apache-airflow-providers-google to the latest version (at the moment, this is 8.6.0). I'm not able to pinpoint exactly where in the changelog this fix is mentioned.
Setting render_template_as_native_obj=True was not useful for this issue since it rendered the id pulled from XCom as an int, and I found no proper way to convert it to str when passed into CreateBatchPredictionJobOperator in the model_name arg.

airflow configure mail template

I'm trying to create a mail template for my dags in airflow, I'm strugling to find the documentation, I tried this but it's poor
I'm trying to find out what are the variables that I could use in my template, for example : for the subject I want to access the dag name (not the {{ ti }} which contains others informations), also for the body, I want to choose which part to be send (in my case, the real exception is displayed as warning in aiflow log, thus, I want to send it instead of {{ exception_html }}

How to pass a value from Xcom to another operator?

DockerOperator has a parameter xcom_push which when set, pushes the output of the Docker container to Xcom:
t1 = DockerOperator(task_id='run-hello-world-container',
image='hello-world',
xcom_push=True, xcom_all=True,
dag=dag)
In the admin interface under Xcom, I can see these values with key return_value. However, how can I access them in the DAG?
If I try:
t1_email_output = EmailOperator(task_id='t1_email_output',
to='user#example.com',
subject='Airflow sent you an email!',
html_content={{ ti.xcom_pull(task_ids='return_value') }},
dag=dag)
I get Broken DAG: [PATH] name 'ti' is not defined.
If I try:
t1_email_output = EmailOperator(task_id='t1_email_output',
to='user#example.com',
subject='Airflow sent you an email!',
html_content=t1.xcom_pull(task_ids='return_value'),
dag=dag)
I get Broken DAG: [PATH] xcom_pull() missing 1 required positional argument: 'context'.
You need to pass the task id from which you are pulling the xcom and not the variable name
In your example it would be
{{ ti.xcom_pull('run-hello-world-container') }}
Also in the second snippet it should be "ti" instead of "t1"
html_content=ti.xcom_pull('run-hello-world-container'),
I found the problem - turns out I was missing a quote and my parameter was also wrong:
t1_email_output = EmailOperator(task_id='t1_email_output',
to='user#example.com',
subject='Airflow sent you an email!',
html_content="{{ ti.xcom_pull(key='return_value') }}",
dag=dag)
Sends an email with the Docker container's output like I expect.
I think what is happening is that the {{ }} syntax gets processed as a Jinja template by Airflow when the DAG is run, but not when it is loaded. So if I don't put the quotes around it, Airflow gets Python exceptions when it tries to detect and load the DAG, because the template hasn't been rendered yet. But if the quotes are added, the templated expression is treated as a string, and ignored by Python interpreter when being loaded by Airflow. However when the EmailOperator is actually triggered during a DAG run, the template is rendered into actual references to the relevant data.

set Airflow variable in UI to today.date or {{ds}}

Is there any way to set a variable in Airflow UI to get today.date() or something similar to {{ds}} in the DAG code?
I want to have flexibility to set a hard code date in variable without changing the DAG code for some use cases.
I am getting today date in DAG code right now:
today = datetime.today()
but wanted to get it like this:
today= models.Variable.get('todayVar')
This is a duplicate of stackoverflow post:
Airflow - Get start time of dag run
You can achieve what you want by:
{{ dag_run.start_date }}
In airflow the date you are meaning is also called the 'run_date'

Is it possible to 'tag' an airflow DAG in the UI?

In the airflow DAG UI, I'd like to add a tag for a subset of DAGs. Let's say there is a tag #weekend_runs that I'd like to add to some specific DAGs.
Is it possible to filter your view of DAGs in the UI based on tags in Airflow? Or do I need to do something hacky like add _weekend_run to the end of DAG names in order to use fuzzy search and filter out other scripts?
Thanks!
Adding a tag to a DAG is now possible from Airflow 1.10.9
In order to filter DAGs (e.g by team), you can add tags in each dag.
The filter is saved in a cookie and can be reset by the reset button.
For example:
Dag File:
dag = DAG('dag', tags=['example'])
UI:
Note: This feature is only available for the RBAC UI (enabled using rbac=True in [webserver] section in your airflow.cfg).
This is not possible yet and it's not even in the roadmap for Airflow 2.0.
A hack I used in the past is to abuse of one of the fields (DAG name or Owner), as you suggested, for example by adding _weekend_run to the DAG name. And then I created a Greasemonkey userscript that allows to filter out the DAGs you don't want to show in the UI. Something along the line of the following script will do the job for your application:
// ==UserScript==
// #name Only weekend runs
// #match http://<airflow-instance-url-here>/admin/
// #grant none
// ==/UserScript==
(function() {
'use strict';
$('td:not:contains("weekend_run")').parent().hide();
})();
Unfortunately for this to work it needs to be installed on each user's browser, which is far from ideal. Of course, the ideal thing would be to make a PR to the Airflow project :)

Resources