How to trigger airflow dag manually? - airflow

I have beening working on Airflow for a while for no problem withe the scheduler but now I have encountered a problem.
Bascially I have a script and dag ready for a task, but the task doesn't run periodically. Instead it needs to be activated at random time. (External parties will tell us it's time and we will run it. This may happen for many times in the following months.)
Is there anyway to trigger the dag manually? Any other directions/suggestions are welcomed as well.
Thanks.

You have a number of options here:
UI: Click the "Trigger DAG" button either on the main DAG or a specific DAG.
CLI: Run airflow trigger_dag <dag_id>, see docs in https://airflow.apache.org/docs/stable/cli.html#trigger_dag. Note that later versions of airflow use the syntaxairflow dags trigger <dag_id>
API: Call POST /api/experimental/dags/<dag_id>/dag_runs, see docs in https://airflow.apache.org/docs/stable/api.html#post--api-experimental-dags--DAG_ID--dag_runs.
Operator: Use the TriggerDagRunOperator, see docs in https://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/operators/trigger_dagrun/index.html#airflow.operators.trigger_dagrun.TriggerDagRunOperator and an example in https://github.com/apache/airflow/blob/master/airflow/example_dags/example_trigger_controller_dag.py.
You'll probably go with the UI or CLI if this is truly going to be 100% manual. The API or Operator would be options if you are looking to let the external party indirectly trigger it themselves. Remember to set schedule_interval=None on the DAG.

So a dag can be triggered by following ways:
Using the REST API Reference(see documentation)
endpoint-
> POST /api/experimental/dags/<DAG_ID>/dag_runs
Using Curl:
curl -X POST
http://localhost:8080/api/experimental/dags/<DAG_ID>/dag_runs
-H 'Cache-Control: no-cache'
-H 'Content-Type: application/json'
-d '{"conf":"{"key":"value"}"}'
Using Python requests:
import requests
response = requests.post(url, data=json.dumps(data), headers=headers)
Using the trigger DAG option present in the UI as mentioned by #Daniel

Airflow has API. The method you need is POST /api/experimental/dags/<DAG_ID>/dag_runs. With this method you also could pass config params for the dag run.
We use Jenkins to trigger dags manually. If you are using Jenkins you could check our jenkins pipeine library.

The examples given from other answers use the “experimental” API. This REST API is deprecated since version 2.0. Please consider using the stable REST API.
Using method POST dags/<DAG_ID>/dagRuns.
curl -X POST 'http://localhost:8080/api/v1/dags/<DAG_ID>/dagRuns' \
--header 'accept: application/json' \
--header 'Content-Type: application/json' \
--user '<user>:<password>' \ # If authentication is used
--data '{}'

Related

How do I set my project cloud resource location using the CLI only?

I am running the command 'firebase init storage' and am getting the following error...
Error: Cloud resource location is not set for this project but the operation you are attempting to perform in Cloud Storage requires it.
I am writing a script to initialize my project, so I do not want to have to do things manually in a web console if I don't have to.
How can I set the Cloud resource location for my project using either firebase cli or gcloud tools?
Similar concern "Firebase Project Initialization Error: Cloud resource location is not set for this project".
This might help you:
Go to the Firebase console and open your project
Go to the Storage tab ( Side panel on left )
Click setup storage
Run Firebase init again
You may use the REST API to create a script to initialize your project.
REST API sample curl:
curl --request POST \
'https://firebase.googleapis.com/v1beta1/projects/<"PROJECT-ID">/defaultLocation:finalize' \
--header 'Authorization: Bearer '$(gcloud auth application-default print-access-token) \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--data '{"locationId":<"REGION-NAME">}' \
--compressed
Check these documentation for reference:
REST Resource: projects
Method: projects.defaultLocation.finalize
Note: After the default GCP resource location is finalized, or if it was already set, it cannot be changed.
Unfortunately, Firebase CLI and Gcloud tools have not yet been implemented to set the Cloud resource location. However you can file a Feature Requests that you might think is helpful.

how to trigger a DAG with future date

I want to trigger a Airflow DAG in future so the execution date is in tomorrow.
this will help us testing a file with tomorrows date.
when I click the execute button I see an option "trigger dag w/ config" but could not find any documentation around that.
Go to "trigger with config" and change the date there, next to the calendar icon. Leave the config editor as is.
The Airflow UI doesn't allow to specify an execution date, it always triggers "right now". However, the REST API and CLI do allow you to specify an execution date.
CLI (docs):
airflow dags trigger -e/--exec-date EXECUTION_DATE DAG_ID
# For example:
airflow dags trigger -e 2022-04-05 mydag
REST API (docs):
curl -X 'POST' \
'http://localhost:8080/api/v1/dags/mydag/dagRuns' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"logical_date": "2022-04-05T00:00:00Z"
}'

How to call API Hooks or Call back function from JIRA to DocuSign and revert from DocuSign to JIRA

Hi guys I am working on one service desk of JIRA, From here I need to pass some information to DocuSign when a button is clicked and a PDF will be generated to DocuSign and it will revert that PDF file to JIRA and that file will be linked to JIRA. If these didn't work directly then any intermediate code in ASP.NET/MVC/Core will be great.
Thank in advance.
You can use Python in between. You need to setup a webhook out of Jira to airflow for example.
Webhook Configuration in your Jira like:
curl -X POST \\
https://{airflow.SERVER}/api/v1/dags/{dag_name}/dagRuns \\
-H 'Cache-Control: no-cache' \\
-H 'Content-Type: application/json' \\
-H 'Authorization: Basic *****************' \\
-d '{"conf":{"ticket":"{{issue.key}}"}}'
Then host your code there and set up the connection to Jira and your DocuSign.
The code could look like the following:
atc_conn = JiraHook("jira").get_conn()
jql = "project =...."
issues = atc_conn.search_issues(jql, fields=["attachment"])
for issue in issues:
attachments = issue.fields.attachment
for attachment in attachments:
attachments_to_sign = attachment.get()
....

How can I send the content of the file to HTTP Event Collector in Splunk?

I am using a script that gives me some data in json format, I want to send this data to splunk.
I can store the output of the script in a file but how can I send it to HTTP Event Collector?
Couple of things I tried but did not work:
FILE="output.json"
file1="cat answer.txt"
curl -k "https://prd-pxxx.splunkcloud.com:8088/services/collector"  -H "Authorization: Splunk XXXXX"  -d  '{"event": "$file1", "sourcetype": "manual"}'
-----------------------------------------------------------
curl -k "https://prd-pxxx.splunkcloud.com:8088/services/collector"  -H "Authorization: Splunk XXXXX"  -d  '{"event": "#output.json", "sourcetype": "manual"}'
curl -k "https://prd-p-w0gjo.splunkcloud.com:8088/services/collector"  -H "Authorization: Splunk d70b305e-01ef-490d-a6d8-b875d98e689b"   -d '{"sourcetype":"_json", "event": "#output.json", "source": "output.json}
-----------------------------------------------------------------
After trying this I understand that it literally sends everything specified in the event section. Is there a way I can send the content of the file or use a variable?
Thanks in advance!
(Note - I haven't tried this specifically, but it should get you close)
According to Docs.Splunk on HTTP Event Collector Examples #3, it would seem you can do something very similar to this:
curl -k "https://mysplunkserver.example.com:8088/services/collector/raw?channel=00872DC6-AC83-4EDE-8AFE-8413C3825C4C&sourcetype=splunkd_access&index=main" \
-H "Authorization: Splunk CF179AE4-3C99-45F5-A7CC-3284AA91CF67" \
-d < $FILE
Presuming the content of the file is formatted correctly, it should go straight in.
How is the file being created? Is it in a Deployment App on a managed endpoint? If so, it will likely be simpler to setup a scripted input for the UF to run on whatever schedule you choose.

How to check Compute Time usage i.e. (GB-Seconds & GHz-Seconds) by certain firebase google cloud functions per day?

I have two different background cloud functions, doing similar things but using different algorithm and libraries for testing.
Now I want to measure the total GB-Seconds & GHz-Seconds by functions to optimize for pricing.
I think this is available in Functions metrics explorer, but I can't create the report.
One option is to attach labels to your Cloud Functions, then go to your GCP billing report, group by SKU, and filter it by labels to see the breakdown per function.
The only downside is that labels can only be configured in gcloud command, GCP Client Libraries, or via REST API, it's currently not yet available in Firebase CLI (feature request here).
In this first approach, you'll have to redeploy your functions using gcloud. Here's an example and further information can be seen on this documentation:
gcloud functions deploy FUNCTION_NAME \
--runtime RUNTIME \
--trigger-event "providers/cloud.firestore/eventTypes/document.write" \
--trigger-resource "projects/YOUR_PROJECT_ID/databases/(default)/documents/messages/{pushId}" \
--update-labels KEY=VALUE
The second approach to avoid redeployment is to create a PATCH request to add/update labels to your function. It can be done by running this command (update all caps with your input):
curl \
--request PATCH \
--header "Authorization: Bearer $(gcloud auth print-access-token)" \
--header "content-type: application/json" \
--data "{\"labels\":{\"KEY\":\"VALUE\"}}" \
https://cloudfunctions.googleapis.com/v1/projects/PROJECT-ID/locations/REGION/functions/FUNCTION-NAME?updateMask=labels

Resources