How do I get a list of all unpaused(running) dags using airflow API?
I tried GET /dags endpoints but I did not find a query string to filter paused dags, isn't there something like is_paused query parameter or body parameter perhaps?
P.S I'm currently using airflow version 2.2.3 +
Currently Airflow API doesn't support this filter, you should get all the dags and filter them locally.
If you really need this filter, you can create an Airflow plugin which exposes a simple API to fetch the unpaused dags and return them.
Update: this filter will be available in Airflow API from 2.6.0 (PR)
Actually there is plugin made for this. You can fetch the dags along with status. Please explore this plugin. May be this is what you are looking for.
Airflow API Plugin
Dag Run Endpoints
Or else you can write your custom python script/API to fill the dagbag and then filter the list to get the list of dags with status which you want.
Related
I was looking through the different API endpoints that Airflow offers, but I could not find one that would suite my needs. Essentially I want to monitor the state of each task within the DAG, without having to specify each task I am trying to monitor. Ideally, I would be able to ping the DAG and the response would tell me the state of the task/tasks and what task/tasks are running/retrying...etc
You can use the airflow rest api which comes along with it - https://airflow.apache.org/docs/apache-airflow/stable/stable-rest-api-ref.html
I have setup airflow in my local machine. I am trying to access the below airflow link:
http://localhost:8080/api/experimental/test/
I am getting Airflow 404 = lots of circles
I have tried to set auth_backend to default, but no luck.
What changes do i need to make in airflow.cfg to be able to make REST API calls to airflow for triggering DAGs?
Experimental API is disabled by default in Airlfow 2. It was used in 1.10 but it has been deprecated and disabled by default in Airflow 2. Instead you should use the fully-fledged REST API which uses completely different URL scheme:
https://airflow.apache.org/docs/apache-airflow/stable/stable-rest-api-ref.html
In Airflow UI you can even browse and try the API (just look at the menus of Airflow).
Airflow provides rest API functionality to extracts dag/task status.
https://airflow.apache.org/docs/apache-airflow/stable/stable-rest-api-ref.html#section/Trying-the-API
But wondering if there a way to get latest dag/task status of all dags w.r.t dag owner only without specifying it manually for each dag id.
This will help us for creating a workflow dashboard for business users.
You can make use of dag, dag_run and task_instsnce tables in airflow metadata database. It's fairly straightforward..
I read the API reference and couldnt find anything on it, is that possible?
Currently, there is no such feature that does it out-of-the-box but you can write some custom code in your DAG to get around this. For example, use PythonOperator (you can use MySQL operator if your metadata db is mysql) to get status of the last X runs for the dag.
use BranchPythonOperator to see if the number is more than X, if it is then use a BashOperator to run airflow pause dag cli.
You can also just make this a 2-step task by adding logic of PythonOperator in BranchPythonOperator. This is just an idea, you can use a different logic.
We are using Cloud Composer (Managed Airflow in GCP) to orchestrate our tasks. We are moving all our logs to sumo logic (a standard process in our org). Our requirement is to track an entire log of a single execution of a DAG, as of now there seems to be no way to track.
Currently, the first task in DAG will generate a unique ID and pass it to other tasks via xcom. The problem here is we were not able to inject the unique ID in Airflow operators log(like BigQueryOperator).
Is there any other way to inject the custom unique ID in Airflow operators log?
Composer integrates with stackdriver logging and you could filter per-DAG logs by "workflow:{your-dag-name}" and "execution-date:{your-dag-run-date}", e.g.,
You could read log entries with the following filters:
resource.type="cloud_composer_environment"
resource.labels.location="your-location"
resource.labels.environment_name="your-environment-name"
logName="projects/cloud-airflow-dev/logs/airflow-worker"
labels."execution-date"="your-dag-run-date"
labels.workflow="your-dag-id"