I'm trying to create an external sensor under the next configuration
DAG-A Running at 00:00:00
DAG-B Running at 04:00:00
DAG-B.task checking the status of a DAG-A.task.
The issue is, when the external task sensor from the DAG-B pokes the DAG-A.task is using the 04:00:00 hour like this:
INFO - Poking for DAG-A.task on 2020-06-02T04:00:00+00:00
instead:
INFO - Poking for DAG-A.task on 2020-06-02T00:00:00+00:00
And the task is not found.
Any ideas of parameters to configure to poke at 00:00:00?
I found the solution, just needed to add a execution_delta on the external task sensor configuration as next:
execution_delta=timedelta(hours=4)
Related
So, I have a problem with even the blank Airflow installation.
As soon as I try to run
airflow test tutorial print_date 2015-06-01
I get a raised exception which says
PendingDeprecationWarning: The requested task could not be added to the DAG because a task with task_id create_tag_template_field_result is already in the DAG. Starting in Airflow 2.0, trying to overwrite a task will raise an exception.
What is the reason for this (as I made literally no changes to the installation whatsoever)?
I also got that when, in a previous installation, I tried to run my own dag... but the "create_tag_template_field_result" was nowhere to be found in my code.
you can set the config arg load_examples = False to solve it.
This is the test command will call get_dag function which will construct a DagBag object, in the construction function will call collect_dags function.
The collect_dags function when the conf arg LOAD_EXAMPLES=True(default True), will collect all the dags in the example path, that's where the task create_tag_template_field_result comes from.
And in the collect_dags function will call add_task function of every example task, that's where you add the create_tag_template_field_result task again.
And maybe it's quickstart when you added this task before for the first time while you didn't realize.
you can set the config arg load_examples = False to solve it
This warning is occuring in
/usr/local/lib/python3.7/dist-packages/airflow/example_dags/example_complex.py
so i remove or rename (for example, to not working name *.py.back ) this.
I had the same error with a fresh install.
Then I don't know if this helps, but I downgraded Airflow to version 1.10.10 (with python3.7) and the error was gone.
There is no problem with installing at on termux. But if I try to setup a job, I get an error
"Can't open /var/run/atd.pid to signal atd. No atd running?"
and the job do not execute on the given shedule.
Somebody an idea how to fix this?
I just found a way to start this daemon:
atd start
(Your PATH environment should be set up in such a way that the atd daemon is found)
Furthermore, as mentioned in my comment, I advise you to check your /etc/init.d.
airflow webserver can run without problem.
airflow scheduler would get error message:
Cannot use more than 1 thread when using sqlite. Setting parallelism to 1
airflow.cfg:
sql_alchemy_conn = mysql+pymysql://root:mypassword#localhost:3306/airflow
Have you set $AIRFLOW_HOME wherever you run scheduler too?
Looks like the scheduler is not picking up the airflow.cfg file at all.
Lets say today is 2017-10-20. I have an existing dag which is successful till today. I need to add a task with a start_date of 2017-10-01. How to make the scheduler trigger task from 2017-10-01 to 2017-10-20 automatically ?
You can use the backfill command line tool.
airflow backfill your_dag_id -s 2017-10-01 -e 2017-10-20 -t task_name_regex
This is assuming there is already a DAG run for dates beginning from 2017-10-01. If that's not the case, make sure the DAG's start date is 2017-10-01 or earlier and that catchup is enabled.
If you don't mind executing the whole DAG again, you can remove it from the Web UI and it will appear again with status Off. If you enable it again, it will run from the beginning, including the new tasks.
I'm running Airflow and attempting to iterate on some task we're building from the command line.
When running a airflow webserver, everything works as expected. But when I run airflow backfill dag task '2017-08-12', airflow returns:
[2017-08-15 02:52:55,639] {__init__.py:57} INFO - Using executor LocalExecutor
[2017-08-15 02:52:56,144] {models.py:168} INFO - Filling up the DagBag from /usr/local/airflow/dags
2017-08-15 02:52:59,055 - airflow.jobs.BackfillJob - INFO - Backfill done. Exiting
...and doesn't actually run the dag.
When using airflow test or airflow run (i.e. commands involving running a task rather than a dag), it works as expected
Am I making a basic mistake? What can I do to debug from here?
Thanks
Have you run those DAG on that date range already? You will need to clear the DAG first then backfill. Base on what Maxime mentioned here: https://groups.google.com/forum/#!topic/airbnb_airflow/gMY-sc0QVh0
If a task has a #monthly schedule, then if you try and run it with a start_date mid-month, it will merely state Backfill done. Exiting.. If a task has a schedule of '30 5 * * *', this also prevents backfill from the command line
(Updated to reflect better information, and this discussion)
Two possible reasons:
Execution date specified via -e option is outside of the DAG's [start_date, end_date) range.
Even if execution date is between the dates, please keep in mind that if you DAG has schedule_interval=None then it won't backfill iteratively: it will only run for a single date (specified as --start_date or --end_date if the first is omitted).