Airflow: Execution date is in future when launching from web-ui - airflow

I'm using airflow v1.9.0.
When I trigger a dag from the web-ui, the execution date is in my local time, which means that for airflow the task is scheduled in the future.
How can I configure the webserver to use it's local time over mine?

Related

mwaa restart functionality for requirements.txt updates

Every time our team puts another requirements.txt file for our MWAA environment, it requires a restart.
Regardless of the environment being in a PENDING or UPDATING state, I can still access the UI and run/monitor DAGS. I expect something to at least be unavailable or locked during this process from a user perspective.
So, my questions are: in the MWAA way of things, what exactly is being "restarted" during this process and why is applied to the entire so-called MWAA environment?
Airflow DAG processor, Airflow workers and Airflow scheduler are reboot
but not Airflow web server
This can be confirmed checking their respective logs.
Beware, some long-running task can fail during a reboot.

Airflow Scheduler handling queueing of dags

I have the following airflow setup
Executor : KubernetesExecutor
airflow version : 2.1.3
airflow config : parallelism = 256
I have the below scenario
I have a number of dags(eg 10) which are dependent on the success state of another task from another dag. The tasks kept failing with retries enabled for 6 times.
All the dependent dags run hourly and as a result they were added to the queue state by the scheduler. I can see around 800 dags were in queue and nothing was running. So I ended up manually changing their state to Fail.
Below are my questions from this event.
Is there a limit on the number of dags that can run concurrently in airflow set up ?
Is there a limit on how many dags can be enqueued ?
When dags are queued how does the scheduler decides which one to pick ? Is it based on queued time ?
Is is possible for setting up priority among the queued dags ?
How Does airflow 2.1.3 treats task in queue ? Are they counted against max_active_runs parameters ?

if the AirFlow scheduler crash, can AirFlow restart the in progress job in another scheduler container?

If the AirFlow scheduler crash, can AirFlow restart the in progress job in another scheduler container? Or, do it have to rerun the job from the beginning?
I am considering to use AirFlow to implement on-demand nearline processing and wish to know the reliability aspects. But I could not confirm this point from the docs.

Airflow dag dependencies

I have a airflow dag-1 that runs approximately for week and dag-2 that runs every day for few hours. When the dag-1 is running i cannot have the dag-2 running due to API limit rate (also dag-2 is supposed to run once dag-1 is finished).
Suppose the dag-1 is running already, then dag-2 that is supposed to run everyday fails, is there a way i can schedule the dag dependencies in a right way?
Is it possible to stop dag-1 temporarily(while running) when dag-2 is supposed to start and then run dag-1 again without manual interruption?
One of the best way is to use the defined pool ..
Lets say if you have a pool named: "specefic_pool" and allocate only one slot for it.
Specify the pool name in your dag bash command (instead of default pool, please use newly created pool) By that way you may over come of running both the dags parallel .
This helps whenever Dag1 is running Dag2 will never be triggered until pool is free or if the dag2 picked the pool until dag2 is completed dag1 is not going to get triggered.

Airflow State of this instance has been externally set to shutdown. Taking the poison pill

Some of the Airflow tasks are automatically getting shutdown.
I am using Airflow version 1.10.6 with Celery Executor. The Database is PostgreSQL and Broker is Redis. The airflow infrastructure is deployed on Azure.
Few tasks are getting shutdown after 15 hrs, few are getting stopped after 30 minutes. These are long-running tasks, I have set the execution_timeout to 100 hrs.
Any configuration that can prevent these tasks to be shutdown by Airflow ?
{local_task_job.py:167} WARNING - State of this instance has been externally set to shutdown. Taking the poison pill.

Resources