What is the behavior of the job if I am giving Dependency and Schedule on the same job?
Suppose a job J2 is dependent on J1 on s(J1)|f(J1) and is scheduled at 18:00.
If the job J1 is force started at 15:00 and is success at 15:05. What will happen to the job J2 will it get start at 15:05 or will it wait till 18:00?
Thanks in advance,
Suresh Reddy M
Job2 will wait until 18:00 to become "Active", then will check the dependency on s(Job1). If Job1 is in success at that time, Job2 will run.
The run time of a job (or a box) is always based on the condition (dependency) and the schedule set to it. If either of the two is not met yet, then the job will not run. So in your question, the job2 will not run until 18:00.
Related
I set the schedule like '* 1,5,10,18 * * *' in airflow.
But 18 in yesterday wasn't executed. So I checked the logs.
then I found the job scheduled in 10 executed in 18.
I want to know why and how can I fix.
Note that if you run a DAG on a schedule_interval of one day, the run
stamped 2016-01-01 will be trigger soon after 2016-01-01T23:59. In
other words, the job instance is started once the period it covers has
ended.
~Airflow scheduler docs
So as you can see it will be scheduled after the period - 10->18 is closed, so after 18:00. Check task before, it should be ran just after 10:00.
You don't understand how the airflow scheduler works.
Airflow as DAG Run (data_interval_start) always takes the date of the previous execution, therefore the task performed at your place at 18, has DAG Run at 10. The same as the next task, call at 10 will have a DAG Run at 18 the previous day.
A job has a predecessor. So as aoon as the predecessor is finished the job starts running. But is it possible to schedule the job as follows:
1)when the predecessor fonishes before say 5 o clock the job should wait until 5 to start.
2) When the predecessor finished after 5 oclock then the job should wait for it to finish
Edit the second job to have a "From Time" of 17:00. If you have a condition between the 2 jobs then they will run as you describe.
I have a DAG 'abc' scheduled to run every day at 7 AM CST. For some reason, I do not want to run tomorrow's instance. How can I skip that particular instance. Is there any way to do that using command line ? Appreciate any help on this.
I believe you can preemptively create a DAG Run for the future date at in the UI under Browse->DAG Run -> Create, initializing it in the success (or failed) state, which should prevent the scheduler from creating a new run when the time comes. I think you can do this on the CLI with trigger_dag as well, but you'll just need to separately update its state cause it'll default to running.
I think you can set the start_date for the day after tomorrow or whatever date you want your dag run as long as it is in the future. but the schedule interval will stay the same every 7AM. You can start date in Default_Args
I am trying to understand autosys job. Suppose I have Job A that runs every 15 minutes. Suppose for some reason if Job A takes more than 15 minutes, will another instance of it run or it will wait for the job to finish before running another instance?
In my experience, if the previous job run is still running, another instance will not run if the next scheduled time comes. The next time the job runs is when the previous run is finished and the next scheduled time comes.
Another user also experienced this according to this answer.
I did not find any AutoSys documentation that officially confirms what happens in this situation, but I guess the best way to find out is to test it on your AutoSys instance.
I have experienced this first hand and can confirm that there won't be two instances in the mentioned scenario. The job will wait on the previous run to complete and will immediately kick off the next instance if the time condition is met before the previous completes.
But this will be the case only when the job is in running state, if the job is in any other state it will kick off based on the given start_time condition.
There are 5 jobs inside a Main Box job like this
MainBox
Job1 -> Job2 -> Job3 -> Job4 -> Job5
job2 is dependent on Job1, Job3 is dependent on Job2 and so on.
The dependency is implemented with condition attribute. For instance, Job2 has condtion: success(Job1).
On starting the MainBox, jobs will run in sequence. Suppose Job3 got failed. Now, how to re-start jobs inside MainBox from the failed Job3?
If I manually force start Job3 then it runs but the dependent Job4 does not get started after the success of Job3.
Actually in the setup you have mentioned, if any of the job fails, the Box will fail and downstream jobs will go to inactive state if condition is "box_terminator: y" set for the jobs. In this case you will have to manually complete the downstream by force starting each as conditions will not work as the Box itself is failed.
In order to achieve what you need to do a little tweak in jobs, make "box_terminator: n". Then the on any job's failure, the Box will stay in running state and the downstream jobs will stay in activated state waiting for failed job to be success. You can the either mark the failed job as success to skip its run or force start to completion, since the box is still in running state, the dependencies will work for each jobs within it.
But be sure to include "trem_runtime: value_in_minutes " for the Box, so that box fails after some time if failed job is unattended, else box will run indefinitely.
I think this should do.
We can achieve your requirement through below steps. Its an alternative method,
Step 1: Put ON_HOLD job1 and job2 which are ran and completed successfully.
Step 2: Terminate the box job if it is in running state.
Step 3: Start the box job.
step 4: Mark the job1 and job2 to success state from ON_HOLD state.
step 5: Check the status of the Box job and you can see that the job sequence start running from job3.
Regards,
Kaliraja.
For you to be able to restart the failed job without impacting the success job/s and for the remaining ACTIVE/INACTIVE jobs to run automatically (since it's dependent on the failed job), you need to use the ON_ICE feature of autosys. ON_ICE is just like a SUCCESS status. If your job is ON_ICE, Autosys will read this as SUCCESS and it will not be triggered again. Hence, since your dependency of job3 is the SU of job2, your job3 will run automatically once you've force started your main BOX (skipping to run the job1 and job2 considering they are already in ON_ICE status). Below are the steps you can perform:
If your main box is still in RU status, force STOP your main box.
ON_ICE your job1 and job2.
FORCE START your main box.
Once box already force started, your job3 will automatically run.
Once your job3 already turned to SU, your job4 will run immediately.
Once job4 already completed its run, your main box should turn to SU since your job1 and job2 are in ON_ICE state and your job3 and job4 are in SU state.
You can now set OFF_ICE the jobs job1 and job2 so as in the next schedule run of the main box, these jobs will be kicked off.
Note: The start and end run of the job1 and job2 should not change. The start and end run of your main box, job3, and job4 should have the updated time.