Autosys Job setup with both Day and Predecessor conditions - autosys

Predecessor job(JOB_A) running M-F
JOB_B is set up to run M-TH after JOB_A completion
We have job(JOB_C) that needs to run only on FRIDAY's after JOB_A completion.
Because of FRIDAY only we have to use DATE condition with DAY and TIME.
JOB_C is triggering at the Time and not waiting for the JOB_A completion.
(JOB_A dependent on another job so it may run any time between 19:00 to 23:00
job run time may be 5Min only)
Can Autosys handle this?
Regards

Please correct me if I misunderstood your problem.
From my understanding you want JOB_C to run on Friday, only after the completion of JOB_A in that case you can add the following attributes to the JIL of JOB_C
days_of_week: fr
conditions: s(JOB_A)
you can use conditions : d(JOB_A) if there is no actual dependency on JOB_A other than the order in which they run.
d(JOB_A) is for DONE, that will satisfy the condition as soon as JOB_A had completed irrespective of the exit status/code
s(JOB_A) is for SUCCESS, so it will only satisfy if JOB_A had completed successfully

You might want to have a dummy job, let's say JOB_D, just to indicate Friday. The command does not matter, what matter is the schedule - FRIDAY, 00:00. Add this job as a second predecessor for JOB_C with condition SUCCESS IN LAST 24 HOURS - like
job: JOB_C
condition: d(JOB_A) & s(JOB_D, 24.00)
...

Related

Is there a way to make an Airflow process start at the exact same time every week?

I have an airflow process that runs every Sunday at 12:00am. Is there a way to trigger this process exactly at the same time (absolute time) every week regardless of previous run duration or outcome. I see that the start time of the process keeps creeping forward to the point that after a couple of weeks it now gets triggered a full 16 hours later than the scheduled time. How do I make it start exactly at the same time regardless of the previous run outcome or whether previously triggerred manually or not (cron like behaviour) ?
Add depends_on_past argument in your DAG's default_args, false value will make sure new dagruns will be created every interval without depending on the previous dagrun status:
'depends_on_past': False
It might not be necessary, but I recommend restart your scheduler after making this change.

Airflow: Why is there a start_date for operators?

I don't understand why do we need a 'start_date' for the operators(task instances). Shouldn't the one that we pass to the DAG suffice?
Also, if the current time is 7th Feb 2018 8.30 am UTC, and now I set the start_date of the dag to 7th Feb 2018 0.00 am with my cron expression for schedule interval being 30 9 * * * (daily at 9.30 am, i.e expecting to run in next 1 hour). Will my DAG run today at 9.30 am or tomorrow (8th Feb at 9.30 am )?
Regarding start_date on task instance, personally I have never used this, I always just have a single DAG start_date.
However from what I can see this would allow you to specify certain tasks to start at a different time from the main DAG. It appears this is a legacy feature and from reading the FAQ they recommend using time sensors for that type of thing instead and just having one start_date for all tasks passed through the DAG.
Your second question:
The execution date for a run is always the previous period based on your schedule.
From the docs (Airflow Docs)
Note that if you run a DAG on a schedule_interval of one day, the run stamped 2016-01-01 will be trigger soon after 2016-01-01T23:59. In other words, the job instance is started once the period it covers has ended.
To clarify:
If set on a daily schedule, on the 8th it will execute the 7th.
If set to a weekly schedule to run on a Sunday, the execution date for this Sunday would be last Sunday.
Some complex requirements may need specific timings at the task level. For example, I may want my DAG to run each day for a full week before some aggregation logging task starts running, so to achieve this I could set different start dates at the task level.
A bit more useful info... looking through the airflow DAG class source it appears that setting the start_date at the DAG level simply means it is passed through to the task when no default value for task start_date was passed in to the DAG via the default_args dict, or when no specific start_date is are defined on a per task level. So for any case where you want all tasks in a DAG to kick off at the same time (dependencies aside), setting start_date at the DAG level is sufficient.
Just to add to what is already here. A task that depends on another task(s) must have a start date >= to the start date of its dependencies.
For example:
if task_a depends on task_b
you cannot have
task_a start_date = 1/1/2019
task_b start_date = 1/2/2019
Otherwise, task_a will not be runnable for 1/1/2019 as task_b will not run for that date and you cannot mark it as complete either
Why would you want this?
I would have liked this logic for a task, which was an external task sensor waiting for the completion of another dag. But the other dag had a start date after the current dag. Therefore, I didn't want the dependency in place for days when the other dag didn't exist
it's likely to not set the dag parameter of your tasks as stated by :
https://stackoverflow.com/a/61749549/1743724

dependent job not starting due to s condition

Job B
Condition s(Job A, 9.00)
Start time: 9:15
Job A has run within 9 hours. Sometimes the Job A starts running again just before the scheduled start time of Job B (9:15). Because of this, Job B cannot start as Job A is RU instead of SU.
What kind of condition on Job B would solve this issue?
Is your jobB should be really dependent with the success of jobA? if yes, then all you need to do is to wait for the jobA to be completed, and jobB will automatically start after that. If no, you can simply remove the jobA in the run condition of jobB and remain the start time of jobB as 9:15. Also, are your jobs reside in the same box? Is your jobA has multiple run time in a day?

Autosys time condition

I have job A that have time condition as 6:00 and predecessor as job B(this completes mostly by 4:00). Yesterday job B did not ran and today after job B completion job A started as time condition was satisfied on previous day itself. I want to know whether is there any way to make this time condition to be unsatisfied after that particular day completion so that there wont be any issues in next day?
If you are using R11 version, then you can use lookback condition(refer CA manual for details)
syntax is
status(job_name, hhhh.mm)
eg
condition: success(job_b,12.00)
If job_b has become success in last 12 hrs, then condition is satisfied, else its not and the job_a doesnt run.
Hope this will do.

Autosys Job Preference - start_times vs conditions?

if we specify start_time as well as conditions in autosys job, which get preference over another ??
For e.g. lets say start time for my job A is 06:00 AM but this job is also dependent on another job B (via conditions param) . But Job B gets becomes successful at 06:30 AM.
So my job will start running at 06:00 AM or 06:30 AM ?
My guess is start time as well as conditions, both will be considered while running the job. So job should start at 06:30 AM. But i still want to be sure.
-Gaurav
Start_time:- Job started means Activated or might be run, (if conditions are already met) at starting time.
Conditions:- Means job would run after given condition mate.
Example:-
Suppose job A starting time is 11:00 PM and its condition is s(B)---->B should success, B is another job that would complete at 11:30 PM. In this scenario A started at 11:00 PM but it would in active state for 30 min. than it would in running state.
Your job will be in Activated state at 6:00 AM and will wait until B completes...but
consider a situation when you have another job C which is running before B. In case if C is running long and B is delayed and is not in Activated state (it is in sucessfull since run from, say, yesterday and waiting for C), you job A will also trigger as it's already 6 and B is in success (even from yesterday).
In order to avoid this, you have two ways:
1) Make B activated before A
2) Make a condition s(A, )

Resources