Oozie - How to postpone a workflow until another one completes - oozie

I have a bundle that runs several coordinators.
While one (and specific one) of these coordinators runs (a daily time-scheduled workflow), no other one has to start. Is there a way to postpone or cancel all other coordinators until this one completes ?
Example:
C1: Runs once a day at 06:00. Has to run alone!
C2: Runs every 15 minutes (and take about 5 mins to complete)
C3: Runs once a day at 04:00 (and could take more than 2 hours to complete)
What I need is:
C1 starts only if C2 & C3 are not running, else wait for its completion to start
C2 starts only if C1 is not running, else cancel itself
If it's not possible, is there a workaround ?

Related

Kill job if execution time less than 2 min

I have a chain with several jobs, and sometimes a certain job that normally takes about 2 hours, finishes in less than 2 minutes.
What I would like to do is to kill this job if it ends in less than 2 minutes so that the chain won't proceed.
Is this possible?
Thanks
Well you don't really want to kill anything, do you? If you do, see BMC's note (including video) on using ctmkilljob.
In this case your next job is dependent on 2 things, the predecessor job being ok and the duration of the predecessor job. Add another input condition to your next job (in addition to the existing condition) to represent the >2 minutes duration.
On the job that needs to run for more than 2 minutes, add a Notify when the Exectime exceeds 2 mins (or 60 mins or whatever you decide is the threshold) and get it to shout to an entry in your shout destination table.
On the Control-M Server create a new program entry in your shout destination table and create a small script referenced by the shout. The script should use the ctmcontb utility to create a new ODAT condition that your next is waiting on. If you have a look at the BMC help note for ctmkilljob (and just substitute in ctmcontb) then you'll see how to do this.

dependent job not starting due to s condition

Job B
Condition s(Job A, 9.00)
Start time: 9:15
Job A has run within 9 hours. Sometimes the Job A starts running again just before the scheduled start time of Job B (9:15). Because of this, Job B cannot start as Job A is RU instead of SU.
What kind of condition on Job B would solve this issue?
Is your jobB should be really dependent with the success of jobA? if yes, then all you need to do is to wait for the jobA to be completed, and jobB will automatically start after that. If no, you can simply remove the jobA in the run condition of jobB and remain the start time of jobB as 9:15. Also, are your jobs reside in the same box? Is your jobA has multiple run time in a day?

Autosys Job Queue

I'm trying to set1 an autosys jobs configuration so that will have a "funnel" job queue behavior, or, as I call it, in a 'waterdrops' pattern, each job executing in sequence after a given time interval, with local job failure not cascading into sequence failure.
1 (ask for it to be setup, actually, as I do not control the Autosys machine)
Constraints
I have an (arbitrary) N jobs (all executing on success of job A)
For this discussion, lets say three (B1, B2, B3)
Real production numbers might go upward of 100 jobs.
All these jobs won't be created at the same time, so addition of a new job should be as less painful as possible.
None of those should execute simultaneously.
Not actually a direct problem for our machine
But side effect on a remote, client machine : jobs include file transfer, which are trigger-listened to on client machine, which doesn't handle well.
Adaptation of client-machine behavior is, unfortunately, not possible.
Failure of job is meaningless to other jobs.
There should be a regular delay in between each job
This is a soft requirement in that, our jobs being batch scripts, we can always append or prepend a sleep command.
I'd rather, however have a more elegant solution especially if the delay is centralised : a parameter - that could be set to greater values, should the need arise.
State of my reasearch
Legend
A(s) : Success status of job
A(d) : Done status of job
Solution 1 : Unfailing sequence
This is the current "we should pick this solution" solution.
A (s) --(delay D)--> B(d) --(delay D)--> B2(d) --(delay D)--> B3 ...
Pros :
Less bookeeping than solution 2
Cons :
Bookeeping of the (current) tailing job
Sequence doesn't resist to job being ON HOLD (ON ICE is fine).
Solution 2 : Stairway parallelism
A(s) ==(delay D)==> B1
A(s) ==(delay D x2)==> B2
A(s) ==(delay D x3)==> B3
...
Pros :
Jobs can be put ON HOLD without incidence.
Cons :
Bookeeping to know "who is when" (and what's the next delay to implement)
N jobs executed at the same time
Underlying race condition created
++ Risk of overlap of job execution, especially if small delays accumulates
Solution 3 : The Miracle Box ?
I have read a bit about Job Boxes, but the specific details eludes me.
-----------------
A(s) ====> | B1, B2, B3 |
-----------------
Can we limit the number of concurrent executions of jobs of a box (i.e a box-local max_load, if I understand that parameter) ?
Pros :
Adding jobs would be painless
Little to no bookeeping (box name, to add new jobs - and it's constant)
Jobs can be put ON HOLD without incidence (unless I'm mistaken)
Cons :
I'm half-convinced it can't be done (but that's why I'm asking you :) )
... any other problem I have failed to forseen
My questions to SO
Is Solution 3 a possibility, and if yes, what are the specific commands and parameters for implementing it ?
Am I correct in favoring Solution 1 over Solution 2 otherwise2 ?
An alternative solution fitting in the constraints is of course more than welcome!
Thanks in advance,
Best regards
PS: By the way, is all of this a giant race condition manager for the remote machine failing behavior ?
Yes, it is.
2 I'm aware it skirts a bit toward the "subjective" part of questions rejection rules, but I'm asking it in regards to the solution(s) correctness toward my (arguably) objective constraints.
I would suggest you to do below
Put all the jobs (B1,B2,B3) in a box job B.
Create another job (say M1) which would run on success of A. This job will call a shell/perl script (say forcejobs.sh)
The shell script will get a list of all the jobs in B and start a loop with a sleep interval of delay period. Inside loop it would force start jobs one by one after the delay period.
So outline of script would be
get all the jobs in B
for each job start for loop
force start the job
sleep for delay interval
At the end of the loop, when all jobs are successfully started, you can use an infinite loop and keep checking status of jobs. Once all jobs are SU/FA or whatever, you can end the script and send the result to you/stdout and finish the job M1.

Oozie coordinator-app: execute job every Nth minute divisible by M

I have an Hive script that I am executing using Oozie coordinator every 10 minutes. When I launched my Oozie coordinator-app, suppose I have started at 08:03, the first workflow starts at that time, the next 08:13, and then 08:23, and so on.
What I want is to execute the workflow every clock time hh:mm, where mm is divisible by 10. Assuming the same scenario above, what I want to happen is this: the first workflow will execute at 08:10, and then 08:20, and so on.
How do I do this in Oozie? How about when every 5 minutes (The last m of the minute is either 5 or 0)? Thanks for your input.
In order to run a coordinator job at a frequency, you can use the following directives
<coordinator-app name="app" frequency="10" start="2015-07-10T12:00Z" end="2016-01-01T00:00Z" timezone="UTC" xmlns="uri:oozie:coordinator:0.1">
This would run every 10 minutes, starting exactly at 12:00 UTC time today. The same goes for running every 5 minutes, just replace frequency="10" with frequency="5". To to have it run every Nth minute divisible by M, you will have to ensure that your start parameter is set correctly.
Another option if you are using a more recent version of Oozie (4.1.0) would be to use the cron like scheduler. This would allow you to schedule Oozie coordinators in a cron-like fashion if you're familiar. See http://blog.cloudera.com/blog/2014/04/how-to-use-cron-like-scheduling-in-apache-oozie/ and https://issues.apache.org/jira/browse/OOZIE-1306

Autosys Job Preference - start_times vs conditions?

if we specify start_time as well as conditions in autosys job, which get preference over another ??
For e.g. lets say start time for my job A is 06:00 AM but this job is also dependent on another job B (via conditions param) . But Job B gets becomes successful at 06:30 AM.
So my job will start running at 06:00 AM or 06:30 AM ?
My guess is start time as well as conditions, both will be considered while running the job. So job should start at 06:30 AM. But i still want to be sure.
-Gaurav
Start_time:- Job started means Activated or might be run, (if conditions are already met) at starting time.
Conditions:- Means job would run after given condition mate.
Example:-
Suppose job A starting time is 11:00 PM and its condition is s(B)---->B should success, B is another job that would complete at 11:30 PM. In this scenario A started at 11:00 PM but it would in active state for 30 min. than it would in running state.
Your job will be in Activated state at 6:00 AM and will wait until B completes...but
consider a situation when you have another job C which is running before B. In case if C is running long and B is delayed and is not in Activated state (it is in sucessfull since run from, say, yesterday and waiting for C), you job A will also trigger as it's already 6 and B is in success (even from yesterday).
In order to avoid this, you have two ways:
1) Make B activated before A
2) Make a condition s(A, )

Resources