Oozie frequency 1 minute interval? How to set up - oozie

This can be a simple question, but I need to know.
How to set up oozie coordinator frequency at 1 minute frequency?
I need to run a shell script 1 minute interval.

Oozie Frequency representation,
frequency="${coord:minutes(1)}"
Update:
Cron Style representation,
frequency="0/1 * * * *"

Related

R: taskscheduleR for consecutive task

I have around 10 automation tasks in R everyday.Some tasks are dependent on others.Normally, I use taskscheduleR package to set each tasks at different times (30 min difference),eg: 8:00, 8:30,..., 12:00. My problem is:
1) For some tasks: the time to run ranges from 5 to 25 minutes ---> 30 min difference is too much
2) Due to some infrequent errors, some task take more than 30 mins to run
Therefore, I just want to set a batch of tasks to complete, one after others. I prefer the solutions can be run in R
Thanks in advance,
Just come up with the answer, I never thought that it will be that easy.
Just consider 10 tasks as part of a big task, and schedule the big task:
#big task: bigtask.R
source('task1.R')
source('task2.R')
....
source('task10.R)
All we need is to schedule the 'bigtask.R' by using taskscheduleR addins

Find optimal time schedule for nodes in a directed graph

So I have a DAG representing a project and each node is a task, which has a variable that says how long it takes to finish that task.
We can assume that it is possible to work on any number of tasks at the same time.
How do I find optimal schedule of tasks so that I find a sequence of tasks that will result in earliest completion of the project.
If you can work on any number of tasks in parallel, then an optimal schedule can easily be computed. The starting time for a task in an optimal schedule can recursively be defined as the maximum of the optimal end times (i.e. optimal start time plus duration) of all its predecessor nodes in the graph. Tasks without predecessors all start at time 0 (or at whatever time you want them to start).
This recursion can be computed iteratively by iterating over the tasks in a topological order. In pseudocode, the algorithm could look like this:
// Computes optimal starttime for tasks in the dag.
// Input: dag : DAG with tasks as nodes
// duration : Array with duration of tasks
// Output: Array with optimal starttime of tasks
def computeSchedule(dag, duration)
starttime := [0 for each node in dag]
for t in nodesInTopologicalOrder(dag):
starttime[t] = max(starttime[p] + duration[p] for p in predecessors(t))
return starttime

ISO 8601 Repeating Intervals without Date

With ISO8601, is there a way to specify a repeating interval which starts at a given time for any day, and repeats over time in that day?
For example, does the following hold:
R2/T09:00:00Z/PT1H = R/2000-01-01T09:00:00/P1D + R/2000-01-01T10:00:00/P1D?
Or is the former not correct under the standard?
The motivation behind this is to run a task at 9am and 10am every day.
No, Iso 8601 cannot irregular repetitions. You would need evaluate/run both of those expressions.
Cron expressions would be a better option as it is widely supported, especially for running tasks. You can find cron expression builders on the web and a library for every language (and OS support with crontab in Unix systems). This expression would handle your use case 0 0 9,10 ? * * * and would run at 9 and 10 am of every day of every year.
Sorry for the response 2 years later.

How do Cron "Steps" Work?

I'm running into a situation where a cron job I thought was running every 55 minutes is actually running at 55 minutes after the hour and at the top of the hour. Actually, it's not a cron job, but it's a PHP scheduling application that uses cron syntax.
When I ask this application to schedule a job every 55 minutes, it creates a crontab line like the following.
*/55 * * * *
This crontab line ends up not running a job every 55 minutes. Instead a job runs at 55 minutes after the hours, and at the top of the hour. I do not desire this. I've run this though a cron tester, and it verifies the undesired behavior is correct cron behavior.
This leads me to looking up what the / actually means. When I looked at the cron manual I learned the slash indicated "steps", but the manual itself is a little fuzzy on that that means
Step values can be used in conjunction with ranges. Following a range with "<number>" specifies skips of the number's value through the range. For example, "0-23/2" can be used in the hours field to specify command execution every other hour (the alternative in the V7 standard is "0,2,4,6,8,10,12,14,16,18,20,22"). Steps are also permitted after an asterisk, so if you want to say "every two hours", just use "*/2".
The manual's description ("specifies skips of the number's value through the range") is a little vague, and the "every two hours" example is a little misleading (which is probably what led to the bug in the application)
So, two questions:
How does the unix cron program use the "step" information (the number after a slash) to decide if it should skip running a job? (modular division? If so, on what? With what conditions deciding a "true" run, and which decisions not? Or is it something else?)
Is it possible to configure a unix cron job to run every "N" minutes?
Step values can be used in conjunction with ranges. Following a range
with "<number>" specifies skips of the number's value through the range. For
example, "0-23/2" can be used in the hours field to specify command
execution every other hour (the alternative in the V7 standard is
"0,2,4,6,8,10,12,14,16,18,20,22"). Steps are also permitted after an
asterisk, so if you want to say "every two hours", just use "*/2".
The "range" being referred to here is the range given before the /, which is a subrange of the range of times for the particular field. The first field specifies minutes within an hour, so */... specifies a range from 0 to 59. A first field of */55 specifies all minutes (within the range 0-55) that are multiples of 55 -- i.e., 0 and 55 minutes after each hour.
Similarly, 0-23/2 or */2 in the second (hours) field specifies all hours (within the range 0-23) that are multiples of 2.
If you specify a range starting other than at 0, the number (say N) after the / specifies every Nth minute/hour/etc starting at the lower bound of the range. For example, 3-23/7 in the second field means every 7th hour starting at 03:00 (03:00, 10:00, 17:00).
This works best when the interval you want happens to divide evenly into the next higher unit of time. For example, you can easily specify an event to occur every 1, 2, 3, 4, 5, 6, 10, 12, 15, 20, or 30 minutes, or every 1, 2, 3, 4, 6, or 12 hours. (Thank the Babylonians for choosing time units with so many nice divisors.)
Unfortunately, cron has no concept of "every 55 minutes" within a time range longer than an hour.
If you want to run a job every 55 minutes (say, at 00:00, 00:55, 01:50, 02:45, etc.), you'll have to do it indirectly. One approach is to schedule a script to run every 5 minutes; the script then checks the current time, and does its work only once every 11 times it's called.
Or you can use multiple lines in your crontab file to run the same job at 00:00, 00:55, 01:50, etc. -- except that a day is not a multiple of 55 minutes. If you don't mind having a longer or shorter interval once a day, week, or month, you can write a program to generate a large crontab with as many entries as you need, all running the same command at a specified time.
I came across this website that is helpful with regard to cron jobs.
https://crontab.guru
And specific to your case with * /55
https://crontab.guru/#*/55_*_*_*_*
It helped to get a better understanding of the concept behind it.
There is another tool named at that should be considered. It can be used instead of cron to achieve what the topic starter wants. As far as I remember, it is pre-installed in OS X but it isn't bundled with some Linux distros like Debian (simply apt install at).
It runs a job at a specific time of day and that time can be calculated using a complex specification. In our case the following can be used:
You can also give times like now + count time-units, where the time-units can be minutes, hours, days, or weeks and you
can tell at to run the job today by suffixing the time with today and to run the job tomorrow by suffixing the time with tomorrow.
The script every2min.sh is executed every 2 minutes. It delays next execution every time the instance is running:
#!/bin/sh
at -f ./every2min.sh now + 2 minutes
echo "$(date +'%F %T') running..." >> /tmp/every2min.log
Which outputs
2019-06-27 14:14:23 running...
2019-06-27 14:16:00 running...
2019-06-27 14:18:00 running...
As at does not know about "seconds" unit, the execution time will be rounded to full minute after the first run. But for a given task (with 55 minutes range) it should not be a big problem.
There also might be security considerations
For both at and batch, commands are read from standard input or the file specified with the -f option and executed. The working directory, the environment (except for the variables BASH_VERSINFO, DISPLAY, EUID, GROUPS, SHELLOPTS, TERM, UID, and _) and the umask are retained from the time of invocation.
This is the easiest way to schedule something to be ran every X minutes I've seen so far.

cron job running every minute, starting on the half hour

Scenario
We have a use-case where we need to run a program every minute for some number of hours (let's say 8, but ending on an hour boundary), but we need the program to start running at 6:30am (on the half hour)
I was recently asked if this scenario is possible in a single cronstring, and I can't seem to find a way to do it since the minute area is needed to match both the half hour start time and the execute every minute.
I can see how to do it in two cronstrings, 1 from 6:30am - 7:00am then another from 7:00am-3:00pm
30-59 6 * * * /path/to/my/program.sh
*/1 7-15 * * * /path/to/my/program.sh
Is there anyway to combine these into a single cronstring ?
you can change Time in your PC to 7:00 ot 6:00 when it is 6:30 if it doesn't matters to you

Resources