how to find the total time taken by a job in oozie - oozie

Is there a oozie coomandline that I can use or some other way to figure out the start and end time of an oozie jobs submitted through a coordinator?

Yes, you can see the start and end time of a cooridnator job
$ oozie job -oozie http://localhost:8080/oozie -info 14-20090525161321-oozie-joe
For more info, For finding the info about workflow,cooridnator and bundle jobs

Related

Autosys job to auto-update itself as SUCCESS if no response from CMD

In my Autosys box scheduled to run every week, I have 2 jobs:
Job1 - Call up a shell script to generate a file
Job2 - Call up a
shell script to transfer the generated file
What happened is that for Job2, even though the file has been successfully transferred, there is no exit code from the shell script. This resulted in Job2 and the box being in RUNNING state and prevent the box from running at the next week schedule.
The ideal way is to amend the transfer shell script (in Job2) to return a proper exit code. But I do not have access to the shell script to make any change.
In JIL, is it possible to achieve either one of the following:
immediate after Job2 CMD execution, mark Job2 as success, OR
after X minutes of Job2 CMD execution, mark Job2 as success
Adding the term_run_time attribute to the JIL of Job2 will terminate the job if it has been running for more than the number of minutes specified.
For example, term_run_time: 60 sets a 60 minute termination timer.

Report of oozie jobs

How can we get status of Oozie jobs running daily? We have many jobs running in Oozie coordinator and currently we are monitoring through Hue/Oozie browser.
Is there any way we can get a single log file which contains coordinator name/workflow name with date and status? Can we write any program or script to achieve this?
You use the below command and put it into a script to run it daily/cron.
oozie jobs -oozie http://localhost:11000/oozie -filter status=RUNNING -len 2
oozie jobs -oozie http://localhost:11000/oozie -filter startCreatedTime=2016-06-28T00:00Z\;endcreatedtime=2016-06-28T10:00Z -len 2
Basically you are using oozie's jobs api and -filter command to get the information about workflow/coordinator/bundle. The -filter command supports couple of options to get the data based on status/startCreatedTime/name.
By default, it will bring the workflow record information, if you want to get the coordinator/bindle information. You can use the -jobtype parameter and value as coord/bundle.
Let me know if you need anything specifically. The oozie doc is little outdated for this feature.
Command to get status of all running oozie coordinators
oozie jobs -jobtype coordinator -filter status=RUNNING -len 1000 -oozie http://localhost:11000/oozie
Command to get status of all running oozie workflows
oozie jobs -filter status=RUNNING -len 1000 -oozie http://localhost:11000/oozie
Command to get status of all workflows for a specific coordinator ID
oozie job -info COORDINATOR_ID_HERE
Based on these queries you can write required scripts to get what you want.
Terms explanation:
oozie : Command to initiate oozie
job/jobs : API
len: No. of oozie workflows/coordinators to display
-oozie: Param to specify oozie url
-filter : Param to specify list of filters.
Complete documentation https://oozie.apache.org/docs/3.1.3-incubating/DG_CommandLineTool.html
Below command worked for me.
oozie jobs -oozie http://xx.xxx.xx.xx:11000/oozie -jobtype wf -len 300 | grep 2016-07-01 > OozieJobsStatus_20160701.txt
However we need to parse the file.

How to re-run oozie failed job

I want to know about failed oozie job execution.
Does it run from start or run from the failed point?
I have MR and Pig task to be executed.
If Pig job failed, does it start again from MR job execution?
How to re-run the failed oozie jobs?
It will start from the failed action.
oozie job --oozie <oozie_url> -rerun <job_id> -config <job.properties>
Add the following property in the job.properties if not present already. -config parameter is required only if you are updating the job.properties file.
oozie.wf.rerun.failnodes=true

Is there a way to direct Oozie to kill all jobs?

I have tried
oozie job -oozie http://sandbox.hortonworks.com:11000/oozie -config ./job.properties -kill *
...to no effect. I have done a few Google searches and checked Oozie's documentation, and there does not appear to be a command for this.
Would any one know of a way to accomplish this?
It seems that the recent versions of oozie (tested on 4.2) have made this a lot easier.
Here is a oneliner that I now use to kill all jobs that I created.
oozie jobs -oozie http://myserver:11000/oozie -kill -filter user=dennis -jobtype bundle & oozie jobs -oozie http://myserver:11000/oozie -kill -filter user=dennis -jobtype coordinator & oozie jobs -oozie http://myserver:11000/oozie -kill -filter user=dennis
First it kills all bundles, then it kills all coordinators and finally all workflows. Note that I set a filter to my own username, as it appears to be mandatory to have a filter set.
Update:
As mentioned in the comments by #Nutle:
Worth noting that (on 4.3 and win7 x64) the suggested command returned
a syntax error, solved by enclosing the filter terms in quotes, i.e.
oozie jobs <...> -kill -filter "user=dennis"
To my knowledge there is no such command.
Try a shell script that lists the jobs (sadly, workflow jobs, coordinators and bundles should be listed separately), then greps the captions and fancy formatting out, cuts the job id and kills them one by one.
Maybe you can use below type of code in python . It lists all running coordinator jobs in oozie and then kill them iteratively. You can modify if you want any another statuses as well for killing jobs . Define your oozieURL parameter (like http://localhost:11000/oozie)
import os , commands
def killAllRunningJobs(oozieURL):
runningJobs = commands.getoutput("oozie jobs -oozie " + oozieURL + " -jobtype coordinator | grep -i RUNNING | awk -F \" \" '{print $1} " )
print "Current Running Co-ordinator Jobs : " + runningJobs
for jobs in runningJobs:
os.system("oozie job -oozie " + oozieURL + " -kill " + jobs)
oozie jobs -oozie oozieURL -filter STATUS=RUNNING -kill

Difference between Cron and Crontab?

I am not able to understand the answer for this question: "What's the difference between cron and crontab." Are they both schedulers with one executing the files once and the other executing the files on a regular interval OR does cron schedule a job and crontab stores them in a table or file for execution?
Wiki page for Cron mentions :
Cron is driven by a crontab (cron table) file, a configuration file
that specifies shell commands to run periodically on a given schedule.
But wiki.dreamhost for crontab mentiones :
The crontab command, found in Unix and Unix-like operating systems, is
used to schedule commands to be executed periodically. It reads a
series of commands from standard input and collects them into a file
known as a "crontab" which is later read and whose instructions are
carried out.
Specifically, When I schedule a job to be repeated : (Quoting from wiki)
1 0 * * * printf > /var/log/apache/error_log
or executing a job only once
at -f myScripts/call_show_fn.sh 1:55 2014-10-14
Am I doing a cron function in both the commands which is pushed in crontab OR is the first one a crontab and the second a cron function?
cron is the general name for the service that runs scheduled actions. crond is the name of the daemon that runs in the background and reads crontab files. A crontab is a file containing jobs in the format
minute hour day-of-month month day-of-week command
crontabs are normally stored by the system in /var/spool/<username>/crontab. These files are not meant to be edited directly. You can use the crontab command to invoke a text editor (what you have defined for the EDITOR env variable) to modify a crontab file.
There are various implementations of cron. Commonly there will be per-user crontab files (accessed with the command crontab -e) as well as system crontabs in /etc/cron.daily, /etc/cron.hourly, etc.
In your first example you are scheduling a job via a crontab. In your second example you're using the at command to queue a job for later execution.

Resources