I want to run oozie coordinator with start time as sysdate. How do I do that?
is it possible to put sysdate as start date ? Will it catch up?
You can make coorodinator's "start" refer to a variable - startTime, then overwrite its value with sysdate from command line, such as:
oozie job -run -config ./coord.properties -DstartTime=`date -u "+%Y-%m-%dT%H:00Z"`
adjust the time format if you are not using UTC time zone in your system.
sample coordinator job xml:
<coordinator-app name="my-coord"
frequency="${frequency}" start="${startTime}" end="${end}" timezone="UTC"
xmlns="uri:oozie:coordinator:0.4">
<action>
<workflow> ...
coordinator attribute file coord.properties:
...
startTime=2014-05-19T22:00Z
end=2015-01-19T22:08Z
frequency=60 ...
Related
Suppose I have an airflow dag file that creates a graph like so...
def get_current_info(filename)
current_info = {}
<fill in info in current_info relevant for today's date for given file>
return current_info
files = [
get_current_info("file_001"),
get_current_info("file_002"),
....
]
for f in files:
<some BashOperator bo1 using f's current info dict>
<some BashOperator bo2 using f's current info dict>
....
bo1 >> bo2
....
Since these values in the current_info dict that is used to define the dag changes periodically (here, daily), I would like to know by what process / schedule the dag definition gets updated. (I print the current_info values each run and values appear to be updating, but curious as to how and when exactly this happens).
When does a airflow dag definition get evaluated? referenced anywhere in the docs?
The DAGs are evaluated in every run of the scheduler.
This article describes how the scheduler works and at what stage the DAG files are picked up for evaluation.
After some discussion on the [airflow email list][1], it turns out that airflow builds the dag for each task when it is run (so each tasks includes the overhead of building the dag again (which in my case was very significant)).
See more details on this here: https://stackoverflow.com/a/59995882/8236733
I have a hourly shell script job that takes a date and hour as input params. The date and hour are used to construct the input path to fetch data for the logic contained in the job DAG. When a job fails and I need to rerun it (by clicking "Clear" for the failed task node to clean up the status to re-trigger a new run), how can I make sure the date and hour used for rerun are the same as the failed run since the rerun could happen in a different hour as the original run?
You have 3 options:
Hover to the failed task which is going to clear, in its displaying tag there will be a value with key Run:, it is its Execution date and time.
Click on the failed task which is going to clear, heading of its displaying popup which has the clear option will be [taskname] on [executiondatewithtime]
Open the task log, the first line after the attempts count will be included a string with format Executing <Task([TaskName]): task_id> on [ExecutionDate withTime]
oozie job -info $coordinator
the command gives you the details of workflows belong to the coordinator, print their ID, status, created time and nominal time.
I'm trying to print the workflows of the oozie coordinator which are executed after a specific date.
As per their documentation,
-filter <arg> <key><comparator><value>[;<key><comparator><value>]*
(All Coordinator actions satisfying the filters will be retrieved).
key: status or nominal time
comparator: =, !=, <, <=, >, >=. = is used as OR and others as AND
status: values are valid status like SUCCEEDED, KILLED etc. Only = and != apply for status.
nominaltime: time of format yyyy-MM-dd'T'HH:mm'Z'
From this, it's understandable that status key supports only "=" or "!=" whereas nominal time key supports all comparators.
But when I try to use it, I'm getting below error.
[hadoop#xx ~]$ oozie job -info $coord -filter status nominalTime>2018-09-01'T'08:00'Z'
Error: E0421 : E0421: Invalid job filter [nominalTime], filter should be of format <key><comparator><value> pairs
The same command works if I put "=" or "!=" but throws an error if you use other comparators. (>,<,>=,<=)
Kindly suggest how to fix this or any other alternatives for this use case.
-filter 'nominaltime>2019-11-01T01:00Z'
I'm working on a WebAPI that schedules Events and appointments.In order to do that, I am using the chroniton.NetCore package. I am able to set the schedule using a cron expression or other functions, but is there any way to set a stop date or termination date for the schedule?
var schedule = new EveryXTimeSchedule(TimeSpan.FromSeconds(5));
var scheduledJob = singularity.ScheduleParameterizedJob(
schedule, job, "Hello World", true); //starts immediately
I wish to stop this job on the next day at midnight without using Task.Delay(since the api is stateless)
Thank you.
Current version of chroniton.NETCore does not support such behaviour out of the box. You can implement your own logic based on fact, that you have information about time range.
ScheduleParameterizedJob() returns an IScheduledJob which can be used to cancel the job if needed. You can create additional RunOnceSchedule to close your first job on the next day at midnight using
Singularity.Instance.StopScheduledJob(IScheduledJob job);
you can inherit from EveryXTimeSchedule and override NextScheduledTime method to return Chroniton.Constants.Never if 'next scheduled time' is >= the next day at midnight;
Best way to create a Midnight DateTime in C# :
DateTime nextDayAtMidnight = DateTime.Today.AddDays(1)
I have a control-m file watcher job which waits for a specific file, if file arrived with in specified time job ends ok, but I want to set job status ok when file does not arrived in specified time instead of keep waiting for the file, is this possible ? how to implement it ?
Thank you.
There are two ways of setting up a file-watcher.
File Watcher Job
Filewatcher Utility in Control M ctmfw
There are two consequences of FW jobs getting completed.
Giving the out condition to next job, so that the successer jobs start executing
Just to complete the job, so that this gets cleard off in New day process.
Now, if you want 1st consequence, then this is one option -
Assume that your FW job [ABC] runs between 0600 - 1800, and the out condition it passes to the successor job is ABC-OK. Successor job [DEF] runs on getting the condition ABC-OK; Keep a dummy job [ABC_DUMMY] which runs on 1805 that sets the same condition ABC-OK. So, once ABC_DUMMY completes, DEF will get the condition it is looking for and will execute.
If the file arrived early, then the FW job ABC will run, and it will set the condition ABC-OK. and DEF will start running.
In both this condition, ensure that once DEF is completed, ABC-OK is negated.
If you are looking for second consequence, then I believe as long as job is not failing, FW jobs will be in 'To Run' status, and this will get cleared off in the New Day Process.
Happy to help further. Just post your doubts here.
JN
Edit your FileWatcher job
In the EXECUTION tab:
Submit between "Enter your beginning time" to "enter your ending time"
In the STEPS tab:
ON (Statement=* CODE=COMPSTAT=0)
DO OK
DO CONDITION, NAME=FILE-FOUND
ON (Statement=* CODE=COMPSTAT!0)
DO OK
DO CONDITION, NAME=FILE-NOT-FOUND
Use wait until parameter in file watcher. Suppose if you want the job to watch for the file until 06:00 AM, mention 06:00 AM in wait until parameter mention 06:00.
Exactly at 06:00 AM the job will get fail if it doesn't find the file. then u can use step tab to set the job okay with either of the following options.
Option 1:
ON(ON (Statement=* CODE=COMPSTAT!0))
DO OK
or
Option 2:
ON( (statement=* CODE=NOTOK))
DO OK