Scheduling a job or box based on a recurring external event - autosys

We need to start a processing pipeline whenever an external event (say, an "update" notification) occurs, one or more times a day.
For a predictable or fixed number of occurences, we can set up a trigger or watcher job to capture the events and use them to kick off the dependent box with the processing jobs. But what if the event can happen a variable number of times a day? Basically, we need to restart the trigger/watcher job automatically every time the dependent processing completes, and thus be ready for the next time the external event occurs.
Can this be done in autosys? The 'continuous' attribute on file trigger jobs doesn't seem useful: it only writes alerts to the scheduler log when we would like it to start a dependent box instead.

a simple loop perhaps would suffice
insert_job: flag-file-watcher
insert_job: flag-reset
condition: s(flag-file-watcher)
insert_job: main-process
condition: s(flag-reset)
update_job: flag-file-watcher
condition: s(main-process)

Related

How to loop workflows with a counter in Oozie using HUE 3.11?

I have a workflow that starts with a shell script node that accepts a numeric parameter and it directs into different hive scripts using this parameter. How do I loop this workflow so it would execute based on a range of number as the parameter?
What I do right now is I change parameter in the GUI, execute, wait for it to finish, then change the parameter for the next number and rerun again.
You can achieve this using sub-flow, read the following blog to understand how to implement http://www.helmutzechmann.com/2015/04/23/oozie-loops/
The shell action output can be captured and accessed by the other action
${wf:actionData('shellAction')['variablename']}
Hope this helps.
-Ravi

How to change value in an oozie job coordinator?

I have a mapreduce job which is scheduled by an oozie coordinator and runs every 4 hours. This mapreduce job takes a parameter, let's say k, whose value is set in the job.config file. I'd like to know if I change the value of this parameter between two runs, does it pick the updated (new) value or it sticks to the original (old) value?
if the job is in runing mode, it will stick to Old parameter it self, and if the job is in waiting to schedule run, then it will take the latest value :).
Actually, there is a devious way to "dynamically" fetch a parameter value at run time:
insert a dummy Shell Action at the beginning of the Workflow, with
<capture-output/> option set
in the shell script, just download a properties file from HDFS and
dump it to STDOUT
the "capture-output" option tells Oozie to parse STDOUT into a Map (i.e. a key/value list)
then use an E.L. function to retrieve the appropriate value(s) in
the next Actions
${wf:actionData("DummyShellAction")["some.key"]}
http://oozie.apache.org/docs/4.0.0/WorkflowFunctionalSpec.html#a4.2.6_Hadoop_Jobs_EL_Function

How to prevent sbt from running a task multiple times in a session?

I'd like to prevent the following task from getting run multiple times when sbt is running:
val myTask = someSettings map {s => if !s.isDone doSomethingAndSetTheFlag}
So what's expected would be when myTask is run for the first time, isDone is false and something gets done in the task, and then the task sets the flag to true. But when the task is run for the second time, since the isDone flag is true, it skips the actual execution block.
The expected behavior is similar to compile -> when source is compiled, the task doesn't compile the code again the next time it's triggered until watchSource task says the code has been changed.
Is it possible? How?
This is done by sbt, a task will be evaluated only once within a single run. If you want to have a value evaluated once, at the project load time, you can change it to be a SettingKey.
This is documented in the sbt documentation (highlighting is mine):
As mentioned in the introduction, a task is evaluated on demand. Each
time sampleTask is invoked, for example, it will print the sum. If the
username changes between runs, stringTask will take different values
in those separate runs. (Within a run, each task is evaluated at
most once.) In contrast, settings are evaluated once on project load
and are fixed until the next reload.

Cancel a long task that's managed by a web service

I have a web service with three methods: StartReport(...), IsReportFinished(...) and GetReport(...), each with various parameters. I also have a client application (Silverlight) which will first call StartReport to trigger the generation of the report, then it will poll the server with IsReportFinished to see if it's done and once done, it calls GetReport to get the report. Very simple...
StartReport is simple. It first generates an unique ID, then it will use System.Threading.Tasks.Task.Factory.StartNew() to create a new task that will generate the report and finally return the unique ID while the task continues to run in the background. IsReportFinished will just check the system for the unique ID to see if the report is done. Once done, the unique ID can be used to retrieve the report.
But I need a way to cancel the task, which is implemented by adding a new parameter to IsReportFinished. When called with cancel==true it will again check if the report is done. If the report is finished, there's nothing to cancel. Otherwise, it needs to cancel the task.
How do I cancel this task?
You could use a cancellation token to cancel TPL tasks. And here's another example.

PL/SQL wait for update in Oracle

How do I create PL/SQL function which waits for update on some row for specified timeout and then returns.
What I want to accomplish is - I have long running process which will update it's status to ASYNC_PROCESS table by process_id. I need function which returns with true/false when this process has completed, but also I need this function to wait some time for this process complete, return on timeout or return imediately with true, when process has completed. I don't want to use sleep(1 sec), because in such case I will be having 1 sec lag. I don't want to use sleep(1 msec), because in such case I am spending cpu resources (and 1msec lag).
Is there a good way how experienced programmer would accomplish this?
That function will be called from .NET (So I need minimal lag between DB operation and .NET/UI)
THNX,
Beef
I think the most sensible thing to do in this case is to use update triggers on that ASYNC_PROCESS table.
You should also look into the DBMS_ALERT package. Here's an edited excerpt from that doc:
Create an alert:
DBMS_ALERT.REGISTER('emp_table_alert');
Create a trigger on your table to fire the alert:
CREATE TRIGGER emptrig AFTER INSERT ON emp
BEGIN
DBMS_ALERT.SIGNAL('emp_table_alert', 'message_text');
END;
From your .net code, you can the use something that calls this:
DBMS_ALERT.WAITONE('emp_table_alert', :message, :status, :timeout);
Make sure you read the docs for what :status and :timeout do.
You should look at Oracle Advanced Queuing. It offers the kind of functions your looking for.
You'll probably need a separate queue table where a trigger on ASYNC_PROCESS inserts messages. You then use the AQ functions to retrieve (or wait for) the next message in the queue table.
If you're doing this in C#.NET, why wouldn't you simply spawn a worker thread to do the update (via ODAC)? Why hand the responsibility over to Oracle when (it seems) you want a .NET process to make the update call (in background) and have the main process be notified of its completion.
See here and here for examples, although there are several approaches in .NET for this (delegates, events, async callbacks, thread pools, etc)

Resources