How to loop workflows with a counter in Oozie using HUE 3.11? - oozie

I have a workflow that starts with a shell script node that accepts a numeric parameter and it directs into different hive scripts using this parameter. How do I loop this workflow so it would execute based on a range of number as the parameter?
What I do right now is I change parameter in the GUI, execute, wait for it to finish, then change the parameter for the next number and rerun again.

You can achieve this using sub-flow, read the following blog to understand how to implement http://www.helmutzechmann.com/2015/04/23/oozie-loops/
The shell action output can be captured and accessed by the other action
${wf:actionData('shellAction')['variablename']}
Hope this helps.
-Ravi

Related

Check value of a variable while R session is running

Is there a way to check the value of a variable in the script while it is running? One way could be to setup print commands in the script to print the value of the variable periodically. But I forgot to do that and it is a long program. Is there any another way?

Airflow - How to pass data the output of one operator as input to another task

I have a list of http endpoints each performing a task on its own. We are trying to write an application which will orchestrate by invoking these endpoints in a certain order. In this solution we also have to process the output of one http endpoint and generate the input for the next http enpoint. Also, the same workflow can get invoked simultaneously depending on the trigger.
What I have done until now,
1. Have defined a new operator deriving from the HttpOperator and introduced capabilities to write the output of the http endpoint to a file.
2. Have written a python operator which can transfer the output depending on the necessary logic.
Since I can have multiple instances of the same workflow in execution, I could not hardcode the output file names. Is there a way to make the http operator which I wrote to write to some unique file names and the same file name should be available for the next task so that it can read and process the output.
Airflow does have a feature for operator cross-communication called XCom
XComs can be “pushed” (sent) or “pulled” (received). When a task pushes an XCom, it makes it generally available to other tasks. Tasks can push XComs at any time by calling the xcom_push() method.
Tasks call xcom_pull() to retrieve XComs, optionally applying filters based on criteria like key, source task_ids, and source dag_id.
To push to XCOM use
ti.xcom_push(key=<variable name>, value=<variable value>)
To pull a XCOM object use
myxcom_val = ti.xcom_pull(key=<variable name>, task_ids='<task to pull from>')
With bash operator , you just set xcom_push = True and the last line in stdout is set as xcom object.
You can view the xcom object , hwile your task is running by simply opening the tast execution from airflow UI and clicking on the xcom tab.

pass a directory name from one coordinator to another in oozie

I have a coordinator-A running which has workflow that generates output to a directory
/var/test/output/20161213-randomnumber/
now i need to pass the dir name "20161213-randomnumber" to another coordinator-B which needs to start as soon as the workflow of the coordinator-A is completed.
I am not able to find any pointers on how to pass the file name or how can the coordinator-B be triggered with the directory generated by co-ordinator A.
How ever i have seen numerous examples on triggering the coordinators for a specific date, daily, monthly, weekly dataset. In my case the dataset is not time dependent. It can arrive arbitrarily .
In your case, you can add one more action to put one empty trigger file (trig.txt) after your data generation script(/var/test/output/20161213-randomnumber/) action in your coordinator A. Then in your coordinator B add the data dependency to point to the trigger file, if it is there corrdinator B will start. Once B is getting started you can clear the trigger file for the next run.
You can use this data dependency to solve the problem. you can not pass parameter from one coordinator to another coordinator.

How to change value in an oozie job coordinator?

I have a mapreduce job which is scheduled by an oozie coordinator and runs every 4 hours. This mapreduce job takes a parameter, let's say k, whose value is set in the job.config file. I'd like to know if I change the value of this parameter between two runs, does it pick the updated (new) value or it sticks to the original (old) value?
if the job is in runing mode, it will stick to Old parameter it self, and if the job is in waiting to schedule run, then it will take the latest value :).
Actually, there is a devious way to "dynamically" fetch a parameter value at run time:
insert a dummy Shell Action at the beginning of the Workflow, with
<capture-output/> option set
in the shell script, just download a properties file from HDFS and
dump it to STDOUT
the "capture-output" option tells Oozie to parse STDOUT into a Map (i.e. a key/value list)
then use an E.L. function to retrieve the appropriate value(s) in
the next Actions
${wf:actionData("DummyShellAction")["some.key"]}
http://oozie.apache.org/docs/4.0.0/WorkflowFunctionalSpec.html#a4.2.6_Hadoop_Jobs_EL_Function

Call duration during the call via AMI

How can I get call duration during the call via AMI?
something like Status() or CoreShowChannels() but seconds needs to be after Answering call
You have following options:
1)You can collect "Link" events and store info about calls start in your application. AMI is not designed to got call info. This one is CORRECT way.
2) issue command
http://www.voip-info.org/wiki/view/Asterisk+Manager+API+Action+Command
with
"core show channel CHANNEL_NAME_HERE"
It will have info about duration
3)Other option is get variable CDR(billsec)
http://www.voip-info.org/wiki/view/Asterisk+Manager+API+Action+GetVar

Resources