Oozie variable for all actions - oozie

Is there a way to set up a global variable for all actions in the workflow?
I need to define variable containing a value and then the same variable will be modified in the actions.
I tried:
<workflow-app name="test" xmlns="uri:oozie:workflow:0.5">
<global>
<configuration>
<property>
<name>variable1</name>
<value>/some/path</value>
</property>
</configuration>
</global>
.....
<action name="wf1">
....
<property>
<name>variable1</name>
<value>/some/other/path</value>
</property>
</action>
....
<action name="wf2">
.....
<property>
<name>variable1</name>
<value>/some/second/path</value>
</property>
....
</action>
<action name="createFolder">
<fs>
<mkdir path="${variable1}"/>
</fs>
<ok to="End"/>
<error to="Kill"/>
</action>
I would like to let actions to modified the value and then use it in another action. Is it possible? Right now I´m getting VARIABLE variable1 cannot be resolved

You can do this using action configuration.
You can even define default values for each action type.

Related

Oozie coordinator get day of the week

I am trying to create a condition in my Oozie workflow, where an action should be executed only on mondays (at the end of the workflow).
So far I added a decision node in the workflow, and the current date as parameter in the coordinator, and I need to test the day of the week.
coordinator.xml
<coordinator-app name="${project}_coord" frequency="${coord_frequency}" start="${coord_start_date}" end="${coord_end_date}" timezone="UTC" xmlns="uri:oozie:coordinator:0.1">
<controls>
<concurrency>1</concurrency>
<execution>LAST_ONLY</execution>
</controls>
<action>
<workflow>
<app-path>${wf_application_path}</app-path>
<configuration>
<property>
<name>currentDate</name>
<value>${coord:formatTime(coord:dateOffset(coord:nominalTime(), 0, ‘DAY’), “yyyyMMdd”)}</value>
</property>
</configuration>
</workflow>
</action>
workflow.xml
<decision name = "send_on_monday">
<switch>
<case to = "send_file">
${currentDay} eq "MON" <-------- test on day of the week
</case>
<default to = "sendSuccessEmail" />
</switch>
</decision>
<action name="send_file">
<ssh xmlns="uri:oozie:ssh-action:0.1">
<host>${remoteNode}</host>
<command>/pythonvenv</command>
<args>${fsProjectDir}/send_file.py</args>
</ssh>
<ok to="sendSuccessEmail"/>
<error to="sendTechnicalFailureEmail"/>
</action>
I didn't find information on how to get the day of the week with EL functions.
Any help is appreciated.
I found a solution by using wf:actionData in a decision node :
workflow.sh
<action name="getDayOfWeek">
<ssh xmlns="uri:oozie:ssh-action:0.1">
<host>${remoteNode}</host>
<command>${fsProjectDir}/scripts/dayOfWeek.sh</command>
<capture-output/>
</ssh>
<ok to="send_on_monday"/>
<error to="sendTechnicalFailureEmail"/>
</action>
<decision name="send_on_monday">
<switch>
<case to = "send_file">
${wf:actionData('getDayOfWeek')['day'] eq 'Mon'}
</case>
<default to = "sendSuccessEmail" />
</switch>
</decision>
dayOfWeek.sh
#!/bin/sh
DOW=$(date +"%a")
echo day=$DOW

Oozie workflow not running properly

I have created a new Oozie workflow in Hue UI based on the below hql query.
things.hql
drop table output_table;
create table output_table like things;
insert overwrite table output_table select t.* from things_dup td right outer join things t on (td.item_id = t.item_id) where td.item_id is null;
insert overwrite table things_dup select * from things;
The tables are,
things table
item_id product
1 soap
2 chocklate
things_dup
item_id product
1 soap
when i run the hql seperately
hadoop dfs -f things.hql
its working fine. things_dup table have updated properly.
But when i run the workflow, things_dup table have not updated.
insert overwrite table things_dup select * from things;
Can any one know why? Please help me to fix this issue.
Workflow.xml
<workflow-app name="Things_workflow" xmlns="uri:oozie:workflow:0.4">
<start to="Things_workflow"/>
<action name="Things_workflow">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>/user/cloudera/hive-site.xml</job-xml>
<script>things.hql</script>
<file>/user/cloudera/hive-site.xml#hive-site.xml</file>
</hive>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
action
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>localhost.localdomain:8021</job-tracker>
<name-node>hdfs://localhost.localdomain:8020</name-node>
<job-xml>/user/cloudera/hive-site.xml</job-xml>
<script>things.hql</script>
<file>/user/cloudera/hive-site.xml#hive-site.xml</file>
</hive>
Thanks,
manimekalai
I also got issue with hive oozie but finally solved.
Please match you workflow.xml with this one.
Please must put this job-xml line before configuration node.
<?xml version="1.0" encoding="UTF-8"?>
<workflow-app xmlns="uri:oozie:workflow:0.2" name="pig-hive-wf">
<start to="hive-node" />
<action name="hive-node">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>hive-site.xml</job-xml>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<property>
<name>oozie.hive.defaults</name>
<value>/user/cloudera/oozie/pig_hive_data_WF/hive-site.xml</value>
</property>
</configuration>
<script>/user/cloudera/oozie/pig_hive_data_WF/load_data.q</script>
<param>LOCATION=/user/${wf:user()}/oozie/pig_hive_data_WF/output/pig_loaded_data</param>
<file>hive-conf.xml#hive-conf.xml</file>
</hive>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Pig Hive job is failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end" />
</workflow-app>
you can copy hive-site.xml file from /etc/hive/conf/hive-site.xml and put within workflow dir.

Pass -Dpig.notification.listener argument to Pig action in Oozie (Oozie client build version: 3.3.2-cdh4.7.1)

I want pass -Dpig.notification.listener argument to Pig action in oozie. But when I am doing this the pig is failing and there is no clear error description. Below is the Pig action snippet. Any suggestion how to pass listener to pig action.
<action name="FinalReport">
<pig>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/home/hadoop/surjan/path" />
</prepare> <script>${nameNode}//home/hadoop/surjan/wf/pigs/FinalReport.pig</script>
<param>inDate=20160430</param>
</pig>
<ok to="success" />
<error to="kill" />
</action>
EDIT: I found the solution for passing the -D arguments. Just in case if anybody is facing same issue.
<action name="FinalReport">
<pig>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/home/hadoop/surjan/path" />
</prepare>
<configuration>
<property>
<name>pig.notification.listener</name>
<value>com.surjan.util.counter.MyListener</value>
</property>
</configuration>
<script>${nameNode}//home/hadoop/surjan/wf/pigs/FinalReport.pig</script>
<param>inDate=20160430</param>
<argument>Dpig.notification.listener=com.surjan.util.counter.CounterListener</argument>
<archive>archive/mycounter.jar#mycounter</archive>
</pig>
<ok to="success" />
<error to="kill" />
enter code here

Oozie Invalid Transition

<workflow-app name="Oozie_app" xmlns="uri:oozie:workflow:0.1">
<start to="TransformWeatherData"/>
<action name="TransformWeatherData">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>default</value>
</property>
</configuration>
<exec>/home/kingsly/working_directory/copyFromLocal.sh</exec>
<file>/home/kingsly/working_directory/copyFromLocal.sh</file>
</shell>
<ok to="Oozie_app"/>
<error to="end"/>
</action>
<end name='end' />
I am new to Oozie and i have created a workflow and job.properties file
This is how my workflow.xml looks
When i submit this workflow i am getting error as
Error: E0708 : E0708: Invalid transition, node [TransformWeatherData] transition [Oozie_app]
please help me resolve this .
My main objective is to move a file from local machine to HDfs and i have included the Hadoop command in the shell script
You were referring to a missing node. I fixed this:
<workflow-app name="Oozie_app" xmlns="uri:oozie:workflow:0.1">
<start to="TransformWeatherData"/>
<action name="TransformWeatherData">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>default</value>
</property>
</configuration>
<exec>/home/kingsly/working_directory/copyFromLocal.sh</exec>
<file>/home/kingsly/working_directory/copyFromLocal.sh</file>
</shell>
<ok to="end"/>
<error to="kill" />
</action>
<kill name="kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name='end' />

Oozie <job-xml> not working in global section with subworkflow action

I am trying to shorten my oozie workflow and I am referring this cloudera post.
http://blog.cloudera.com/blog/2013/11/how-to-shorten-your-oozie-workflow-definitions/
Per this post, we can add job.xml file in global section, but its not working for me.
<global>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>job1.xml</job-xml>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
</global>
-----------------------Main Workflow------------------------------
<?xml version="1.0" encoding="UTF-8"?>
<workflow-app xmlns="uri:oozie:workflow:0.4" name="main-wf">
<global>
<job-xml>/user/${wf:user()}/${examplesRoot}/apps/map-reduce/job.xml</job-xml>
</global>
<start to="main-node"/>
<action name="main-node">
<sub-workflow>
<app-path>/user/${wf:user()}/${examplesRoot}/apps/map-reduce/workflow.xml</app-path>
</sub-workflow>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>MR failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
-----------------------Sub Workflow------------------------------
<?xml version="1.0" encoding="UTF-8"?>
<workflow-app xmlns="uri:oozie:workflow:0.4" name="OozieBDU-wf">
<start to="wordcount-node"/>
<action name="wordcount-node">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/${outputDir}"/>
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<property>
<name>mapred.mapper.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.reducer.new-api</name>
<value>true</value>
</property>
<property>
<name>mapreduce.map.class</name>
<value>com.yumecorp.WordCount$TokenizerMapper</value>
</property>
<property>
<name>mapreduce.reduce.class</name>
<value>com.yumecorp.WordCount$IntSumReducer</value>
</property>
<property>
<name>mapred.output.key.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
<property>
<name>mapred.output.value.class</name>
<value>org.apache.hadoop.io.IntWritable</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>/user/${wf:user()}/${examplesRoot}/input-data/decodeAndProfile</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>/user/${wf:user()}/${examplesRoot}/output-data/${outputDir}</value>
</property>
<property>
<name>mapreduce.job.outputformat.class</name>
<value>org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat</value>
</property>
<property>
<name>mapred.mapoutput.key.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
<property>
<name>mapred.mapoutput.value.class</name>
<value>org.apache.hadoop.io.IntWritable</value>
</property>
</configuration>
</map-reduce>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>MR failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
-----------------------job.xml------------------------------
<configuration>
<property>
<name>jobTracker</name>
<value>pari.hhh.com:8021</value>
</property>
<property>
<name>nameNode</name>
<value>hdfs://pari.hhh.com:8020</value>
</property>
<property>
<name>queueName</name>
<value>default</value>
</property>
<property>
<name>examplesRoot</name>
<value>wordcount</value>
</property>
<property>
<name>outputDir</name>
<value>map-reduce</value>
</property>
</configuration>
-----------------------job.properties------------------------------
examplesRoot=wordcount
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/map-reduce/mainworkflow.xml
-----------------------oozie command -----------------------------
oozie job -config job.properties -run
-----------------------ERROR -------------------------------------
OB[0000012-150131091133585-oozie-oozi-W] ACTION[0000012-150131091133585-oozie-oozi-W#wordcount-node] ELException in ActionStartXCommand
javax.servlet.jsp.el.ELException: variable [jobTracker] cannot be resolved
at org.apache.oozie.util.ELEvaluator$Context.resolveVariable(ELEvaluator.java:106)
at org.apache.commons.el.NamedValue.evaluate(NamedValue.java:124)
at org.apache.commons.el.ExpressionString.evaluate(ExpressionString.java:114)
at org.apache.commons.el.ExpressionEvaluatorImpl.evaluate(ExpressionEvaluatorImpl.java:274)
at org.apache.commons.el.ExpressionEvaluatorImpl.evaluate(ExpressionEvaluatorImpl.java:190)
at org.apache.oozie.util.ELEvaluator.evaluate(ELEvaluator.java:203)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:188)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
at org.apache.oozie.command.XCommand.call(XCommand.java:281)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Anybody can help??
Cheers
Pari
To pass properties from a workflow to its sub-workflow you need to add the <propagate-configuration/> tag to the <sub-workflow> action.
For example:
<action name="main-node">
<sub-workflow>
<app-path>/user/${wf:user()}/${examplesRoot}/apps/map-reduce/workflow.xml</app-path>
<propagate-configuration/>
</sub-workflow>
<ok to="end" />
<error to="fail" />
</action>

Resources