sqoop exec job in oozie is not working - oozie

I am running a 3 node HDP 2.2 cluster. Oozie version is 4.1.0.2.2 and Sqoop version is 1.4.5.2.2. I am using Sqoop job to do incremental imports from RDBMS into HDFS as shown below,
sqoop job –create JOB1 –meta-connect “jdbc:hsqldb:hsql://ip-address:16000/sqoop” — import –connect jdbc:oracle:thin:#ip-address:db –username db_user –password-file hdfs://ip-address:8020/user/oozie/.password_sqoop –table TABLE1 –target-dir /user/incremental/ –incremental lastmodified –check-column LAST_UPDATED –last-value “2013-08-12 18:13:44.0″ –merge-key ID –fields-terminated-by ‘|';
sqoop job –exec JOB1
The above 2 sqoop commands are working perfectly fine when run from command prompt. I am using sqoop-metastore (HSQLDB) to store the sqoop jobs.
The sqoop create job is working in OOZIE and i am able to see the sqoop job being listed in the sqoop-metastore after oozie job completes.
But when i put the sqoop exec job in the oozie workflow i get, “Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]” error. The underlying map reduce job however shows as completed successfully. Checked for logs in /var/log/oozie but nothing there as well.
workflow.xml :
<workflow-app xmlns=”uri:oozie:workflow:0.4″ name=”oozie-wf”>
<start to=”sqoop-wf”/>
<action name=”sqoop-wf”>
<sqoop xmlns=”uri:oozie:sqoop-action:0.2″>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>fs.hdfs.impl.disable.cache</name>
<value>true</value>
</property>
</configuration>
<command>job –meta-connect “jdbc:hsqldb:hsql://ip-address:16000/sqoop” –exec JOB1</command>
</sqoop>
<ok to=”end”/>
<error to=”fail”/>
</action>
<kill name=”fail”>
<message>Failed, Error Message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name=”end”/>
</workflow-app>
job.properties :
nameNode=hdfs://ip-address:8020
jobTracker=ip-address:8050
oozie.wf.application.path=hdfs://ip-address:8020/user/oozie/sqoopoozie
oozie.use.system.libpath=true
oozie.sqoop.log.level=DEBUG
I have tried in multiple different ways for sqoop exec job in oozie but nothing works. Please help.

Please ensure you are specifying your metastore in sqoop-site.xml and you are passing the site xml in your oozie workflow folder. You can pass it with xml tag .

It puzzles me as to how sqoop job --exec JOB1 works in the command prompt. The right syntax for executing the job (from command prompt) would be
sqoop job --exec JOB1 --meta-connect jdbc:hsqldb:hsql://ip-address:16000/sqoop
Perhaps you executed the job from the same machine that is running your sqoop metastore
Please make sure you have done the following
Started sqoop metastore on a the machine with ip=ip-address
Made necessary configuration changes in sqoop-site.xml (follow this link)
As for the oozie workflow, try the following workflow.xml
<workflow-app xmlns=”uri:oozie:workflow:0.4″ name=”oozie-wf”>
<start to=”sqoop-wf”/>
<action name=”sqoop-wf”>
<sqoop xmlns=”uri:oozie:sqoop-action:0.2″>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>fs.hdfs.impl.disable.cache</name>
<value>true</value>
</property>
</configuration>
<arg>job</arg>
<arg>--exec</arg>
<arg>JOB1</arg>
<arg>--meta-connect jdbc:hsqldb:hsql://ip-address:16000/sqoop</arg>
</sqoop>
<ok to=”end”/>
<error to=”fail”/>
</action>
<kill name=”fail”>
<message>Failed, Error Message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name=”end”/>
</workflow-app>

In the case of our cluster, creating jobs through Hue UI without specifying the metastore (using the --meta-connect parameter) resulted in jobs being only sometimes returned by the sqoop job --list command. In command prompt creating and listing jobs on the same machine worked; however, machine B could not find jobs created on machine A.
If you also can't access jobs can access jobs created on a different node, it seems like your sqoop metastore is not running. To solve it:
start the metastore on one of the nodes with the sqoop-metastore command,
check if the metastore is running: $ ps aux | grep sqoop should return one of the results as a Sqoop metastore,
specify the metastore path in sqoop-site.xml files. In our case we couldn't make ot work though, so we did it using the next step,
if necessary, both in shell and Hue UI, specify the metastore path in every sqoop job command (create, list, show, exec, delete) as
$ sqoop job --exec my_job --meta-connect jdbc:hsqldb:hsql://node-running-metastore:16000/sqoop.
Note: To start the metastore, consider using & and nohup to run the metastore process in the background and to capture logs into a file:
$ nohup sqoop-metastore &>> /tmp/nohup-sqoop-metastore.out &

Related

How to auto rerun of failed action in oozie?

How can I re-run any action which was failed in the workflow automatically?
I know the way to rerun manually from command line or thorough hue.
$oozie job -rerun ...
Is there any parameter we can set or provide in workflow to retry automatically when action fails?
Most of the time, when an action fails in the Oozie workflow, you need to debug and fix the error and rerun the workflow. But there are times, when you want Oozie to retry the action after an interval, for fixed number of times before failing the workflow. You can specify the retry-max and retry-interval in the action definition.
Examples of User-Retry in a workflow action is :
<workflow-app xmlns="uri:oozie:workflow:0.5" name="wf-name">
<action name="a" retry-max="2" retry-interval="1">
....
</action>
You can find the more information about the User-Retry for Workflow Actions in the link.

Haddop i trying to run my oozie job but i am unable to run , This is Error

I am trying to run my oozie job but i am unable to run ,
This is Error:
./bin/oozie job -oozie http://localhost:11000/oozie/ -config examples/apps/map-reduce/job.properties -run
Error: E0902 : E0902: Exception occured: [User: ubuntu is not allowed to impersonate ubuntu]
Could u please tell me exactly what is the issue.
Thanks,
make sure you have correct proxyuser provided in hadoop core-site.xml. Below properties needs to set in hadoop core-site.xml :
<property>
<name>hadoop.proxyuser.oozie.hosts</name>
<value>HOSTNAME</value> <!-- put suitable hostname -->
</property>
<property>
<name>hadoop.proxyuser.oozie.groups</name>
<value>*</value> <!-- use correct group here . * is not safe -->
</property>

Oozie null pointer exception when submitting jobs

Just trying to run a very simple word count example but getting the following null pointer when submitting the job:
oozie job -oozie=http://localhost:11000/oozie/ -config job.properties -run
[cloudera#localhost Oozie_Example]$ oozie job -oozie=http://localhost:11000/oozie/ -config job.properties -run
java.lang.RuntimeException: java.lang.NullPointerException
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1242)
at sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:2714)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:477)
at org.apache.oozie.client.OozieClient$JobSubmit.call(OozieClient.java:586)
at org.apache.oozie.client.OozieClient$JobSubmit.call(OozieClient.java:561)
at org.apache.oozie.client.OozieClient$ClientCallable.call(OozieClient.java:479)
at org.apache.oozie.client.OozieClient.run(OozieClient.java:655)
at org.apache.oozie.cli.OozieCLI.jobCommand(OozieCLI.java:918)
at org.apache.oozie.cli.OozieCLI.processCommand(OozieCLI.java:579)
at org.apache.oozie.cli.OozieCLI.run(OozieCLI.java:552)
at org.apache.oozie.cli.OozieCLI.main(OozieCLI.java:199)
Caused by: java.lang.NullPointerException
at java.io.ByteArrayInputStream.<init>(ByteArrayInputStream.java:106)
at sun.misc.CharacterEncoder.encode(CharacterEncoder.java:188)
at sun.net.www.protocol.http.NegotiateAuthentication.setHeaders(NegotiateAuthentication.java:156)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1482)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
... 8 more
java.lang.NullPointerException
here is my job properties file:
nameNode=hdfs://localhost:8020
jobTracker=localhost:8021
queueName=default
inputDir=${nameNode}/user/dev/oozie/workflow/wordcount/input/
outputDir=${nameNode}/user/dev/oozie/workflow/wordcount/output/
oozie.libpath=${nameNode}/user/oozie/workflow/wordcount/lib
oozie.use.system.libpath=true
user.name=dev
oozie.wf.application.path=${nameNode}/user/${user.name}/oozie/workflow/wordcount/
Any ideas? It happens pretty early so I think it has something to do with Oozie or namenode.
Would like to add to the answer of this question.
When oozie client got a NullPointerException, that usually means that request caused server thread failure. If you want to find out the "true" reason, you should look at the server log such as /var/log/oozie/oozie.log for CDH.
There you will find detailed stacktrace. That proved to be very helpful. At least in our case, it tells exactly what happened such as "a hdfs workflow file doesn't exist, blah..."
java.lang.NullPointerException is due to an error in your job.properties file.
Have you placed the workflow XML into HDFS? Also, you should specify:
oozie.wf.application.path=${nameNode}/user/${user.name}/oozie/workflow/wordcount/my-workflow.xml

Running Oozie Workflow gives Error: E0732

I am running a oozie workflow on Oozie 3.3.2 that gives following error
Error: E0732 : E0732: Fork <fork>/Join[join1] not in pair (join should have been [join2])
Now the same workflow runs fine on Oozie 2.3.2
Is this something version specific or this error is thrown in some other case
please help
Can you validate your workflow
oozie validate workflow.xml
And then see if some issue is there this suggests something wrong with xml

phpUnderControl and PHPUnit always failing build with code 255

I have the following build.xml file setup in phpUnderControl.
<target name="phpunit">
<exec executable="phpunit" dir="${basedir}/httpdocs" failonerror="on">
<arg line="--log-junit ${basedir}/build/logs/phpunit.xml
--coverage-clover ${basedir}/build/logs/phpunit.coverage.xml
--coverage-html ${basedir}/build/coverage
--colors
${basedir}/httpdocs/qc/unit/CalculatorTest.php" />
</exec>
</target>
For some unkown reason the build always fails with the below message.
phpunit:
[exec] PHPUnit 3.4.15 by Sebastian Bergmann.
[exec]
BUILD FAILED
/opt/cruisecontrol-bin-2.8.3/projects/citest.local/build.xml:30: exec returned: 255
I have run the very simple unit test manually within the unit directory and PHPUnit returns.
PHPUnit 3.4.15 by Sebastian Bergmann.
.
Time: 0 seconds, Memory: 5.25Mb
OK (1 test, 1 assertion)
Does anyone know why it keeps failing the build, when all the tests are fine?
My build script does have a clean method which removes and log files, so its not that. I have also manually removed the log files, just in case it's that script. And changed the owner of the log directories so they are writable.
If it makes any difference the phpunit.xml is empty after PHPUnit is run.
Thanks.
UPDATE: Incidentally if I remove failonerror="on" it works, obviously, but PHPUnit still returns 255 and I do want it to fail on any errors, the issue is there isn't any errors yet it still fails!
255 is the error code that PHP throws on fatal and (I think) parse errors.
Could there be a fatal or parse error in the script execution, maybe after the tests have run?
Any PHP errors? What happens if you add error_reporting(E_ALL); ?
Probably the 'exec returned: 255' is caused by the memory consumption of the CodeCoverage report, which will be displayed after enabling error_reporting(E_ALL) like Pekka mentioned. Try init_set('display_errors','on'); also. Usualy you will face a 'memory size ...exhaused' fatal.
In my case, it turned out to be that I didn't have a webserver installed. Installing nginx and setting it up as a webserver (even if I'm not using it) fixed the problem.
it might be caused by some empty testsuite in phpunit.xml, that is, no assertXXX in test code
I got Selenium, PHPUnit, PHPUnderControl, and CruiseControl all working together (here http://www.siteconsortium.com/h/p1.php?id=php002)
Just a very basic example but I was able to get it to work like this. Start from something simple and copy the other stuff back in.
<?xml version="1.0" encoding="UTF-8"?>
<project name="test" default="build" basedir=".">
<target name="build">
<exec executable="C:\\PHP\\phpunit.bat" dir="C:\\workspace\\proj\\phpunit" failonerror="on">
<arg line="--log-junit C:\\workspace\\proj\\build\\logs\\phpunit.xml
--configuration C:\\workspace\\proj\\phpunit.xml
--include-path C:\\trunk\\includes C:\\workspace\\proj\\includes" />
</exec>
</target>
</project>

Resources