I'm using hadoop-2.7.2 and oozie-4.0.1, what should be the jobTracker value in job.properties file of oozie workflow. I referred this link;
http://hadooptutorial.info/apache-oozie-installation-on-ubuntu-14-04/
which states that, in YARN architecture the job tracker runs on 8032 port and i'm currently using this. But in mapred-site.xml of hadoop i'm having the value hdfs://localhost:54311 for job tracker property.
I'm confused, can any one explain me or provide some useful links for installing oozie and running jobs on oozie.
Currently, i'm not able to run workflow jobs on oozie, it is in a Running state for a long time and then it is getting suspended with a connection error. Job DAG is also not getting generated, it is throwing some UI Exception.
Please anyone help me with this.
In your properties file just pass the Resorucemanager address which you have configured in the yarn-site.xml or directly parse the resourcemanager address in workflow.xml file as
<job-tracker>localhost:8032</job-tracker>
While running properties file you need to specify in which host the oozie server will be running, I think in that part you didn't face any issues right. Then paste the error message and update the question.
EDITED:
Configurations needed to be in yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<description>NM Webapp address.</description>
<name>yarn.nodemanager.webapp.address</name>
<value>${yarn.nodemanager.hostname}:8042</value>
</property>
<property>
<description>hostname </description>
<name>yarn.nodemanager.hostname</name>
<value>localhost</value>
</property>
you can either specify hostname or localhost for Pesudo node cluster.
for HA cluster need the below
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
in Production Environment , probably you have configured a High-Availbility yarn cluster. In this case , the oozie job tracker config in job.properties should be the configuration value of yarn.resourcemanager.cluster-id.
a cut of my yarn configuration :
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>datayarn</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>resourcemanager1,resourcemanager2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.resourcemanager1</name>
<value>11.11.11.11</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.resourcemanager2</name>
<value>11.11.11.12</value>
</property>
So , the jobTracker value should be:datayarn
Related
Sometimes the explorer fails to connect to the HDFS because of network fluctuations. In this case, the task status is always running.
You need to set the timeout period for HDFS connections as follows:
<configuration>
<property>
<name>ipc.client.connect.timeout</name>
<value>3000</value>
</property>
<property>
<name>ipc.client.connect.max.retries.on.timeouts</name>
<value>3</value>
</property>
</configuration>
I was new to oozie process . I was testing the following coordinator.xml,when i submit the job it running in loop but I want to run everyday at 1:00 am .Can someone let me know what mistake i was doing.
<coordinator-app name="cron-coord-jon" frequency="0 1 * * *" start="2009-01-01T05:00Z" end="2036-01-01T06:00Z" timezone="UTC"
xmlns="uri:oozie:coordinator:0.2">
<action>
<workflow>
<app-path>${workflowAppUri}</app-path>
<configuration>
<property>
<name>jobTracker</name>
<value>${jobTracker}</value>
</property>
<property>
<name>nameNode</name>
<value>${nameNode}</value>
</property>
<property>
<name>queueName</name>
<value>${queueName}</value>
</property>
</configuration>
</workflow>
</action>
</coordinator-app>
Your coordinator is likely not running in a loop, but rather submitting every 'missed' job since the start date you specified. Set the start date to the current day (e.g. 2019-06-03T00:00Z) and relaunch your coordinator.
If the start time is before 01:00, you should see a single job be launched for the day.
You may want to pass this in as a variable. Here is the call to date that will provide the current date & time in the correct format.
date -u "+%Y-%m-%dT%H:%MZ"
I'm running java application in oozie and oozie adding something to classpath. How do I know? When I run this application without oozie it works perfectly fine, but with oozie I get
java.lang.NoSuchMethodError: org.apache.hadoop.yarn.webapp.util.WebAppUtils.getProxyHostsAndPortsForAmFilter(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/List;
at org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer.initFilter(AmFilterInitializer.java:40)
at org.apache.hadoop.http.HttpServer.<init>(HttpServer.java:272)
at org.apache.hadoop.yarn.webapp.WebApps$Builder$2.<init>(WebApps.java:222)
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:219)
at org.apache.hadoop.mapreduce.v2.app.client.MRClientService.serviceStart(MRClientService.java:136)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1058)
I even configured
<property>
<name>oozie.use.system.libpath</name>
<value>false</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.user.classpath.first</name>
<value>true</value>
</property>
But it doesn't help. How I can tell oozie to totally **** off my classpath?
I have configured and built ooze 4.2.0 with the following components
* Hadoop - 2.6.0
* OS - Mac 10.7
* Hive - 1.1.2
* Hbase - 1.2.1
The embedded tomcat version with the ooze distort is 6.0.43. As per ooze install and config instructions the server should run at localhost:11000/oozie , but I am getting the below error when I am starting ooze and checking the status of oozie.
Command (I have removed the http:// as I am not allowed more than 2 links in question)
oozie admin -oozie localhost:11000/oozie -status
Exception = Could not authenticate, Authentication failed, status: 404, message: Not Found
My ooze-site.xml configuration is as below
<property>
<name>oozie.system.id</name>
<value>oozie-iMac</value>
<description>The Oozie system ID.</description>
</property>
<property>
<name>oozie.service.AuthorizationService.security.enabled</name>
<value>false</value>
<description>Specifies whether security is enabled or not.
If disabled any user can manage Oozie system and manage any job.
</description>
</property>
<property>
<name>oozie.service.HadoopAccessorService.kerberos.enabled</name>
<value>false</value>
<description>Indicates if Oozie is configured to use Kerberos.</description>
</property>
<property>
<name>oozie.authentication.type</name>
<value>simple</value>
<description> Defines Authentication used for Oozie HTTP endpoint.
Supported values: simple|kerberos|#AUTHENTICATION_HANDLER_CLASSNAME#
</description>
</property>
<property>
<name>oozie.services</name>
<value>
org.apache.oozie.service.SchedulerService,
org.apache.oozie.service.InstrumentationService,
org.apache.oozie.service.MemoryLocksService,
org.apache.oozie.service.UUIDService,
org.apache.oozie.service.ELService,
org.apache.oozie.service.AuthorizationService,
org.apache.oozie.service.UserGroupInformationService,
org.apache.oozie.service.HadoopAccessorService,
org.apache.oozie.service.JobsConcurrencyService,
org.apache.oozie.service.URIHandlerService,
org.apache.oozie.service.DagXLogInfoService,
org.apache.oozie.service.SchemaService,
org.apache.oozie.service.LiteWorkflowAppService,
org.apache.oozie.service.JPAService,
org.apache.oozie.service.CallbackService,
org.apache.oozie.service.ActionService,
org.apache.oozie.service.ShareLibService,
org.apache.oozie.service.CallableQueueService,
org.apache.oozie.service.ActionCheckerService,
org.apache.oozie.service.RecoveryService,
org.apache.oozie.service.PurgeService,
org.apache.oozie.service.CoordinatorEngineService,
org.apache.oozie.service.BundleEngineService,
org.apache.oozie.service.DagEngineService,
org.apache.oozie.service.CoordMaterializeTriggerService,
org.apache.oozie.service.StatusTransitService,
org.apache.oozie.service.PauseTransitService,
org.apache.oozie.service.GroupsService,
org.apache.oozie.service.ProxyUserService,
org.apache.oozie.service.XLogStreamingService,
org.apache.oozie.service.JvmPauseMonitorService,
org.apache.oozie.service.SparkConfigurationService
</value>
<description>
All services to be created and managed by Oozie Services singleton.
Class names must be separated by commas.
</description>
</property>
<property>
<name>oozie.db.schema.name</name>
<value>oozie</value>
<description> Oozie DataBase Name </description>
</property>
<property>
<name>oozie.service.JPAService.jdbc.driver</name>
<value>com.mysql.jdbc.Driver</value>
<description> JDBC driver class. </description>
</property>
<property>
<name>oozie.service.JPAService.jdbc.url</name>
<value>jdbc:mysql://localhost:3307/oozie?createDatabaseIfNotExist=true</value>
<description> JDBC URL. for MySQL DB connection </description>
</property>
<property>
<name>oozie.service.JPAService.jdbc.username</name>
<value>oozie</value>
<description> DB user name. </description>
</property>
<property>
<name>oozie.service.JPAService.jdbc.password</name>
<value>oozie</value>
<description> DB user password (leave 1 blank space if empty) </description>
</property>
<!-- Added as per Apache OOZIE Install documentation -->
<property>
<name>oozie.service.JPAService.create.db.schema</name>
<value>false</value>
</property>
<property>
<name>oozie.service.JPAService.validate.db.connection</name>
<value>false</value>
</property>
<property>
<name>oozie.service.JPAService.pool.max.active.conn</name>
<value>10</value>
</property>
<property>
<name>oozie.service.HadoopAccessorService.keytab.file</name>
<value>http.keytab</value>
<description>Location of the Oozie user keytab file</description>
</property>
core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
<description> Proxy User Setup for Oozie runs </description>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
<description> Proxy Group configuration for Oozie Run
</description>
</property>
When I open the browser and type the same address it shows the error as below
Tomcat Error on Safari Browser
What am I doing wrong here.
Iam trying to execute under an Hortonworks distribution the map-reduce oozie example...
but it's functional yet...
First, here my Custom Hadoop Configs (from Ambari). I had to modify the core xml to correct my "impersonate" problem...
hadoop.proxyuser.oozie.groups=*
hadoop.proxyuser.oozie.hosts=*
Work well, but now i have this :
Error: E0803 : E0803: IO error, <openjpa-2.1.0-r422266:1071316 fatal store error>
org.apache.openjpa.persistence.RollbackException: The transaction has been rolled back.
See the nested exceptions for details on the errors that occurred. FailedObject: org.apache.oozie.WorkflowJobBean#3f623c47
I already found persons with the same error but no solution... May be can you help me!
My job.properties (on local)
nameNode=hdfs://namenode01:8020
jobTracker=namenode01:8021
queueName=default
examplesRoot=examples
oozie.wf.application.path=${nameNode}/oozie/${examplesRoot}/apps/map-reduce
outputDir=map-reduce
and my workflow.xml (on HDFS)
<workflow-app xmlns="uri:oozie:workflow:0.2" name="map-reduce-wf">
<start to="mr-node"/>
<action name="mr-node">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/oozie/examples/output-data/${outputDir}"/>
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<property>
<name>mapred.mapper.class</name>
<value>org.apache.oozie.example.SampleMapper</value>
</property>
<property>
<name>mapred.reducer.class</name>
<value>org.apache.oozie.example.SampleReducer</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>1</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>/oozie/examples/input-data/text</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>/oozie/examples/output-data/${outputDir}</value>
</property>
</configuration>
</map-reduce>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
Iam trying to execute my wf with :
oozie job -oozie http://edgenode01:11000/oozie -config /home/oozie/examples/apps/no-op/job.properties -run
Thanks a lot!
Open Job History at your-hadoop-host:50030/jobhistory.jsp and find your job. There go to map tasks and see logs.
If you are using derby db - check the db location to see if there are any lock files which owned by user other than intended - if yes remove it and stop and start oozie again.