When I do rhadoop example, below errors are occurred.
is running beyond virtual memory limits. Current usage: 121.2 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
hadoop streaming failed with error code 1
How can I fix it?
My hadoop settings.
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/usr/local/hadoop-2.7.3/data/yarn/nm-local-dir</value>
</property>
<property>
<name>yarn.resourcemanager.fs.state-store.uri</name>
<value>/usr/local/hadoop-2.7.3/data/yarn/system/rmstore</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
<property>
<name>yarn.web-proxy.address</name>
<value>0.0.0.0:8089</value>
</property>
</configuration>
I got almost same error while running a Spark application on YARN cluster.
"Container [pid=791,containerID=container_1499942756442_0001_02_000001] is running beyond virtual memory limits. Current usage: 135.4 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container."
I resolved it by disabling virtual memory check in the file yarn-site.xml
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
This one setting was enough in my case.
I referred below site.
http://crazyadmins.com/tag/tuning-yarn-to-get-maximum-performance/
Then I got to know that I can change memory allocation of mapreduce.
I changed mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>2000</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>2000</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>1600</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>1600</value>
</property>
</configuration>
Related
Sometimes the explorer fails to connect to the HDFS because of network fluctuations. In this case, the task status is always running.
You need to set the timeout period for HDFS connections as follows:
<configuration>
<property>
<name>ipc.client.connect.timeout</name>
<value>3000</value>
</property>
<property>
<name>ipc.client.connect.max.retries.on.timeouts</name>
<value>3</value>
</property>
</configuration>
I'm running java application in oozie and oozie adding something to classpath. How do I know? When I run this application without oozie it works perfectly fine, but with oozie I get
java.lang.NoSuchMethodError: org.apache.hadoop.yarn.webapp.util.WebAppUtils.getProxyHostsAndPortsForAmFilter(Lorg/apache/hadoop/conf/Configuration;)Ljava/util/List;
at org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer.initFilter(AmFilterInitializer.java:40)
at org.apache.hadoop.http.HttpServer.<init>(HttpServer.java:272)
at org.apache.hadoop.yarn.webapp.WebApps$Builder$2.<init>(WebApps.java:222)
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:219)
at org.apache.hadoop.mapreduce.v2.app.client.MRClientService.serviceStart(MRClientService.java:136)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1058)
I even configured
<property>
<name>oozie.use.system.libpath</name>
<value>false</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.user.classpath.first</name>
<value>true</value>
</property>
But it doesn't help. How I can tell oozie to totally **** off my classpath?
I'm using hadoop-2.7.2 and oozie-4.0.1, what should be the jobTracker value in job.properties file of oozie workflow. I referred this link;
http://hadooptutorial.info/apache-oozie-installation-on-ubuntu-14-04/
which states that, in YARN architecture the job tracker runs on 8032 port and i'm currently using this. But in mapred-site.xml of hadoop i'm having the value hdfs://localhost:54311 for job tracker property.
I'm confused, can any one explain me or provide some useful links for installing oozie and running jobs on oozie.
Currently, i'm not able to run workflow jobs on oozie, it is in a Running state for a long time and then it is getting suspended with a connection error. Job DAG is also not getting generated, it is throwing some UI Exception.
Please anyone help me with this.
In your properties file just pass the Resorucemanager address which you have configured in the yarn-site.xml or directly parse the resourcemanager address in workflow.xml file as
<job-tracker>localhost:8032</job-tracker>
While running properties file you need to specify in which host the oozie server will be running, I think in that part you didn't face any issues right. Then paste the error message and update the question.
EDITED:
Configurations needed to be in yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<description>NM Webapp address.</description>
<name>yarn.nodemanager.webapp.address</name>
<value>${yarn.nodemanager.hostname}:8042</value>
</property>
<property>
<description>hostname </description>
<name>yarn.nodemanager.hostname</name>
<value>localhost</value>
</property>
you can either specify hostname or localhost for Pesudo node cluster.
for HA cluster need the below
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
in Production Environment , probably you have configured a High-Availbility yarn cluster. In this case , the oozie job tracker config in job.properties should be the configuration value of yarn.resourcemanager.cluster-id.
a cut of my yarn configuration :
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>datayarn</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>resourcemanager1,resourcemanager2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.resourcemanager1</name>
<value>11.11.11.11</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.resourcemanager2</name>
<value>11.11.11.12</value>
</property>
So , the jobTracker value should be:datayarn
I have configured and built ooze 4.2.0 with the following components
* Hadoop - 2.6.0
* OS - Mac 10.7
* Hive - 1.1.2
* Hbase - 1.2.1
The embedded tomcat version with the ooze distort is 6.0.43. As per ooze install and config instructions the server should run at localhost:11000/oozie , but I am getting the below error when I am starting ooze and checking the status of oozie.
Command (I have removed the http:// as I am not allowed more than 2 links in question)
oozie admin -oozie localhost:11000/oozie -status
Exception = Could not authenticate, Authentication failed, status: 404, message: Not Found
My ooze-site.xml configuration is as below
<property>
<name>oozie.system.id</name>
<value>oozie-iMac</value>
<description>The Oozie system ID.</description>
</property>
<property>
<name>oozie.service.AuthorizationService.security.enabled</name>
<value>false</value>
<description>Specifies whether security is enabled or not.
If disabled any user can manage Oozie system and manage any job.
</description>
</property>
<property>
<name>oozie.service.HadoopAccessorService.kerberos.enabled</name>
<value>false</value>
<description>Indicates if Oozie is configured to use Kerberos.</description>
</property>
<property>
<name>oozie.authentication.type</name>
<value>simple</value>
<description> Defines Authentication used for Oozie HTTP endpoint.
Supported values: simple|kerberos|#AUTHENTICATION_HANDLER_CLASSNAME#
</description>
</property>
<property>
<name>oozie.services</name>
<value>
org.apache.oozie.service.SchedulerService,
org.apache.oozie.service.InstrumentationService,
org.apache.oozie.service.MemoryLocksService,
org.apache.oozie.service.UUIDService,
org.apache.oozie.service.ELService,
org.apache.oozie.service.AuthorizationService,
org.apache.oozie.service.UserGroupInformationService,
org.apache.oozie.service.HadoopAccessorService,
org.apache.oozie.service.JobsConcurrencyService,
org.apache.oozie.service.URIHandlerService,
org.apache.oozie.service.DagXLogInfoService,
org.apache.oozie.service.SchemaService,
org.apache.oozie.service.LiteWorkflowAppService,
org.apache.oozie.service.JPAService,
org.apache.oozie.service.CallbackService,
org.apache.oozie.service.ActionService,
org.apache.oozie.service.ShareLibService,
org.apache.oozie.service.CallableQueueService,
org.apache.oozie.service.ActionCheckerService,
org.apache.oozie.service.RecoveryService,
org.apache.oozie.service.PurgeService,
org.apache.oozie.service.CoordinatorEngineService,
org.apache.oozie.service.BundleEngineService,
org.apache.oozie.service.DagEngineService,
org.apache.oozie.service.CoordMaterializeTriggerService,
org.apache.oozie.service.StatusTransitService,
org.apache.oozie.service.PauseTransitService,
org.apache.oozie.service.GroupsService,
org.apache.oozie.service.ProxyUserService,
org.apache.oozie.service.XLogStreamingService,
org.apache.oozie.service.JvmPauseMonitorService,
org.apache.oozie.service.SparkConfigurationService
</value>
<description>
All services to be created and managed by Oozie Services singleton.
Class names must be separated by commas.
</description>
</property>
<property>
<name>oozie.db.schema.name</name>
<value>oozie</value>
<description> Oozie DataBase Name </description>
</property>
<property>
<name>oozie.service.JPAService.jdbc.driver</name>
<value>com.mysql.jdbc.Driver</value>
<description> JDBC driver class. </description>
</property>
<property>
<name>oozie.service.JPAService.jdbc.url</name>
<value>jdbc:mysql://localhost:3307/oozie?createDatabaseIfNotExist=true</value>
<description> JDBC URL. for MySQL DB connection </description>
</property>
<property>
<name>oozie.service.JPAService.jdbc.username</name>
<value>oozie</value>
<description> DB user name. </description>
</property>
<property>
<name>oozie.service.JPAService.jdbc.password</name>
<value>oozie</value>
<description> DB user password (leave 1 blank space if empty) </description>
</property>
<!-- Added as per Apache OOZIE Install documentation -->
<property>
<name>oozie.service.JPAService.create.db.schema</name>
<value>false</value>
</property>
<property>
<name>oozie.service.JPAService.validate.db.connection</name>
<value>false</value>
</property>
<property>
<name>oozie.service.JPAService.pool.max.active.conn</name>
<value>10</value>
</property>
<property>
<name>oozie.service.HadoopAccessorService.keytab.file</name>
<value>http.keytab</value>
<description>Location of the Oozie user keytab file</description>
</property>
core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
<description> Proxy User Setup for Oozie runs </description>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
<description> Proxy Group configuration for Oozie Run
</description>
</property>
When I open the browser and type the same address it shows the error as below
Tomcat Error on Safari Browser
What am I doing wrong here.
I'm trying to create an OOZIE job with a custom inputformat. I am using the new API and have set :
<property>
<name>mapred.mapper.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.reducer.new-api</name>
<value>true</value>
</property>
The property name I'm triying is :
<property>
<name>mapreduce.job.inputformat.class</name
<value>org.lab41.dendrite.generator.kronecker.mapreduce.lib.input.QuotaInputFormat</value>
</property>
Is the correct property name?
you can see this page
https://cwiki.apache.org/confluence/display/OOZIE/Map+Reduce+Cookbook
correct property name should be 'mapred.input.format.class'
so, you can write like this :
<property>
<name>mapred.input.format.class</name>
<value>org.lab41.dendrite.generator.kronecker.mapreduce.lib.input.QuotaInputFormat</value>
</property>