I have a setup with Hue 3.8 and HDP 2.3 installed through Ambari.
When I am trying to run a dummy script using Oozie dashboard, it creates a job.properties file for the same. This file contains wrong mapping for hdfs URL because of which the script fails.
Need help to understand from where this properties file is getting populated.
Any help would be highly appreciated.
Thanks.
It comes from the HDFS section of the hue.ini config file.
You should check this value:
[hadoop]
[[hdfs_clusters]]
[[[default]]]
# Enter the filesystem uri
fs_defaultfs=hdfs://localhost:8020
Related
How to pass a file as command line argument to spark job in Oozie workflow? My spark job is expecting a file as command line argument, but when I am pass that file in the workflow as /file/location it is not picking up that file.
I got one workaround, if we put the file in a custom-directory in ozzie shared library with a few additional change in job.properties
oozie.use.system.libpath=true
oozie.action.sharelib.for.spark=spark,custom-directory
oozie.libpath=true
Then we need to update the shared lib using below command:
oozie admin -auth SIMPLE -sharelibupdate
After that we can directly pick up the file by just using the name of the file, which we placed in custom-directory, in the oozie workflow.
slapd executable file is not present in /etc/rc.d/init.d and also slapd.d directory is not present.
Please suggest me how to create it.
PS: I am able to start ldap with file "/usr/local/libexec/slapd"
I am new to devstack and was trying to understand the way it works. I have one question regarding generation of tempest.conf file. I can not understand how this file gets generated and which part of the code generates it.
Is it always generated into /opt/stack/tempest/etc/ directory. What if I have a different folder structure and I want to generate my tempest.conf file in suppose /opt/stack/new/tempest/etc/ directory.
Any help is appreciated, thanks.
First of all, Tempest framework in the OpenStack is works for openstack development. You can find it in github.
the tempest.conf file is the configuration file of Tempest, from the README.rst, you can know:
To start you need to create a configuration file. The easiest way to create a configuration file is to generate a sample in the etc/ directory
$ cd $TEMPEST_ROOT_DIR
$ oslo-config-generator --config-file \
tempest/cmd/config-generator.tempest.conf \
--output-file etc/tempest.conf
You can just modify the output file path in the --output-file param.
I am trying to place an updated jar under lib path and removing the old jar. Unfortunately , I see the old logs in oozie console which were present in old jar. For confidential purpose I am unable to show logs here. But I am doing the below steps:
Replacing a jar (mycode.jar) under lib folder which is mentioned in workkflow.xml
Submitted the oozie job using oozie job -oozie http://host -config job.properties -run
When I see logs in console, I could see old jar(older version of mycode.jar) logs even if jar is replaced.
If you are talking about the lib directory in the oozie workflow application then you need not to do anything. The next execution of the workflow will automatically pick the new (updated) jar.
For updating the jars into share lib /user/oozie/share/lib/lib_*/* then after replacing the jar, you need to execute the following command to update the share lib into oozie server.
oozie admin -sharelibupdate
Hope this will help. Thanks.
To make sure issue is same I'll narrate what I was facing:
created a MapReduce JAR and placed it in lib folder.
Ran oozie(MapReduce action) job and picked the JAR as expected and ran fine.
I had some functionality changes in my code(JAR) so I added new log statements to make sure new JAR is being picked. Built the JAR and replaced the old JAR with newly built JAR in lib folder(hdfs)
Ran oozie job again, code from old JAR was executed because new log statements did not show up.
After few search I found following tips:
Clear the Yarn Cache: found this in HortonWorks site(https://community.hortonworks.com/articles/92339/how-to-clear-local-file-cache-and-user-cache-for-y.html) - pasting content below for reference
Short Description:
To use different version jar file with same name, clear cache on all NodeManager hosts to prevent the application using old jar
a. Find out the cache location by checking the value of the yarn.nodemanager.local-dirs property
< property >
< name >yarn.nodemanager.local-dirs< /name>
< value>/hadoop/yarn/local</value>
< /property>
b. Remove filecache and usercache folder located inside the folders that is specified in yarn.nodemanager.local-dirs.
[yarn#node2 ~]$ cd /hadoop/yarn/local/
[yarn#node2 local]$ ls filecache nmPrivate spark_shuffle usercache
[yarn#node2 local]$ rm -rf filecache/ usercache/
c. Restart YARN service.
I was unable to clear cache because I did not have the necessary access. Thus I followed below workaround
Rename the Package or class, since this package/class was written by me, I had the liberty to simply rename the class, thus in oozie when new Class name was looked up, automatically the new functionality was executed.
Option 2 may not be viable for many and the question remains open as to why oozie does not pick New JAR/Class.
I'm writing an oozie java action which has my custom code in a jar file in the job ./lib folder.
I would also like to add to the classpath a jar in a folder external to my job (i.e. /home/me/otherjars/spark-assembly.jar).
The ./lib folder gets added to the classpath automatically. How can I get oozie to also add the external jar?
The oozie.libpath property is definitely what you need. Please check...
the Oozie documentation
this Oozie JIRA about global/local scope for that property
this orphan thread about precedence order (search for that
phrase)
this post and this other post, for example
The Bestway to use any custom Jars in Oozie useing, Once Oozie Sharedlib Installed in Cluster, you can mention place the Jar, in Sub Folder and pass the parameter
oozie.use.system.libpath = true
These will call Jar when every the Jobs are getting started.
Another option you can use, is adding Custom Path with UDF jar in hadoop_env.sh file under Hadoop ClassPath, These required your Hadoop restart to take effect, along with it also required you Custom JAR Path should be available in all the Nodes of Hadoop Cluster.