I'm looking for a way to add the Phoenix Project (http://phoenix.incubator.apache.org/) JAR's to an HBase region server running under CDH 4.5 Cloudera.
There are several mentions in the googlecanon about /usr/lib/.. but this CDH packages at least the hbase dependencies under dirs such as: /usr/share/cmf/lib/cdh4/hbase-0.94.6-cdh4.5.0.jar
I was looking for a way under the Cloudera Manager. Ideally, we'd be able to deploy dependencies like Phoenix under HDFS or similar.
Anybody try this integration?
i am using CDH 4.7 and i have been working with phoenix form past 2-3 months. Its working fine...
below is the instruction i followed...
Phoenix version 2.2.2
copied phoenix-2.2.2.jar to all region servers hbase lib which is under parcels directory (for me its under /opt/cloudera/parcels/CDH-4.7.7-*/lib/hbase/lib
then used phoenix-2.2.2 client.jar to create,upload data to tables
note: you should mention zookeeper quorem server for client.jar
Related
I am trying to update the Hue on hdp on no internet access environment.
However, the compiling progress needs to download some python package from the internet.
Also, I cannot find any pre-built hue package without cdh(I am working on hdp, so install cdh just for hue is inconvenient).
Does anyone have a good idea for it?
It needs more than just Python packages, there are external build dependencies as well as Cloudera and Maven Central packages that it downloads.
Download Hue where you do have internet. Build a tarball using make clean prod, then copy it to the cluster.
It runs in a virtualenv, so as long as built with a matching OS & Python version, should be fine.
I've done this since Hue 3.11 on an HDP cluster.
I'm trying to install Spark2 in my cloudera cluster (evaluation version) following the cloudera's instructions to install this component. I downloaded the CSD, installed it and using the parcel downloaded the component, distribute it but when I try to activate it I'm having this message:
CDH (5.8 and higher) parcel required for SPARK2
(2.2.0.cloudera1-1.cdh5.12.0.p0.142354) is not available.
This is the information of the cluster:
Version: Cloudera Enterprise Data Hub Edition Trial 5.12.1 (#6 built
by jenkins on 20170818-0807 git:
9bdee611802535491d400e03c98ef694a2c77d0a)
Java VM Name: Java HotSpot(TM) 64-Bit Server VM
Java VM Vendor: Oracle Corporation
Java Version: 1.7.0_67
CSD
SPARK2_ON_YARN-2.2.0.cloudera1.jar
Parcel
http://archive.cloudera.com/spark2/parcels/2.2.0.cloudera1/
I'm thinking it could be because my CDH version (5.12.1) and the version of the last spark2 parcel (cdh5.12.0) but I don't find any other package for cdh5.12.1 and my next question is: for cdh5.13.0 which is the spark2 parcel?
The error message is misleading. The real issue is that your cluster is running on Java 1.7. Spark 2.2 is only supported on Java 1.8. Upgrade Java on your cluster and you should be able to install the Spark 2.2 parcel.
Finally solved. The problem was that I need to update de cdh core, after update, spark 2 just works fine.
I have installed Cloudera CDH QuickStart VM 5.5, and I'm running a Sqoop action in my Oozie workflow. I encountered an error that says MySQL JDBC driver is missing and I came across to a SO answer here that says the mysql-connector-java.jar should be placed in Oozie's HDFS shared lib path, under sqoop path.
When I browse the Oozie's HDFS shared lib path, however, I've noticed two sqoop subdirectories to copy the jar.
/user/oozie/share/lib/sqoop
and
/user/oozie/share/lib/lib_20151118030154/sqoop
Aside from sqoop, hive, pig, distcp, and mapreduce-streaming paths also exist on both lib and lib/lib_20151118030154.
So the question is: where do I place my connector jar: on the first or the second one?
What's the difference (or difference of purpose) of these two paths in relation to jars of sqoop, hive, pig, distcp, and mapreduce-streaming for Oozie?
The lib_20151118030154 sub-dir would be the current version of the ShareLibs, as of 18-NOV-2015. The versioning allows you to make updates without stopping the Oozie service -- check the documentation here.
In other words: the Oozie service keeps in memory a list of the JARs in each ShareLib (based on what was present for the latest version at boot time), so that adding a JAR will not make a difference until (a) you stop/restart the service or (b) you resync the service as explained in the doc above.
It sounds very basic but I haven't found clear instructions on how to do this. I'm new on openstack. I have setted up devstack on my laptop, I have created an instance from a cirros image and now I would like this image to run a jar. I was expecting this to work in a similar way as Amazon EMR for instance, but obviously it doesn't. Any help or hints for straightforward tutorials will be appreciated.
The cirros image doesn't include Java nor does it include a facility for installing additional packages. You should boot using a full distribution of some sort (e.g., Fedora, CentOS, Ubuntu, etc), and then proceed to install Java following instruction appropriate for that distribution.
Once you have Java installed, you can install and run your jar file.
I've successfully installed a hadoop cluster on EC2 using Cloudera Manager. All the services are up and running.
Now I wish to use the command line client to add files to hdfs. I've ssh'd into the server and there is no such executable that I can find. I'm assuming I've overlooking something simple. Thanks for any help.
CDH should setup an alias to access hadoop, if not you can find all the hadoop ecosystem projects under /opt/cloudera/parcels/CDH/lib
The binary for hadoop is:
/opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop