how to use cloudera manager for monitor the components of CDH4 - cloudera

I have already install CDH4 without using cloudera manager. I wanted to use cloudera manager so that i can monitor the different components of CDH4. Please suggest me how to use the manager now.

I have recently had to undertake the same task of importing already installed and running clusters into new Cloudera Manager instances.
I would firstly suggest taking your time to read through as much documentation as possible to fully understand the processes and key components.
As a short answer, you need to manually import all your cluster configurations and assignments into Cloudera Manager so that they can be managed. A rough outline of the plan I used is below:
Setup MySQL instance on NEW hardware (can use postgresql)
Create Cloudera Manager user on all servers (must be sudo enabled)
Setup ssh key access between cloudera-manager server and all other hosts
Useful Docs below:
- http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-Manager-Installation-Guide/cmig_install_mysql.html
- http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-Manager-Installation-Guide/cmig_install_path_B.html
Install Cloudera Manager and agent/daemon packages on Cloudera Manager server
Shutdown all services using cluster and cluster services
Save the namespace
Backup Meta Data and Configuration files to MULTIPLE LOCATIONS
Ensure the backup can be loaded by starting a single instance NN
Install Cloudera Manager agent and daemon on all production servers
Start the services on the Cloudera Manager server
Access the Cloudera Manager interface
Skip Setup Wizard
Add all hosts to Cloudera Manager
Create HDFS service - DO NOT start the service
Check hosts assignments are correct
Input all configuration file parameters and verify (this means each servers conf files need to be input manually)
Run host inspector and configuration check
Perform the above process for remaining services
I hope this provides a some assistance for you. If you have any other questions I will be happy to try and assist you as much as I can.
Regards,
James

I just recorded a webinar titled "Installing Cloudera Manager in < 30 mins" for Global Knowledge. Available at: http://www.globalknowledge.com/training/coursewebsem.asp?pageid=9&courseid=20221&catid=248&country=United+States (register in the upper right of page). In the video, I install CM on Ubuntu, set up the core components (Hadoop only), and then browse through some of the graphs for monitoring.

Related

Cloudify architecture

I am trying to setup cloudify in an OpenStack installation using this offline guide.
This guide does not specify much about cloud platform so I have assumed it can be used on OpenStack environment. I am using simple manager blueprint YAML file for bootstrapping .
I have the following questions:
Can I use fabric 1.4.2 with cloudify 3.4.1 ?
If not, I am unable to find wagon-file for fabric 1.4.1.wgn file
Architecture: Can I use CLI inside a network to bootstrap a manager within that network? And this network lies inside OpenStack environment. Can cloudify CLI machine, cloudify Manager and application reside within one network inside openstack? If so, how? Because we would like to test it inside one single network.
(Full disclosure: I wrote the document you linked to.)
Yes you can.
You can find all Wagon files for all versions of the Fabric plugin here: https://github.com/cloudify-cosmo/cloudify-fabric-plugin/releases
Yes.

Saving instances on openstack

I have installed openstack by installing devstack environment, where I am finding difficult to save the work after host reboot.However if I install openstack component wise, will it help me in any ways in saving my work after host reboot, and are there any extra benefits of installing openstack component wise
Installing Openstack component wise would certainly enhance your end to end understanding of how services interact with each other. Devstack is an all in one place sort of installation. For better understanding, I'd recommend to install each component manually following the Openstack documentation.
The cause why the vm data is lost after reboot is because you are launched the vm with the ephemeral disk, which will be gone after the vm reboot.
Try to create the instance with root disk, then you will have the permenant disk.

Can apache drill work with cloudera hadoop?

I am trying to setup apache drill in distributed mode. I already have cloudera hadoop cluster with a master and 2 slaves. From documentation given on apache drill, its not pretty clear if it can be set up with typical cloudera cluster. I could not find any relevant articles. Any kind of help will be appreciated.
Drill can be installed along with Cloudera on the nodes of the cluster independently - and would be able to query the files on HDFS.
Refer the link for installation details -
https://cwiki.apache.org/confluence/display/DRILL/Deploying+Apache+Drill+in+a+Clustered+Environment
I got this working with cloudera hadoop distribution. I already had cloudera cluster installed with all services running.
perform following steps:
Install apache drill on all nodes of the cluster.
Run drill/bin/drillbit.sh on each node.
Configure storage plugin for dfs using apache drill webinterface at host:8047. Update HDFS configurations here.
Run Sqlline : ./sqlline -u jdbc:drill:zk=host1:2181,host2:2181,host3:2181
(2181 is the port number used by zookeeper.)
It may only work with a rudimentary insecure cluster as Drill currently isn't tested / documented to integrate with HDFS + Kerberos for secure Hadoop clusters. Vote and check back on this ticket for Drill secure HDFS support:
https://issues.apache.org/jira/browse/DRILL-3584

Deploying an ASP.NET web site to a remote VPS with Jenkins

I am just starting to get my head wrapped around continuous deployment with Jenkins, but I am running into some roadblocks and I haven't really found very many good, definitive resources on the topic in regards to ASP.NET applications.
I have set up a local build server than successfully pulls down code from a SVN repo, and builds it OK with MSBuild. This works well so far, but now I'd like to automate pushing this compiled code to a development server.
My problem is this - from what I gather based on what I read (which may be an incorrect assumption...) is that the staging server is typically within the same network as the build server, meaning you can share network resources, servers, etc.
In my case, I want to run the Jenkins server on a remote VPS, then deploy to other remote VPSes (so, essentially individual isolated machines communicating with each other).
I have seen alot of terms, but I am very new in my Sys Admin / DevOps type skills.
So, my question is this:
Is it even possible to, using Jenkins on a VPS, to then deploy to any particular server I choose? (I have full access to all of them, so if its a security thing, I can fix that... but they are not within the same network/domain)
What is the method to achieve this? I've seen xcopy, Web Deployment Packages (msdeploy), batch scripts, etc. mentioned, but not really a guidance behind what to use in what situations. Are any of these methods useful to achieve my goal?
Thanks for any help or guidance!
How is your Powershell? ;) You should check out psake.
psake is a build automation tool written in PowerShell. It avoids the
angle-bracket tax associated with executable XML by leveraging the
PowerShell syntax in your build scripts. psake has a syntax inspired
by rake (aka make in Ruby) and bake (aka make in Boo), but is easier
to script because it leverages your existent command-line knowledge.
psake is pronounced sake – as in Japanese rice wine. It does NOT rhyme
with make, bake, or rake.
You can deploy your files to the target server through SSH. Jenkins do support transfers through SSH. All you need to do is setting up a SSH server ex : CopSSH and a user account with admin permissions. and configuring the Jenkins to transfer through SSH.
Create host configurations in the main Jenkins configuration
Add an SSH Server
Add the public key to the remote server (the build server)
Click "Test Configuration"
Save
Configure a job to Publish Over SSH (Post Build Action)
Add Transfer Set.
Refer Publish Over SSH For More details

Best way to install web applications (e.g. Jira) on Unixes?

Can you throw some points on how it is a best way, best practice
to install web application on Unixes?
Like:
where to place app and its bases and so for,
how to configure to be secure and easy to backup,
etc
For example I know such suggestion -- to set uniq user for each app.
App in question is Jira on FreeBSD, but more general suggestions are also welcomed.
Here's what I did for my JIRA install on Fedora Linux:
Create a separate user to run JIRA
Install JIRA under the JIRA user's home directory
Made a soft link "/home/jira/jira" pointing to the JIRA installation directory (the directory as installed contains the version number, something like /home/jira/atlassian-jira-enterprise-4.0-standalone)
Created an /etc/init.d script to run JIRA as a service, and added it to chkconfig so that it runs at system startup - see these instructions
Created a MySQL database for JIRA on a separate data volume
Set up scheduled XML backups via the JIRA admin interface
Set up a remote backup script to dump the MySQL database and copy the DB dump and XML backups to a separate backup server
In order to avoid having to open extra firewall ports, set up an Apache virtual host "jira.myhost.com" and used mod_proxy to forward requests to the JIRA URL.
I set everything up on a virtual machine (an Amazon EC2 instance in my case) and cloned the machine image so that I can easily restart a new instance if the current one goes down.

Resources