Cloudera - CCP Data Engineer Certification ( DE575) - cloudera

Is it possible to clear the CCP Data Engineer certification(DE575) by only mastering Hive, Sqoop, Flume and Oozie? If not what else would be needed?
FYI - I don't have a cloudera license to post this question in cloudera community.
Thanks
Max

Related

Problem INS-35423 on installing Oracle 11g RAC (empty cluster nodes)

I am trying to install Oracle 11g RAC for training purposes on a CentOS 6.9 machine.
I have succesfully installed the grid and clusterware services and have two nodes (rac01, rac02)
The following does not report any serious problem
./cluvfy stage -pre dbinst -n rac01,rac02
As a matter of fact the only problem reported is a missing pdksh package (which is not a real problem) and the fact the pool of NTP servers used by the nodes return different IP addresses for each node (to be expected since the pool does not always return the same IP address).
Similary the following reports that clusterware services are up and running
[root#rac01 bin]# ./crsctl check cluster -all
**************************************************************
rac01:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
rac02:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
I am trying to install the database as the oracle user but when the time comes to select a RAC installation no nodes are reported.
Does anybody have any clue what other possible problems may exist and how/where to look?
I have no idea why the following worked (someone else may explain it) but I re-run the grid installer from each of the nodes as follows
[oracle#rac01] rac01$ /u01/app/11.2.0/grid/oui/bin/runInstaller -ignoreSysPrereqs -updateNodeList ORACLE_HOME=/u01/app/11.2.0/grid "CLUSTER_NODES={rac01,rac02}" CRS=true LOCAL_NODE=rac01
[oracle#rac02] rac02$ /u01/app/11.2.0/grid/oui/bin/runInstaller -ignoreSysPrereqs -updateNodeList ORACLE_HOME=/u01/app/11.2.0/grid "CLUSTER_NODES={rac01,rac02}" CRS=true LOCAL_NODE=rac02
and afterwards I re-run the db installer and the rac nodes appeared in the list

Attempting to encrypt existing PVs in EKS

I'm attempting to encrypt our existing PVs in EKS. Basically been searching the net and haven't come up with a solid solution. Our K8s version is 1.11 on EKS. Our PVs are EBS volumes. We currently have account level ebs encryption enabled but didn't when these resources were made.
I attempted to stop the ASG and the node. Created a snapshot and new encrypted volume then ran:
kubectl patch pv pvc-xxxxxxxx-xxxxx-xxxxxxxx -p '{"spec":{"awsElasticBlockStore":{"volumeID":"aws://us-east-1b/vol-xxxxx"}}}'
but was met with:
The PersistentVolume is invalid: spec.persistentvolumesource: Forbidden: is immutable after creation
I'm looking for a solution to this issue or to validate that is in fact not possible. Potential relevant github issue:
https://github.com/kubernetes/kubernetes/issues/59642
Thanks in advance

In which version of Corda (open-source,Enterprise) is SGX technology implemented?

We want to ascertain whether transaction data is protected from manipulation and all the transactions are secured, in Corda.
On reading the "Multilateral Ledger" topic in the notes given in the URL https://www.corda.net/discover/technology.html, we learnt that Corda uses SGX technology that provides full encryption of transactions.
Shall we know in which version of Corda (open source or Enterprise) is it implemented?
The article in following link states that the SGX is a feature of Corda Enterprise -
https://docs.corda.net/design/sgx-infrastructure/design.html

Corda Notary Cluster in Open Source

Is it possible to configure/use a Notary Cluster in Open Source version? Or is it available only in R3 Corda (former Enterprise)?
There is notary Demo which may help you out in here.
Link:
https://github.com/corda/corda/tree/master/samples/notary-demo
It will just help get your head around it.

Using snow (and snowfall) with AWS for parallel processing in R

In relation to my earlier similar SO question , I tried using snow/snowfall on AWS for parallel computing.
What I did was:
In the sfInit() function, I provided the public DNS to socketHosts parameter like so
sfInit(parallel=TRUE,socketHosts =list("ec2-00-00-00-000.compute-1.amazonaws.com"))
The error returned was Permission denied (publickey)
I then followed the instructions (I presume correctly!) on http://www.imbi.uni-freiburg.de/parallel/ in the 'Passwordless Secure Shell (SSH) login' section
I just cat the contents of the .pem file that I created on AWS into the ~/.ssh/authorized_keys of the AWS instance I want to connect to from my master AWS instance and for the master AWS instance as well
Is there anything I am missing out ?
I would be very grateful if users can share their experiences in the use of snow on AWS.
Thank you very much for your suggestions.
UPDATE:
I just wanted to update the solution I found to my specific problem:
I used StarCluster to setup my AWS cluster : StarCluster
Installed package snowfall on all the nodes of the cluster
From the master node issued the following commands
hostslist <- list("ec2-xxx-xx-xxx-xxx.compute-1.amazonaws.com","ec2-xx-xx-xxx-xxx.compute-1.amazonaws.com")
sfInit(parallel=TRUE, cpus=2, type="SOCK",socketHosts=hostslist)
l <- sfLapply(1:2,function(x)system("ifconfig",intern=T))
lapply(l,function(x)x[2])
sfStop()
The ip information confirmed that the AWS nodes were being utilized
Looks not that bad but the pem file is wrong. But it is sometimes not that simple and many people have to fight with this issues. A lot of tips you can find in this post:
https://forums.aws.amazon.com/message.jspa?messageID=241341
Or check google for other posts.
From my experience most people have problems in these steps:
Can you log onto the machines via ssh? (ssh ec2-00-00-00-000.compute-1.amazonaws.com). Try to use the public DNS, not the public IP to connect.
You should check your "Security groups" in AWS if the 22 port is open for all machines!
If you plan to start more than 10 worker machines you should work on a MPI installation on your machines (much better performance!)
Markus from cloudnumbers.com :-)
I believe #Anatoliy is correct: you're using an X.509 certificate. For the precise steps to take to add the SSH keys, look at the "Types of credentials" section of the EC2 Starters Guide.
To upload your own SSH keys, take a look at this page from Alestic.
It is a little confusing at first, but you'll want to keep clear which are your access keys, your certificates, and your key pairs, which may appear in text files with DSA or RSA.

Resources