I have installed Spark 2.3.0 on Ubuntu 18.04 with two nodes: a master one (ip: 172.16.10.20) and a slave one (ip: 172.16.10.30). I can check that this Spark cluster looks like up and running
jps -lm | grep spark
14165 org.apache.spark.deploy.master.Master --host 172.16.10.20 --port 7077 --webui-port 8080
13701 org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://172.16.10.20:7077
I give it a try with this simple R script (using the sparklyr package):
library(sparklyr)
library(dplyr)
# Set your SPARK_HOME path
Sys.setenv(SPARK_HOME="/home/master/spark/spark-2.3.0-bin-hadoop2.7/")
config <- spark_config()
# Optionally you can modify config parameters here
sc <- spark_connect(master = "spark://172.16.10.20:7077", spark_home = Sys.getenv("SPARK_HOME"), config = config)
# Some test code, copying data to Spark cluster
iris_tbl <- copy_to(sc, iris)
src_tbls(sc)
spark_apply(iris_tbl, function(data) {
return(head(data))
})
All commands are executed, fine and smooth (but a bit slow to my taste), and the spark log is kept in a temp file. When looking into the log file I see no mention of the slave node, which makes me wonder, whether this Spark is really running in a cluster mode.
How may I check that the master-slave relation is really working?
In your case please check
172.16.10.20:8080 url and open executors tab to see the number executors running
Here is URL
http://[driverHostname]:4040 by default
http://<master-ip>:8080(webui-port)
Additional info on a monitor and inspect Spark job executions
command based status check stack question
Related
I have been struggling to get a Multi Node H2O cluster up and running using AWS EC2 instances.
I have followed the advice from this thread, but still struggle with the nodes not seeing each other. The EC2 instances all use the same AMI that I have pre-built, so the same h2o.jar file is on all of them,
I have also tried the following troubleshooting advice:
Name cluster -name
Rather use -network flag
Open port 54321 on security group as 0.0.0.0
Here are my steps:
1) Start AWS EC2 in same availability zone and get private IPs and network cidr (172.31.0.0/20). Put ip addresses into flatfile.txt
172.31.8.210:54321
172.31.9.207:54321
172.31.13.136:54321
2) Copy the flatfile.txt to all servers to which I want to connect as nodes and start H2O
# cluster_run
library(h2oEnsemble)
library(ssh)
ips <- gsub("(.*):.*", "\\1", readLines("flatfile.txt"))
start_cluster <- function(ip){
# Copy flatfile across
session <- ssh_connect(paste0("ubuntu#", ip), keyfile = "mykey.pem")
scp_upload(session, "flatfile.txt")
# Ensure no h2o instance is already running
out <- ssh_exec_wait(session, "sudo pkill java")
# Start H2O cluster
cmd <- gsub("\\s+", " ", paste0("ssh -i mykey.pem -o 'StrictHostKeyChecking no' ubuntu#", ip,
" 'java -Xmx20g
-jar /home/rstudio/R/x86_64-pc-linux-gnu-library/3.5/h2o/java/h2o.jar
-name mycluster
-network 172.31.0.0/20
-flatfile flatfile.txt
-port 54321 &'"))
system(cmd, wait = FALSE)
}
start_cluster(ips[3])
start_cluster(ips[2])
start_cluster(ips[1])
3) Once this has been done, I now want to connect R to my new Multi Node cluster
h2o.init(startH2O = F)
h2o.shutdown(prompt = FALSE)
This is where I see that the nodes aren't being picked up:
I have also seen that when I start the H2O cluster on the different nodes, it isnt picking up the other machines within the network:
You need to add port 54321+1 (so 54322) to the security group, as well.
The internal communication goes through 54322.
(I would also specify /16 for -network because it’s easier for other people to understand. For example, even if you are sure /20 is technically correct for your network setup, I can’t easily be sure. :-)
Depending on the actual network setup, you probably don’t need -network flag at all. Your instances probably only have one interface.
I want to server R model using App Engine.
I am following this example R with app engine, but stuck. I tried several methods but still have issues. Any guidance on this issue?
Please refer to my code
app.yaml
runtime: custom
env: flex
Dockerfile
FROM gcr.io/gcer-public/plumber-appengine
LABEL maintainer="mark"
RUN R -e "install.packages(c('plumber'), repos='http://cran.rstudio.com/')"
WORKDIR /payload/
COPY [".", "./"]
EXPOSE 8080
ENTRYPOINT ["R", "-e", "pr <- plumber::plumb(commandArgs()[4]); pr$run(host='0.0.0.0', port=8080)"]
CMD ["schedule.R"]
schedule.R
#* #get /demoR
get_predict_length <- function(){
dataset <- iris
# create the model
model <- lm(Petal.Length ~ Petal.Width, data = dataset)
petal_width = "0.4"
#petal_width = '0.4'
# convert the input to a number
petal_width <- as.numeric(petal_width)
#create the prediction data frame
prediction_data <- data.frame(Petal.Width=petal_width)
# create the prediction
predict(model,prediction_data)
}
I deploy using 'gcloud app deploy and its successful. I get a link 'https://iris-custom-dot-my-project-name.appspot.com/'.
Final output in logs
Stackdriver logs show:
Starting server to listen on port 8080
when I click on app engine version https://iris-custom-dot-my-project-name.appspot.com/' , I get below message:
This site can’t be reached
Office network issue
In my case the real issue was, my office network blocked port 8080, so when I connected from home or in office over mobile hotspot, it worked.
In general follow below steps to resolve the issue.
1) search google 'my public IP address"
2) Add your IP in firewall rules will resolve the issue.
either use gcloud commands
https://cloud.google.com/sdk/gcloud/reference/app/firewall-rules/create
or
using GCP UI ( you can use any priority number which has not been used already)
It is possible to connect sparklyr with a remote hadoop cluster or it is only possible to use it local?
And if it is possible, how? :)
In my opinion the connection from R to hadoop via spark is very important!
Do you mean Hadoop or Spark cluster? If Spark, you can try to connect through Livy, details here:
https://github.com/rstudio/sparklyr#connecting-through-livy
Note: Connecting to Spark clusters through Livy is under experimental development in sparklyr
You could use livy which is a Rest API service for the spark cluster.
once you have set up your HDinsight cluster on Azure check for livy service using curl
#curl test
curl -k --user "admin:mypassword1!" -v -X GET
#r-studio code
sc <- spark_connect(master = "https://<yourclustername>.azurehdinsight.net/livy/",
method = "livy", config = livy_config(
username = "admin",
password = rstudioapi::askForPassword("Livy password:")))
Some useful URL
https://learn.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-livy-rest-interface
I have tried many options both in Mac and in Ubuntu.
I read the Rserve documentation
http://rforge.net/Rserve/doc.html
and that for the Rserve and RSclient packages:
http://cran.r-project.org/web/packages/RSclient/RSclient.pdf
http://cran.r-project.org/web/packages/Rserve/Rserve.pdf
I cannot figure out what is the correct workflow for opening/closing a connection within Rserve and for shutting down Rserve 'gracefully'.
For example, in Ubuntu, I installed R from source with the ./config --enable-R-shlib (following the Rserve documentation) and also added the 'control enable' line in /etc/Rserve.conf.
In an Ubuntu terminal:
library(Rserve)
library(RSclient)
Rserve()
c<-RS.connect()
c ## this is an Rserve QAP1 connection
## Trying to shutdown the server
RSshutdown(c)
Error in writeBin(as.integer....): invalid connection
RS.server.shutdown(c)
Error in RS.server.shutdown(c): command failed with satus code 0x4e: no control line present (control commands disabled or server shutdown)
I can, however, CLOSE the connection:
RS.close(c)
>NULL
c ## Closed Rserve connection
After closing the connection, I also tried the options (also tried with argument 'c', even though the connection is closed):
RS.server.shutdown()
RSshutdown()
So, my questions are:
1- How can I close Rserve gracefully?
2- Can Rserve be used without RSclient?
I also looked at
How to Shutdown Rserve(), running in DEBUG
but the question refers to the debug mode and is also unresolved. (I don't have enough reputation to comment/ask whether the shutdown works in the non-debug mode).
Also looked at:
how to connect to Rserve with an R client
Thanks so much!
Load Rserve and RSclient packages, then connect to the instances.
> library(Rserve)
> library(RSclient)
> Rserve(port = 6311, debug = FALSE)
> Rserve(port = 6312, debug = TRUE)
Starting Rserve...
"C:\..\Rserve.exe" --RS-port 6311
Starting Rserve...
"C:\..\Rserve_d.exe" --RS-port 6312
> rsc <- RSconnect(port = 6311)
> rscd <- RSconnect(port = 6312)
Looks like they're running...
> system('tasklist /FI "IMAGENAME eq Rserve.exe"')
> system('tasklist /FI "IMAGENAME eq Rserve_d.exe"')
Image Name PID Session Name Session# Mem Usage
========================= ======== ================ =========== ============
Rserve.exe 8600 Console 1 39,312 K
Rserve_d.exe 12652 Console 1 39,324 K
Let's shut 'em down.
> RSshutdown(rsc)
> RSshutdown(rscd)
And they're gone...
> system('tasklist /FI "IMAGENAME eq Rserve.exe"')
> system('tasklist /FI "IMAGENAME eq Rserve_d.exe"')
INFO: No tasks are running which match the specified criteria.
Rserve can be used w/o RSclient by starting it with args and/or a config script. Then you can connect to it from some other program (like Tableau) or with your own code. RSclient provides a way to pass commands/data to Rserve from an instance of R.
Hope this helps :)
On a Windows system, if you want to close an RServe instance, you can use the system function in R to close it down.
For example in R:
library(Rserve)
Rserve() # run without any arguments or ports specified
system('tasklist /FI "IMAGENAME eq Rserve.exe"') # run this to see RServe instances and their PIDs
system('TASKKILL /PID {yourPID} /F') # run this to kill off the RServe instance with your selected PID
If you have closed your RServe instance with that PID correctly, the following message will appear:
SUCCESS: The process with PID xxxx has been terminated.
You can check the RServe instance has been closed down by entering
system('tasklist /FI "IMAGENAME eq Rserve.exe"')
again. If there are no RServe instances running any more, you will get the message
INFO: No tasks are running which match the specified criteria.
More help and info on this topic can be seen in this related question.
Note that the 'RSClient' approach mentioned in an earlier answer is tidier and easier than this one, but I put it forward anyway for those who start RServe without knowing how to stop it.
If you are not able to shut it down within R, run the codes below to kill it in terminal. These codes work on Mac.
$ ps ax | grep Rserve # get active Rserve sessions
You will see outputs like below. 29155 is job id of the active Rserve session.
29155 /Users/userid/Library/R/3.5/library/Rserve/libs/Rserve
38562 0:00.00 grep Rserve
Then run
$ kill 29155
I'm trying to run analysis in parrallel in R on an AWS EC2 cluster. I am using
starcluster to setup and manage the EC2 cluster, and am trying to use snow and
foreach in R. To start off, I have 2 nodes in the cluster, 1 master and 1
worker.
starcluster start mycluster
starcluster listinstances
-----------------------------------------
mycluster (security group: #sc-mycluster)
-----------------------------------------
....
Cluster nodes:
master running i-xxxxxxxxx masterIP.compute-1.amazonaws.com
node001 running i-xxxxxxxxx node001IP.compute-1.amazonaws.com
Total nodes: 2
starcluster sshmaster mycluster
I then start R and load the snow package and try to create a cluster
object.
R
library("snow")
cl = makeCluster(c("masterIP.compute-1.amazonaws.com", "node001IP.compute-1.amazonaws.com"), type = "SOCK")
This, however, gives me the following error message:
The authenticity of host 'masterIP.compute-1.amazonaws.com (xx.xxx.xx.xx)' can't be established.
ECDSA key fingerprint is xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'masterIP.compute-1.amazonaws.com,xx.xxx.xx.xx' (ECDSA) to the list of known hosts.
Permission denied (publickey).
So I tried copying my ssh key (keyname.rsa to be specific) to the .ssh file
on EC2 and trying again. That still didn't work; I received the same
Permission denied (publickey). error. It was my thought that starcluster
handled the setup of ssh and communication between nodes, so I'm a little
confused as to why I'm not able to set this up. I also tried to just add node001, so cl = makeCluster(c("node001IP.compute-1.amazonaws.com"), type = "SOCK"), but the same error occurs.
It turns out, after much tinkering, that all that was needed was an update to R version 2.15. The command cl = makeCluster(c("masterIP.compute-1.amazonaws.com", "node001IP.compute-1.amazonaws.com"), type = "SOCK") worked perfectly after that.