R - Error in Rmpi with snow - r

I'm trying to execute an MPI cluster over 3 different computers inside a local area network with the following R code:
library(plyr)
library(class)
library(snow)
cl <- makeCluster(spec=c("localhost","ip1","ip2"),master="ip3")
but I'm getting an error:
Error in mpi.comm.spawn(slave = mpitask, slavearg = args, nslaves = count, :
Calloc could not allocate memory (18446744071562067968 of 4 bytes)
Warning messages:
1: In if (nslaves <= 0) stop("Choose a positive number of slaves.") : [...]
2: In mpi.comm.spawn(slave = mpitask, slavearg = args, nslaves = count, :
NA produced by coercition
What is this error due? I couldn't find any relevant topic on the current subject.

When calling makeCluster to create an MPI cluster, the spec argument should either be a number or missing, depending on whether you want the workers to be spawned or not. You can't specify the hostnames, as you would when creating a SOCK cluster. And in order to start workers on other machines with an MPI cluster, you have to execute your R script using a command such as mpirun, mpiexec, etc., depending on your MPI installation, and you specify the hosts to use via arguments to mpirun, not to makeCluster.
In your case, you might execute your script with:
$ mpirun -n 1 -H ip3,localhost,ip1,ip2 R --slave -f script.R
Since -n 1 is used, your script executes only on "ip3", not all four hosts, but MPI knows about the other three hosts, and will be able to spawn processes to them.
You would create the MPI cluster in that script with:
cl <- makeCluster(3)
This should cause a worker to be spawned on "localhost", "ip1", and "ip2", with the master process running on "ip3" (at least with Open MPI: I'm not sure about other MPI distributions). I don't believe the "master" option is used with the MPI transport: it's primarily used by the SOCK transport.
You can get lots of information about mpirun from its man page.

You can even try out executing the code in cluster nodes by following:
Create a file with name
nodelist -> Write down the machine names inside that one below the other.
Using mpirun try the following command in terminal :
mpirun -np (no.of processes) -machinefile (path where your nodelist file is present) Rscript (filename.R). Ignore round braces.
By default it will take the first node as the master and spawn the process to rest of the nodes including itself as slaves.

Related

Vtune throws aps error while running MPI samples

I am trying to profile my MPI application using Intel Vtune. While trying to run the below two commands, I am getting error.
I_MPI_DEBUG=5.
I tried two things,
export I_MPI_DEBUG = 5
mpirun -np 4 aps ~/binary/vasp_std_2022
mpirun -genv I_MPI_DEBUG=5 -np 4 aps ~/binary/vasp_std_2022
vtune: Warning: Memory bandwidth collection is not supported inside a virtual machine since uncore events cannot be collected. For full functionality, consider using a bare-metal environment.
vtune: Warning: CPU frequency data collection is not supported on this platform.
vtune: Error: amplxe-perf:
Using CPUID GenuineIntel-6-6A-6
both cgroup and no-aggregation modes only available in system-wide mode
Usage: perf stat [] []
-G, --cgroup monitor event in cgroup name only
-A, --no-aggr disable CPU count aggregation
-a, --all-cpus system-wide collection from all CPUs
--for-each-cgroup expand events for each cgroup
vtune: Error: Preliminary validation of the requested events failed.
aps Error: Cannot run the collection.
aps Error: Cannot process configs directory.
aps Error: Cannot process configs directory.
aps Error: Cannot process configs directory.
In the command you are using 'aps' while profiling.
If you are using aps command you need to use some parameters like collection-mode as --collection-mode=<mode>
This parameter is used to specify a comma separated list of data to collect. Possible values:
hwc : hardware counters
omp : openMP statistics
mpi : MPI statistcs
all : all possible data(default)
Try to use the command as below
mpirun -genv I_MPI_DEBUG= 5 -np 4 aps --collect-mode=omp ./obj

MPI: Pin each instance to certain cores on each node

I want to execute several instances of my program with OpenMPI 2.11. Each instance runs on its own node (-N 1) on my cluster. This works fine. I now want to pin each program-instance to the first 2 cores of its node. To do that, it looks like I need to use rankfiles. Here is my rankfile:
rank 0=+n0 slot=0-1
rank 1=+n1 slot=0-1
This, in my opinion, should limit each program-instance to cores 0 and 1 of the local machine it runs on.
I execute mpirun like so:
mpirun -np 2 -N 1 -rf /my/rank/file my_program
But mpirun fails with this error without even executing my program:
Conflicting directives for mapping policy are causing the policy
to be redefined:
New policy: RANK_FILE
Prior policy: UNKNOWN
Please check that only one policy is defined.
What's this? Did I make a mistake in the rankfile?
Instead of using a rankfile, simply use a hostfile:
n0 slots=n max_slots=n
n1 slots=n max_slots=n
Then tell Open MPI to map one process per node with two cores per process using:
mpiexec --hostfile hostfile --map-by ppr:1:node:PE=2 --bind-to core ...
ppr:1:node:PE=2 reads as: 1 process per resource; resource type is node; 2 processing elements per process. You can check the actual binding by adding the --report-bindings option.

How to setup workers for parallel processing in R using snowfall and multiple Windows nodes?

I’ve successfully used snowfall to setup a cluster on a single server with 16 processors.
require(snowfall)
if (sfIsRunning() == TRUE) sfStop()
number.of.cpus <- 15
sfInit(parallel = TRUE, cpus = number.of.cpus)
stopifnot( sfCpus() == number.of.cpus )
stopifnot( sfParallel() == TRUE )
# Print the hostname for each cluster member
sayhello <- function()
{
info <- Sys.info()[c("nodename", "machine")]
paste("Hello from", info[1], "with CPU type", info[2])
}
names <- sfClusterCall(sayhello)
print(unlist(names))
Now, I am looking for complete instructions on how to move to a distributed model. I have 4 different Windows machines with a total of 16 cores that I would like to use for a 16 node cluster. So far, I understand that I could manually setup a SOCK connection or leverage MPI. While it appears possible, I haven’t found clear and complete directions as to how.
The SOCK route appears to depend on code in a snowlib script. I can generate a stub from the master side with the following code:
winOptions <-
list(host="172.01.01.03",
rscript="C:/Program Files/R/R-2.7.1/bin/Rscript.exe",
snowlib="C:/Rlibs")
cl <- makeCluster(c(rep(list(winOptions), 2)), type = "SOCK", manual = T)
It yields the following:
Manually start worker on 172.01.01.03 with
"C:/Program Files/R/R-2.7.1/bin/Rscript.exe"
C:/Rlibs/snow/RSOCKnode.R
MASTER=Worker02 PORT=11204 OUT=/dev/null SNOWLIB=C:/Rlibs
It feels like a reasonable start. I found code for RSOCKnode.R on GitHub under the snow package:
local({
master <- "localhost"
port <- ""
snowlib <- Sys.getenv("R_SNOW_LIB")
outfile <- Sys.getenv("R_SNOW_OUTFILE") ##**** defaults to ""; document
args <- commandArgs()
pos <- match("--args", args)
args <- args[-(1 : pos)]
for (a in args) {
pos <- regexpr("=", a)
name <- substr(a, 1, pos - 1)
value <- substr(a,pos + 1, nchar(a))
switch(name,
MASTER = master <- value,
PORT = port <- value,
SNOWLIB = snowlib <- value,
OUT = outfile <- value)
}
if (! (snowlib %in% .libPaths()))
.libPaths(c(snowlib, .libPaths()))
library(methods) ## because Rscript as of R 2.7.0 doesn't load methods
library(snow)
if (port == "") port <- getClusterOption("port")
sinkWorkerOutput(outfile)
cat("starting worker for", paste(master, port, sep = ":"), "\n")
slaveLoop(makeSOCKmaster(master, port))
})
It’s not clear how to actually start a SOCK listener on the workers, unless it is buried in snow::recvData.
Looking into the MPI route, as far as I can tell, Microsoft MPI version 7 is a starting point. However, I could not find a Windows alternative for sfCluster. I was able to start the MPI service, but it does not appear to listen on port 22 and no amount of bashing against it with snowfall::makeCluster has yielded a result. I’ve disabled the firewall and tried testing with makeCluster and directly connecting to the worker from the master with PuTTY.
Is there a comprehensive, step-by-step guide to setting up a snowfall cluster on Windows workers that I’ve missed? I am fond of snowfall::sfClusterApplyLB and would like to continue using that, but if there is an easier solution, I’d be willing to change course. Looking into Rmpi and parallel, I found alternative solutions for the master side of the work, but still little to no specific detail on how to setup workers running Windows.
Due to the nature of the work environment, neither moving to AWS, nor Linux is an option.
Related questions without definitive answers for Windows worker nodes:
How to set up cluster slave nodes (on Windows)
Parallel R on a Windows cluster
Create a cluster of co-workers' Windows 7 PCs for parallel processing in R?
There were several options for HPC infrastructure considered: MPICH, Open MPI, and MS MPI. Initially tried to use MPICH2 but gave up as the latest stable release 1.4.1 for Windows dated back by 2013 and no support since those times. Open MPI is not supported by Windows. Then only the MS MPI option is left.
Unfortunately snowfall does not support MS MPI so I decided to go with pbdMPI package, which supports MS MPI by default. pbdMPI implements the SPMD paradigm in contrast withRmpi, which uses manager/worker parallelism.
MS MPI installation, configuration, and execution
Install MS MPI v.10.1.2 on all machines in the to-be Windows HPC cluster.
Create a directory accessible to all nodes, where R-scripts / resources will reside, for example, \HeadMachine\SharedDir.
Check if MS MPI Launch Service (MsMpiLaunchSvc) running on all nodes.
Check, that MS MPI has the rights to run R application on all the nodes on behalf of the same user, i.e. SharedUser. The user name and the password must be the same for all machines.
Check, that R should be launched on behalf of the SharedUser user.
Finally, execute mpiexec with the following options mentioned in Steps 7-10:
mpiexec.exe -n %1 -machinefile "C:\MachineFileDir\hosts.txt" -pwd
SharedUserPassword –wdir "\HeadMachine\SharedDir" Rscript hello.R
where
-wdir is a network path to the directory with shared resources.
–pwd is a password by SharedUser user, for example, SharedUserPassword.
–machinefile is a path to hosts.txt text file, for example С:\MachineFileDir\hosts.txt. hosts.txt file must be readable from the head node at the specified path and it contains a list of IP addresses of the nodes on which the R script is to be run.
As a result of Step 7 MPI will log in as SharedUser with the password SharedUserPassword and execute copies of the R processes on each computer listed in the hosts.txt file.
Details
hello.R:
library(pbdMPI, quiet = TRUE)
init()
cat("Hello World from
process",comm.rank(),"of",comm.size(),"!\n")
finalize()
hosts.txt
The hosts.txt - MPI Machines File - is a text file, the lines of which contain the network names of the computers on which R scripts will be launched. In each line, after the computer name is separated by a space (for MS MPI), the number of MPI processes to be launched. Usually, it equals the number of processors in each node.
Sample of hosts.txt with three nodes having 2 processors each:
192.168.0.1 2
192.168.0.2 2
192.168.0.3 2

R Running foreach dopar loop on HPC MPIcluster

I got access to an HPC cluster with a MPI partition.
My problem is that -no matter what I try- my code (which works fine on my PC) doesn't run on the HPC cluster. The code looks like this:
library(tm)
library(qdap)
library(snow)
library(doSNOW)
library(foreach)
> cl<- makeCluster(30, type="MPI")
> registerDoSNOW(cl)
> np<-getDoParWorkers()
> np
> Base = "./Files1a/"
> files = list.files(path=Base,pattern="\\.txt");
>
> for(i in 1:length(files)){
...some definitions and variable generation...
+ text<-foreach(k = 1:10, .combine='c') %do%{
+ text= if (file.exists(paste("./Files", k, "a/", files[i], sep=""))) paste(tolower(readLines(paste("./Files", k, "a/", files[i], sep=""))) , collapse=" ") else ""
+ }
+
+ docs <- Corpus(VectorSource(text))
+
+ for (k in 1:10){
+ ID[k] <- paste(files[i], k, sep="_")
+ }
+ data <- as.data.frame(docs)
+ data[["docs"]]=ID
+ rm(docs)
+ data <- sentSplit(data, "text")
+
+ frequency=NULL
+ cs <- ceiling(length(POLKEY$x) / getDoParWorkers())
+ opt <- list(chunkSize=cs)
+ frequency<-foreach(j = 2: length(POLKEY$x), .options.mpi=opt, .combine='cbind') %dopar% ...
+ write.csv(frequency, file =paste("./Result/output", i, ".csv", sep=""))
+ rm(data, frequency)
+ }
When I run the batch job the session gets killed at the time limit. Whereas I receive the following message after the MPI cluster initialization:
Loading required namespace: Rmpi
--------------------------------------------------------------------------
PMI2 initialized but returned bad values for size and rank.
This is symptomatic of either a failure to use the
"--mpi=pmi2" flag in SLURM, or a borked PMI2 installation.
If running under SLURM, try adding "-mpi=pmi2" to your
srun command line. If that doesn't work, or if you are
not running under SLURM, try removing or renaming the
pmi2.h header file so PMI2 support will not automatically
be built, reconfigure and build OMPI, and then try again
with only PMI1 support enabled.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
An MPI process has executed an operation involving a call to the
"fork()" system call to create a child process. Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your MPI job may hang, crash, or produce silent
data corruption. The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.
The process that invoked fork was:
Local host: ...
MPI_COMM_WORLD rank: 0
If you are *absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
--------------------------------------------------------------------------
30 slaves are spawned successfully. 0 failed.
Unfortunately, it seems that the loop doesn't go through once as no output is returned.
For the sake of completeness, my batch file:
#!/bin/bash -l
#SBATCH --job-name MyR
#SBATCH --output MyR-%j.out
#SBATCH --nodes=5
#SBATCH --ntasks-per-node=6
#SBATCH --mem=24gb
#SBATCH --time=00:30:00
MyRProgram="$HOME/R/hpc_test2.R"
cd $HOME/R
export R_LIBS_USER=$HOME/R/Libs2
# start R with my R program
module load R
time R --vanilla -f $MyRProgram
Does anybody have a suggestion how to solve the problem? What am I doing wrong?
Thanks in advance for your help!
Your script is an MPI application, so you need to execute it appropriately via Slurm. The Open MPI FAQ has a special section on how to do that:
https://www.open-mpi.org/faq/?category=slurm
The most important point is that your script shouldn't execute R directly, but should execute it via the mpirun command, using something like:
mpirun -np 1 R --vanilla -f $MyRProgram
My guess is that the "PMI2" error is caused by not executing R via mpirun. I don't think the "fork" message indicates a real problem and it happens to me at times. I think it happens because R calls "fork" when initializing, but this has never caused a problem for me. I'm not sure why I only get this message occasionally.
Note that it is very important to tell mpirun to only launch one process since the other processes will be spawned, so you should use the mpirun -np 1 option. If Open MPI was properly built with Slurm support, then Open MPI should know where to launch those processes when they are spawned, but if you don't use -np 1, then all 30 processes launched via mpirun will spawn 30 processes each, causing a huge mess.
Finally, I think you should tell makeCluster to spawn only 29 processes to avoid running a total of 31 MPI processes. Depending on your network configuration, even that much oversubscription can cause problems.
I would create the cluster object as follows:
library(snow)
library(Rmpi)
cl<- makeCluster(mpi.universe.size() - 1, type="MPI")
That's safer and makes it easier to keep your R script and job script in sync with each other.

mpi_comm_spawn on remote nodes

How does one use MPI_Comm_spawn to start worker processes on remote nodes?
Using OpenMPI 1.4.3, I've tried this code:
MPI_Info info;
MPI_Info_create(&info);
MPI_Info_set(info, "host", "node2");
MPI_Comm intercom;
MPI_Comm_spawn("worker",
MPI_ARGV_NULL,
nprocs,
info,
0,
MPI_COMM_SELF,
&intercom,
MPI_ERRCODES_IGNORE);
But that fails with this error message:
--------------------------------------------------------------------------
There are no allocated resources for the application
worker
that match the requested mapping:
Verify that you have mapped the allocated resources properly using the
--host or --hostfile specification.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
A daemon (pid unknown) died unexpectedly on signal 1 while attempting to
launch so we are aborting.
There may be more information reported by the environment (see above).
This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
If I replace the "node2" with the name of my local machine, then it works fine. If I ssh into node2 and run the same thing there (with "node2" in the info dictionary) then it also works fine.
I don't want to start the parent process with mpirun, so I'm just looking for a way to dynamically spawn processes on remote nodes. Is this possible?
I don't want to start the parent
process with mpirun, so I'm just
looking for a way to dynamically spawn
processes on remote nodes. Is this
possible?
I'm not sure why you don't want to start it with mpirun? You're implicitly starting up the whole MPI machinery anyway as soon as you hit MPI_Init(), this way you just get to pass it options rather than relying on the default.
The issue here is simply that when the MPI library starts up (at MPI_Init()) it doesn't see any other hosts available, because you haven't given it any with the --host or --hostfile options to mpirun. It won't just launch processes elsewhere on your say-so (indeed, spawn doesn't require Info host, so in general it wouldn't even know where to go otherwise), so it fails.
So you'll need to do
mpirun --host myhost,host2 -np 1 ./parentjob
or, more generally, provide a hostfile, preferably with a number of slots available
myhost slots=1
host2 slots=8
host3 slots=8
and launch the jobs this way, mpirun --hostfile mpihosts.txt -np 1 ./parentjob This is a feature, not a bug; now it's MPIs job to figure out where the workers go, and if you don't specify a host explicitly in the info, it'll try to put it in the most underutilized place. It also means you don't have to recompile to change the hosts you'll spawn to.

Resources