I've got a Windows HPC Server running with some nodes in the backend. I would like to run Parallel R using multiple nodes from the backend. I think Parallel R might be using SNOW on Windows, but not too sure about it. My question is, do I need to install R also on the backend nodes?
Say I want to use two nodes, 32 cores per node:
cl <- makeCluster(c(rep("COMP01",32),rep("COMP02",32)),type="SOCK")
Right now, it just hangs.
What else do I need to do? Do the backend nodes need some kind of sshd running to be able to communicate each other?
Setting up snow on a Windows cluster is rather difficult. Each of the machines needs to have R and snow installed, but that's the easy part. To start a SOCK cluster, you would need an sshd daemon running on each of the worker machines, but you can still run into troubles, so I wouldn't recommend it unless you're good at debugging and Windows system administration.
I think your best option on a Windows cluster is to use MPI. I don't have any experience with MPI on Windows myself, but I've heard of people having success with the MPICH and DeinoMPI MPI distributions for Windows. Once MPI is installed on your cluster, you also need to install the Rmpi package from source on each of your worker machines. You would then create the cluster object using the makeMPIcluster function. It's a lot of work, but I think it's more likely to eventually work than trying to use a SOCK cluster due to the problems with ssh/sshd on Windows.
If you're desperate to run a parallel job once or twice on a Windows cluster, you could try using manual mode. It allows you to create a SOCK cluster without ssh:
workers <- c(rep("COMP01",32), rep("COMP02",32))
cl <- makeSOCKluster(workers, manual=TRUE)
The makeSOCKcluster function will prompt you to start each one of the workers, displaying the command to use for each. You have to manually open a command window on the specified machine and execute the specified command. It can be extremely tedious, particularly with many workers, but at least it's not complicated or tricky. It can also be very useful for debugging in combination with the outfile='' option.
Related
I search and it doesn't seem to have a Julia core way to limit RAM used so I search Linux instead.
According to this question, I can limit the RAM used by my command to 64GB with:
ulimit -v 64000000
I am wondering if I do:
$ulimit -v 64000000
$julia
$julia>
Am I doing everything alright, i.e., everything I do like launching a JuMP model within my Julia console will be limited to 64GB of RAM?
Since ulimit appears to set resource limits for your entire user account, I can find no reason why this should not also apply to Julia processes run under that user account.
The one exception that comes to mind might be if you are running Julia processes on multiple nodes of a linux cluster with, e.g. the Distributed.jl stdlib, but this is a rather niche case.
I have access to (not authority over) a computing cluster which has R installed. Is there a way for me to use R-Studio on my local computer -- but have the code running on the cluster via SSH?
To clarify -- No I don't really have non-SSH access, no I can't install R-Studio (server or desktop) on the cluster.
In line with the hackish options #hrbrmstr mentioned...
If your aim is to run mostly non-interactive code, then you can probably establish an n-node parallel::makePSOCKcluster() on the remote machines and run each of your commands via parallel like commands. Similarly, you could use package::svSocket, see this neat demo on YouTube for more details than fit in a reasonable answer.
But, given that you said RStudio, I suspect you are thinking of interactive use, and the above would be doable (but painful). Nothing I know of will let you just pretend that the remote machine is the local machine (which is a pity to be sure). However you might be able to hack something together, with sink() etc and a server and client side loop, e.g. How to connect two computers using R?.
I am working in a large server environement. For example, when I run detectCores() from the parallel library, I get a print out of 48. I would like to use this environment efficiently. What parallel backend should one use in this environment?
I have searched around, and it seems as though certain packages are best for server environments, while others work best within gui environments. But what about a blended environment such as RStudio Server?
I'm running Torque with Open MPI on a single machine with 24 cores. Why is it possible to specify in my job,sh, for instance, nodes=1:ppn:2 and still be able to run a job specified by mpirun -np 12 WhatEverCommand? In such case the job is executed on 12 cores, even though the "nodes" says 2 cpus.
Doesn't specifying the "nodes" option make any restrictions on the resources to be used by the submitted job? If it doesn't, then how to prevent users from violating the server rules by overriding the declared resources?
On the other hand - specifying the nodes=1:ppn=8 and mpirun without "-np" option, gives me only 1 cpu running the job.
Am I that bad and missing something fundamental here?
By default, OpenMPI doesn't integrate with Torque at all. You have to compile OpenMPI using the --with-tm configure option, which doesn't seem to be enabled in most distro packages. The OpenMPI project mentions Torque integration in its FAQs on building and running OpenMPI.
Similarly, Torque doesn't actually restrict access to CPUs unless cpuset support is enabled. Again, this seems absent in most distro packages. This is why your OpenMPI app, when compiled without Torque integration, can hit all the cores without restriction.
Building both packages from source is not too difficult, so it's worth researching the configure options and building the support that makes sense for you.
I am running an external program via R that is pretty memory hungry and can take >8 hours to run. I'd like to open up another instance of R to do other tasks but am concerned about crashing the external program and having to restart the process. Should I expect any problems under these circumstances? The external program is widows only and I'm running it on a Bootcamp partition on a MacBook Pro.
On a proper operating system, both instances will be independent and not interfere with each other. (Unless they compete for the same resources, from that does not seem to be the case from your description.)
This is no different than several users running on server and each running one or two instances...