“unable to find the specified executable file” when trying to use mpirun on julia - julia

I am trying to run my julia code on multiple nodes of a cluster, which uses Moab and Torque for the scheduler and resource manager.
In an interactive session where I requested 3 nodes, I load julia and openmpi modules and run:
mpirun -np 72 --hostfile $PBS_NODEFILE -display-allocation julia --project=. "./estimation/test.jl"
The mpirun does successfully recognize my 3 nodes since it displays:
====================== ALLOCATED NODES ======================
comp-bc-0383: slots=24 max_slots=0 slots_inuse=0 state=UP
comp-bc-0378: slots=24 max_slots=0 slots_inuse=0 state=UNKNOWN
comp-bc-0372: slots=24 max_slots=0 slots_inuse=0 state=UNKNOWN
=================================================================
However, after that it returns an error message
--------------------------------------------------------------------------
mpirun was unable to find the specified executable file, and therefore
did not launch the job. This error was first reported for process
rank 48; it may have occurred for other processes as well.
NOTE: A common cause for this error is misspelling a mpirun command
line parameter option (remember that mpirun interprets the first
unrecognized command line token as the executable).
Node: comp-bc-0372
Executable: /opt/aci/sw/julia/1.5.3_gcc-4.8.5-ips/bin/julia
--------------------------------------------------------------------------
What could be the possible cause of this? Is it because it has trouble accessing julia from other nodes? (I think this is the case because the code runs as long as -np X where x <= 24, which is the number of slots for one node; as soon as x >= 25, it fails to run)

Here a good manual how to work with modules and mpirun. UsingMPIstacksWithModules
To sum it up with what is written in the manual:
It should be highlighted that modules are nothing else than a structured way to manage your environment variables; so, whatever hurdles there are about modules, apply equally well about environment variables.
What you need is to export the environment variables in your mpirun command with -x PATH -x LD_LIBRARY_PATH. To see if this worked you can then run
mpirun -np 72 --hostfile $PBS_NODEFILE -display-allocation -x PATH -x LD_LIBRARY_PATH which julia
Also, you should consider giving the whole path of the file you want to run, so /path/to/estimation/test.jl instead of ./estimation/test.jl since your working directory is not the same in every node. (In general it is always safer to use whole paths).
By using whole paths, you should also be able to use /path/to/julia (that is the output of which julia) instead of only julia, this way you should not need to export the environment variables.

Related

ORTE problem when running MPI on multiple computing nodes

I am trying to run a simple MPI example on a cluster with multiple computing nodes. Now I am just using two test nodes, including gpu8 and gpu12.
What I've done include:
gpu8 and gpu12 have the correct MPI environment (OpenMPI-4.0.1). I can successfully run the MPI example on a single node.
Passwordless login between gpu8 and gpu12 has been setup. They can ssh to another node with no issues.
There is a hostfile on each node containing
gpu8
gpu12
The executable files are under the same path.
echo $PATH (on both nodes) gives
/home/user_1/share/local/openmpi-4.0.1/bin:xxxxxx
echo $LD_LIBRARY_PATH (on both nodes) gives
/home/t716/shshi/share/local/openmpi-4.0.1/lib:
The ORTE problem:
I am running mpirun -np 2 --hostfile /home/user_2/hosts ./home/user_2/mpi-hello-world/mpi_hello_world. The error output is:
bash: orted: command not found
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
--------------------------------------------------------------------------

Several questions on running Rmpi and foreach on a HPC cluster

I am queueing and running an R script on a HPC cluster via sbatch and mpirun; the script is meant to use foreach in parallel. To do this I've used several useful questions & answers from StackOverflow: R Running foreach dopar loop on HPC MPIcluster, Single R script on multiple nodes, Slurm: Use cores from multiple nodes for R parallelization.
It seems that the script completes, but a couple of strange things happen. The most important is that the slurm job keeps on running afterwards, doing nothing(?). I'd like to understand if I'm doing things properly. I'll first give some more specific information, then explain the strange things I'm seeing, then I'll ask my questions.
– Information:
R is loaded as a module, which also calls an OpenMPI module. The packages Rmpi, doParallel, snow, foreach were already compiled and included in the module.
The cluster has nodes with 20 CPUs each. My sbatch file books 2 nodes and 20 CPUs per node.
The R script myscript.R is called in the sbatch file like this:
mpirun -np 1 Rscript -e "source('myscript.R')"
My script calls several libraries in this order:
library('snow')
library('Rmpi')
library('doParallel')
library('foreach')
and then sets up parallelization as follows at the beginning:
workers <- mpi.universe.size() - 1
cl <- makeMPIcluster(workers, outfile='', type='MPI')
registerDoParallel(cl)
Then several foreach-dopar are called in succession – that is, each starts after the previous has finished. Finally
stopCluster(cl)
mpi.quit()
are called at the very end of the script.
mpi.universe.size() correctly gives 40, as expected. Also, getDoParWorkers() gives doParallelSNOW. The slurm log encouragingly says
39 slaves are spawned successfully. 0 failed.
starting MPI worker
starting MPI worker
...
Also, calling print(clusterCall(cl, function() Sys.info()[c("nodename","machine")])) from within the script correctly reports the node names shown in the slurm queue.
– What's strange:
The R script completes all its operations, the last one being saving a plot as pdf, which I do see and is correct. But the slurm job doesn't end, it remains in the queue indefinitely with status "running".
The slurm log shows very many lines with
Type: EXEC. I can't find any relation between their number and the number of foreach called. At the very end the log shows 19 lines with Type: DONE (which make sense to me).
– My questions:
Why does the slurm job run indefinitely after the script has finished?
Why the numerous Type: EXEC messages? are they normal?
There is some masking between packages snow and doParallel. Am I calling the right packages and in the right order?
Some answers to the StackOverflow questions mentioned above recommend to call the script with
mpirun -np 1 R --slave -f 'myscript.R'
instead of using Rscript as I did. What's the difference? Note that the problems I mentioned remain even if I call the script this way, though.
I thank you very much for your help!

Change the number of cores OpenMPI can "see"

I am running an executable (I don't have access to source code) that calls mpirun. I'm getting the following error which is common if more cores are requested than available on the CPU:
There are not enough slots available in the system to satisfy the 12
slots that were requested by the application:
/Users/me/Library/app/executable
Either request fewer slots for your application, or make more slots
available for use.
A "slot" is the Open MPI term for an allocatable unit where we can
launch a process. The number of slots available are defined by the
environment in which Open MPI processes are run:
1. Hostfile, via "slots=N" clauses (N defaults to number of
processor cores if not provided)
2. The --host command line parameter, via a ":N" suffix on the
hostname (N defaults to 1 if not provided)
3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
4. If none of a hostfile, the --host command line parameter, or an
RM is present, Open MPI defaults to the number of processor cores
In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.
Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
My problem is that I cannot change the command-line options for mpirun e.g. using --oversubscribe. Rather, I need to change the default number of cores that OpenMPI "sees". (This would be an easy fix otherwise as in this case)
Is there an environment variable or something I can update to trick openMPI into working?
Ah. I found the default OpenMPI hostfile at /usr/local/etc/openmpi-default-hostfile (on Mac) and added at the end (in a new line):
localhost slots=12
So OpenMPI was reading a default slots count of 6, since I have 6 cores on my system (The error only occurred for >6 requested CPUs). But I have 12 threads and wanted to use the CPU at full capacity.
This worked for me since I was not running mpirun in the command line.

Using MPI communication with containerized applications

I've successfully executed a program on my cluster using
mpirun -np 1 ./my_program
I've containerized my_program using Singularity and produced my_program.simg. The container also includes the same version of OpenMPI as the host. I'd like to have nodes in my cluster execute the Singularity container, so I tried:
mpirun -np 1 \
singularity run my_program.simg
However, this fails:
[pn111:395108] PMIX ERROR: NOT-FOUND in file ../../../../../../../openmpi-3.1.3/opal/mca/pmix/pmix2x/pmix/src/server/pmix_server_ops.c at line 1865
[pn111:395108] PMIX ERROR: NOT-FOUND in file ../../../../../../../openmpi-3.1.3/opal/mca/pmix/pmix2x/pmix/src/server/pmix_server_ops.c at line 1865
[pn111:395108] PMIX ERROR: NOT-FOUND in file ../../../../../../../openmpi-3.1.3/opal/mca/pmix/pmix2x/pmix/src/server/pmix_server_ops.c at line 1865
I'm assuming that I need to expose something to the container so it knows how to speak to the MPI universe. Is there a socket I need to bind into the container, or an environment variable I need to set, or something like that?
The Singularity documentation suggests there's nothing extra to do.
This GitHub issue may have good advice, which I will try.

Error in locking authority file for parallelisation with qmake

I work with Rscript on a cluster. qmake (a specialized version of GNU-make for the cluster) is used to parallelize jobs on several nodes. But Rscript seems to need to write a Xauthority file and it creates an error when every nodes work in the same time. In this way, my makefile-bases pipeline stops after the first group of parallelized tasks and don't start the next group of tasks. But the results of the first group are ok.
I'm also invoking usr/bin/xvfb-run ( https://en.wikipedia.org/wiki/Xvfb) when runnning RScript.
I've already changed the ssh-config (FORWARD X11 yes) but the problem persists.
I also tried to change the name of Xauthority file for each job but it didn't work (option -f in Rscript).
Here is the error which appear at the beginning of the process
/usr/bin/xauth: error in locking authority file .Xauthority
Here is the error which appears before the process stops :
/usr/bin/xvfb-run: line 171: kill: (44402) - No such process
qmake: *** [Data/Median/median_B00FTXC.tmp] Error 1

Resources