Including MPI header file with user-defined macro - mpi

For including openmp, this worked :
#ifdef _OPENMP
#include <omp.h>
#endif
I wanted to have this way of including for MPI as well; so I wrote this :
#ifdef MPI
#include <mpi.h>
#endif
But this does not work. Cannot MPI be included in the same manner?

Yes it will work. You need to tell the compiler that MPI is both a macro, and defined. The shorthand for that is -D:
gcc -D MPI my.c
Example, test.c:
#include <stdio.h>
int main(){
#ifdef MPI
printf("Hello, MPI\n");
#endif
printf("Hello, world\n");
}
Screen scrape:
jsmith#LOBSTER:~$ gcc test.c
jsmith#LOBSTER:~$ ./a.out
Hello, world
jsmith#LOBSTER:~$ gcc -D MPI test.c
jsmith#LOBSTER:~$ ./a.out
Hello, MPI
Hello, world

Related

OpenMP threads not activating when run with mpirun

While trying to run a hybrid MPI/OpenMP application I realized that the number of OpenMP threads was always 1, even though I exported OMP_NUM_THREAS=36. I build a small C++ example showing the issue:
#include <vector>
#include "math.h"
int main ()
{
int n=4000000, m=1000;
double x=0,y=0;
double s=0;
std::vector< double > shifts(n,0);
#pragma omp parallel for reduction(+:x,y)
for (int j=0; j<n; j++) {
double r=0.0;
for (int i=0; i < m; i++){
double rand_g1 = cos(i/double(m));
double rand_g2 = sin(i/double(m));
x += rand_g1;
y += rand_g2;
r += sqrt(rand_g1*rand_g1 + rand_g2*rand_g2);
}
shifts[j] = r / m;
}
}
I compile the code using g++:
g++ -fopenmp main.cpp
OMP_NUM_THREADS is still set to 36. When I run the code with just:
time ./a.out
I get a run-time of about 6 seconds and htop shows the command using all 36 cores of my local node, as expected. When I run it with mpirun:
time mpirun -np 1 ./a.out
I get a run-time of 3m20s and htop shows the command is using only on one core. I've also tried using mpirun -np 1 -x OMP_NUM_THREADS=36 ./a.out but results were the same.
I am using GCC 9.2.0 and OpenMPI 4.1.0a1. Since this is a developer version, I've also tried with OpenMPI 4.0.3 with the same result.
Any idea what I am missing?
The default behavior of Open MPI is to
bind a MPI task on a core if there are two or less MPI tasks
bind a MPI task to a socket otherwise
So you really should
mpirun --bind-to none -np 1 ./a.out
so your MPI task can access all the cores of your host.

Limit MPI to run on single GPU even if we have single Node multi GPU setup

I am new to distributed computing and I am trying to run a program which uses MPI and ROCm(AMD framework to run on GPU).
The command I am using to run the program is
mpirun -np 4 ./a.out
But it is defaultly running on the available 2 GPUs in my machine.
Is there a way to make it run only on single GPU and if yes how?
Thanks in Advance :)
You may control the active GPU(s) by setting some environment variables
(e.g. GPU_DEVICE_ORDINAL, ROCR_VISIBLE_DEVICES or HIP_VISIBLE_DEVICES, see this or this for more details).
For instance:
export HIP_VISIBLE_DEVICES=0
mpirun -np 4 ./a.out
# or
HIP_VISIBLE_DEVICES=0 mpirun -np 4 ./a.out
Be careful that some MPI implementations do not export all environement variables, or may reload your bashrc or cshrc. So using your MPI's syntax to set envvars is safer:
# with openmpi
mpirun -x HIP_VISIBLE_DEVICES=0 -np 4 ./a.out
# or with mpich
mpiexec -env HIP_VISIBLE_DEVICES 0 -n 4 ./a.out
To be on the safe side, it's probably a good idea to add this to your C++ code:
#include <stdlib.h>
// ...
char* hip_visible_devices = getenv("HIP_VISIBLE_DEVICES");
if (hip_visible_devices) std::cout << "Running on GPUs: " << hip_visible_devices << std::endl;
else std::cout << "Running on all GPUs! " << std::endl;
(note that cuda has both an envvar and a C-function CudaSetDevice(id), I'm wondering if there's an equivalent for AMD or openCL).

MPI_Rank return same process number for all process

I'm trying to run this sample hello world program with openmpi and mpirun on debian 7.
#include <stdio.h>
#include <mpi/mpi.h>
int main (int argc, char **argv) {
int nProcId, nProcNo;
int nNameLen;
char szMachineName[MPI_MAX_PROCESSOR_NAME];
MPI_Init (&argc, &argv); // Start up MPI
MPI_Comm_size (MPI_COMM_WORLD,&nProcNo); // Find out number of processes
MPI_Comm_rank (MPI_COMM_WORLD, &nProcId); // Find out process rank
MPI_Get_processor_name (szMachineName, &nNameLen); // Get machine name
printf ("Hello World from process %d on %s\r\n", nProcId, szMachineName);
if (nProcId == 0)
printf ("Number of Processes: %d\r\n", nProcNo);
MPI_Finalize (); // Shut down MPI
return 0;
}
My problem is MPI_Comm_Rank returns 0 for all copies of the process. When I run this command on the shell:
mpirun -np 4 helloWorld
It produces this output:
Hello World from process 0 on debian
Number of Processes: 1
Hello World from process 0 on debian
Number of Processes: 1
Hello World from process 0 on debian
Number of Processes: 1
Hello World from process 0 on debian
Number of Processes: 1
Why is the number of processes still 1?
Make sure that both mpicc and mpirun come from the same MPI implementation. When mpirun fails to provide the necessary universe information to the launched processes, with the most common reason for that being that the executable was build against a different MPI implementation (or even a different version of the same implementation), MPI_Init() falls back to the so-called singleton MPI initialisation and creates an MPI_COMM_WORLD that only contains the calling process. Thus the result is many MPI processes within their own separate MPI_COMM_WORLD instances.
Usually commands like mpicc --showme, which mpicc and which mpirun could help you find out if that is the case indeed.

Wrong mpi number of processors

Sorry, I'm sure making a silly mistake, but did not work out.
I'm compiling a simple mpi hello world:
#include <stdio.h>
#include <mpi.h>
int main (argc, argv)
int argc;
char *argv[];
{
int rank, size;
MPI_Init (&argc, &argv); /* starts MPI */
MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */
MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */
printf( "Hello world from process %d of %d\n", rank, size );
MPI_Finalize();
return 0;
}
And:
> mpicc -o hello_world_c hello_world.c
> mpirun -np 4 hello_world_c
But returns:
Hello world from process 0 of 1
Hello world from process 0 of 1
Hello world from process 0 of 1
Hello world from process 0 of 1
But my computer is a core i7 with 4 cores. And everything seems to be ok, ie. cat /proc/cpuinfo shows the 4 processors
what's happening???
Thanks in advance!!!!
There is nothing wrong with your code.
The only problem that can be is with your mpi installation.
Notice:
There is a differences between processor to core. its not the same thing.
In this case, you need mpiexec from the 'mpich2' package.
Firstly, remove all mpi packages that installed on your computer.
If your server is Ubuntu, you can use the command:
sudo apt-get purge mpi mpich2 openmpi-common
To make sure that you have removed all the packages, try this command
which mpiexec
If you got nothing in response, you already removed all the packages.
Then reinstall the package of mpich2
sudo apt-get install mpich2
Try to compile and run your code again!
Hope this help!
I don't know how you can compile it:
int main (argc, argv)
int argc;
char *argv[];
will be changed to
int main (int argc, char *argv[])
another point is that mpi is message passing interface that passes messages between processes not cores or processors if you have a 4 core system you can run your code with so many processes as your ram permits but only 4 processes are working at any time and other processes must wait so it is efficient that you use only 4 process.
INSTALL
sudo apt-get install libopenmpi-dev openmpi-bin openmpi-doc
Now compile and execute code

Convert C to MPI API C Program

How do we convert c Program with user defined function to a Parallel Program using MPI API to C. A Demo would be more useful
Thank you.
Hari
It depends upon what you are trying to execute in parallel and what are your requirements.
There are lot of good tutorials available.
Here's a simple hello world program :
#include "stdio.h"
#include "mpi.h"
int main(int argc, char *argv[])
{
int id, nprocs;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &id);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
printf("Hello World from proc %d out of %d!\n", id, nprocs);
MPI_Finalize();
return 0;
}
The compile it as following
usr#xbc:~> mpicc -Wall -Werror -o mpihello mpihello.c
To Run it on 2 separate machines as
usr#xbc:~> mpirun_rsh -ssh -np 2 node0 node1 /home/xbc/mpihello
or
usr#xbc:~> mpirun_rsh -hostfile hostlist -n 2 /home/xbc/mpihello
where your hostlist file would have 2 entries as node0 and node
To Run it on the same machine as 2 processes
usr#xbc:~> mpirun_rsh -ssh -np 2 node0:1 /home/xbc/mpihello
or
usr#xbc:~> mpirun_rsh -hostfile hostlist -n 2 /home/xbc/mpihello
where your hostlist file would have 1 entry as node0:1
The output put would a Hello world with Ranks as 0 and 1 from either 2 machines or 2 processes depending upon how you run it.
Now you can run MPI over TCP/IP or MPI over IP over IB(InfiniBand) or MPI over IB Over Ethernet or in many other ways depending upon your requirements.
You will have to configure the mpi compilation and installation accordingly.

Resources