How to run the same MPI program multiple times - mpi

I have an MPI program to calculate a sorting time. I run it with mpirun -np 2 mpiSort. So this gives me the sorting time by 2 processes.
I want to get the sorting time for 5 times to average them. How do I do that automatically?
If I do a loop in the mpiSort program. It actually executes 5(times) x 2(processes) = 10 times.
Edit: The mpiSort does the sort in parallel. Basically, I'm trying to do mpirun -np 2 mpiSort without typing it 5 times. Because I want to do the same for 4 cores, 8 cores.

You could run on five cores using mpirun -np 5 mpiSort and add an MPI_gather at the end. Is the sort code actually using MPI (i.e. calls MPI_init at the beginning?). Assuming you are, you can run on 5 cores and simply average at the end with a reduce,
# include <mpi.h>
#include <iostream>
using namespace std;
int main ( int argc, char *argv[] )
{
int ierr, rank, nprocs, root=0;
double time, buf;
ierr = MPI_Init ( &argc, &argv );
ierr = MPI_Comm_rank (MPI_COMM_WORLD, &rank);
ierr = MPI_Comm_size (MPI_COMM_WORLD, &nprocs);
time = 0.5;
ierr = MPI_Reduce (&time, &buf, 1, MPI_DOUBLE_PRECISION,
MPI_SUM, root, MPI_COMM_WORLD);
if (rank == root){
buf = buf / nprocs;
cout << buf << "\n";
}
MPI_Finalize ( );
}
where time is each processes sort time.

Put in a loop is the way to go. I was confused because I got 10 values of endTime = MPI_Wtime(), and I only used 5 of them from the root process. Thanks to #EdSmith with his MPI_Reduce code, the correct calculated time is the average of the two processes by using MPI_Reduce.
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nProcs);
for (int run=0; run<5; run++) {
...
endTime = MPI_Wtime();
totalTime = endTime - startTime;
MPI_Reduce (&totalTime, &workTime, 1, MPI_DOUBLE_PRECISION, MPI_SUM, root, MPI_COMM_WORLD);
if (rank == root) {
paraTime = workTime/nProcs;
}
...
}
MPI_Finalize();

Related

How to update message without using if statement?

I have a number of processors, let's say, 9, that are arange like a ring together. So, the processors communicating with each other in a ring and in a non-blocking setting MPI_Isend() and MPI_Irecv(). And the task is to recieve the rank of previous proccessor and add that to its own rank, and then pass it to its neighor. This continues until reaching to the processor '0' again. Then processor '0' prints the sum which is n(n+1)/2 ( in this case 45). I know that these non-blicking function return immediately even if the communication is not finished, and MPI_Wait() is needed to ensure the completion of the communication. And I know that it's better to have a buffer of size 2 to store the rank and sum. But I don,t know how and when to update the message before sending it to the next rank?
I don't want to use if statemet. Lik if(rank==0) then send to 1 and add then if(rank==1) receive from 0 and then add 1 and then send to 2,... Since this one is highly inefficient for larg number of processor.
int main (int argc, char *argv[])
{
int size, rank, next, prev;
int buf[2],
MPI_Request reqs[9];
MPI_Status stats[9];
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
prev = rank-1;
next = rank+1;
if (rank == 0) prev = size - 1;
if (rank == (size - 1)) next = 0;
//MPI_Irecv (&buf,count,datatype,source,tag,comm,&request)
ierror = MPI_Irecv(&buf[0], 1, MPI_INT, prev, tag1, MPI_COMM_WORLD, &reqs[0]);
ierror = MPI_Irecv(&buf[1], 1, MPI_INT, next, tag2, MPI_COMM_WORLD, &reqs[1]);
//MPI_Isend (&buf,count,datatype,dest,tag,comm,&request)
ierror = MPI_Isend(&buf[0], 1, MPI_INT, prev, tag2, MPI_COMM_WORLD, &reqs[2]);
ierror = MPI_Isend(&buf[1], 1, MPI_INT, next, tag1, MPI_COMM_WORLD, &reqs[3]);
ierror = MPI_Waitall(9, reqs, stats);

Why this code only runs on limited number of processors?

I wrote this code to measure the time wall for MPI_Bcast(), but the code only runs on specific numbers of processors-8,16, 24, 32. I need it to be able to run for any number of processors and any size of array. There is an error with the segmentation that doesn't allow it to run for array size of 2^13 and bigger.
#include<mpi.h>
#include<stdio.h>
#include<stdlib.h>
#include<math.h>
int main(int argc, char *argv[]){
MPI_Init(&argc, &argv);
int rank, size;
long int N = pow(2,12);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
long double start, end;
float* buffer = (float*)malloc(sizeof(float*) * N + 1);
if(rank == 0){
/*Creating data array of N random floating numbers*/
for(int i =0; i < N; i++){
buffer[i] = rand();
buffer[i]= buffer[i]/rand();
}
start = MPI_Wtime();
printf("Start Time:%Lf\nRoot processor-Rank %d- started broadcasting data buffer of %li floating numbers...\n", start, rank, N);
MPI_Bcast(buffer, N, MPI_FLOAT, 0,MPI_COMM_WORLD);
end = MPI_Wtime();
printf("Total time elapsed: %Lf\n", end - start);
}
if(rank != 0){
printf("Processor %d of %d processors, received the broadcast.\n", rank, size);
}
/*if(rank == 0){
end = MPI_Wtime();
printf("Total time elapsed: %Lf\n", end - start);}*/
MPI_Finalize();
}

OpenMPI doesn't kill other rank when one rank crashes

I have some sample code:
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
int main(int argc, char** argv) {
// Initialize the MPI environment
MPI_Init(&argc, &argv);
// Find out rank, size
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
// We are assuming at least 2 processes for this task
if (world_size < 2) {
fprintf(stderr, "World size must be greater than 1 for %s\n", argv[0]);
MPI_Abort(MPI_COMM_WORLD, 1);
}
int number;
if (world_rank == 1) {
number = -1;
MPI_Send(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
raise(SIGSEGV);
} else if (world_rank == 0) {
MPI_Recv(&number, 1, MPI_INT, 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("Process 0 received number %d from process 1\n", number);
}
printf("rank %d finalize\n", world_rank);
MPI_Finalize();
}
Rank 1 raises a signal to simulate crash. After the raise() rank 1 exits. But rank 0 stills prints rank 0 finalize.
Is there any way to know in rank 0 whether rank 1 crashes in this case? Is it possible to let mpirun kill rank 0 when rank 1 crashes?
Note there is a race condition in your problem, and mpirun might have not enough time to notice task 1 crashed and kill task 0 before the message is printed.
You can force Open MPI to kill all tasks as soon as a crash is detected with the option below
mpirun -mca orte_abort_on_non_zero_status 1 ...

MPI_Wtime timer runs about 2 times faster in OpenMPI 2.0.2

After updating OpenMPI from 1.8.4 to 2.0.2 I ran into erroneous time measurement using MPI_Wtime(). With version 1.8.4 the result was the same as returned by omp_get_wtime() timer, and now MPI_Wtime runs about 2 times faster.
What can cause such a behaviour?
My sample code:
#include <omp.h>
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
int some_work(int rank, int tid){
int count = 10000;
int arr[count];
for( int i=0; i<count; i++)
arr[i] = i + tid + rank;
for( int val=0; val<4000000; val++)
for(int i=0; i<count-1; i++)
arr[i] = arr[i+1];
return arr[0];
}
int main (int argc, char *argv[]) {
MPI_Init(NULL, NULL);
int rank, size;
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (rank == 0)
printf("there are %d mpi processes\n", size);
MPI_Barrier(MPI_COMM_WORLD);
double omp_time1 = omp_get_wtime();
double mpi_time1 = MPI_Wtime();
#pragma omp parallel
{
int tid = omp_get_thread_num();
if ( tid == 0 ) {
int nthreads = omp_get_num_threads();
printf("There are %d threads for process %d\n", nthreads, rank);
int result = some_work(rank, tid);
printf("result for process %d thread %d is %d\n", rank, tid, result);
}
}
MPI_Barrier(MPI_COMM_WORLD);
double mpi_time2 = MPI_Wtime();
double omp_time2 = omp_get_wtime();
printf("process %d omp time: %f\n", rank, omp_time2 - omp_time1);
printf("process %d mpi time: %f\n", rank, mpi_time2 - mpi_time1);
printf("process %d ratio: %f\n", rank, (mpi_time2 - mpi_time1)/(omp_time2 - omp_time1) );
MPI_Finalize();
return EXIT_SUCCESS;
}
Compiling
g++ -O3 src/example_main.cpp -o bin/example -fopenmp -I/usr/mpi/gcc/openmpi-2.0.2/include -L /usr/mpi/gcc/openmpi-2.0.2/lib -lmpi
And running
salloc -N2 -n2 mpirun --map-by ppr:1:node:pe=16 bin/example
Gives something like
there are 2 mpi processes
There are 16 threads for process 0
There are 16 threads for process 1
result for process 1 thread 0 is 10000
result for process 0 thread 0 is 9999
process 1 omp time: 5.066794
process 1 mpi time: 10.098752
process 1 ratio: 1.993125
process 0 omp time: 5.066816
process 0 mpi time: 8.772390
process 0 ratio: 1.731342
The ratio is not consistent as I wrote first but still large enough.
Results for OpenMPI 1.8.4 are OK:
g++ -O3 src/example_main.cpp -o bin/example -fopenmp -I/usr/mpi/gcc/openmpi-1.8.4/include -L /usr/mpi/gcc/openmpi-1.8.4/lib -lmpi -lmpi_cxx
Gives
result for process 0 thread 0 is 9999
result for process 1 thread 0 is 10000
process 0 omp time: 4.655244
process 0 mpi time: 4.655232
process 0 ratio: 0.999997
process 1 omp time: 4.655335
process 1 mpi time: 4.655321
process 1 ratio: 0.999997
I've got similar behavior on my cluster (same OpenMPI version as yours, 2.0.2) and the problem was the default governor for the CPU frequencies, the 'conservative' one.
Once set the governor to 'performance', output of MPI_Wtime() aligned with the correct timings (output of 'time', in my case).
It appears that, for some older Xeon processors (like the Xeon E5620), some clocking function becomes skewed when too aggressive policies for dynamic frequency adjustment are used - the same OpenMPI version does not suffer from this problem on newer Xeons within the same cluster.
Maybe MPI_Wtime() could be a costly operation in itself?
Do the results get more consistent if you avoid to measure the time consumed by MPI_Wtime() as part of the OpenMP-Time?
E.g.:
double mpi_time1 = MPI_Wtime();
double omp_time1 = omp_get_wtime();
/* do something */
double omp_time2 = omp_get_wtime();
double mpi_time2 = MPI_Wtime();

difficulty with MPI_Gather function

I have a value on local array (named lvotes) for each of the processors (assume 3 processors), and first element of each is storing a value, i.e.:
P0 : 4
P1 : 6
p2 : 7
Now, using MPI_Gather, I want gather them all in P0, so It will look like :
P0 : 4, 6, 7
I used gather this way:
MPI_Gather(lvotes, P, MPI_INT, lvotes, 1, MPI_INT, 0, MPI_COMM_WORLD);
But I get problems. It's my first time coding in MPI. I could use any suggestion.
Thanks
This is a common issue with people using the gather/scatter collectives for the first time; in both the send and receive counts you specify the count of items to send to or receive from each process. So although it's true that you'll be, in total, getting (say) P items, if P is the number of processors, that's not what you specify to the gather operation; you specify you are sending a count of 1, and receiving a count of 1 (from each process). Like so:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <mpi.h>
int main ( int argc, char **argv ) {
int rank;
int size;
int lvotes;
int *gvotes;
MPI_Init ( &argc, &argv );
MPI_Comm_rank ( MPI_COMM_WORLD, &rank );
MPI_Comm_size ( MPI_COMM_WORLD, &size );
if (rank == 0)
gvotes = malloc(size * sizeof(int) );
/* everyone sets their first lvotes element */
lvotes = rank+4;
/* Gather to process 0 */
MPI_Gather(&lvotes, 1, MPI_INT, /* send 1 int from lvotes.. */
gvotes, 1, MPI_INT, /* gather 1 int each process into lvotes */
0, MPI_COMM_WORLD); /* ... to root process 0 */
printf("P%d: %d\n", rank, lvotes);
if (rank == 0) {
printf("P%d: Gathered ", rank);
for (int i=0; i<size; i++)
printf("%d ", gvotes[i]);
printf("\n");
}
if (rank == 0)
free(gvotes);
MPI_Finalize();
return 0;
}
Running gives
$ mpirun -np 3 ./gather
P1: 5
P2: 6
P0: 4
P0: Gathered 4 5 6

Resources