I have this MPI program which hangs without completing. Any idea where it is going wrong? May be I am missing something but I cannot think of a possible issue with the code. Changing the order of send and receive doesn't work either. (But I am guessing any order would do due to nonblocking nature of the calls.)
#include <mpi.h>
int main(int argc, char** argv) {
int p = 2;
int myrank;
double in_buf[1];
double out_buf[1];
MPI_Comm comm = MPI_COMM_WORLD;
MPI_Status stat;
MPI_Init(&argc, &argv);
MPI_Comm_rank(comm, &myrank);
MPI_Comm_size(comm, &p);
MPI_Request requests[2];
MPI_Status statuses[2];
if (myrank == 0) {
MPI_Isend(out_buf,1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &requests[0]);
MPI_Irecv(in_buf, 1, MPI_DOUBLE, 1, 1, MPI_COMM_WORLD, &requests[1]);
} else {
MPI_Irecv(in_buf, 1, MPI_DOUBLE, 1, 1, MPI_COMM_WORLD, &requests[0]);
MPI_Isend(out_buf, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &requests[1]);
}
MPI_Waitall(2, requests, statuses);
printf("Done...\n");
}
Right of the bat it looks like your tags are mismatched.
You post isend from rank 0 with tag=0 but post irecv from rank 1 with tag=1.
I assume you launch two processes as well, right? the int p = 2 doesn't do anything useful.
Related
I am encountering some strange behavior with MPICH. The following minimal example, which sends a message to the non-existing process with rank -1, causes a deadlock:
// Program: send-err
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char** argv) {
MPI_Init(NULL, NULL);
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
// We are assuming at least 2 processes for this task
if (world_size != 2) {
fprintf(stderr, "World size must be 2 for %s\n", argv[0]);
MPI_Abort(MPI_COMM_WORLD, 1);
}
int number;
if (world_rank == 0) {
number = -1;
MPI_Send(&number, // data buffer
1, // buffer size
MPI_INT, // data type
-1, //destination
0, //tag
MPI_COMM_WORLD); // communicator
} else if (world_rank == 1) {
MPI_Recv(&number,
1, // buffer size
MPI_INT, // data type
0, // source
0, //tag
MPI_COMM_WORLD, // communicator
MPI_STATUS_IGNORE);
}
MPI_Finalize();
}
If the call to the send function,
MPI_Send( start, count, datatype, destination_rank, tag, communicator )
uses destination_rank = -2, then the program fails with the error message:
> mpirun -np 2 send-err
Abort(402250246) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Send: Invalid rank, error stack:
PMPI_Send(157): MPI_Send(buf=0x7fffeb411b44, count=1, MPI_INT, dest=MPI_ANY_SOURCE, tag=0, MPI_COMM_WORLD) failed
PMPI_Send(94).: Invalid rank has value -2 but must be nonnegative and less than 2
Based on the error message, I would expect a program that sends a message to the process with rank -1 to fail similarly to the program sending a message to the process with rank -2. What causes this difference in behavior?
I am writing a sample program MPI in which one process send an integer to other process.
This is my source code
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
// Find out rank, size
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
int number;
if (world_rank == 0) {
number = -1;
MPI_Send(&number, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
}
else if (world_rank == 1) {
MPI_Recv(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,
MPI_STATUS_IGNORE);
printf("Process 1 received number %d from process 0\n",
number);
}
}
And this is the error when I run mpiexec in windows cmd line
ERROR: Error reported: failed to set work directory to 'D:\study_documents\Thesis\Nam 4\Demo\Sample Codes\MPI_HelloWorld\Debug' on DESKTOP-EKN1RD3
Error (3) The system cannot find the path specified.
I'm trying to implement the following scenario using mpi_comm_spawn & scatter :
1- Master spawns 2 processes with a job.
2- He scatters an array to those spawned processes.
3- The spawned processes receive the scattered array sort it then send it back.
4- The master receives the sorted parts of the array.
I'd like to know how to do the step 2, so far i've tried with send and receives, they work perfectly but i want to do it with the scatter function.
Edit : Here's what i'd like to do in the master code , i'm missing the part in the slave's where i receive the scattered array
/*Master Here*/
MPI_Comm_spawn(slave, MPI_ARGV_NULL, 2, MPI_INFO_NULL,0, MPI_COMM_WORLD, &inter_comm, array_of_errcodes);
printf("MASTER Sending a message to slaves \n");
MPI_Send(message, 50, MPI_CHAR,0 , tag, inter_comm);
MPI_Scatter(array, 10, MPI_INT, &array_r, 10, MPI_INT, MPI_ROOT, inter_comm);
Thanks.
master.c
#include "mpi.h"
int main(int argc, char *argv[])
{
int n_spawns = 2;
MPI_Comm intercomm;
MPI_Init(&argc, &argv);
MPI_Comm_spawn("worker_program", MPI_ARGV_NULL, n_spawns, MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE);
int sendbuf[2] = {3, 5};
int recvbuf; // redundant for master.
MPI_Scatter(sendbuf, 1, MPI_INT, &recvbuf, 1, MPI_INT, MPI_ROOT, intercomm);
MPI_Finalize();
return 0;
}
worker.c
#include "mpi.h"
#include <stdio.h>
int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
MPI_Comm intercomm;
MPI_Comm_get_parent(&intercomm);
int sendbuf[2]; // redundant for worker.
int recvbuf;
MPI_Scatter(sendbuf, 1, MPI_INT, &recvbuf, 1, MPI_INT, 0, intercomm);
printf("recvbuf = %d\n", recvbuf);
MPI_Finalize();
return 0;
}
Command line
mpicc master.c -o master_program
mpicc worker.c -o worker_program
mpirun -n 1 master_program
I wonder if it is possible to send data to a 3rd party communicator by having its integer value.
In other words, I would like to send a MPI_Comm (int) to processes in a non-related communicator in order to establish a communication between them.
For example, this picture shows my goal:
For this purpose, I have developed testing codes which intend to transfer the MPI_Comm.
parent.c
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
MPI_Init(NULL, NULL);
int world_size, world_rank;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
MPI_Comm children;
int err[world_size], msg;
MPI_Comm_spawn("./children", NULL, 2, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &children, err);
if (world_rank == 0) {
MPI_Send(&children, 1, MPI_INT, 0, 0, children);
MPI_Recv(&msg, 1, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, children, MPI_STATUS_IGNORE);
}
MPI_Finalize();
return (0);
}
children.c
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
MPI_Init(NULL, NULL);
int world_size, world_rank;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
MPI_Comm parent;
MPI_Comm_get_parent(&parent);
int comm, msg = 123;
if (world_rank == 0) {
MPI_Recv(&comm, 1, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, parent, MPI_STATUS_IGNORE);
MPI_Comm children = (MPI_Comm) comm;
MPI_Send(&msg, 1, MPI_INT, 0, 0, children);
}
MPI_Finalize();
return (0);
}
Of course, the code does not work due to:
Fatal error in MPI_Send: Invalid communicator, error stack
Is there any way two establish that connection?
PS: This is an example, and I know that if I used "parent" comm in the send of "children.c", it would work. But my intention is to send data to a 3rd party communicator with only its integer id.
I write programs using MPI and I have an access to two different clusters. I am not good in system administration, so I can not tell anything about software, OS, compilers which are used there. But, on one machine I have an deadlock using such code:
#include "mpi.h"
#include <iostream>
int main(int argc, char **argv) {
int rank, numprocs;
MPI_Status status;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
int x = rank;
if (rank == 0) {
for (int i=0; i<numprocs; ++i)
MPI_Send(&x, 1, MPI_INT, i, 100500, MPI_COMM_WORLD);
}
MPI_Recv(&x, 1, MPI_INT, 0, 100500, MPI_COMM_WORLD, &status);
MPI_Finalize();
return 0;
}
The error message is related:
Fatal error in MPI_Send: Other MPI error, error stack:
MPI_Send(184): MPI_Send(buf=0x7fffffffceb0, count=1, MPI_INT, dest=0, tag=100500, MPI_COMM_WORLD) failed
MPID_Send(54): DEADLOCK: attempting to send a message to the local process without a prior matching receive
Why is that so? I can't understand, why does it happen on one machine, but doesn't happen on another?
MPI_Send is a blocking operation. It may not complete until a matching receive is posted. In your case rank 0 is trying to send a message to itself before having posted a matching receive. If you must do something like this you would replace MPI_Send with MPI_Isend(+MPI_Wait...` after the receive). But you might as well just not make him send a message to itself.
The proper thing to use in your case is MPI_Bcast.
Since rank 0 already has the correct value of x, you do not need to send it in a message. This means that in the loop you should skip sending to rank 0 and instead start from rank 1:
if (rank == 0) {
for (int i=1; i<numprocs; ++i)
MPI_Send(&x, 1, MPI_INT, i, 100500, MPI_COMM_WORLD);
}
MPI_Recv(&x, 1, MPI_INT, 0, 100500, MPI_COMM_WORLD, &status);
Now rank 0 won't try to talk to itself, but since the receive is outside the conditional, it will still try to receive a message from itself. The solution is to simply make the receive the alternative branch:
if (rank == 0) {
for (int i=1; i<numprocs; ++i)
MPI_Send(&x, 1, MPI_INT, i, 100500, MPI_COMM_WORLD);
}
else
MPI_Recv(&x, 1, MPI_INT, 0, 100500, MPI_COMM_WORLD, &status);
Another more involved solution is to use non-blocking operations to post the receive before the send operation:
MPI_Request req;
MPI_Irecv(&x, 1, MPI_INT, 0, 100500, MPI_COMM_WORLD, &req);
if (rank == 0) {
int xx = x;
for (int i=0; i<numprocs; ++i)
MPI_Send(&xx, 1, MPI_INT, i, 100500, MPI_COMM_WORLD);
}
MPI_Wait(&req, &status);
Now rank 0 will not block in MPI_Send as there is already a matching receive posted earlier. In all other ranks MPI_Irecv will be immediately followed by MPI_Wait, which is equivalent to a blocking receive (MPI_Recv). Note that the value of x is copied to a different variable inside the conditional as simultaneously sending from and receiving into the same memory location is forbidden by the MPI standard for obvious correctness reasons.