MPI Error: No output - mpi

The code below is for using 4 nodes to communicate using MPI. I am able to compile it successfully on the cluster using "mpiicpc".
However, the output screen just gives me a warning, ‘Warning: Cant read mpd.hosts for list of hosts start only on current’ and hangs.
Could you please suggest what the warning means and also if it is the reason why my code hangs?
#include <mpi.h>
#include <fstream>
using namespace std;
#define Cols 96
#define Rows 96
#define beats 1
ofstream fout("Vm0");
ofstream f1out("Vm1");
double V[Cols][Rows];
int r,i,y,ibeat;
int my_rank;
int p;
int source;
int dest;
int tag = 0;
//Allocating Memory
double *A = new double[Rows*sizeof(double)];
double *B = new double[Rows*sizeof(double)];
void prttofile ();
int main (int argc, char *argv[])
//MPI Commands
MPI_Status status;
MPI_Request send_request, recv_request;
MPI_Init (&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
MPI_Comm_size(MPI_COMM_WORLD, &p);
for (ibeat=0;ibeat<beats;ibeat++)
for (i=0; i<Cols/2; i++)
for (y=0; y<Rows/2; y++)
if (my_rank == 0)
if (i < 48)
if (y<48)
V[i][y] = 0;
//Load the Array with the edge values
for (r=0; r<Rows/2; y++)
if ((my_rank == 0) || (my_rank == 1))
A[r] = V[r][48];
BB[r] = V[r][48];
int test = 2;
if ((my_rank%test) == 0)
MPI_Isend(C, Rows, MPI_DOUBLE, my_rank+1, 0, MPI_COMM_WORLD, &send_request);
MPI_Irecv(CC, Rows, MPI_DOUBLE, my_rank+1, MPI_ANY_TAG, MPI_COMM_WORLD, &recv_request);
else if ((my_rank%test) == 1)
ibeat = ibeat+1;
prttofile ();
} //close ibeat
MPI_Finalize ();
} //close main
//Print to File Function to save output values
void prttofile ()
for (i = 0; i<Cols/2; i++)
for (y = 0; y<Rows/2; y++)
if (my_rank == 0)
fout << V[i][y] << " " ;
if (my_rank == 0)
fout << endl;
if ....

When you want to run on multiple nodes you have to tell mpirun which ones you want with the -machinefile switch. This machinefile is just a list of nodes, one per line. If you want to put 2 processes on one node, list it twice.
So if your machines are named node1 and node2 and you want to use two cores from each:
$ cat nodes
$ mpirun -machinefile nodes -np 4 ./a.out
If you're using a batch control system like PBS or TORQUE (you use qsub to submit your job) then this node file is created for you and its location is in the $PBS_NODEFILE environment variable:
mpirun -machinefile $PBS_NODEFILE -np 4 ./a.out


making tree function in xv6

I want to make tree command in xv6, if you don't know the tree is to list out directories on the terminal. I know this is probably easy for you but the code is so far
#include "types.h"
#include "stat.h"
#include "user.h"
#include "fcntl.h"
#include "fs.h"
#include "file.h"
main(int argc, char *argv[])
if(argc < 2){
printf(2, "Usage: tree [path]...\n");
int fd = open(argv[1],O_RDONLY);
return -1;
struct dirent dir;
printf(1,"|_ %d,%d",,dir.inum);
//struct stat *st;
struct inode ip;
ip= getinode(dir.inum);
int i;
for(i=0;i<NDIRECT;i++ ){
uint add=ip.addrs[i];
return 0;
and it has been giving me numerous error on the terminal the first being file.h:17:20: error: field ‘lock’ has incomplete type
struct sleeplock lock; // protects everything below here
I'm searching for sleeplock and there is nothing like that in the code. What is wrong with the code? Thank you for your help
You cannot use kernel headers (like file.h) in a user code. To use kernel functionnalities in your code, you must use system calls.
To achieve what you want, you could start from ls function and make it recursive.
One example made quickly:
I added a parameter to the ls function to display the depth of crawling
and call itself on each directory elements but two first which are . and ..
ls(char *path, int decal)
char buf[512], *p;
int fd, i, skip = 2;
struct dirent de;
struct stat st;
if((fd = open(path, 0)) < 0){
printf(2, "tree: cannot open %s\n", path);
if(fstat(fd, &st) < 0){
printf(2, "tree: cannot stat %s\n", path);
case T_FILE:
for (i = 0; i < decal; i++)
printf(1, " ");
printf(1, "%s %d %d %d\n", fmtname(path), st.type, st.ino, st.size);
case T_DIR:
if(strlen(path) + 1 + DIRSIZ + 1 > sizeof buf){
printf(1, "tree: path too long\n");
strcpy(buf, path);
p = buf+strlen(buf);
*p++ = '/';
while(read(fd, &de, sizeof(de)) == sizeof(de)){
if(de.inum == 0)
memmove(p,, DIRSIZ);
p[DIRSIZ] = 0;
if(stat(buf, &st) < 0){
printf(1, "tree: cannot stat %s\n", buf);
for (i = 0; i < decal; i++)
printf(1, " ");
printf(1, "%s %d %d %d\n", fmtname(buf), st.type, st.ino, st.size);
if (skip)
ls(buf, decal+1);

Using of MPI Barrier lead to fatal error

I get a strange behavior of my simple MPI program. I spent time to find an answer myself, but I can't. I red some questions here, like OpenMPI MPI_Barrier problems, MPI_SEND stops working after MPI_BARRIER, Using MPI_Bcast for MPI communication. I red MPI tutorial on mpitutorial.
My program just modify array that was broadcasted from root process and then gather modified arrays to one array and print them.
So, the problem is, that when I use code listed below with uncommented MPI_Barrier(MPI_COMM_WORLD) I get an error.
#include "mpi/mpi.h"
#define N 4
void transform_row(int* row, const int k) {
for (int i = 0; i < N; ++i) {
row[i] *= k;
const int root = 0;
int main(int argc, char** argv) {
MPI_Init(&argc, &argv);
int rank, ranksize;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &ranksize);
if (rank == root) {
int* arr = new int[N];
for (int i = 0; i < N; ++i) {
arr[i] = i * i + 1;
MPI_Bcast(arr, N, MPI_INT, root, MPI_COMM_WORLD);
int* arr = new int[N];
MPI_Bcast(arr, N, MPI_INT, root, MPI_COMM_WORLD);
transform_row(arr, rank * 100);
int* transformed = new int[N * ranksize];
MPI_Gather(arr, N, MPI_INT, transformed, N, MPI_INT, root, MPI_COMM_WORLD);
if (rank == root) {
for (int i = 0; i < ranksize; ++i) {
for (int j = 0; j < N ; j++) {
printf("%i ", transformed[i * N + j]);
return 0;
The error comes with number of thread > 1. The error:
Fatal error in PMPI_Barrier: Message truncated, error stack:
PMPI_Barrier(425)...................: MPI_Barrier(MPI_COMM_WORLD) failed
MPIR_Barrier_impl(332)..............: Failure during collective
MPIDI_CH3U_Request_unpack_uebuf(568): Message truncated; 16 bytes received but buffer size is 1
I understand that some problem with buffer exists, but when I use MPI_buffer_attach to attach big buffer to MPI it don't help.
Seems I need to increase this buffer, but I don't now how to do this.
XXXXXX#XXXXXXXXX:~/test_mpi$ mpirun --version
HYDRA build details:
Version: 3.2
Release Date: Wed Nov 11 22:06:48 CST 2015
So help me please.
One issue is MPI_Bcast() is invoked twice by the root rank, but only once by the other ranks. And then root rank uses an uninitialized arr.
MPI_Barrier() might only hide the problem, but it cannot fix it.
Also, note that if N is "large enough", then the second MPI_Bcast() invoked by root rank will likely hang.
Here is how you can revamp the init/broadcast phase to fix these issues.
int* arr = new int[N];
if (rank == root) {
for (int i = 0; i < N; ++i) {
arr[i] = i * i + 1;
MPI_Bcast(arr, N, MPI_INT, root, MPI_COMM_WORLD);
Note in this case, you can simply initialize arr on all the ranks so you do not even need to broadcast the array.
As a side note, MPI program typically
#include <mpi.h>
and then use mpicc for the compilation/linking
(this is a wrapper that invoke the real compiler after setting the include/library paths and using the MPI libs)

Creating multiple child processes with a single pipe

I need to create three child processes, each of which reads a string from the command line arguments and writes the string to a single pipe. The parent would then read the strings from the pipe and display all three of them on the screen. I tried doing it for two processes to test and it is printing one of the strings twice as opposed to both of them.
#include <stdio.h>
#include <unistd.h>
int main (int argc, char *argv[]) {
char *character1 = argv[1];
char *character2 = argv[2];
char inbuf[100]; //creating an array with a max size of 100
int p[2]; // Pipe descriptor array
pid_t pid1; // defining pid1 of type pid_t
pid_t pid2; // defining pid2 of type pid_t
if (pipe(p) == -1) {
fprintf(stderr, "Pipe Failed"); // pipe fail
pid1 = fork(); // fork
if (pid1 < 0) {
fprintf(stderr, "Fork Failed"); // fork fail
else if (pid1 == 0){ // if child process 1
close(p[0]); // close the read end
write(p[1], character1, sizeof(&inbuf[0])); // write character 1 to the pipe
else { // if parent, create a second child process, child process 2
pid2 = fork();
if (pid2 < 0) {
fprintf(stderr, "Fork Failed"); // fork fail
if (pid2 = 0) { // if child process 2
close(p[0]); // close the read end
write(p[1], character2, sizeof(&inbuf[0])); // write character 2 to the pipe
else { // if parent process
close(p[1]); // close the write end
read(p[0], inbuf, sizeof(&inbuf[0])); // Read the pipe that both children write to
printf("%s\n", inbuf); // print
read(p[0], inbuf, sizeof(&inbuf[0])); // Read the pipe that both children write to
printf("%s\n", inbuf); // print
Your code doesn't keep looping until there's no more data to read. It does a single read. It also doesn't check the value returned by read(), but it should.
I've abstracted the fork() and write() (and error check) code into a function. This seems to work:
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
static void child(int fd, const char *string)
pid_t pid = fork();
int len = strlen(string);
if (pid < 0)
fprintf(stderr, "%.5d: failed to fork (%d: %s)\n",
(int)getpid(), errno, strerror(errno));
else if (pid > 0)
else if (write(fd, string, len) != len)
fprintf(stderr, "%.5d: failed to write on pipe %d (%d: %s)\n",
(int)getpid(), fd, errno, strerror(errno));
int main (int argc, char *argv[])
char inbuf[100]; //creating an array with a max size of 100
int p[2]; // Pipe descriptor array
if (argc != 4)
fprintf(stderr, "Usage: %s str1 str2 str3\n", argv[0]);
return 1;
if (pipe(p) == -1)
fprintf(stderr, "Pipe Failed"); // pipe fail
return 1;
for (int i = 0; i < 3; i++)
child(p[1], argv[i+1]);
int nbytes;
close(p[1]); // close the write end
while ((nbytes = read(p[0], inbuf, sizeof(inbuf))) > 0)
printf("%.*s\n", nbytes, inbuf); // print
return 0;
I ran the command multiple times, each time using the command line:
./p3 'message 1' 'the second message' 'a third message for the third process'
On one run, the output was:
the second messagemessage 1
a third message for the third process
On another, I got:
the second messagemessage 1a third message for the third process
And on another, I got:
message 1
the second messagea third message for the third process
(This is on a MacBook Pro with Intel Core i7, running Mac OS X 10.8.3, and using GCC 4.7.1.)

MPI hangs on MPI_Send for large messages

There is a simple program in c++ / mpi (mpich2), which sends an array of type double. If the size of the array more than 9000, then during the call MPI_Send my programm hangs. If array is smaller than 9000 (8000, for example) programm works fine. Source code is bellow:
using namespace std;
Cube** cubes;
int cubesLen;
double* InitVector(int N) {
double* x = new double[N];
for (int i = 0; i < N; i++) {
x[i] = i + 1;
return x;
void CreateCubes() {
cubes = new Cube*[12];
cubesLen = 12;
for (int i = 0; i < 12; i++) {
cubes[i] = new Cube(9000);
void SendSimpleData(int size, int rank) {
Cube* cube = cubes[0];
int nodeDest = rank + 1;
if (nodeDest > size - 1) {
nodeDest = 1;
double* coefImOut = (double *) malloc(sizeof (double)*cube->coefficentsImLength);
cout << "Before send" << endl;
int count = cube->coefficentsImLength;
MPI_Send(coefImOut, count, MPI_DOUBLE, nodeDest, 0, MPI_COMM_WORLD);
cout << "After send" << endl;
MPI_Status status;
double *coefIm = (double *) malloc(sizeof(double)*count);
int nodeFrom = rank - 1;
if (nodeFrom < 1) {
nodeFrom = size - 1;
MPI_Recv(coefIm, count, MPI_DOUBLE, nodeFrom, 0, MPI_COMM_WORLD, &status);
int main(int argc, char *argv[]) {
int size, rank;
const int root = 0;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (rank != root) {
SendSimpleData(size, rank);
return 0;
class Cube
class Cube {
Cube(int size);
Cube(const Cube& orig);
virtual ~Cube();
int Id() { return id; }
void Id(int id) { this->id = id; }
int coefficentsImLength;
double* coefficentsIm;
int id;
Cube::Cube(int size) {
this->coefficentsImLength = size;
coefficentsIm = new double[size];
for (int i = 0; i < size; i++) {
coefficentsIm[i] = 1;
Cube::Cube(const Cube& orig) {
Cube::~Cube() {
delete[] coefficentsIm;
The program runs on 4 processes:
mpiexec -n 4 ./myApp1
Any ideas?
The details of the Cube class aren't relevant here: consider a simpler version
#include <mpi.h>
#include <cstdlib>
using namespace std;
int main(int argc, char *argv[]) {
int size, rank;
const int root = 0;
int datasize = atoi(argv[1]);
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (rank != root) {
int nodeDest = rank + 1;
if (nodeDest > size - 1) {
nodeDest = 1;
int nodeFrom = rank - 1;
if (nodeFrom < 1) {
nodeFrom = size - 1;
MPI_Status status;
int *data = new int[datasize];
for (int i=0; i<datasize; i++)
data[i] = rank;
cout << "Before send" << endl;
MPI_Send(&data, datasize, MPI_INT, nodeDest, 0, MPI_COMM_WORLD);
cout << "After send" << endl;
MPI_Recv(&data, datasize, MPI_INT, nodeFrom, 0, MPI_COMM_WORLD, &status);
delete [] data;
return 0;
where running gives
$ mpirun -np 4 ./send 1
Before send
After send
Before send
After send
Before send
After send
$ mpirun -np 4 ./send 65000
Before send
Before send
Before send
If in DDT you looked at the message queue window, you'd see everyone is sending, and no one is receiving, and you have a classic deadlock.
MPI_Send's semantics, wierdly, aren't well defined, but it is allowed to block until "the receive has been posted". MPI_Ssend is clearer in this regard; it will always block until the receive has been posted. Details about the different send modes can be seen here.
The reason it worked for smaller messages is an accident of the implementation; for "small enough" messages (for your case, it looks to be <64kB), your MPI_Send implementation uses an "eager send" protocol and doesn't block on the receive; for larger messages, where it isn't necessarily safe just to keep buffered copies of the message kicking around in memory, the Send waits for the matching receive (which it is always allowed to do anyway).
There's a few things you could do to avoid this; all you have to do is make sure not everyone is calling a blocking MPI_Send at the same time. You could (say) have even processors send first, then receive, and odd processors receive first, and then send. You could use nonblocking communications (Isend/Irecv/Waitall). But the simplest solution in this case is to use MPI_Sendrecv, which is a blocking (Send + Recv), rather than a blocking send plus a blocking receive. The send and receive will execute concurrently, and the function will block until both are complete. So this works
#include <mpi.h>
#include <cstdlib>
using namespace std;
int main(int argc, char *argv[]) {
int size, rank;
const int root = 0;
int datasize = atoi(argv[1]);
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (rank != root) {
int nodeDest = rank + 1;
if (nodeDest > size - 1) {
nodeDest = 1;
int nodeFrom = rank - 1;
if (nodeFrom < 1) {
nodeFrom = size - 1;
MPI_Status status;
int *outdata = new int[datasize];
int *indata = new int[datasize];
for (int i=0; i<datasize; i++)
outdata[i] = rank;
cout << "Before sendrecv" << endl;
MPI_Sendrecv(outdata, datasize, MPI_INT, nodeDest, 0,
indata, datasize, MPI_INT, nodeFrom, 0, MPI_COMM_WORLD, &status);
cout << "After sendrecv" << endl;
delete [] outdata;
delete [] indata;
return 0;
Running gives
$ mpirun -np 4 ./send 65000
Before sendrecv
Before sendrecv
Before sendrecv
After sendrecv
After sendrecv
After sendrecv

Segmentation fault while using MPI_File_open

I'm trying to read from a file for an MPI application. The cluster has 4 nodes with 12 cores in each node. I have tried running a basic program to compute rank and that works. When I added MPI_File_open it throws an exception at runtime
The cluster has MPICH2 installed and has a Network File System. I check MPI_File_open with different parameters like ReadOnly mode, MPI_COMM_WORLD etc.
Can I use MPI_File_open with Network File System?
int main(int argc, char* argv[])
int myrank = 0;
int nprocs = 0;
int i = 0;
MPI_Comm icomm = MPI_COMM_WORLD;
MPI_Status status;
MPI_Info info;
MPI_File *fh = NULL;
int error = 0;
MPI_Init(&argc, &argv);
MPI_Barrier(MPI_COMM_WORLD); // Wait for all processor to start
MPI_Comm_size(MPI_COMM_WORLD, &nprocs); // Get number of processes
MPI_Comm_rank(MPI_COMM_WORLD, &myrank); // Get own rank
if ( myrank == 1 || myrank == 0 )
printf("Hello from %d\r\n", myrank);
if (myrank == 0)
error = MPI_File_open( MPI_COMM_SELF, "lw1.wei", MPI_MODE_UNIQUE_OPEN,
if ( error )
printf("Error in opening file\r\n");
printf("File successfully opened\r\n");
MPI_Barrier(MPI_COMM_WORLD); //! Wait for all the processors to end
if ( myrank == 0 )
printf("Number of Processes %d\n\r", nprocs);
return 0;
You forgot to allocate an MPI_File object before opening the file. You may either change this line:
MPI_File *fh = NULL;
MPI_File fh;
and open file by giving fh's address to MPI_File_open(..., &fh). Or you may simply allocate memory from heap using malloc().
MPI_File *fh = malloc(sizeof(MPI_File));
