Can I link two separate executables with MPI_open_port and share port information in a text file? - mpi

I'm trying to create a shared MPI COMM between two executables which are both started independently, e.g.
mpiexec -n 1 ./exe1
mpiexec -n 1 ./exe2
I use MPI_Open_port to generate port details and write these to a file in exe1 and then read with exe2. This is followed by MPI_Comm_connect/MPI_Comm_accept and then send/recv communication (minimal example below).
My question is: can we write port information to file in this way, or is the MPI_Publish_name/MPI_Lookup_name required for MPI to work as in this, this and this? As supercomputers usually share a file system, this file based approach seems simpler and maybe avoids establishing a server.
It seems this should work according to the MPI_Open_Port documentation in the MPI 3.1 standard,
port_name is essentially a network address. It is unique within the communication universe to which it belongs (determined by the implementation), and may be used by any client within that communication universe. For instance, if it is an internet (host:port) address, it will be unique on the internet. If it is a low level switch address on an IBM SP, it will be unique to that SP
In addition, according to documentation on the MPI forum:
The following should be compatible with MPI: The server prints out an address to the terminal, the user gives this address to the client program.
MPI does not require a nameserver
A port_name is a system-supplied string that encodes a low-level network address at which a server can be contacted.
By itself, the port_name mechanism is completely portable ...
Writing the port information to file does work as expected, i.e creates a shared communicator and exchanges information using MPICH (3.2) but hangs at the MPI_Comm_connect/MPI_Comm_accept line when using OpenMPI versions 2.0.1 and 4.0.1 (on my local workstation running Ubuntu 12.04 but eventually needs to work on a tier 1 supercomputer). I have raised as an issue here but welcome a solution or workaround in the meantime.
Further Information
If I use the MPMD mode with OpenMPI,
mpiexec -n 1 ./exe1 : -n 1 ./exe2
this works correctly, so must be an issue with allowing the jobs to share ompi_global_scope as in this question. I've also tried adding,
MPI_Info info;
MPI_Info_create(&info);
MPI_Info_set(info, "ompi_global_scope", "true");
with info passed to all commands, with no success. I'm not running a server/client model as both codes run simultaneously so sharing a URL/PID from one is not ideal, although I cannot get this to work even using the suggested approach, which for OpenMPI 2.0.1,
mpirun -n 1 --report-pid + ./OpenMPI_2.0.1 0
1234
mpirun -n 1 --ompi-server pid:1234 ./OpenMPI_2.0.1 1
gives,
ORTE_ERROR_LOG: Bad parameter in file base/rml_base_contact.c at line 161
This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
pmix server init failed
--> Returned value Bad parameter (-5) instead of ORTE_SUCCESS
and with OpenMPI 4.0.1,
mpirun -n 1 --report-pid + ./OpenMPI_4.0.1 0
1234
mpirun -n 1 --ompi-server pid:1234 ./OpenMPI_4.0.1 1
gives,
ORTE_ERROR_LOG: Bad parameter in file base/rml_base_contact.c at line 50
...
A publish/lookup server was provided, but we were unable to connect
to it - please check the connection info and ensure the server
is alive:
Using 4.0.1 means the error should not be related to this bug in OpenMPI.
Minimal code
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <iostream>
#include <fstream>
using namespace std;
int main( int argc, char *argv[] )
{
int num_errors = 0;
int rank, size;
char port1[MPI_MAX_PORT_NAME];
char port2[MPI_MAX_PORT_NAME];
MPI_Status status;
MPI_Comm comm1, comm2;
int data = 0;
char *ptr;
int runno = strtol(argv[1], &ptr, 10);
for (int i = 0; i < argc; ++i)
printf("inputs %d %d %s \n", i,runno, argv[i]);
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (runno == 0)
{
printf("0: opening ports.\n");fflush(stdout);
MPI_Open_port(MPI_INFO_NULL, port1);
printf("opened port1: <%s>\n", port1);
//Write port file
ofstream myfile;
myfile.open("port");
if( !myfile )
cout << "Opening file failed" << endl;
myfile << port1 << endl;
if( !myfile )
cout << "Write failed" << endl;
myfile.close();
printf("Port %s written to file \n", port1); fflush(stdout);
printf("Attempt to accept port1.\n");fflush(stdout);
//Establish connection and send data
MPI_Comm_accept(port1, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &comm1);
printf("sending 5 \n");fflush(stdout);
data = 5;
MPI_Send(&data, 1, MPI_INT, 0, 0, comm1);
MPI_Close_port(port1);
}
else if (runno == 1)
{
//Read port file
size_t chars_read = 0;
ifstream myfile;
//Wait until file exists and is avaialble
myfile.open("port");
while(!myfile){
myfile.open("port");
cout << "Opening file failed" << myfile << endl;
usleep(30000);
}
while( myfile && chars_read < 255 ) {
myfile >> port1[ chars_read ];
if( myfile )
++chars_read;
if( port1[ chars_read - 1 ] == '\n' )
break;
}
printf("Reading port %s from file \n", port1); fflush(stdout);
remove( "port" );
//Establish connection and recieve data
MPI_Comm_connect(port1, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &comm1);
MPI_Recv(&data, 1, MPI_INT, 0, 0, comm1, &status);
printf("Received %d 1\n", data); fflush(stdout);
}
//Barrier on intercomm before disconnecting
MPI_Barrier(comm1);
MPI_Comm_disconnect(&comm1);
MPI_Finalize();
return 0;
}
The 0 and 1 simply specify if this code writes a port file or reads it in the example above. This is then run with,
mpiexec -n 1 ./a.out 0
mpiexec -n 1 ./a.out 1

Related

printf alternative when using "define _GNU_SOURCE"

After reading https://www.quora.com/How-can-I-bypass-the-OS-buffering-during-I-O-in-Linux I want to try to access data on the serial port with the O_DIRECT option, but the only way I can seem to do that is by adding the GNU_SOURCE define but when I tried to execute the program, nothing at all is printed on the screen.
If I remove "#define _GNU_SOURCE" and compile, then the system gives me an error on O_DIRECT.
If I remove the define and the O_DIRECT flag, then incorrect (possibly outdated) data is always read, but the data is printed on the screen.
I still want to use the O_DIRECT flag and be able to see the data, so I feel I need an alternative command to printf and friends, but I don't know how to continue.
I attached the code below:
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <time.h>
#include <unistd.h>
#include <termios.h>
#define TIMEOUT 5
int main(){
char inb[3]; //our byte buffer
int nread=0; //number bytes read from port
int n; //counter
int iosz=128; //Lets get 128 bytes
int fd=open("/dev/ttyS0", O_NOCTTY | O_RDONLY | O_SYNC | O_DIRECT); //Open port
tcflush(fd,TCIOFLUSH);
for(n=0;n<iosz;n++){
int s=time(NULL); //Start timer for 5 seconds
while (time(NULL)-s < TIMEOUT && nread < 1){
inb[0]='A'; //Fill buffer with bad data
inb[1]='B';
inb[2]='C';
nread=read(fd,(char*)inb,1); //Read ONE byte
tcflush(fd,TCIOFLUSH);
if (nread < 0 || time(NULL)-s >= TIMEOUT){
close(fd); //Exit if read error or timeout
return -1;
}
}
printf("%x:%d ",inb[0] & 0xFF,nread); //Print byte as we receive it
}
close(fd); //program ends so close and exit
printf("\n"); //Print byte as we receive it
return 0;
}
First off, I'm no expert on this topic, just curious about it, so take this answer with a pinch of salt.
I don't know if what you're trying to do here (if I'm not looking at it the wrong way it seems to be to bypass the kernel and read directly from the port to userspace) was ever a possibility (you can find some examples, like this one but I could not find anything properly documented) but with recent kernels you should be getting an error running your code, but you're not catching it.
If you add these lines after declaring your port:
...
int fd=open("/dev/ttyS0", O_NOCTTY | O_RDONLY | O_SYNC | O_DIRECT );
if (fd == -1) {
fprintf(stderr, "Error %d opening SERIALPORT : %s\n", errno, strerror(errno));
return 1;
}
tcflush(fd,TCIOFLUSH);
....
When you try to run you'll get: Error 22 opening SERIALPORT : Invalid argument
In my humble and limited understanding, you should be able to get the same effect changing the settings on termios to raw, something like this should do:
struct termios t;
tcgetattr(fd, &t); /* get current port state */
cfmakeraw(&t); /* set port state to raw */
tcsetattr(fd, TCSAFLUSH, &t); /* set updated port state */
There are many good sources for termios, but the only place I could find taht also refers to O_DIRECT (for files) is this one.

How can I run multiple threads inside of a given MPI process?

I understand that a single MPI job Launches many processes which could be run on multiple nodes.
How do I run multiple threads inside of a given MPI process using MPI_THREAD_MULTIPLE?
I was unable to find enough information in relation to the topic.
Assuming your using OpenMP to run multiple threads
You will write the OpenMP code as you would do with out the MPI. (this statement is over simplified)
When the MPI comes you need to consider how your process will communicate. MPI is not sending messages to individual threads but individual process. For that reason MPI provides four modes of interaction with threads.
MPI_THREAD_SINGLE: Provides only one thread
MPI_THREAD_FUNNELED: Can provide many threads, but only the master thread can make MPI calls. The master thread is the one who call MPI_Init...
MPI_THREAD_SERIALIZED: Can provide many threads, but only one can make MPI calls at a time.
MPI_THREAD_MULTIPE: Can provide many threads, and all of them can make MPI call at any time.
You need to specify the mode you want at MPI_Init, which becomes:
MPI_Init_thread(&argc, &argv, HERE_PUT_THE_MODE_YOU_NEED, PROVIDED_MODE)
Ex:
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPE, &provided)
At the provided field the MPI_Init_thread returns the provided mode. Make sure that you got a mode that your code can cope with it.
Also, avoid the use of MPI_Probe and MPI_IProbe, because they are not thread save. You should use MPI_Mprobe and MPI_Improbe.
Here is a simple 'hello world' example as #ab2050 asked:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <omp.h>
#include "mpi.h"
int main(int argc, char *argv[]) {
int provided;
int rank;
MPI_Init_thread(&argc, &argv, MPI_THREAD_FUNNELED, &provided);
if (provided != MPI_THREAD_FUNNELED) {
fprintf(stderr, "Warning MPI did not provide MPI_THREAD_FUNNELED\n");
}
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
#pragma omp parallel default(none), \
shared(rank), \
shared(ompi_mpi_comm_world), \
shared(ompi_mpi_int), \
shared(ompi_mpi_char)
{
printf("Hello from thread %d at rank %d parallel region\n",
omp_get_thread_num(), rank);
#pragma omp master
{
char helloWorld[12];
if (rank == 0) {
strcpy(helloWorld, "Hello World");
MPI_Send(helloWorld, 12, MPI_CHAR, 1, 0, MPI_COMM_WORLD);
printf("Rank %d send: %s\n", rank, helloWorld);
}
else {
MPI_Recv(helloWorld, 12, MPI_CHAR, 0, 0, MPI_COMM_WORLD,
MPI_STATUS_IGNORE);
printf("Rank %d received: %s\n", rank, helloWorld);
}
}
}
MPI_Finalize();
return 0;
}
You have to run this code on two process. Because 'MPI_THREAD_FUNNELED' is selected only the master thread makes MPI calls.
The following variables are specified at OpenMP data scoping place
because is needed by gcc version 6.1.1. Older versions like 4.8 do not require to declare them.
ompi_mpi_comm_world
ompi_mpi_char

Why does reading from STDOUT work?

I encountered a curious case, where reading from STDOUT works, when running a program in terminal. The question is, why and how? Let's start with code:
#include <QCoreApplication>
#include <QSocketNotifier>
#include <QDebug>
#include <QByteArray>
#include <unistd.h>
#include <errno.h>
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
const int fd_arg = (a.arguments().length()>=2) ? a.arguments().at(1).toInt() : 0;
qDebug() << "reading from fd" << fd_arg;
QSocketNotifier n(fd_arg, QSocketNotifier::Read);
QObject::connect(&n, &QSocketNotifier::activated, [](int fd) {
char buf[1024];
auto len = ::read(fd, buf, sizeof buf);
if (len < 0) { qDebug() << "fd" << fd << "read error:" << errno; qApp->quit(); }
else if (len == 0) { qDebug() << "fd" << fd << "done"; qApp->quit(); }
else {
QByteArray data(buf, len);
qDebug() << "fd" << fd << "data" << data.length() << data.trimmed();
}
});
return a.exec();
}
Here's qmake .pro file for convenience, if someone wants to test above code:
QT += core
QT -= gui
CONFIG += c++11
TARGET = stdoutreadtest
CONFIG += console
CONFIG -= app_bundle
TEMPLATE = app
SOURCES += main.cpp
And here's output of 4 executions:
$ ./stdoutreadtest 0 # input from keyboard, ^D ends, works as expected
reading from fd 0
typtyptyp
fd 0 data 10 "typtyptyp"
fd 0 done
$ echo pipe | ./stdoutreadtest 0 # input from pipe, works as expected
reading from fd 0
fd 0 data 5 "pipe"
fd 0 done
$ ./stdoutreadtest 1 # input from keyboard, ^D ends, works!?
reading from fd 1
typtyp
fd 1 data 7 "typtyp"
fd 1 done
$ echo pipe | ./stdoutreadtest 1 # input from pipe, still reads keyboard!?
reading from fd 1
typtyp
fd 1 data 7 "typtyp"
fd 1 done
So, the question is, what is going on, why do the last two runs above actually read what is typed on terminal?
I also tried looking at QSocketNotifier sources here leading to here, but didn't really gain any insight.
There is no different between fd 0,1,2, all of the three fds is pointing to terminal if not redirected, they are strictly IDENTICAL!
Programs usually use 0 for input, 1 for output, 2 for error, but all of them can be different ways.
Eg, for less, ordinary usage:
prog | less
Now input of less is redirected to the prog, and less can not read any user input from stdin, so how does less get user input like scroll up/down or page up/down ?
Sure less can read user input from stdout, which is exactly what less does.
So you can use these fds wisely when you know how bash handle these fds.

iptables netfilter copying the packet

I was wondering if there is a way to copy a packet using iptables/netfilter, change it and deliver both to the application.
Basically, I want to capture a packet from a flow and redirect it to some queue, then I want to copy it, issue the verdict for it(I know how to do this part in C),then I need to change something in the copied version, AND issue the verdict for that "modified" packet too.
Basically I want the app to receive both the unmodified and the modified version.
Is this possible?
Thanks in advance for any help.
Your mission can be achieved with libipq library. The tutorial in following like focus on copying & modifying a packet in userspace.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.205.2605&rep=rep1&type=pdf
You need to know C to work on it. Alternatively "Scapy" - a python based packet maipulation tool can be used.
#include <linux/netfilter.h>
#include <libipq.h>
/*
* Used to open packet ; Insert a iptables rule to get packet here
* iptables -I 1 [INPUT|OUTPUT|FORWARD] <packet header match> -j QUEUE
*/
#include <linux/netfilter.h>
#include <libipq.h>
#include <stdio.h>
#define BUFSIZE 2048
static void die(struct ipq_handle *h)
{
ipq_destroy_handle(h);
exit(1);
}
int main(int argc, char **argv)
{
int status;
unsigned char buf[BUFSIZE];
struct ipq_handle *h;
h = ipq_create_handle(0, NFPROTO_IPV4);
if (!h)
die(h);
status = ipq_set_mode(h, IPQ_COPY_PACKET, BUFSIZE);
if (status < 0)
die(h);
do{
status = ipq_read(h, buf, BUFSIZE, 0);
if (status < 0)
die(h);
if (ipq_message_type(buf) == IPQM_PACKET){
ipq_packet_msg_t *m = ipq_get_packet(buf);
status = ipq_set_verdict(h, m->packet_id, NF_ACCEPT, 0, NULL);
}
} while (1);
ipq_destroy_handle(h);
return 0;
}

MPI: Why does my MPICH program fails for large no. of processes?

/* C Example */
#include <mpi.h>
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
int main (int argc, char* argv[])
{
int rank, size;
int buffer_length = MPI_MAX_PROCESSOR_NAME;
char hostname[buffer_length];
MPI_Init (&argc, &argv); /* starts MPI */
MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */
MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */
MPI_Get_processor_name(hostname, &buffer_length); /* get hostname */
printf( "Hello world from process %d running on %s of %d\n", rank, hostname, size );
MPI_Finalize();
return 0;
}
The above program compiles and run successfully on ubuntu 12.04 for smaller no. of processes. But it fails when I try to execute with 1000s of processes. Why it is so?
I am expecting that scheduler would keep the threads in queue and can dispatch one by one (I am running this code on a single core machine)
Why the following error is coming for large no. of processes and how to resolve this issue?
root#ubuntu:/home# mpiexec -n 1000 ./hello
[proxy:0:0#ubuntu] HYDU_create_process (./utils/launch/launch.c:26): pipe error (Too many open files)
[proxy:0:0#ubuntu] launch_procs (./pm/pmiserv/pmip_cb.c:751): create process returned error
[proxy:0:0#ubuntu] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:935): launch_procs returned error
[proxy:0:0#ubuntu] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:0#ubuntu] main (./pm/pmiserv/pmip.c:226): demux engine error waiting for event
Killed
You are running into the open file limit on your system. Default in Ubuntu is 1024. You can try raising the limit in your session with the ulimit command.
ulimit -n 2048

Resources