Query in MPI initialization - mpi

If we call MPI_Init() we know that multiple copies of the same executable run on different machines. Suppose MPI_Init() is in a function f(), then will multiple copies of main() function exist too?
The main problem that I am facing is of taking inputs. In effect, what is happening is that input is being taken once but the main function is running several times. The processor with rank 0 always seems to have the input, rest of them have random values. So to send the values do we have to broadcast the input from processor 0 to all the other processors?

MPI_Init() doesn't create multiple copies, it just initializes in-process MPI library. Multiple copies of your process are created before that, most probably with some kind of mpirun command (that is how you run your MPI application). All processes are independent from the beginning, so answering the first part of your question — yes, multiple copies of main() will exist, and they will exist even if you don't call MPI_Init.
The answer to your question about inputs depends on nature of the inputs: if it's typed in from console, then you have to input the values only in one process (e.g. rank 0) and then broadcast them. If the inputs are in some file or specified as a command-line argument, then all processes can access them.

Related

Using MPI_Gatherv with only a subset of all processes

I want to use MPI_Gatherv to collect data onto a single process. The thing is, I don't need data from all other processes, just a subset of them. The only communicator in the code is MPI_COMM_WORLD. So, does every process in MPI_COMM_WORLD have to call MPI_Gatherv? I have looked at the MPI standard and can't really make out what it is saying. If all processes must make the call, can some of the values in MPI_Gatherv's "recvcounts" array be zero? Or would there be some other way to signal to the root process which processes should be ignored? I guess I could introduce another communicator, but for this particular problem that would be coding overkill.

Is it possible to write with several processors in the same file, at the end of the file, in an ordonated way?

I have 2 processors (this is an example), and I want these 2 processors to write in a file. I want them to write at the end of file, but not in a mixed pattern, like that :
[file content]
proc0
proc1
proc0
proc1
proc0
proc1
(and so on..)
I'd like to make them write following this kind of pattern :
[file content]
proc0
proc0
proc0
proc1
proc1
proc1
(and so on..)
Is it possible? If so, what's the setting to use?
The sequence in which your processes have outputs ready to report is, essentially, unknowable in advance. Even repeated runs of exactly the same MPI program will show differences in the ordering of outputs. So something, somewhere, is going to have to impose an ordering on the writes to the file.
A very common pattern, the one Wesley has already mentioned, is to have all processes send their outputs to one process, often process 0, and let it deal with the writing to file. This master-writer could sort the outputs before writing but this creates a couple of problems: allocating space to store output before writing it and, more difficult to deal with, determining when a collection of output records can be sorted and written to file and the output buffers be reused. How long does the master-writer wait and how does it know if a process is still working ?
So it's common to have the master-writer write outputs as it gets them and for another program to order the output file as desired after the parallel program has finished. You could tack this on to your parallel program as a step after mpi_finalize or you could use a completely separate program (such as sort on a Linux machine). Of course, for this to work each output record has to contain some sequencing information on which to sort.
Another common pattern is to only have one process which does any writing at all, that is, none of the other processes do any output at all. This completely avoids the non-determinism of the sequencing of the writing.
Another pattern, less common partly because it is more difficult to implement and partly because it depends on underlying mechanisms which are not always available, is to use mpi io. With mpi io multiple processes can write to different parts of a file as if simultaneously. To actually write simultaneously the program needs to be executing on hardware, network and operating system which supports parallel i/o. It can be tricky to implement this even with the right platform, and especially when the volume of output from processes is uncertain.
In my experience here on SO people asking question such as yours are probably at too early a stage in their MPI experience to be tackling parallel i/o, even if they have access to the necessary hardware.
I disagree with High Performance Mark. MPI-IO isn't so tricky in 2014 (as long as you have have access to any file system besides NFS -- install PVFS if you need a cheap easy parallel file system).
If you know how much data each process has, you can use MPI_SCAN to efficiently compute how much data was written by "earlier" processes, then use MPI_FILE_WRITE_AT_ALL to carry out the I/O efficiently. Here's one way you might do this:
incr = (count*datatype_size);
MPI_Scan(&incr, &new_offset, 1, MPI_LONG_LONG_INT,
MPI_SUM, MPI_COMM_WORLD);
MPI_File_write_at_all(mpi_fh, new_offset, buf, count,
datatype, status)
The answer to your question is no. If you do things that way, you'll end up with jumbled output from all over the place.
However, you can get the same thing by sending your output to a single processor having it do all of the writing itself. For example, at the end of your application, just have everything send to rank 0 and have rank 0 write it all to a file.

MPI and global variables

I have to implement an MPI program. There are some global variables (4 arrays of float numbers and other 6 single float variables) which are first inizialized by the main process reading data from a file. Then I call MPI_Init and, while process of rank 0 waits for results, the other processes (rank 1,2,3,4) work on the arrays etc...
The problem is that those array seem not to be initialized anymore, all is set to 0. I tried to move global variable inside the main function but the result is the same. When MPI_Init() is called all processes are created by fork right? So everyone has a memory copy of the father so why do they see not initizialized arrays?
I fear you have misunderstood.
It is probably best to think of each MPI process as an independent program, albeit one with the same source code as every other process in the computation. Operations that process 0 carries out on variables in its address space have no impact on the contents of the address spaces of other processes.
I'm not sure that the MPI standard even requires process 0 to have values for variables which were declared and initialised prior to the call to mpi_init, that is before process 0 really exists.
Whether it does or not you will have to write code to get the values into the variables in the address space of the other processes. One way to do this would be to have process 0 send the values to the other processes, either one by one or using a broadcast. Another way would be for all processes to read the values from the input files; if you choose this option watch out for contention over i/o resources.
In passing, I don't think it is common for MPI implementations to create processes by forking at the call to mpi_init, forking is more commonly used for creating threads. I think that most MPI implementations actually create the processes when you make a call to mpiexec, the call to mpi_init is the formality which announces that your program is starting its parallel computations.
When MPI_Init() is called all processes are created by fork right?
Wrong.
MPI spawns multiple instances of your program. These instances are separate processes, each with its own memory space. Each process has its own copy of every variable, including globals. MPI_Init() only initializes the MPI environment so that other MPI functions can be called.
As the other answers say, that's not how MPI works. Data is unique to each process and must be explicitly transferred between processes using the API available in the MPI specification.
However, there are programming models that allow this sort of behavior. If, when you say parallel computing, you mean multiple cores on one processor, you might be better served by using something like OpenMP to share your data between threads.
Alternatively, if you do in fact need to use multiple processors (either because your data is too big to fit in one processor's memory, or some other reason), you can take a look at one of the Parallel Global Address Space (PGAS) languages. In those models, you have memory that is globally available to all processes in an execution.
Last, there is a part of MPI that does allow you to expose memory from one process to other processes. It's the Remote Memory Access (RMA) or One-Sided chapter. It can be complex, but powerful if that's the kind of computing model you need.
All of these models will require changing the way your application works, but it sounds like they might map to your problem better.

Variable memory allocation in MPI Code

In a cluster running MPI code, is a copy of all the declared-variables, sent to all nodes , so that all nodes can access it locally, and not perform a remote memory access ?
No, MPI itself can't do this for you in single call.
There is an own state of memory in every MPI process, and every value may be different in any of MPI process.
The only way of sending/receiving data is to use explicit calls of MPI, like Send or Recv. You can pack most of your data into some memory space and send this area of memory to each MPI Process, but this area will not contain 'every declared variable', only variables placed manually into this area.
Update:
Each node runs a copy of the program. Each copy will initialize variables as it want (it can be the same initialization, or individual, based on MPI Process number, called Rank; got from MPI_Comm_Rank function). So every variable exist in N copyes; one set per MPI Process. Every process sees variables, but only the set it owns. Values of variables are unsyncronized automatically.
So, task of programmer is to syncronize values of variables between Nodes (mpi processes).
E.g. here is small MPI program to compute Pi:
http://www.mcs.anl.gov/research/projects/mpi/usingmpi/examples/simplempi/cpi_c.htm
It will send value of the 'n' variable from first process to all other (MPI_Bcast); and every process will send its own 'mypi' after calculation into 'pi' variable of first process (with addition of individual values via MPI_Reduce function).
Only first process will be able to read N from user (via scanf) and this code is conditionally executed based on rank of process; other processes must get the N from the first because they didn't read it from user directly.
Update2 (sorry for late answer):
This is syntax of MPI_Bcast. Programmer should give an address of variable into this function. Each of MPI processes will give the address of its own 'n' variable (it can be different). And the MPI_Bcast will
check the rank of current process and compare with other argument, the rank of "Broadcaster".
If the current process is broadcaster, MPI_Bcast will read value, placed in memory at given address (it will read value of the 'n' variable on "Broadcaster"); then the value will be send via network.
Else, if the current process is not a broadcaster, it is an "receiver". MPI_Bcast at receiver will get the value from "Broadcaster" (Using MPI Library internals, via network) and store the value in memory of current process at given address.
So, the address is given to this function because on some nodes the function will write to the variable. Only value is send via network.

MPI + Function Pointers?

If I'm running the same binary (which implies the same architecture) on multiple nodes of a Beowulf cluster in an MPI configuration, is it safe to pass function pointers via MPI as a way of telling another node to call a function? Under what circumstances, if any, can the same function in the same binary have different virtual addresses on different machines or different instances?
Passing any kind of pointers other than the one shared file pointer per collective MPI_FILE_OPEN (which MPI maintains) to other processes is meaningless. Separate address spaces mean that the pointer value is useless in any process other than the one that generated it.
On the other hand, you could pass around the information about which function you want each process to call, or make each one decide individually. That depends on what your code is doing, of course.
Simply create array of functions filled with known order and pass functions's ID.

Resources