I would like to understand why should I use MPI_Wait to wait for an MPI request to complete?
So, in the end of MPI_Send scope I would use the MPI_Wait method, but why?
In my comprehension MPI_Send just send a message and doesn't wait a request to complete, but
the MPI_ISend do!
Thanks.
To summarize what #Hristo Iliev said, you shouldn't (and can't) call MPI_WAIT for a call to MPI_SEND. Calling MPI_WAIT requires that you pass in an MPI_Request object which you get as a return value from an MPI_I<something> function. Without that object, MPI doesn't know what you're trying to wait on.
So your reasoning in the end is correct. You don't wait on an MPI_SEND, but you do (and must) wait on an MPI_ISEND.
Related
I'm bit confused to understand the difference between Asynchronous calls and Callbacks.
I read this posts which teach about CallBacks but none of the answers addresses how it differs from Asynchronous calls.
Is this Callbacks = Lambda Expressions?
Callbacks are running in a different thread?
Can anyone explains this with plain simple English?
Very simply, a callback needn't be asynchronous.
http://docs.apigee.com/api-baas/asynchronous-vs-synchronous-calls
Synchronous:
If an API call is synchronous, it means that code execution will
block (or wait) for the API call to return before continuing. This
means that until a response is returned by the API, your application
will not execute any further, which could be perceived by the user as
latency or performance lag in your app. Making an API call
synchronously can be beneficial, however, if there if code in your app
that will only execute properly once the API response is received.
Asynchronous:
Asynchronous calls do not block (or wait) for the API call to return
from the server. Execution continues on in your program, and when the
call returns from the server, a "callback" function is executed.
In Java, C and C#, "callbacks" are usually synchronous (with respect to a "main event loop").
In Javascript, on the other hand, callbacks are usually asynchronous - you pass a function that will be invoked ... but other events will continue to be processed until the callback is invoked.
If you don't care what Javascript events occur in which order - great. Otherwise, one very powerful mechanism for managing asynchronous behavior in Javascript is to use "promises":
http://www.html5rocks.com/en/tutorials/es6/promises/
PS:
To answer your additional questions:
Yes, a callback may be a lambda - but it's not a requirement.
In Javascript, just about every callback will be an "anonymous function" (basically a "lambda expression").
Yes, callbacks may be invoked from a different thread - but it's certainly not a requirement.
Callbacks may also (and often do) spawn a thread (thus making themselves "asynchronous").
'Hope that helps
====================================================================
Hi, Again:
Q: #paulsm4 can you please elaborate with an example how the callback
and asynchronous call works in the execution flow? That will be
greatly helpful
First we need to agree on a definition for "callback". Here's a good one:
https://en.wikipedia.org/wiki/Callback_%28computer_programming%29
In computer programming, a callback is a piece of executable code that
is passed as an argument to other code, which is expected to call back
(execute) the argument at some convenient time. The invocation may be
immediate as in a synchronous callback, or it might happen at a later
time as in an asynchronous callback.
We must also define "synchronous" and "asynchronous". Basically - if a callback does all it's work before returning to the caller, it's "synchronous". If it can return to the caller immediately after it's invoked - and the caller and the callback can work in parallel - then it's "asynchronous".
The problem with synchronous callbacks is they can appear to "hang". The problem with asynchronous callbacks is you can lose control of "ordering" - you can't necessarily guarantee that "A" will occur before "B".
Common examples of callbacks include:
a) a button press handler (each different "button" will have a different "response"). These are usually invoked "asynchronousy" (by the GUI's main event loop).
b) a sort "compare" function (so a common "sort()" function can handle different data types). These are usually invoked "synchronously" (called directly by your program).
A CONCRETE EXAMPLE:
a) I have a "C" language program with a "print()" function.
b) "print()" is designed to use one of three callbacks: "PrintHP()", "PrintCanon()" and "PrintPDF()".
c) "PrintPDF()" calls a library to render my data in PDF. It's synchronous - the program doesn't return back from "print()" until the .pdf rendering is complete. It usually goes pretty quickly, so there's no problem.
d) I've coded "PrintHP()" and "PrintCanon()" to spawn threads to do the I/O to the physical printer. "Print()" exits as soon as the thread is created; the actual "printing" goes on in parallel with program execution. These two callbacks are "asynchronous".
Q: Make sense? Does that help?
They are quite similar but this is just mho.
When you use callbacks you specify which method should you should be called back on and you rely on the methods you call to call you back. You could specify your call back to end up anywhere and you are not guaranteed to be called back.
In Asynchronous programming, the call stack should unwind to the starting position, just as in normal synchronous programming.
Caveat: I am specifically thinking of the C# await functionality as there are other async techniques.
I want to comment paulsm4 above, but I have no enough reputation, so I have to give another new answer.
according to wikipedia, "a callback is a piece of executable code that is passed as an argument to other code, which is expected to call back (execute) the argument at some convenient time. The invocation may be immediate as in a synchronous callback, or it might happen at a later time as in an asynchronous callback.", so the decorative word "synchronous" and "asynchronous" are on "callback", which is the key-point. we often confuse them with an "asynchronous function", which is the caller function indeed. For example,
const xhr = new XMLHttpRequest();
xhr.addEventListener('loadend', () => {
log.textContent = `${log.textContent}Finished with status: ${xhr.status}`;
});
xhr.open('GET', 'https://raw.githubusercontent.com/mdn/content/main/files/en-us/_wikihistory.json');
xhr.send();
here, xhr.send() is an asynchronous caller function, while the anonymous function defined in xhr.addEventListener is an asynchronous callback function.
for clarification, the following is synchronous callback example:
function doOperation(callback) {
const name = "world";
callback(name);
}
function doStep(name) {
log.console(`hello, ${name}`);
}
doOperation(doStep)
then, let's answer the specified question:
Is this Callbacks = Lambda Expressions?
A: nop, callback is just a normal function, it can be named or anonymous(lambda expressions).
Callbacks are running in a different thread?
A: if callbacks are synchronous, they are running within the same thread of the caller function. if callbacks are asynchronous, they are running in another thread to the caller function to avoid blocking the execution of the caller.
A call is Synchronous: It returns control to the caller when it's done.
A call is Async.: Otherwise.
When I have code like this:
QNetworkReply* reply = netWorkMansger.post(...);
connect(reply,&QNetworkReply::finished,[=](){//Handler code});
What if between the first and second line the reply gets finished? Than the connect comes to late, does it not?
The connect() would be indeed too late if the reply would ever emit finished() during creation, before the reply pointer is returned to the caller of post().
QNetworkReply/QNetworkAccessManager don’t do that though, the start of the reply’s network operation is queued in the event loop, i.e. the operation won’t start before your code returns to the event loop or events are otherwise processed (otherwise means one explicitly calls one of a few nasty methods like QDialog::exec()/QMessageBox::critical etc., QEventLoop::exec(), or processEvents()). Thus your code is safe if you connect to the reply right away.
Not emitting signals like finished() synchronously at the time an asynchronous operation is started is in my opinion one of the most important rules when implementing such operations.
I know that MPI_SENDRECV allow to overcome the problem of deadlocks (when we use the classic MPI_SEND and MPI_RECV functions).
I would like to know if MPI_SENDRECV(sent_to_process_1, receive_from_process_0) is equivalent to:
MPI_ISEND(sent_to_process_1, request1)
MPI_IRECV(receive_from_process_0, request2)
MPI_WAIT(request1)
MPI_WAIT(request2)
with asynchronous MPI_ISEND and MPI_RECV functions?
From I have seen, MPI_ISEND and MPI_RECV creates a fork (i.e. 2 processes). So if I follow this logic, the first call of MPI_ISEND generates 2 processes. One does the communication and the other calls MPI_RECV which forks itself 2 processes.
But once the communication of first MPI_ISEND is finished, does the second process call MPI_IRECV again? With this logic, the above equivalent doesn't seem to be valid...
Maybe I should change to this:
MPI_ISEND(sent_to_process_1, request1)
MPI_WAIT(request1)
MPI_IRECV(receive_from_process_0, request2)
MPI_WAIT(request2)
But I think that it could be create also deadlocks.
Anyone could give to me another solution using MPI_ISEND, MPI_IRECV and MPI_WAIT to get the same behaviour of MPI_SEND_RECV?
There's some dangerous lines of thought in the question and other answers. When you start a non-blocking MPI operation, the MPI library doesn't create a new process/thread/etc. You're thinking of something more like a parallel region of OpenMP I believe, where new threads/tasks are created to do some work.
In MPI, starting a non-blocking operation is like telling the MPI library that you have some things that you'd like to get done whenever MPI gets a chance to do them. There are lots of equally valid options for when they are actually completed:
It could be that they all get done later when you call a blocking completion function (like MPI_WAIT or MPI_WAITALL). These functions guarantee that when the blocking completion call is done, all of the requests that you passed in as arguments are finished (in your case, the MPI_ISEND and the MPI_IRECV). Regardless of when the operations actually take place (see next few bullets), you as an application can't consider them done until they are actually marked as completed by a function like MPI_WAIT or MPI_TEST.
The operations could get done "in the background" during another MPI operation. For instance, if you do something like the code below:
MPI_Isend(..., MPI_COMM_WORLD, &req[0]);
MPI_Irecv(..., MPI_COMM_WORLD, &req[1]);
MPI_Barrier(MPI_COMM_WORLD);
MPI_Waitall(2, req);
The MPI_ISEND and the MPI_IRECV would probably actually do the data transfers in the background during the MPI_BARRIER. This is because as an application, you are transferring "control" of your application to the MPI library during the MPI_BARRIER call. This lets the library make progress on any ongoing MPI operation that it wants. Most likely, when the MPI_BARRIER is complete, so are most other things that finished first.
Some MPI libraries allow you to specify that you want a "progress thread". This tells the MPI library to start up another thread (not that thread != process) in the background that will actually do the MPI operations for you while your application continues in the main thread.
Remember that all of these in the end require that you actually call MPI_WAIT or MPI_TEST or some other function like it to ensure that your operation is actually complete, but none of these spawn new threads or processes to do the work for you when you call your nonblocking functions. Those really just act like you stick them on a list of things to do (which in reality, is how most MPI libraries implement them).
The best way to think of how MPI_SENDRECV is implemented is to do two non-blocking calls with one completion function:
MPI_Isend(..., MPI_COMM_WORLD, &req[0]);
MPI_Irecv(..., MPI_COMM_WORLD, &req[1]);
MPI_Waitall(2, req);
How I usually do this on node i communicating with node i+1:
mpi_isend(send_to_process_iPlus1, requests(1))
mpi_irecv(recv_from_process_iPlus1, requests(2))
...
mpi_waitall(2, requests)
You can see how ordering your commands this way with non-blocking communication allows you (during the ... above) to perform any computation that does not rely on the send/recv buffers to be done during your communication. Overlapping computation with communication is often crucial for maximizing performance.
mpi_send_recv on the other hand (while avoiding any deadlock issues) is still a blocking operation. Thus, your program must remain in that routine during the entire send/recv process.
Final points: you can initialize more than 2 requests and wait on all of them the same way using the above structure as dealing with 2 requests. For instance, it's quite easy to start communication with node i-1 as well and wait on all 4 of the requests. Using mpi_send_recv you must always have a paired send and receive; what if you only want to send?
As we know, there is a thing called an MPI send buffer used during a send action.
And for following code:
MPI_Isend(data, ..., req);
...
MPI_Wait(req, &status)
Is it safe to use data between MPI_Isend and MPI_Wait ?
That means, will MPI_Isend use data as the internal send buffer?
And more, if I don't use data anymore, could I indicate MPI to use data as the send buffer rather than waste time to copy data?
BTW, I've heard of MPI_Bsend, but I don't think it could save memory and time in this case.
MPI provides two kinds of operations: blocking and non-blocking. The difference between the two is when it is safe to reuse the data buffer passed to the MPI function.
When a blocking call like MPI_Send returns, the buffer is no longer needed by the MPI library and can be safely reused. On the other hand, non-blocking calls only initiate the corresponding operation and let it continue asynchronously. Only after a successful call to a routine like MPI_Wait or after a positive test result from MPI_Test one can safely reuse the buffer.
As for how the library utilises the user buffer, that is very implementation-specific. Shorter messages are usually copied to internal (for the MPI library) buffers for performance reasons. Longer messages are usually directly read from the user buffer and sent to the network, therefore the buffer will be in use by MPI until the whole message has been sent.
It is absolutely not save to use data between MPI_Isend and MPI_Wait.
Between MPI_Isend and MPI_Wait you actually don't know when data can be reused. Only after MPI_Wait you can be sure that data is sent and you can reuse it.
If you don't use data anymore you should call MPI_Wait at the end of your program.
In a spatially decomposed 2D domain, I need to send particles to the 8 neighbors. I know how many I'm sending but not how many I'll receive from these neighbors.
I had implemented a code with MPI_Send(), MPI_Probe() and MPI_Recv() but I realized that it caused deadlocks whenever the message was too big.
I decided to go for non-blocking communications but then I can't figure out in what order MPI_Isend, MPI_Irecv and MPI_Iprobe should be called? I definitely need to know the size my receiving buffer should be allocated to before actually calling MPI_Irecv so I'm tempted by the order MPI_Isend() then MPI_Iprobe() then MPI_Irecv(), but the problem is that MPI_Iprove() always returns a flag equal to false and I get stuck in the while loop. As far as I understand there no obligation for MPI to actually complete the send before the call to MPI_Wait(), therefore I understand that MPI_Iprobe might never return true. But if so, how does one receives an unknown size message in non-blocking MPI point-to-point communications?
You don't have to make all 3 operations non-blocking. You can use an MPI_ISEND with a regular MPI_PROBE and/or MPI_RECV. It sounds like that might be a better option for you.