MPI reuse MPI_Request

MPI reuse MPI_Request - mpi

Is it safe to re-use a finished MPI_Request for another request ? I have been using a pool of MPI_Request to improve performance and there is no error. But it would be good to know for sure.

Variables of type MPI_Request are not request objects themselves but rather just opaque handles (something like an abstract pointer) to the real MPI request objects. Assigning a new value to such a variable in no way affects the MPI object and only breaks the association to it. Therefore the object might become inaccessible in the sense that if no handle to it exists in your program, it can no longer be passed to MPI calls. This is the same as losing a pointer to a dynamically allocated memory block, thus leaking it.
When it comes to asynchronous request handles, once the operation is completed, MPI destroys the request object and MPI_Wait* / MPI_Test* set the passed handle variable to MPI_REQUEST_NULL on return. Also, a call to MPI_Request_free will mark the request for deletion and set the handle to MPI_REQUEST_NULL on return. At that point you can reuse the variable and store a different request handle in it.
The same applies to handles to communicators (of type MPI_Comm), handles to datatypes (of type MPI_Datatype), handles to reduce operations (of type MPI_Op), and so on.

It's just fine to reuse your MPI_Request objects as long as they're completed before you use them again (either by completing the request or freeing the request object manually using MPI_REQUEST_FREE).

Related

Rust Global.dealloc vs ptr::drop_in_place vs ManuallyDrop

I'm relatively new to Rust. I was working on some lock-free algorithms, and started playing around with manually managing memory, something similar to C++ new/delete. I noticed a couple different ways that do this throughout the standard library components, but I want to really understand the differences and use cases of each. Here's what it seems like to me:
ManuallyDrop<Box<T>> will prevent Box's destructor from running. I can save a raw pointer to the ManuallyDrop element, and have the actual element go out of scope (what would normally be dropped in Rust) without being dropped. I can later call ManuallyDrop::drop(&mut *ptr) to drop this value manually.
I can also dereference the ManuallyDrop<Box<T>> element, save a raw pointer to just the Box<T>, and later call std::ptr::drop_in_place(box_ptr). This is supposed to destroy the Boxitself and drop the heap-allocated T.
Looking at the ManuallyDrop::drop implementation, it looks those are literally doing the exact same thing. Since ManuallyDrop is zero cost and just stores a value in it's struct, is there any difference in the above two approaches?
I can also call std::alloc::Global.dealloc(...), which looks like it will deallocate the memory block without calling drop. So if I call this on a pointer to Box<T>, it'll deallocate the heap pointer, but won't call drop, so T will still be lying around on the heap. I could call it on a pointer to T itself, which will remove T.
From exploring the standard library, it looks like Global.dealloc gets called in the raw_vec implementation to actually remove the heap-allocated array that Vec points to. This makes sense, since it's literally trying to remove a block of memory.
Rc has a drop implementation that looks roughly like this:
// destroy the contained object
ptr::drop_in_place(self.ptr.as_mut());
// remove the implicit "strong weak" pointer now that we've
// destroyed the contents.
self.dec_weak();
if self.weak() == 0 {
Global.dealloc(self.ptr.cast(), Layout::for_value(self.ptr.as_ref()));
}
I don't really understand why it needs both the dealloc and the drop_in_place. What does the dealloc add that the drop_in_place doesn't do?
Also, if I just save a raw pointer to a heap-allocated value by doing something like Box::new(5).into_raw(), does my pointer now control that memory allocation. As in, will it remain alive until I explicitly call ptr::drop_in_place()?
Finally, when I was playing with all this, I ran into a strange issue. After running ManuallyDrop::drop or ptr::drop_in_place on my raw pointer, I then tried running println! on the pointer's dereferenced value. Sometimes I get a scary heap error and my test fails, which is what I would expect. Other times, it just prints the same value, as if no drops happened. I also tried running ManuallyDrop::drop multiple times on the exact same value, and same thing. Sometimes a heap error, sometimes totally fine, and the same value prints out.
What is happening here?

If you come from C++, you can think of drop_in_place as calling the destructor manually, and dealloc as calling old C free.
They serve different purposes:
drop_in_place just calls Drop::drop, that releases the resources held by your type.
dealloc frees the memory pointed to by a pointer, previously allocated with alloc.
You seem to think that drop_in_place also frees the memory, but that is not the case. I think your confusion arises because Box<T> contains a dynamically allocated object, so its Box::drop implementation does release the memory used by that object, after calling its drop_in_place, of course.
That is what you see in the Rc implementation, first it calls the drop_in_place (destructor) of the inner object, then it releases the memory.
About what happens if you call drop_in_place several times in a row... well, the function is unsafe for a reason: you most likely get Uundefined Behavior. From the docs:
...if T is not Copy, using the pointed-to value after calling drop_in_place can cause undefined behavior.
Note the can cause. I think it is perfectly possible to write a type that allows calling drop several times, but it doesn't sound like such a good idea.

When exactly is "Component.completed" fired?

When exactly is "Component.completed" fired?
The docs say this:
Emitted after the object has been instantiated.
And if this was C++, I'd know that, since the object has been instantiated, I can rely on the constructor to have been executed, with all the guarantees that come from that.
But in QML I don't know what guarantees I have about an object that "has been instantiated". That memory has been allocated for it? That its properties have evaluated and received their initial values? That the whole descendant subtree has been loaded?

The guarantee is it will be fired after the object has been completed. That includes the allocation of memory, construction of object and rigging of property bindings, initial evaluations and such.
What is not guaranteed is the order in which completed signals are handled when objects are nested in a tree. You should not rely on that. An object will not be completed before its entire object tree is completed, but for some inexplicable reason, you can't expect notifications to arrive in the tree-defined order.

In Windows API application-defined callback functions, what is the lifetime of pointer parameter data? Does it persist after the callback returns?

As an example, let's look at EnumWindowStations(), which requires the caller to pass an EnumWindowStationsProc() callback function. The callback function will be invoked once for every window station in the current terminal session. Let's look at the signature of the callback function:
BOOL CALLBACK EnumWindowStationProc(
_In_ LPTSTR lpszWindowStation,
_In_ LPARAM lParam
);
The first parameter is a pointer to string data. Was that string buffer allocated explicitly for the callback invocation, and will it be freed immediately after the callback returns, or perhaps immediately before the enumeration function returns? Or, does the pointer point to some kind of persistent memory, such that the string buffer will remain allocated and usable afterward?
This is an important point, because if it is not persistent, then it would be incorrect to, for example, store the raw pointer in a global container to be accessed after the callback and full enumeration process have finished. Instead, it would be necessary to copy the underlying string data to a buffer controlled by the application, before the callback returns.
The official documentation does not seem to make clear what the lifetime of the string data is. There's only one line in the description of the parameter:
lpszWindowStation [in]
The name of the window station.
And nowhere on the documentation page does it talk about the lifetime of the string data. Nor can I recall ever finding an MSDN page that answers this question "once and for all", i.e. for all uses of the callback idiom in the Windows API.
At the moment I am most interested in the EnumWindowStations()/EnumWindowStationsProc() case, but it would be best if answerers would address the general case, i.e. what to assume for all Windows API callback functions.

Generally if the memory is allocated by the system you can rely on it being valid for the duration of the callback and no more. That is the case for lpszWindowStation in your example. You would need to access and copy the string inside your callback function and must not refer to the system supplied data outside your callback.
You can deduce this with a little thought experiment. If this string could be accessed after the callback returned when would it become invalid? When would it be deallocated? You'd have to tell the system that you were done with it. Since there is no API indicated to do that the only conclusion is that stated above.
Things are different for lParam. That value was supplied by you when you called EnumWindowStations and so you manage the lifetime of whatever it points to, if indeed you are using it as a pointer.

In general pointers passed to callback functions by WINAPI are only guaranteed to be valid for the duration of the callback (unless otherwise specified in the documentation of the specific callback). Once the callback returns, the pointer should be assumed to be invalid (it may be freed or it may point to a temporary buffer that will be overwriten in the next iteration of an enumeration etc).
The callback function is responsible for creating a copy of whatever data it may need to persist after it returns. In the case of the above example you should allocate your own bufffer and copy the string from lpszWindowStation if you need to use it after the callback returns. Of course you should also manage the lifetime of the allocated buffer(s).

Winsock asynchronous multiple WSASend with one single buffer

MSDN states "For a Winsock application, once the WSASend function is called, the system owns these buffers and the application may not access them."
In a server application, does that mean that if I want to broadcast a message to multiple clients I cannot use a single buffer that holds the data and invoke WSASend on each socket with that one buffer?

I don't have a documentation reference that confirms this is possible but I've been doing it for years and it hasn't failed yet, YMMV.
You CAN use a single data buffer as long as you have a unique OVERLAPPED structure per send. Since the WSABUF array is duplicated by the WSASend() call and can be stack based I would expect that you COULD have a single WSABUF array, but I've never done that.
What you DO need to make sure that you keep that single data buffer "alive" until all of the data writes complete.
Broadcasting like this can complicate a design if you tend to structure your extended OVERLAPPED so that it includes the data buffer, but it does avoid memory allocation and memory copying.
Note: I have a system whereby my extended OVERLAPPED structures include the data buffer and operation code and these are reference counted and pool and used for sends and recvs. When broadcasting a buffer I use a separate "buffer handle" per send, this handle is just an OVERLAPPED structure extended in a different way, it holds a reference to the original data buffer and has its own reference count. When all of the broadcast sends have completed all of the buffer handles will have been released and these will, in turn, have released the underlying data buffer for reuse.

Garbage collection vs. shared pointers

What are the differences between shared pointers (such as boost::shared_ptr or the new std::shared_ptr) and garbage collection methods (such as those implemented in Java or C#)? The way I understand it, shared pointers keep track of how many times variables points to the resource and will automatically destruct the resource when the count reaches zero. However, my understanding is that the garbage collector also manages memory resources, but requires additional resources to determine if an object is still being referred to and doesn't necessarily destruct the resource immediately.
Am I correct in my assumptions, and are there any other differences between using garbage collectors and shared pointers? Also, why would anyone ever used a garbage collector over a shared pointer if they perform similar tasks but with varying performance figures?

The main difference lies, as you noted, in when the resource is released/destroyed.
One advantage where a GC might come in handy is if you have resources that take a long time to be released. For a short program lifetime, it might be nice to leave the resources dangling and have them cleaned up in the end. If resource limits are reached, then the GC can act to release some of them. Shared pointers, on the other hand, release their resources as soon as the reference count hits zero. This could be costly for frequent acquisition-release cycles of a resource with costly time requirements.
On the other hand, in some garbage collection implementations, garbage collection requires that the whole program pause its execution while memory is examined, moved around, and freed. There are smarter implementations, but none are perfect.

Those Shared Pointers (usually called reference counting) run the risk of cycles.
Garbage collection (Mark and Sweep) does not have this problem.

In a simple garbage-collected system, nobody will hold a direct pointer to any object; instead, code will hold references to table entries which point to objects on the heap. Each object on the heap will store its size (meaning all heap objects will form a singly-linked list) and a back-reference to the object in the object table which holds it (or at least used to).
When either the heap or the object table gets full, the system will set a "delete me" flag on every object in the table. It will examine every object it knows about and, if its "delete flag" was set, unset it and add all the objects it knows about to the list of objects to be examined. Once that is done, any object whose "delete me" flag is still set can be deleted.
Once that is done, the system will start at the beginning of the heap, take each object stored there, and see if its object reference still points to it. If so, it will copy that object to the beginning of the heap, or just past the end of the last copied object; otherwise the object will be skipped (and will likely be overwritten when other objects are copied).

In languages with a garbage collector (GC), the GC keeps track of and cleans up memory that isn’t being used anymore, and we don’t need to think about it. In most languages without a GC, it’s our responsibility to identify when memory is no longer being used and to call code to explicitly free it, just as we did to request it.
more details: HERE

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

MPI reuse MPI_Request - mpi

Is it safe to re-use a finished MPI_Request for another request ? I have been using a pool of MPI_Request to improve performance and there is no error. But it would be good to know for sure.

It's just fine to reuse your MPI_Request objects as long as they're completed before you use them again (either by completing the request or freeing the request object manually using MPI_REQUEST_FREE).

Related

Rust Global.dealloc vs ptr::drop_in_place vs ManuallyDrop

When exactly is "Component.completed" fired?

In Windows API application-defined callback functions, what is the lifetime of pointer parameter data? Does it persist after the callback returns?

Winsock asynchronous multiple WSASend with one single buffer

Garbage collection vs. shared pointers

Categories

Resources