Rust Global.dealloc vs ptr::drop_in_place vs ManuallyDrop - pointers

I'm relatively new to Rust. I was working on some lock-free algorithms, and started playing around with manually managing memory, something similar to C++ new/delete. I noticed a couple different ways that do this throughout the standard library components, but I want to really understand the differences and use cases of each. Here's what it seems like to me:
ManuallyDrop<Box<T>> will prevent Box's destructor from running. I can save a raw pointer to the ManuallyDrop element, and have the actual element go out of scope (what would normally be dropped in Rust) without being dropped. I can later call ManuallyDrop::drop(&mut *ptr) to drop this value manually.
I can also dereference the ManuallyDrop<Box<T>> element, save a raw pointer to just the Box<T>, and later call std::ptr::drop_in_place(box_ptr). This is supposed to destroy the Boxitself and drop the heap-allocated T.
Looking at the ManuallyDrop::drop implementation, it looks those are literally doing the exact same thing. Since ManuallyDrop is zero cost and just stores a value in it's struct, is there any difference in the above two approaches?
I can also call std::alloc::Global.dealloc(...), which looks like it will deallocate the memory block without calling drop. So if I call this on a pointer to Box<T>, it'll deallocate the heap pointer, but won't call drop, so T will still be lying around on the heap. I could call it on a pointer to T itself, which will remove T.
From exploring the standard library, it looks like Global.dealloc gets called in the raw_vec implementation to actually remove the heap-allocated array that Vec points to. This makes sense, since it's literally trying to remove a block of memory.
Rc has a drop implementation that looks roughly like this:
// destroy the contained object
ptr::drop_in_place(self.ptr.as_mut());
// remove the implicit "strong weak" pointer now that we've
// destroyed the contents.
self.dec_weak();
if self.weak() == 0 {
Global.dealloc(self.ptr.cast(), Layout::for_value(self.ptr.as_ref()));
}
I don't really understand why it needs both the dealloc and the drop_in_place. What does the dealloc add that the drop_in_place doesn't do?
Also, if I just save a raw pointer to a heap-allocated value by doing something like Box::new(5).into_raw(), does my pointer now control that memory allocation. As in, will it remain alive until I explicitly call ptr::drop_in_place()?
Finally, when I was playing with all this, I ran into a strange issue. After running ManuallyDrop::drop or ptr::drop_in_place on my raw pointer, I then tried running println! on the pointer's dereferenced value. Sometimes I get a scary heap error and my test fails, which is what I would expect. Other times, it just prints the same value, as if no drops happened. I also tried running ManuallyDrop::drop multiple times on the exact same value, and same thing. Sometimes a heap error, sometimes totally fine, and the same value prints out.
What is happening here?

If you come from C++, you can think of drop_in_place as calling the destructor manually, and dealloc as calling old C free.
They serve different purposes:
drop_in_place just calls Drop::drop, that releases the resources held by your type.
dealloc frees the memory pointed to by a pointer, previously allocated with alloc.
You seem to think that drop_in_place also frees the memory, but that is not the case. I think your confusion arises because Box<T> contains a dynamically allocated object, so its Box::drop implementation does release the memory used by that object, after calling its drop_in_place, of course.
That is what you see in the Rc implementation, first it calls the drop_in_place (destructor) of the inner object, then it releases the memory.
About what happens if you call drop_in_place several times in a row... well, the function is unsafe for a reason: you most likely get Uundefined Behavior. From the docs:
...if T is not Copy, using the pointed-to value after calling drop_in_place can cause undefined behavior.
Note the can cause. I think it is perfectly possible to write a type that allows calling drop several times, but it doesn't sound like such a good idea.

Related

How can the processor discern a far return from a near return?

Reading Intel's big manual, I see that if you want to return from a far call, that is, a call to a procedure in another code segment, you simply issue a return instruction (possibly with an immediate argument that moves the stack pointer up n bytes after the pointer popping).
This, apparently, if I'm interpreting things correctly, is enough for the hardware to pop both the segment selector and offset into the correct registers.
But, how does the system know that the return should be a far return and that both an offset AND a selector need to be popped?
If the hardware just pops the offset pointer and not the selector after it, then you'll be pointing to the right offset but wrong segment.
There is nothing special about the far return command compared to the near return version.
They both look identical as far as I can tell.
I assume then that the processor, perhaps at the micro-architecture level, keeps track of which calls are far and which are close so that when they're returned from, the system knows how many bytes to pop and where to pop them (pointer registers and segment selector registers).
Is my assumption correct?
What do you guys know about this mechanism?
The processor doesn't track whether or not a call should be far or near; the compiler decides how to encode the function call and return using either far or near opcodes.
As it is, FAR calls have no use on modern processors because you don't need to change any segment register values; that's the point of a flat memory model. Segment registers still exist, but the OS sets them up with base=0 and limit=0xffffffff so just a plain 32-bit pointer can access all memory. Everything is NEAR, if you need to put a name on it.
Normally you just don't even think about segmentation so you don't actually call it either. But the manual still describes the call/ret opcodes we use for normal code as the NEAR versions.
FAR and NEAR were used on old 86 processors, which used a segmented memory model. Programs at that time needed to choose what kind of architecture they wished to support, ranging from "tiny" to "large". If your program was small enough to fit in a single segment, then it could be compiled using NEAR calls and returns exclusively. If it was "large", the opposite was true. For anything in between, you had power to choose whether local functions needed to be able to be either callable/returnable from code in another segment.
Most modern programs (besides bootloaders and the like) run on a different construct: they expect a flat memory model. Behind the scenes the OS will swap out memory as needed (with paging not segmentation), but as far as the program is concerned, it has its virtual address space all to itself.
But, to answer your question, the difference in the call/return is the opcode used; the processor obeys the command given to it. If you mistake (say, give it a FAR return opcode when in flat mode), it'll fail.

In terms of design and when writing a library, when should I use a pointer as an argument, and when should I not?

Sorry if my question seems stupid. My background is in PHP, Ruby, Python, Lua and similar languages, and I have no understanding of pointers in real-life scenarios.
From what I've read on the Internet and what I've got as responses in a question I asked (When is a pointer idiomatic?), I have understood that:
Pointers should be used when copying large data. Instead of getting the whole object hierarchy, receive its address and access it.
Pointers have to be used when you have a function on a struct that modifies it.
So, pointers seem like a great thing: I should just always get them as function arguments because they are so lightweight, and it's okay if I somehow end up not needing to modify anything on the struct.
However, looking at that statement intuitively, I can feel that it sounds very creepy, and yet I don't know why.
So, as someone who is designing a struct and its related functions, or just functions, when should I receive a pointer? When should I receive a value, and why?
In other words, when should my NewAuthor method return &Author{ ... }, and when should it return Author{ ... }? When should my function get a pointer to an author as an argument, and when should it just get the value (a copy) of type Author?
There's tradeoffs for both pointers and values.
Generally speaking, pointers will point to some other region of memory in the system. Be it the stack of the function that wants to pass a pointer to a local variable or some place on the heap.
func A() {
i := 25
B(&i) // A sets up stack frame to call B,
// it copies the address of i so B can look it up later.
// At this point, i is equal to 30
}
func B(i *int){
// Here, i points to A's stack frame.
// For this to execute, I look at my variable "i",
// see the memory address it points to, then look at that to get the value of 25.
// That address may be on another page of memory,
// causing me to have to look it up from main memory (which is slow).
println(10 + (*i))
// Since I have the address to A's local variable, I can modify it.
*i = 30
}
Pointers require me to de-reference them constantly whenever I was to see the data it points to. Sometimes you don't care. Other times it matters a lot. It really depends on the application.
If that pointer has to be de-referenced a lot (ie: you pass in a number to use in a bunch of different calcs), then you keep paying the cost.
Compared to using values:
func A() {
i := 25
B(i) // A sets up the stack frame to call B, copying in the value 25
// i is still 25, because A gave B a copy of the value, and not the address.
}
func B(i int){
// Here, i is simply on the stack. I don't have to do anything to use it.
println(10 + i)
// Since i here is a value on B's stack, modifications are not visible outside B's scpe
i = 30
}
Since there's nothing to dereference, it's basically free to use the local variable.
The down side of passing values happens if those values are large because copying data to the stack isn't free.
For an int it's a wash because pointers are "int" sized. For a struct, or an array, you are copying all the data.
Also, large objects on the stack can make the stack extra big. Go handles this well with stack re-allocation, but in high performance scenarios, it may be too much of an impact to performance.
There's a data safety aspect as well (can't modify something I pass by value), but I don't feel that is usually an issue in most code bases.
Basically, if your problem was already solvable by ruby, python or other language without value types, then these performance nuances don't super-matter.
In general, passing structs as pointers will usually do "the right thing" while learning the language.
For all other types, or things that you want to keep as read-only, pass values.
There are exceptions to that rule, but it's best that you learn those as needs arise rather than try to redefine your world all at once. If that makes sense.
Simply you can use pointers anywhere you want, sometimes you don't want to change your data. It may stand for abstract data, and you don't want to explicitly copy the data. Just pass by value and let compiler do its job.

Garbage collection vs. shared pointers

What are the differences between shared pointers (such as boost::shared_ptr or the new std::shared_ptr) and garbage collection methods (such as those implemented in Java or C#)? The way I understand it, shared pointers keep track of how many times variables points to the resource and will automatically destruct the resource when the count reaches zero. However, my understanding is that the garbage collector also manages memory resources, but requires additional resources to determine if an object is still being referred to and doesn't necessarily destruct the resource immediately.
Am I correct in my assumptions, and are there any other differences between using garbage collectors and shared pointers? Also, why would anyone ever used a garbage collector over a shared pointer if they perform similar tasks but with varying performance figures?
The main difference lies, as you noted, in when the resource is released/destroyed.
One advantage where a GC might come in handy is if you have resources that take a long time to be released. For a short program lifetime, it might be nice to leave the resources dangling and have them cleaned up in the end. If resource limits are reached, then the GC can act to release some of them. Shared pointers, on the other hand, release their resources as soon as the reference count hits zero. This could be costly for frequent acquisition-release cycles of a resource with costly time requirements.
On the other hand, in some garbage collection implementations, garbage collection requires that the whole program pause its execution while memory is examined, moved around, and freed. There are smarter implementations, but none are perfect.
Those Shared Pointers (usually called reference counting) run the risk of cycles.
Garbage collection (Mark and Sweep) does not have this problem.
In a simple garbage-collected system, nobody will hold a direct pointer to any object; instead, code will hold references to table entries which point to objects on the heap. Each object on the heap will store its size (meaning all heap objects will form a singly-linked list) and a back-reference to the object in the object table which holds it (or at least used to).
When either the heap or the object table gets full, the system will set a "delete me" flag on every object in the table. It will examine every object it knows about and, if its "delete flag" was set, unset it and add all the objects it knows about to the list of objects to be examined. Once that is done, any object whose "delete me" flag is still set can be deleted.
Once that is done, the system will start at the beginning of the heap, take each object stored there, and see if its object reference still points to it. If so, it will copy that object to the beginning of the heap, or just past the end of the last copied object; otherwise the object will be skipped (and will likely be overwritten when other objects are copied).
In languages with a garbage collector (GC), the GC keeps track of and cleans up memory that isn’t being used anymore, and we don’t need to think about it. In most languages without a GC, it’s our responsibility to identify when memory is no longer being used and to call code to explicitly free it, just as we did to request it.
more details: HERE

Memory (sbrk) 16-byte aligned shifting on pointer access

I wrote a reasonably basic memory allocator using sbrk. I ask for a chunk of memory, say 65k and carve it up as needed for variables requesting dynamic memory. I free the memory by adding it back to the 65k block. The 65k block is derived from a union sizeof(16-bytes). Then I align the block along an even 16-byte boundary. But I'm getting unusual behavior.
Accessing the memory appears fine as I allocate and begin to populate my data structures accept that on one of my function calls, I pass a pointer to a member variable in a global structure but the address of the pointer argument doesn't map directly to the address of that member.
For example, the real address of this particular member happens to be: 0x100313d50 but when executing a particular function (nothing special) the address of the member is being represented as 0x100313d70. Inside the debugger I can query the real address and it appears correct when inside the function where this manifests. This isn't the first member being accessed either, it's the third so two prior memory accesses are fine, but during the third access I'm seeing this unusual shifting.
Is it possible that I'm accessing this memory via a misaligned block? It's possible but I'd expect the get a SIGBUS exception thrown (SPARC chip). I'm compiling using -memalign=16s so it ought to SIGBUS instead of trapping and fixing the misalignment.
All of my structures are padded on a multiple of 16-bytes: sizeof(structure)%16 = 0. Has anyone had experience with this type of behavior? Generally speaking, what type of things/stuff/etc. might cause a pointer to misrepresent a memory address?
Cheers,
Tracy.
Solaris 10, SunStudio-12, C language on modern SPARC processor (in case this helps).
I figure I should answer my own question in the event someone else out there has a similar problem.
The reason why the memory address was shifting is because a prior call to a utility function accidentally overwrote the meta-address of the global structure thusly rewriting the meta-address of that block so lookups on that block were shifted even though the actual data still resided in the original block.
In simple words, I wrote past my buffer. Since I hand out memory from the tail, overwriting would blow away my much needed meta-address for my global structure (or whatever). Now I know what undefined behavior looks like.

Should I care about thread safe of static int (4 bytes) variable in ASP .NET

I have the feeling that I should not care about thread safe accessing / writing to an
public static int MyVar = 12;
in ASP .NET.
I read/write to this variable from various user threads. Let's suppose this variable will store the numbers of clicks on a certain button/link.
My theory is that no thread can read/write to this variable at the same time. It's just a simple variable of 4 bytes.
I do care about thread safe, but only for refference objects and List instances or other types that take more cycles to read/update.
I am wrong with my presumption ?
EDIT
I understand this depend of my scenario, but wasn't that the point of the question. The question is: it is right that can be written thread safe code with an (static int) variable without using lock keyword ?
It is my problem to write correct code. The answer seems to be: Yes, if you write correct and simple code, and not to much complicated, you can create thread safe functions without the need of lock keyword.
If one thread simply sets the value and another thread reads the value, then a lock is not necessary; the read and write are atomic. But if multiple threads might be updating it and are also reading it to do the update (e.g., increment), then you definitely do need some kind of synchronization. If only one thread is ever going to update it even for an increment, then I would argue that no synchronization is necessary.
Edit (three years later) It might also be desirable to add the volatile keyword to the declaration to ensure that reads of the value always get the latest value (assuming that matters in the application).
The concept of thread 'safety' is too vague to be meaningful unfortunately. If you're asking whether you can read and write to it from multiple threads without the program crashing during the operation, the answer is almost certainly yes. If you're also asking if the variable is guaranteed to either be the old value or the new value without ever storing any broken intermediate values, the answer for this data type is again almost certainly yes.
But if your question is "will my program work correctly if I access this from multiple threads", then the answer depends entirely on what your program is doing. For example, if you run the following pseudo code in 2 threads repeatedly in most programming languages, eventually you'll hit the assertion.
if MyVar >= 1:
MyVar = MyVar - 1
assert MyVar >= 0
Primitives like int are thread-safe in the sense that reads/writes are atomic. But as with most any type, it's left to you to do proper checking with more complex operations. For example, if (x > 0) x--; would be problematic in a multi-threaded scenario because x might change in between the if condition check and decrement.
A simple read or write on a field of 32 bits or less is always atomic. But you should provide your read/write code to make sure that it is thread safe.
Check out this post: http://msdn.microsoft.com/en-us/magazine/cc163929.aspx
It explains why you need to synchronize access to the integers in this scenario
Try Interlocked.Increment() or Interlocked.Add() and you'll be right. Your code complexity will be the same but you truly won't have to worry. If you're not worried about losing a few clicks in your counter, you can continue as you are.
Reading or writing integers is atomic. However, reading and then writing is not atomic. So, if you have one thread that writes and many that read, you may be able to get away without locks.
However, even though the operations are atomic, there are still potential multi-threading issues. In order for one thread to be guaranteed that another thread can see values it writes, you need a memory barrier. Otherwise, the compiler can optimize the code so that the variable stays in a register (or even optimize the operation away completely), so changes would be invisible from one thread to another.
You can establish a memory barrier explicitly (volatile or Thread.MemoryBarrier), or with the Interlocked class -- or with the lock statement (Monitor).

Resources