I'm actually trying to understand how data are stored and loaded using pointers with rust, but when I run this code:
#[cfg(test)]
mod tests{
fn get_pointer<T>(a:T) -> *const i32{
ptr::addr_of!(a)
}
#[test]
fn f(){
unsafe {
let a = 5;
let pointer = get_pointer(a);
let encoded = bincode::serialize(&(pointer as usize)).unwrap();
let decoded = bincode::deserialize::<usize>(&encoded[..]).unwrap() as *const i32;
let b = std::ptr::read(decoded);
assert_eq!(a, b);
}
}
}
The value stored in b become 0 instead of 5, and I cannot figure out why this happens and how to solve this.
I think that the problem occurs because the value of a is dropped after the function returns the pointer but I'm not sure if that's right
I think that the problem occurs because the value of a is dropped after the function returns the pointer but I'm not sure if that's right
Well yes, pretty much. The locals of a function only live for the extent of that function (which is why rustc will refuse compiling if you try to return a reference to function-local data), so get_pointer returns a dangling pointer, serializing and deserializing the pointer does quite literally nothing, and the ptr::read is UB:
Safety
Behavior is undefined if any of the following conditions are violated:
src must be valid for reads.
src must be properly aligned. Use read_unaligned if this is not the case.
src must point to a properly initialized value of type T.
Running the program using miri unambiguously flags the issue:
error: Undefined Behavior: pointer to alloc1416 was dereferenced after this allocation got freed
--> /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:703:9
|
703 | copy_nonoverlapping(src, tmp.as_mut_ptr(), 1);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pointer to alloc1416 was dereferenced after this allocation got freed
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: inside `std::ptr::read::<i32>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:703:9
note: inside `main` at src/main.rs:12:17
--> src/main.rs:12:17
|
12 | let b = ptr::read(decoded);
| ^^^^^^^^^^^^^^^^^^
I'm actually trying to understand how data are stored and loaded using pointers with rust
Doing it that way is a terrible idea, you're stepping off way into UB land and that means all bets are off, what you observe (to the extent that you can observe anything) has no relations to defined semantics.
Related
Let's begin with a canonical example of Arc
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let msg = Arc::new(Mutex::new(String::new()));
let mut handles = Vec::new();
for _ in 1..10 {
let local_msg = Arc::clone(&msg);
handles.push(thread::spawn(move || {
let mut locked = local_msg.lock().unwrap();
locked.push_str("hello, world\n");
}));
}
for handle in handles {
handle.join().unwrap();
}
println!("{}", msg.lock().unwrap());
}
This compiles and runs as expected. Then I realized maybe the Mutex doesn't have to live on the heap and started wondering if I can get rid of Arc and just use a shared reference to a Mutex allocated on the stack. Here is my attempt
use std::sync::Mutex;
use std::thread;
fn main() {
let msg = Mutex::new(String::new());
let mut handles = Vec::new();
for _ in 1..10 {
let local_msg = &msg;
handles.push(thread::spawn(move || {
let mut locked = local_msg.lock().unwrap();
locked.push_str("hello, world\n");
}));
}
for handle in handles {
handle.join().unwrap();
}
println!("{}", msg.lock().unwrap());
}
This one doesn't compile, though
error[E0597]: `msg` does not live long enough
--> src/main.rs:8:25
|
8 | let local_msg = &msg;
| ^^^^ borrowed value does not live long enough
9 | handles.push(thread::spawn(move || {
| ______________________-
10 | | let mut locked = local_msg.lock().unwrap();
11 | | locked.push_str("hello, world\n");
12 | | }));
| |__________- argument requires that `msg` is borrowed for `'static`
...
20 | }
| - `msg` dropped here while still borrowed
error: aborting due to previous error
For more information about this error, try `rustc --explain E0597`.
error: could not compile `hello`
To learn more, run the command again with --verbose.
The compiler complains that local_msg doesn't have a 'static lifetime. Well, it doesn't, so the error makes sense. However, this implies the variable let local_msg = Arc::clone(&msg); in the first snippet has 'static lifetime, otherwise I should get a similar error.
Questions:
How could Arc::clone(&msg) get a 'static lifetime? The value it points to isn't known at compile-time, and could die before the whole program exits.
As a bonus, what about other heap-backed smart pointers like Box and Rc? Do they all have a 'static lifetime because the borrow checker ensures that as long as these pointers are visible, then the addresses they point to are always valid?
The thing the compiler is looking for is a lifetime bound. A lifetime bound of 'a doesn't mean “this type is a reference with lifetime 'a”, but rather “all of the references this type contains have lifetimes of at least 'a”.
(When a lifetime bound is written explicitly, it looks like where T: 'a.)
Thus, any type which does not contain any references (or rather, has no lifetime parameters) automatically satisfies the 'static lifetime bound. If T: 'static, then Arc<T>: 'static (and the same for Box and Rc).
How could Arc::clone(&msg) get a 'static lifetime? The value it points to isn't known at compile-time, and could die before the whole program exits.
It does not point to the value using a reference, so it's fine. The type of your value is Arc<Mutex<String>>; there are no lifetime parameters here because there are no references. If it were, hypothetically, Arc<'a, Mutex<String>> (a lifetime parameter which Arc doesn't actually have), then that type would not satisfy the bound.
The job of Arc (or Rc or Box) is to own the value it points to. Ownership is not a reference and thus not subject to lifetimes.
However, if you had the type Arc<Mutex<&'a str>> then that would not satisfy the bound, because it contains a reference which is not 'static.
I'm trying to accomplish an exercise "left to the reader" in the 2018 Rust book. The example they have, 10-15, uses the Copy trait. However, they recommend implementing the same without Copy and I've been really struggling with it.
Without Copy, I cannot use largest = list[0]. The compiler recommends using a reference instead. I do so, making largest into a &T. The compiler then complains that the largest used in the comparison is a &T, not T, so I change it to *largest to dereference the pointer. This goes fine, but then stumbles on largest = item, with complaints about T instead of &T. I switch to largest = &item. Then I get an error I cannot deal with:
error[E0597]: `item` does not live long enough
--> src/main.rs:6:24
|
6 | largest = &item;
| ^^^^ borrowed value does not live long enough
7 | }
8 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the anonymous lifetime #1 defined on the function body at 1:1...
I do not understand how to lengthen the life of this value. It lives and dies in the list.iter(). How can I extend it while still only using references?
Here is my code for reference:
fn largest<T: PartialOrd>(list: &[T]) -> &T {
let mut largest = &list[0];
for &item in list.iter() {
if item > *largest {
largest = &item;
}
}
largest
}
When you write for &item, this destructures each reference returned by the iterator, making the type of item T. You don't want to destructure these references, you want to keep them! Otherwise, when you take a reference to item, you are taking a reference to a local variable, which you can't return because local variables don't live long enough.
fn largest<T: PartialOrd>(list: &[T]) -> &T {
let mut largest = &list[0];
for item in list.iter() {
if item > largest {
largest = item;
}
}
largest
}
Note also how we can compare references directly, because references to types implementing PartialOrd also implement PartialOrd, deferring the comparison to their referents (i.e. it's not a pointer comparison, unlike for raw pointers).
first timer here,
The first NOTE in SliceTricks suggests that there is a potential memory leak problem when cutting or deleting elements in a slice of pointers.
Is the same true for a map? For example: https://play.golang.org/p/67cN0JggWY
Should we nil the entry before deleting from map? Like so:
m["foo"] = nil
What if we simply clear the map?
m = make(map[string]*myStruct)
Will the garbage collector still pick it up?
Thanks in advance
Checking the sources
Although this is not documented anywhere, checking the sources: runtime/hashmap.go, mapdelete() function:
558 func mapdelete(t *maptype, h *hmap, key unsafe.Pointer) {
// ...
600 memclr(k, uintptr(t.keysize))
601 v := unsafe.Pointer(uintptr(unsafe.Pointer(b)) + dataOffset + bucketCnt*uintptr(t.keysize) + i*uintptr(t.valuesize))
602 memclr(v, uintptr(t.valuesize))
// ...
618 }
As you can see, storage for both the key (line #600) and the value (line #602) are cleared / zeroed.
This means if any of the key or value was a pointer, or if they were values of complex types containing pointers, they are zeroed and therefore the pointed objects are no longer referenced by the internal data structures of the map, so there is no memory leak here.
When there is no more reference to a complete map value, then the complete memory area of the map will be garbage collected, and all the pointers included in keys and values are also not held anymore by the map; and if no one else has reference to the pointed objects, they will be garbage collected properly.
Constructing an example to prove this
We can also construct a test code which proves this without examining the sources:
type point struct {
X, Y int
}
var m = map[int]*point{}
func main() {
fillMap()
delete(m, 1)
runtime.GC()
time.Sleep(time.Second)
fmt.Println(m)
}
func fillMap() {
p := &point{1, 2}
runtime.SetFinalizer(p, func(p *point) {
fmt.Printf("Finalized: %p %+v\n", p, p)
})
m[1] = p
fmt.Printf("Put in map: %p %+v\n", p, p)
}
Output (try it on the Go Playground):
Put in map: 0x1040a128 &{X:1 Y:2}
Finalized: 0x1040a128 &{X:1 Y:2}
map[]
What does this do? It creates a *Point value (pointer to a struct), puts it in the map, and registers a function that should be called when this pointer becomes unreachable (using runtime.SetFinalizer()), and then deletes the entry containing this pointer. Then we call runtime.GC() to "force" an immediate garbage collection. I also print the map at the end just to make sure the whole map is not garbage collected due to some optimization.
The result? We see the registered function gets called, which proves the pointer was removed from the map as the result of the delete() call, because (since we had no other references to it) it was eligible for garbage collection.
No, there will not be any memory leaks when deleting from a map.
In case of slices, since a slice actually uses an underlying array, as long as the slice exists - even if it uses just one slot in that array - the pointer items inside the array can not get garbage collected.
"A slice describes a piece of an array" which implies the array needs to be there for the slice to exist and can not get collected by GC; as long as some code is pointing at the slice.
I try to run this code:
impl FibHeap {
fn insert(&mut self, key: int) -> () {
let new_node = Some(box create_node(key, None, None));
match self.min{
Some(ref mut t) => t.right = new_node,
None => (),
};
println!("{}",get_right(self.min));
}
}
fn get_right(e: Option<Box<Node>>) -> Option<Box<Node>> {
match e {
Some(t) => t.right,
None => None,
}
}
And get error
error: cannot move out of dereference of `&mut`-pointer
println!("{}",get_right(self.min));
^
I dont understand why I get this problem, and what I must use to avoid problem.
Your problem is that get_right() accepts Option<Box<Node>>, while it should really accept Option<&Node> and return Option<&Node> as well. The call site should be also changed appropriately.
Here is the explanation. Box<T> is a heap-allocated box. It obeys value semantics (that is, it behaves like plain T except that it has associated destructor so it is always moved, never copied). Hence passing just Box<T> into a function means giving up ownership of the value and moving it into the function. However, it is not what you really want and neither can do here. get_right() function only queries the existing structure, so it does not need ownership. And if ownership is not needed, then references are the answer. Moreover, it is just impossible to move the self.min into a function, because self.min is accessed through self, which is a borrowed pointer. However, you can't move out from a borrowed data, it is one of the basic safety guarantees provided by the compiler.
Change your get_right() definition to something like this:
fn get_right(e: Option<&Node>) -> Option<&Node> {
e.and_then(|n| n.right.as_ref().map(|r| &**r))
}
Then println!() call should be changed to this:
println!("{}", get_right(self.min.map(|r| &**r))
Here is what happens here. In order to obtain Option<&Node> from Option<Box<Node>> you need to apply the "conversion" to insides of the original Option. There is a method exactly for that, called map(). However, map() takes its target by value, which would mean moving Box<Node> into the closure. However, we only want to borrow Node, so first we need to go from Option<Box<Node>> to Option<&Box<Node>> in order for map() to work.
Option<T> has a method, as_ref(), which takes its target by reference and returns Option<&T>, a possible reference to the internals of the option. In our case it would be Option<&Box<Node>>. Now this value can be safely map()ped over since it contains a reference and a reference can be freely moved without affecting the original value.
So, next, map(|r| &**r) is a conversion from Option<&Box<Node>> to Option<&Node>. The closure argument is applied to the internals of the option if they are present, otherwise None is just passed through. &**r should be read inside out: &(*(*r)), that is, first we dereference &Box<Node>, obtaining Box<Node>, then we dereference the latter, obtaining just Node, and then we take a reference to it, finally getting &Node. Because these reference/dereference operations are juxtaposed, there is no movement/copying involved. So, we got an optional reference to a Node, Option<&Node>.
You can see that similar thing happens in get_right() function. However, there is also a new method, and_then() is called. It is equivalent to what you have written in get_right() initially: if its target is None, it returns None, otherwise it returns the result of Option-returning closure passed as its argument:
fn and_then<U>(self, f: |T| -> Option<U>) -> Option<U> {
match self {
Some(e) => f(e),
None => None
}
}
I strongly suggest reading the official guide which explains what ownership and borrowing are and how to use them, because these are the very foundation of Rust language and it is very important to grasp them in order to be productive with Rust.
I don't understand why rustc gives me this error error: use of moved value: 'f' at compile time, with the following code:
fn inner(f: &fn(&mut int)) {
let mut a = ~1;
f(a);
}
fn borrow(b: &mut int, f: &fn(&mut int)) {
f(b);
f(b); // can reuse borrowed variable
inner(f); // shouldn't f be borrowed?
// Why can't I reuse the borrowed reference to a function?
// ** error: use of moved value: `f` **
//f(b);
}
fn main() {
let mut a = ~1;
print!("{}", (*a));
borrow(a, |x: &mut int| *x+=1);
print!("{}", (*a));
}
I want to reuse the closure after I pass it as argument to another function. I am not sure if it is a copyable or a stack closure, is there a way to tell?
That snippet was for rustc 0.8. I managed to compile a different version of the code with the latest rustc (master: g67aca9c), changing the &fn(&mut int) to a plain fn(&mut int) and using normal functions instead of a closure, but how can I get this to work with a closure?
The fact of the matter is that &fn is not actually a borrowed pointer in the normal sense. It's a closure type. In master, the function types have been fixed up a lot and the syntax for such things has changed to |&mut int|—if you wanted a borrowed pointer to a function, for the present you need to type it &(fn (...)) (&fn is marked obsolete syntax for now, to help people migrating away from it, because it's a completely distinct type).
But for closures, you can then go passing them around by reference: &|&mut int|.