After reading several articles about The Heap and the Stack (Rust-lang) I learned that non-primitive types / data-structures are usually located on the heap, leaving a pointer in the stack, pointing to the address where the specific object is located at the heap.
Heap values are referenced by a variable on the stack, which contains the memory address of the object on the heap. [Rust Essentials, Ivo Balbaert]
Considering the following example:
struct Point {
x: u32,
y: u32,
}
fn main() {
let point = Point { x: 8, y: 16 };
// Is this address the value of the pointer at the stack, which points to
// the point-struct allocated in the heap, or is the printed address the
// heap-object's one?
println!("The struct is located at the address {:p}", &point);
}
In my case, the output was:
The struct is located at the address 0x24fc58
So, is 0x24fc58 the value (address) the stack-reference points to, or is it the direct memory-address where the struct-instance is allocated in the heap?
Some additional little questions:
Is this a "raw-address", or the address relative to the program's address-space?
Is it possible to initialize a pointer by directly passing a hex address?
Is it possible to access memory-addresses which don't lay in the program's address-space?
Your Point actually resides on the stack – there is no Box or other structure to put it on the heap.
Yes, it is possible (though obviously unsafe) to pass an address to a *ptr (this is a bare pointer) and cast it to a &ptr – this is unsafe, because the latter are guaranteed to be non-null.
As such, it is of course possible (though wildly unsafe) to access off-heap memory, as long as the underlying system lets you do it (most current systems will probably just kill your process with a Segmentation Fault).
Related
I'm going through this Rust tutorial - https://doc.rust-lang.org/book/ch02-00-guessing-game-tutorial.html - and came across this block of code:
let mut guess = String::new();
io::stdin()
.read_line(&mut guess)
.expect("Failed to read line");
My confusion is why we need to pass a reference to the guess variable, as opposed to just the variable itself. Is there a reason it was designed this way?
In my understanding, guess is a pointer which holds a memory address. Then, if guess is dereferenced like so *guess, this will return the value at the memory address where the String is held.
So, it seems like the read_line function would only need the address of the String to read to. Ie, called like: read_line(guess) (or read_line(mut guess)).
I'm confused why this isn't possible, and why read_line is defined to take the reference to a String, which is the address of a 'pointer' (?) as opposed to just the String (pointer) itself.
Values of type String own the memory holding the characters — they do contain a pointer to heap memory, and when they are dropped, they deallocate that memory.
If you pass a String to a function, you're moving the String and thereby transferring ownership of that memory. Then, at the end of the function, the String and its memory will be discarded unless the function returns the String value back to the caller:
fn moving_read_line(self, string: String) -> std::io::Result<(String, usize)> { ... }
This is less convenient and less flexible (for the caller) than accepting a mutable reference, which does not transfer ownership, only “borrows” it.
The variable guess is actually not a pointer, but a struct that contains a pointer to some memory, as well as the size of that allocated memory. If you dereference a String, you will get a slice, which has a pointer to the underlying memory, as well as the size of the window into that memory, but that pointer and size cannot be modified. The slice is not the owner of the memory being referenced. In order to modify the size or allocate new underlying memory for String, you need a reference to the String, hence the need for the mut reference
I've taken this picture and code from The Rust Book.
Why does s point to s1 rather than just the data on the heap itself?
If so this is how it works? How does the s point to s1. Is it allocated memory with a ptr field that contains the memory address of s1. Then, does s1, in turn point to the data.
In s1, I appear to be looking at a variable with a pointer, length, and capacity. Is only the ptr field the actual pointer here?
This is my first systems level language, so I don't think comparisons to C/C++ will help me grok this. I think part of the problem is that I don't quite understand what exactly pointers are and how the OS allocates/deallocates memory.
fn main() {
let s1 = String::from("hello");
let len = calculate_length(&s1);
println!("The length of '{}' is {}.", s1, len);
}
fn calculate_length(s: &String) -> usize {
s.len()
}
The memory is just a huge array, which can be indexed by any offset (e.g. u64).
This offset is called address,
and a variable that stores an address called a pointer.
However, usually only some small part of memory is allocated, so not every address is meaningful (or valid).
Allocation is a request to make a (sequential) range of addresses meaningful to the program (so it can access/modify).
Every object (and by object I mean any type) is located in allocated memory (because non-allocated memory is meaningless to the program).
Reference is actually a pointer that is guaranteed (by a compiler) to be valid (i.e. derived from address of some object known to a compiler). Take a look at std doc also.
Here an example of these concepts (playground):
// This is, in real program, implicitly defined,
// but for the sake of example made explicit.
// If you want to play around with the example,
// don't forget to replace `usize::max_value()`
// with a smaller value.
let memory = [uninitialized::<u8>(); usize::max_value()];
// Every value of `usize` type is valid address.
const SOME_ADDR: usize = 1234usize;
// Any address can be safely binded to a pointer,
// which *may* point to both valid and invalid memory.
let ptr: *const u8 = transmute(SOME_ADDR);
// You find an offset in our memory knowing an address
let other_ptr: *const u8 = memory.as_ptr().add(SOME_ADDR);
// Oversimplified allocation, in real-life OS gives a block of memory.
unsafe { *other_ptr = 15; }
// Now it's *meaningful* (i.e. there's no undefined behavior) to make a reference.
let refr: &u8 = unsafe { &*other_ptr };
I hope that clarify most things out, but let's cover the questions explicitly though.
Why does s point to s1 rather than just the data on the heap itself?
s is a reference (i.e. valid pointer), so it points to the address of s1. It might (and probably would) be optimized by a compiler for being the same piece of memory as s1, logically it still remains a different object that points to s1.
How does the s point to s1. Is it allocated memory with a ptr field that contains the memory address of s1.
The chain of "pointing" still persists, so calling s.len() internally converted to s.deref().len, and accessing some byte of the string array converted to s.deref().ptr.add(index).deref().
There are 3 blocks of memory that are displayed on the picture: &s, &s1, s1.ptr are different (unless optimized) memory addresses. And all of them are stored in the allocated memory. The first two are actually stored at pre-allocated (i.e. before calling main function) memory called stack and usually it is not called an allocated memory (the practice I ignored in this answer though). The s1.ptr pointer, in contrast, points to the memory that was allocated explicitly by a user program (i.e. after entering main).
In s1, I appear to be looking at a variable with a pointer, length, and capacity. Is only the ptr field the actual pointer here?
Yes, exactly. Length and capacity are just common unsigned integers.
In tutorial is written:
The type *T is a pointer to a T value. The & operator generates
a pointer to its operand.
I am just playing around with pointers in Go and have following:
example := 42
p:=&example
fmt.Println(reflect.TypeOf(&p)) // **int
fmt.Println(reflect.TypeOf(*p)) // int
So if I got it correctly, &p is a pointer to a pointer to an int value.
What is use of **Type in the Go language?
Here's a simple demonstration of the concept of a chain of pointers:
package main
import "fmt"
func main() {
i := 42
fmt.Printf("i: %[1]T %[1]d\n", i)
p := &i
fmt.Printf("p: %[1]T %[1]p\n", p)
j := *p
fmt.Printf("j: %[1]T %[1]d\n", j)
q := &p
fmt.Printf("q: %[1]T %[1]p\n", q)
k := **q
fmt.Printf("k: %[1]T %[1]d\n", k)
}
Playground: https://play.golang.org/p/WL2M1jp1T3
Output:
i: int 42
p: *int 0x10410020
j: int 42
q: **int 0x1040c130
k: int 42
A pointer allows you to pass around a memory address, so that multiple scopes can use the same address, and you can change the value at that address without changing the address; effectively allowing you to share memory. A pointer to a pointer allows you to pass around the address to a memory address, so that multiple scopes can use it and you can change the address pointed to by the shared reference. With a normal pointer, if you change the address of the pointer, any other copies of that pointer held elsewhere will become "disconnected" - they will no longer point to the same value.
For example, you might have two variables being operated on in separate workers, and a central reference you want to be able to switch back and forth between them. A pointer to a pointer is one way to achieve this; the central reference can be changed to point to the pointer used by any of the workers. Each worker would hold a pointer to a value which it would operate on normally, without needing to know if the central reference points to its pointer or not.
Or, as #Volker noted, the canonical example of the linked list. Here is an example in C but the pointer logic is the same in Go.
Yes, you got it correctly.
As to "what use", it's the same as everywhere else: you use a pointer in these cases:
A variable has to be changed in some other code —
typically another function, — and so you pass a pointer to the memory
occupied by that variable to that function so it's able to update
that memory via that address.
A value is too large to be passed around fast enough by copying it.
A pointer to a pointer is a bit of a pathological case for Go,
but still this can be used in the first case: when you want some function
to change the value of a pointer variable your code controls.
If I want to share something like a char **keys array between fork()'d processes using shm_open and mmap can I just stick a pointer to keys into a shared memory segment or do I have to copy all the data in keys into the shared memory segment?
All data you want to share has to be in the shared segment. This means that both the pointers and the strings have to be in the shared memory.
Sharing something which includes pointers can be cumbersome. This is because mmap doesn't guarantee that a given mapping will end up in the required address.
You can still do this, by two methods. First, you can try your luck with mmap and hope that the dynamic linker doesn't load something at your preferred address.
Second method is to use relative pointers. Inside a pointer, instead of storing a pointer to a string, you store the difference between the address of the pointer and the address of the string. Like so:
char **keys= mmap(NULL, ...);
char *keydata= (char*) keys + npointers * sizeof(char*);
strcpy(keydata, firstring);
keys[0]= (char*) (keydata - (char*) &keys[0]);
keydata+= strlen(firststring)+1;
When you want to access the string from the other process, you do the reverse:
char **keys= mmap(NULL, ...);
char *str= (char*) (&keys[0]) + (ptrdiff_t) keys[0];
It's a little cumbersome but it works regardless of what mmap returns.
I'm having some trouble copying pointers' contents. I'm simply trying this:
char* vigia1;
char* vigia2;
And..
char* aux = (char*) malloc (strlen (vigia1)+1);
aux=vigia1;
vigia1=vigia2;
vigia2=aux;
free (aux);
vigia1, vigia2 are pointers to a char pointer. They both have a malloc greater than their maximum possible size, that's OK.
Since I'm trying to make an order for a list, I need to make this change to order the nodes' content. But I'm getting confused: after the free(aux) , vigia2 doesn't have any value. I think I must be pointing vigia2 to the memory region where aux is, region that 'disappear' after the free. So what should I do?
Thanks!
Pointers, pointers, bad with them, worse without them
A pointer is a number that stores where in memory sth is stored, with that in mind, let's delve into what you've done there:
char* aux = (char*) malloc (strlen (vigia1)+1);
Good, you've created space somewhere in a part of the memory called heap, and stored the address of the newly created memory space at aux.
aux=vigia1;
Ops, now you've overwritten the address of the memory space you've "created" with the number stored at vigia1, that happens to be an address to another memory space.
vigia1=vigia2;
Now you're assinging to vigia1 the value of vigia2, another address of some memory space out there.
vigia2=aux;
And, by the end of it, you make vigia2 point to the memory region previously pointed by vigia1.
free (aux);
Now, you're freeing the memory pointed by aux. Wait a second, on the line above this one you've just made vigia2 point to this same address. No wonder it holds nothing useful :)
Trying to help you with what you want to do:
So long you don't have any constraint that obliges you to mantain your list nodes ordered in memory, you don't need to copy the content of the node, just make the pointer of the first node point to the memory region of the second node.
A perfect swap would be:
char *aux; // you'll need an aux to make the swap, the normal stuff
aux = vigia1; // now aux points to the same address as vigia1
vigia1 = vigia2; // vigia1 now points to the contents of vigia2
vigia2 = aux; // and now vigia2 points to the content pointed previously by vigia1
/* and tada! the swap is done :D */
Assigning one pointer to another simply copies the value of one pointer to another, i.e., an address. It doesn't copy what the pointer refers to into another location at a different address. So yes, you can have N pointers all pointing to the same chunk of memory, but once free() is called one of them they are all invalid.
So that means that this:
char* aux = (char*) malloc (strlen (vigia1)+1);
aux=vigia1;
Is a memory leak. You malloc'd some memory for aux and then immediately discarded the address. There is no way to get it back, no way to free() it any longer.
What you are making is just pointer assignments. Malloc'ed memory is just getting wasted causing a leak.
aux=vigia1; // Makes aux point to the location where vigia1 is pointing to
// Doesn't copy the contents of vigia1 to malloc'ed memory for aux
You need to make deep copy using strcpy.
strcpy(aux, vigia1);
Hope this gives you the hint.