Memory management basics of Rust lang - pointers

i need some help with understanding of rust basics and memory basics at all. I think this will be helpful for those who come across this after me.
memory adress
(1) Here i create a function thats return pointer to a vector. Why there is two different adresses if there must be the same pointer?
fn get_pointer(vec: &[i32]) -> &[i32] {
vec
}
fn main() {
let vector = vec![1, 2, 3, 4, 5];
let first_element = get_pointer(&vector);
let vector: Vec<String> = Vec::new();
println!(
"Vector: {:p}, and func result: {:p}",
&vector, first_element
);
}
Output: Vector: 0x5181aff7e8, and func result: 0x17e5bad99b0
different adresses of one element?
(2) Now im creating a function thats return a pointer to the first element of a Vector
fn get_pointer_to_first_element(vec: &[i32]) -> &i32 {
&vec[0];
}
fn main() {
let vector = vec![1, 2, 3, 4, 5];
let first_element = get_pointer_to_first_element(&vector);
println!(
"from variable: {:p}, from function: {:p}",
&vector[0], &first_element
);
}
from variable: 0x1f9a887da80, from function: 0x15fe2ffb10
variables defined in functions
(3) Okay, i understand thats question is even more stupid, but i realy cant get it.
Here i create a pointer variable inside a function, and it must be deleted with its "value" when program leaves scope of a function, so why i can use that value after call of the function?
also bonus question about deleting value after shadowing, is that true? Why?
fn get_pointer_to_first_element(vec: &[i32]) -> &i32 {
let pointer = &vec[0];
pointer
} // here variable "pointer" and its value must be deleted, isn't it? how then i can operate it after calling a function?
fn main() {
let vector = vec![1, 2, 3, 4, 5];
let first_element = get_pointer_to_first_element(&vector);
let vector: Vec<String> = Vec::new(); // Bonus question: "vector" variable is new now, so previous must be deleted?
println!("first element is available and its {first_element}. ");
}
Vector: 0x5181aff7e8, and func result: 0x17e5bad99b0
I tried to use the same things with arrays, but result is the same.

In order:
You allocated a new vector, so of course it'll have a different address. first_element points to the original vector. It was initialized with that value and it'll keep it. Even if you had not, your function takes a slice as an argument; not a Vec. Those are different types. The Vec is itself a pointer. The slice is a separate reference to, as the name suggests, a slice of the data on the heap it points to.
fn main() {
let vector = vec![1, 2, 3];
println!(
"Normal: {:p}. Calculated /w a function: {:p}. Slice: {:p}.",
&vector,
get_vec_address(&vector),
get_slice_address(&vector)
);
// This outputs:
// Normal: 0x7fff91db1550. Calculated /w a function: 0x7fff91db1550. Slice: 0x55ddbc54a9d0.
}
fn get_vec_address(vec: &Vec<i32>) -> &Vec<i32> {
vec
}
/// `&vector` is coerced from `&Vec` into `&[i32]`.
/// To learn more about Deref coercions, go to:
/// https://doc.rust-lang.org/book/ch15-02-deref.html#implicit-deref-coercions-with-functions-and-methods
fn get_slice_address(slice: &[i32]) -> &[i32] {
slice
}
Run this snippet on Rust Playground. See this cheat sheet on containers for a visual explanation.
As the comment says, you're taking the reference of a reference.
pointer isn't dropped. It's moved as the return value. To get a better understanding of scope, you might want to review the chapters on scope and lifetimes.
As for your bonus question, shadowing a variable doesn't drop it. It'll still live to the end of the scope. If there are existing references to it, it can still be used. For example, this is valid code:
fn main() {
let vector = vec![1, 2, 3];
let first_element = &vector[0];
let vector = vec![4, 5, 6];
println!("{first_element}");
println!("{vector:?}");
// This will print:
// 1
// [4, 5, 6]
}
Run this snippet on Rust Playground.

Related

How to perform a `flat_map` (or similar operation) on an iterator N times without runtime polymorphism?

I want to be able to repeat a process where a collection that we are iterating over is altered an n number of times. n is only known at runtime, and can be specified by the user, so we cannot hard-code it into the type.
An approach that uses intermediate data structures by collect-ing between iterations is possible, like so:
let n = 10;
let mut vec1 = vec![1, 2, 3];
{
for _index in 0..n {
let temp_vec = vec1.into_iter().flat_map(|x| vec![x, x * 2]).collect();
vec1 = temp_vec;
}
}
However, this seems wasteful, because we are creating intermediate datastructures, so I went on looking for a solution that chains iterators directly.
At first I thought one could just do something like:
let mut iter = vec![1, 2, 3].into_iter();
for index in 0..n {
iter = iter.flat_map(|x| vec![x, x * 2].into_iter());
}
However, this does not work because in Rust, all functions on iterators return their own kind of 'compound iterator' struct. (In for instance Haskell, functions on iterators return the appropriate kind of result iterator, which does not become a 'bigger and bigger compound type'.)
Rewriting this as a recursive function had similar problems because (a) I was returning 'some kind of Iterator' whose type was (near?)-impossible to write out by hand because of the recursion, and (b) this type was different in the base case from the recursive case.
I found this question about conditionally returning either one or the other iterator type, as well as using impl Iterator to indicate that we return some concrete type that implements the Iterator trait, but we do not care about its exact nature.
A similar example to the code in the linked answer has been implemented in the code below as maybe_flatmap. This works.
However, I don't want to run flat_map zero or one time, but rather N times on the incoming iterator. Therefore, I adapted the code to call itself recursively up to a depth of N.
Attempting to do that, then makes the Rust compiler complain with an error[E0720]: opaque type expands to a recursive type:
use either::Either; // 1.5.3
/// Later we want to work with any appropriate items,
/// but for simplicity's sake, just use plain integers for now.
type I = u64;
/// Works, but limited to single level.
fn maybe_flatmap<T: Iterator<Item = I>>(iter: T, flag: bool) -> impl Iterator<Item = I> {
match flag {
false => Either::Left(iter),
true => Either::Right(iter.flat_map(move |x| vec![x, x * 2].into_iter())),
}
}
/// Does not work: opaque type expands to a recursive type!
fn rec_flatmap<T: Iterator<Item = I>>(iter: T, depth: usize) -> impl Iterator<Item = I> {
match depth {
0 => Either::Left(iter),
_ => {
let iter2 = iter.flat_map(move |x| vec![x, x * 2]).into_iter();
Either::Right(rec_flatmap(iter2, depth - 1))
}
}
}
fn main() {
let xs = vec![1, 2, 3, 4];
let xs2 = xs.into_iter();
let xs3 = maybe_flatmap(xs2, true);
let xs4: Vec<_> = xs3.collect();
println!("{:?}", xs4);
let ys = vec![1, 2, 3, 4];
let ys2 = ys.into_iter();
let ys3 = rec_flatmap(ys2, 5);
let ys4: Vec<_> = ys3.collect();
println!("{:?}", ys4);
}
Rust playground
error[E0720]: opaque type expands to a recursive type
--> src/main.rs:16:65
|
16 | fn rec_flatmap<T: Iterator<Item = I>>(iter: T, depth: usize) -> impl Iterator<Item = I> {
| ^^^^^^^^^^^^^^^^^^^^^^^ expands to a recursive type
|
= note: expanded type is `either::Either<T, impl std::iter::Iterator>`
I am stuck.
Since regardless of how often you flat_map, the final answer is going to be an (iterator over) a vector of integers, it seems like there ought to be a way of writing this function using only a single concrete return type.
Is this possible? Is there a way out of this situation without resorting to runtime polymorphism?
I believe/hope that a solution without dynamic polymorphism (trait objects or the like) is possible because regardless of how often you call flat_map the end result should have (at least morally) have the same type. I hope there is a way to shoehorn the (non-matching) nested FlatMap struct in a matching single static type somehow.
Is there a way to resolve this without runtime polymorphism?
No.
To solve it using a trait object:
let mut iter: Box<dyn Iterator<Item = i32>> = Box::new(vec![1, 2, 3].into_iter());
for _ in 0..n {
iter = Box::new(iter.flat_map(|x| vec![x, x * 2].into_iter()));
}
regardless of how often you call flat_map the end result should have (at least morally) have the same type
I don't know which morality to apply to type systems, but the literal size in memory is (very likely to be) different for FlatMap<...> and FlatMap<FlatMap<...>>. They are different types.
See also:
Conditionally iterate over one of several possible iterators
Creating Diesel.rs queries with a dynamic number of .and()'s
How do I iterate over a Vec of functions returning Futures in Rust?
How can I extend the lifetime of a temporary variable inside of an iterator adaptor in Rust?
Why does Iterator::take_while take ownership of the iterator?

Not understanding how to access the elements of a vector in Rust [duplicate]

This question already has answers here:
Why does printing a pointer print the same thing as printing the dereferenced pointer?
(2 answers)
Closed 4 years ago.
This is my first encounter with Rust, and I am reading the chapter on vectors in the current version of the Rust Book. I do have previous experience with other languages (mostly functional ones, where the following issues are hidden).
Running the following code snippet (from the book) returns 3:
fn main() {
let v = vec![1, 2, 3, 4, 5];
let third: &i32 = &v[2];
println!("{}", third);
}
The first thing that I do not understand is why the third inside the println! macro isn't referenced. I would have expected the above code to print the memory address of the 3rd element of v (as in C and C++), not its content.
Consider now the code (notice the reference this time inside println!):
fn main() {
let v = vec![1, 2, 3, 4, 5];
let third: &i32 = &v[2];
println!("{}", *third);
}
Why does the code code above produce exactly the same output as the one above it, as if the * made no difference?
Finally, let us rewrite the above code snippets eliminating references completely:
fn main() {
let v = vec![1, 2, 3, 4, 5];
let third: i32 = v[2];
println!("{}", third);
}
Why does this last version produce the same output as the previous two? And what type does v[2] really have: is it an &i32 or an i32?
Are all of the above a manifestation of the automatic dereferencing that is only once alluded to in a previous chapter? (If so, then the book should be rewritten, because it is more confusing than clarifying.)
Disclaimer: I'm learning Rust too, so please take this with a grain of salt.
To understand what happens, it might be easier with cargo-expand.
For the code
fn main() {
let v = vec![1, 2, 3, 4, 5];
let third: &i32 = &v[2];
println!("{}", third);
}
we get (I've removed irrelevant codes)
fn main() {
...
let third: ...
{
::io::_print(::std::fmt::Arguments::new_v1_formatted(
...
&match (&third,) {
(arg0,) => [::std::fmt::ArgumentV1::new(arg0, ::std::fmt::Display::fmt)],
},
...
));
};
}
for the first/last case, and
fn main() {
...
let third: ...
{
::io::_print(::std::fmt::Arguments::new_v1_formatted(
...
&match (&*third,) {
(arg0,) => [::std::fmt::ArgumentV1::new(arg0, ::std::fmt::Display::fmt)],
},
...
));
};
}
for the second case.
Which, roughly, means that, for {} formatter, the function fmt (of the trait Display) will be called for a reference to third or *third respectively.
Let's apply this logic for
second case: third: &i32 then *third: i32, this is where impl Display for i32 applies.
first case: third: &i32, this works also because of impl<'_, T> Display for &'_ T (where T is i32)
last case: third: i32: is the same as the first case. Moreover, v[2] (which is of type i32) works because impl Index for Vec (note that: let third = v[2] works because impl Copy for i32, i.e copy semantics is applied for = instead of the default move semantics).

What is the best way to repeat the elements in a vector in Rust?

I found this way, but it seems too verbose for such a common action:
fn double_vec(vec: Vec<i32>) -> Vec<i32> {
let mut vec1 = vec.clone();
let vec2 = vec.clone();
vec1.extend(vec2);
vec1
}
I know that in JavaScript it could be just arr2 = [...arr1, ...arr1].
"Doubling a vector" isn't something that's really done very often so there's no shortcut for it. In addition, it matters what is inside the Vec because that changes what operations can be performed on it. In this specific example, the following code works:
let x = vec![1, 2, 3];
let y: Vec<_> = x.iter().cycle().take(x.len() * 2).collect();
println!("{:?}", y); //[1, 2, 3, 1, 2, 3]
The cycle() method requires that the items in the Iterator implement the Clone trait so that the items can be duplicated. So if the items in your Vec implement Clone, then this will work. Since immutable references (&) implement Clone, a Vec<&Something> will work but mutable references (&mut) do not implement Clone and thus a Vec<&mut Something> will not work.
Note that even if a type does not implement Clone, you can still clone references to that type:
struct Test;
fn test_if_clone<T: Clone>(_x: T) {}
fn main() {
let x = Test;
test_if_clone(x); //error[E0277]: the trait bound `Test: std::clone::Clone` is not satisfied
let y = &x;
test_if_clone(y); //ok
}
You can use the concat method for this, it's simple:
fn double_vec(v: Vec<i32>) -> Vec<i32> {
[&v[..], &v[..]].concat()
}
Unfortunately we have to make the vectors slices explicitly (here &v[..]); but otherwise this method is good because it allocates the result to the needed size directly and then does the copies.
Building on Wesley's answer, you can also use chain to glue two iterables together, one after the other. In the below example I use the same Vec's iter() method twice:
let x = vec![1, 2, 3];
let y: Vec<_> = x.iter().chain(x.iter()).collect();
println!("{:?}", y); //[1, 2, 3, 1, 2, 3]
The iterator methods are a likely to be a lot less efficient than a straight memcpy that vector extension is.
You own code does a clone too many; you can just reuse the by-value input:
fn double_vec(mut vec: Vec<i32>) -> Vec<i32> {
let clone = vec.clone();
vec.extend(clone);
vec
}
However, the nature of a Vec means this is likely to require a copy even if you managed to remove that clone, so you're not generally gaining much over just using concat.
Using concat on slices is fairly efficient, as it will preallocate the Vec in advance and then perform an efficient extend_from_slice. However, this does mean it's no longer particularly sensible to take a Vec as input; writing the following is strictly more flexible.
fn double_slice(slice: &[i32]) -> Vec<i32> {
[slice, slice].concat()
}
Since Rust 1.53, Vec::extend_from_within makes it possible to more efficiently double a vector:
fn double_vec(vec: &mut Vec<i32>) {
vec.extend_from_within(..);
}

Build HashSet from a vector in Rust

I want to build a HashSet<u8> from a Vec<u8>. I'd like to do this
in one line of code,
copying the data only once,
using only 2n memory,
but the only thing I can get to compile is this piece of .. junk, which I think copies the data twice and uses 3n memory.
fn vec_to_set(vec: Vec<u8>) -> HashSet<u8> {
let mut victim = vec.clone();
let x: HashSet<u8> = victim.drain(..).collect();
return x;
}
I was hoping to write something simple, like this:
fn vec_to_set(vec: Vec<u8>) -> HashSet<u8> {
return HashSet::from_iter(vec.iter());
}
but that won't compile:
error[E0308]: mismatched types
--> <anon>:5:12
|
5 | return HashSet::from_iter(vec.iter());
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected u8, found &u8
|
= note: expected type `std::collections::HashSet<u8>`
= note: found type `std::collections::HashSet<&u8, _>`
.. and I don't really understand the error message, probably because I need to RTFM.
Because the operation does not need to consume the vector¹, I think it should not consume it. That only leads to extra copying somewhere else in the program:
use std::collections::HashSet;
use std::iter::FromIterator;
fn hashset(data: &[u8]) -> HashSet<u8> {
HashSet::from_iter(data.iter().cloned())
}
Call it like hashset(&v) where v is a Vec<u8> or other thing that coerces to a slice.
There are of course more ways to write this, to be generic and all that, but this answer sticks to just introducing the thing I wanted to focus on.
¹This is based on that the element type u8 is Copy, i.e. it does not have ownership semantics.
The following should work nicely; it fulfills your requirements:
use std::collections::HashSet;
use std::iter::FromIterator;
fn vec_to_set(vec: Vec<u8>) -> HashSet<u8> {
HashSet::from_iter(vec)
}
from_iter() works on types implementing IntoIterator, so a Vec argument is sufficient.
Additional remarks:
you don't need to explicitly return function results; you only need to omit the semi-colon in the last expression in its body
I'm not sure which version of Rust you are using, but on current stable (1.12) to_iter() doesn't exist
Converting Vec to HashSet
Moving data ownership
let vec: Vec<u8> = vec![1, 2, 3, 4];
let hash_set: HashSet<u8> = vec.into_iter().collect();
Cloning data
let vec: Vec<u8> = vec![1, 2, 3, 4];
let hash_set: HashSet<u8> = vec.iter().cloned().collect();

Reference to a vector still prints as a vector?

Silly n00b trying to learn a bit about Rust. Here is my program:
fn main() {
let v = vec![1, 2, 3];
println!("{:?}", v);
println!("{:?}", &v);
}
Produced the output:
[1, 2, 3]
[1, 2, 3]
What is the point of the &? I was half expecting it to print a memory address.
I was originally thrown by this in the intro where it looks like they are looping through a reference. My guess is that Rust does some magic and detects it is a memory address of a vector?
What is the point of the &?
The & takes the reference of an object, as you surmised. However, there's a Debug implementation for references to Debug types that just prints out the referred-to object. This is done because Rust tends to prefer value equality over reference equality:
impl<'a, T: ?Sized + $tr> $tr for &'a T {
fn fmt(&self, f: &mut Formatter) -> Result { $tr::fmt(&**self, f) }
}
If you'd like to print the memory address, you can use {:p}:
let v = vec![1,2,3];
println!("{:p}", &v);
it looks like they are looping through a reference
The for i in foo syntax sugar calls into_iterator on foo, and there's an implementation of IntoIterator for &Vec that returns an iterator of references to items in the iterator:
fn into_iter(self) -> slice::Iter<'a, T> {
self.iter()
}
The magic is AFAIK in the formatter rather than the compiler. See for example:
fn take_val<T>(a:Vec<T> ) {}
fn take_ref<T>(b:&Vec<T>) {}
fn main() {
let v = vec![1, 2, 3];
take_val(&v);
take_ref(&v);
}
Fails with following error:
<anon>:6:14: 6:16 error: mismatched types:
expected `collections::vec::Vec<_>`,
found `&collections::vec::Vec<_>`
(expected struct `collections::vec::Vec`,
found &-ptr) [E0308]
<anon>:6 take_val(&v);
Which suggests this is due to formatter not wanting to show difference between a reference and a value. In older versions of Rust a &v would have been shown as &[1, 2, 3], if my memory serves me correct.
& has special meaning in Rust. It's not just a reference, it's a note that the value is borrowed to one or more functions/methods.

Resources