I want to write an initialiser for the following struct.
struct Foo {
bar: &Bar
}
It's recommended to use &T over Box<T> for flexibility and that's what I'm going for here. Without an initialiser you'd use the struct like this.
{
let bar = ...;
let foo = Foo { bar: bar };
// use foo
// dealloc bar and foo
}
This works. But I want to allocate &Bar in the initialiser. Now obviously allocating bar on the stack will not work because it goes out of scope once the initialiser returns. So I thought I could use Box.
fn new() -> Foo {
let bar = Box::new(...);
Foo { bar: &*bar }
}
This does not work either because I guess we're just borrowing the value instead of transferring ownership which will still deallocate bar once new returns.
Am I forced to use a Box in the struct in this case?
EDIT
Note: The reason the reference is needed is because Bar is actually a generic trait in my case and thus the size can vary which means allocation on the stack won't work.
Your question doesn't really make sense. If you are constructing the object in your new method, then by definition you know what the type is (because you are calling that constructor), and you don't need to treat it as a trait object. You should just use the type!
The reason the reference is needed is because Bar is actually a generic trait in my case and thus the size can vary which means allocation on the stack won't work.
This isn't completely true! If you wanted to accept a parameter, and you want to transfer ownership, then you can simply restrict the type to the trait you wish:
trait Talker { fn talk(&self); }
struct Dog;
impl Talker for Dog { fn talk(&self) { println!("Woof") }}
struct Cat;
impl Talker for Cat { fn talk(&self) { println!("Meow") }}
struct OwnAGeneric<T: Talker> {
t: T
}
impl<T: Talker> OwnAGeneric<T> {
fn new(t: T) -> OwnAGeneric<T> { OwnAGeneric { t: t } }
fn talk(&self) { println!("I own this:"); self.t.talk(); }
}
fn main() {
let owned_cat = OwnAGeneric::new(Cat);
owned_cat.talk();
}
This should be monomorphized by the compiler and basically as fast as if you had written the code out by hand. This also allows everything to be allocated on the stack.
It's difficult to say for sure without knowing what Bar is. If it's a trait, then yeah it needs to be a &Bar or Box<Bar>. If it's just a regular type, then the normal thing to do is to store it directly:
struct Foo {
bar: Bar
}
When you hear that &Bar is preferred for flexibility, that's usually with respect to function parameters, e.g. fn func(bar: &Bar), and even then it really depends on what you're actually doing. However, when defining a field on a struct, storing the value directly is usually what you want, unless you know what you're doing. This conveys clearly that the Foo owns the Bar.
Related
I want to implement a stack using pointers or something. How can I check if a Box is a null pointer? I seen some code with Option<Box<T>> and Box<Option<T>> but I don't understand this. This is as far as I went:
struct Node {
value: i32,
next: Box<Node>,
}
struct Stack {
top: Box<Node>,
}
Box<T> can never be NULL, therefore there is nothing to check.
Box<T> values will always be fully aligned, non-null pointers
— std::box
You most likely wish to use Option to denote the absence / presence of a value:
struct Node {
value: i32,
next: Option<Box<Node>>,
}
struct Stack {
top: Option<Box<Node>>,
}
See also:
Should we use Option or ptr::null to represent a null pointer in Rust?
How to set a field in a struct with an empty value?
What is the null pointer optimization in Rust?
You don't want null. null is an unsafe antipattern even in languages where you have to use it, and thankfully Rust rids us of the atrocity. Box<T> always contains a T, never null. Rust has no concept of null.
As you've correctly pointed out, if you want a value to be optional, you use Option<T>. Whether you do Box<Option<T>> or Option<Box<T>> really doesn't matter that much, and someone who knows a bit more about the lower-level side of things can chime in on which is more efficient.
struct Node {
value: i32,
next: Option<Box<Node>>,
}
struct Stack {
top: Option<Box<Node>>,
}
The Option says "this may or may not exist" and the Box says "this value is on the heap. Now, the nice thing about Option that makes it infinitely better than null is that you have to check it. You can't forget or the compiler will complain. The typical way to do so is with match
match my_stack.top {
None => {
// Top of stack is not present
}
Some(x) => {
// Top of stack exists, and its value is x of type Box<T>
}
}
There are tons of helper methods on the Option type itself to deal with common patterns. Below are just a few of the most common ones I use. Note that all of these can be implemented in terms of match and are just convenience functions.
The equivalent of the following Java code
if (value == null) {
result = null;
} else {
result = ...;
}
is
let result = value.map(|v| ...)
Or, if the inner computation can feasibly produce None as well,
let result = value.and_then(|v| ...)
If you want to provide a default value, say zero, like
if (value == null) {
result = 0;
} else {
result = value;
}
Then you want
result = value.unwrap_or(0)
It's probably best to stop thinking in terms of how you would handle null and start learning Option<T> from scratch. Once you get the hang of it, it'll feel ten times safer and more ergonomic than null checks.
A Box<T> is a pointer to some location on the heap that contains some data of type T. Rust guarantees that Box<T> will never be a null pointer, i.e the address should always be valid as long as you aren't doing anything weird and unsafe.
If you need to represent a value that might not be there (e.g this node is the last node, so there is no next node), you can use the Option type like so
struct Node {
value: i32,
next: Option<Box<Node>>,
}
struct Stack {
top: Option<Box<Node>>,
}
Now, with Option<Box<Node>>, Node can either have a next Node or no next node. We can check if the Option is not None like so
fn print_next_node_value(node: &Node) {
match &node.next {
Some(next) => println!("the next value is {}", next.value),
None => println!("there is no next node")
}
}
Because a Box is just a pointer to some location on the heap, it can be better to use Option<Box<T>> instead of Box<Option<T>>. This is because the second one will allocate an Option<T> on the heap, while the first one will not. Additionally, Option<Box<T>> and Box<T> are equally big (both are 8 bytes). This is because Rust knows that Box<T> can never be all zeros (i.e can never be the null pointer), so it can use the all-0's state to represent the None case of Option<Box<T>>.
I have the following:
enum SomeType {
VariantA(String),
VariantB(String, i32),
}
fn transform(x: SomeType) -> SomeType {
// very complicated transformation, reusing parts of x in order to produce result:
match x {
SomeType::VariantA(s) => SomeType::VariantB(s, 0),
SomeType::VariantB(s, i) => SomeType::VariantB(s, 2 * i),
}
}
fn main() {
let mut data = vec![
SomeType::VariantA("hello".to_string()),
SomeType::VariantA("bye".to_string()),
SomeType::VariantB("asdf".to_string(), 34),
];
}
I would now like to call transform on each element of data and store the resulting value back in data. I could do something like data.into_iter().map(transform).collect(), but this will allocate a new Vec. Is there a way to do this in-place, reusing the allocated memory of data? There once was Vec::map_in_place in Rust but it has been removed some time ago.
As a work-around, I've added a Dummy variant to SomeType and then do the following:
for x in &mut data {
let original = ::std::mem::replace(x, SomeType::Dummy);
*x = transform(original);
}
This does not feel right, and I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop. Is there a better way of doing this?
Your first problem is not map, it's transform.
transform takes ownership of its argument, while Vec has ownership of its arguments. Either one has to give, and poking a hole in the Vec would be a bad idea: what if transform panics?
The best fix, thus, is to change the signature of transform to:
fn transform(x: &mut SomeType) { ... }
then you can just do:
for x in &mut data { transform(x) }
Other solutions will be clunky, as they will need to deal with the fact that transform might panic.
No, it is not possible in general because the size of each element might change as the mapping is performed (fn transform(u8) -> u32).
Even when the sizes are the same, it's non-trivial.
In this case, you don't need to create a Dummy variant because creating an empty String is cheap; only 3 pointer-sized values and no heap allocation:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
let old = std::mem::replace(self, VariantA(String::new()));
// Note this line for the detailed explanation
*self = match old {
VariantA(s) => VariantB(s, 0),
VariantB(s, i) => VariantB(s, 2 * i),
};
}
}
for x in &mut data {
x.transform();
}
An alternate implementation that just replaces the String:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
*self = match self {
VariantA(s) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 0)
}
VariantB(s, i) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 2 * *i)
}
};
}
}
In general, yes, you have to create some dummy value to do this generically and with safe code. Many times, you can wrap your whole element in Option and call Option::take to achieve the same effect .
See also:
Change enum variant while moving the field to the new variant
Why is it so complicated?
See this proposed and now-closed RFC for lots of related discussion. My understanding of that RFC (and the complexities behind it) is that there's an time period where your value would have an undefined value, which is not safe. If a panic were to happen at that exact second, then when your value is dropped, you might trigger undefined behavior, a bad thing.
If your code were to panic at the commented line, then the value of self is a concrete, known value. If it were some unknown value, dropping that string would try to drop that unknown value, and we are back in C. This is the purpose of the Dummy value - to always have a known-good value stored.
You even hinted at this (emphasis mine):
I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop
That "should" is the problem. During a panic, that dummy value is visible.
See also:
How can I swap in a new value for a field in a mutable reference to a structure?
Temporarily move out of borrowed content
How do I move out of a struct field that is an Option?
The now-removed implementation of Vec::map_in_place spans almost 175 lines of code, most of having to deal with unsafe code and reasoning why it is actually safe! Some crates have re-implemented this concept and attempted to make it safe; you can see an example in Sebastian Redl's answer.
You can write a map_in_place in terms of the take_mut or replace_with crates:
fn map_in_place<T, F>(v: &mut [T], f: F)
where
F: Fn(T) -> T,
{
for e in v {
take_mut::take(e, f);
}
}
However, if this panics in the supplied function, the program aborts completely; you cannot recover from the panic.
Alternatively, you could supply a placeholder element that sits in the empty spot while the inner function executes:
use std::mem;
fn map_in_place_with_placeholder<T, F>(v: &mut [T], f: F, mut placeholder: T)
where
F: Fn(T) -> T,
{
for e in v {
let mut tmp = mem::replace(e, placeholder);
tmp = f(tmp);
placeholder = mem::replace(e, tmp);
}
}
If this panics, the placeholder you supplied will sit in the panicked slot.
Finally, you could produce the placeholder on-demand; basically replace take_mut::take with take_mut::take_or_recover in the first version.
I'm attempting to write Rust bindings for a C collection library (Judy Arrays [1]) which only provides itself room to store a pointer-width value. My company has a fair amount of existing code which uses this space to directly store non-pointer values such as pointer-width integers and small structs. I'd like my Rust bindings to allow type-safe access to such collections using generics, but am having trouble getting the pointer-stashing semantics working correctly.
The mem::transmute() function seems like one potential tool for implementing the desired behavior, but attempting to use it on an instance of a parameterized type yield a confusing-to-me compilation error.
Example code:
pub struct Example<T> {
v: usize,
t: PhantomData<T>,
}
impl<T> Example<T> {
pub fn new() -> Example<T> {
Example { v: 0, t: PhantomData }
}
pub fn insert(&mut self, val: T) {
unsafe {
self.v = mem::transmute(val);
}
}
}
Resulting error:
src/lib.rs:95:22: 95:36 error: cannot transmute to or from a type that contains type parameters in its interior [E0139]
src/lib.rs:95 self.v = mem::transmute(val);
^~~~~~~~~~~~~~
Does this mean a type consisting only of a parameter "contains type parameters in its interior" and thus transmute() just won't work here? Any suggestions of the right way to do this?
(Related question, attempting to achieve the same result, but not necessarily via mem::transmute().)
[1] I'm aware of the existing rust-judy project, but it doesn't support the pointer-stashing I want, and I'm writing these new bindings largely as a learning exercise anyway.
Instead of transmuting T to usize directly, you can transmute a &T to &usize:
pub fn insert(&mut self, val: T) {
unsafe {
let usize_ref: &usize = mem::transmute(&val);
self.v = *usize_ref;
}
}
Beware that this may read from an invalid memory location if the size of T is smaller than the size of usize or if the alignment requirements differ. This could cause a segfault. You can add an assertion to prevent this:
assert_eq!(mem::size_of::<T>(), mem::size_of::<usize>());
assert!(mem::align_of::<usize>() <= mem::align_of::<T>());
I have a program that more or less looks like this
struct Test<T> {
vec: Vec<T>
}
impl<T> Test<T> {
fn get_first(&self) -> &T {
&self.vec[0]
}
fn do_something_with_x(&self, x: T) {
// Irrelevant
}
}
fn main() {
let t = Test { vec: vec![1i32, 2, 3] };
let x = t.get_first();
t.do_something_with_x(*x);
}
Basically, we call a method on the struct Test that borrows some value. Then we call another method on the same struct, passing the previously obtained value.
This example works perfectly fine. Now, when we make the content of main generic, it doesn't work anymore.
fn generic_main<T>(t: Test<T>) {
let x = t.get_first();
t.do_something_with_x(*x);
}
Then I get the following error:
error: cannot move out of borrowed content
src/main.rs:14 let raw_x = *x;
I'm not completely sure why this is happening. Can someone explain to me why Test<i32> isn't borrowed when calling get_first while Test<T> is?
The short answer is that i32 implements the Copy trait, but T does not. If you use fn generic_main<T: Copy>(t: Test<T>), then your immediate problem is fixed.
The longer answer is that Copy is a special trait which means values can be copied by simply copying bits. Types like i32 implement Copy. Types like String do not implement Copy because, for example, it requires a heap allocation. If you copied a String just by copying bits, you'd end up with two String values pointing to the same chunk of memory. That would not be good (it's unsafe!).
Therefore, giving your T a Copy bound is quite restrictive. A less restrictive bound would be T: Clone. The Clone trait is similar to Copy (in that it copies values), but it's usually done by more than just "copying bits." For example, the String type will implement Clone by creating a new heap allocation for the underlying memory.
This requires you to change how your generic_main is written:
fn generic_main<T: Clone>(t: Test<T>) {
let x = t.get_first();
t.do_something_with_x(x.clone());
}
Alternatively, if you don't want to have either the Clone or Copy bounds, then you could change your do_something_with_x method to take a reference to T rather than an owned T:
impl<T> Test<T> {
// other methods elided
fn do_something_with_x(&self, x: &T) {
// Irrelevant
}
}
And your generic_main stays mostly the same, except you don't dereference x:
fn generic_main<T>(t: Test<T>) {
let x = t.get_first();
t.do_something_with_x(x);
}
You can read more about Copy in the docs. There are some nice examples, including how to implement Copy for your own types.
I'm reading the return pointers part of Rust Guide.
Here is a sample code of it:
struct BigStruct {
one: int,
two: int,
// etc
one_hundred: int,
}
fn foo(x: Box<BigStruct>) -> BigStruct {
return *x;
}
fn main() {
let x = box BigStruct {
one: 1,
two: 2,
one_hundred: 100,
};
let y = box foo(x);
}
The strong part of the following explanation confuses me:
There is no copy in this code. main allocates enough room for the `box , passes a pointer to that memory into foo as x, and then foo writes the value straight into that pointer. This writes the return value directly into the allocated box.
having read a related question, I still don't get the no-copy point here.
Does function foo return a copy of *x?
If it does, how to understand the explanation?
If it does not, is it related to ownership and borrowing?
I understand the concept of ownership and borrowing, I just don't know when it happen.
The guide is trying to tell you that the code is behaving as if it was written this way:
struct BigStruct {
one: int,
two: int,
// etc
one_hundred: int,
}
fn foo(x: Box<BigStruct>, result: &mut BigStruct) {
*result = *x;
}
fn main() {
let x = box BigStruct {
one: 1,
two: 2,
one_hundred: 100,
};
unsafe {
let mut y = box std::mem::uninitialized();
foo(x, &mut *y);
}
}
main creates a Box and passes a pointer to the box's interior to foo as an input argument. This way, foo can store the result value there directly rather than returning it and having main copy it into the box.
There is a copy happening in foo (from the first box to the second box), but if foo wasn't writing directly into the box, there would be two copies (possibly one from the first box to the stack in foo, then from the stack to the second box in main).
P.S.: I think there's an error in the guide. It says:
passes a pointer to that memory into foo as x
but x is the box from which we're trying to copy, not the new box... Rather, it passes a pointer as a hidden argument.