How to check if a Box is a null pointer? - pointers

I want to implement a stack using pointers or something. How can I check if a Box is a null pointer? I seen some code with Option<Box<T>> and Box<Option<T>> but I don't understand this. This is as far as I went:
struct Node {
value: i32,
next: Box<Node>,
}
struct Stack {
top: Box<Node>,
}

Box<T> can never be NULL, therefore there is nothing to check.
Box<T> values will always be fully aligned, non-null pointers
— std::box
You most likely wish to use Option to denote the absence / presence of a value:
struct Node {
value: i32,
next: Option<Box<Node>>,
}
struct Stack {
top: Option<Box<Node>>,
}
See also:
Should we use Option or ptr::null to represent a null pointer in Rust?
How to set a field in a struct with an empty value?
What is the null pointer optimization in Rust?

You don't want null. null is an unsafe antipattern even in languages where you have to use it, and thankfully Rust rids us of the atrocity. Box<T> always contains a T, never null. Rust has no concept of null.
As you've correctly pointed out, if you want a value to be optional, you use Option<T>. Whether you do Box<Option<T>> or Option<Box<T>> really doesn't matter that much, and someone who knows a bit more about the lower-level side of things can chime in on which is more efficient.
struct Node {
value: i32,
next: Option<Box<Node>>,
}
struct Stack {
top: Option<Box<Node>>,
}
The Option says "this may or may not exist" and the Box says "this value is on the heap. Now, the nice thing about Option that makes it infinitely better than null is that you have to check it. You can't forget or the compiler will complain. The typical way to do so is with match
match my_stack.top {
None => {
// Top of stack is not present
}
Some(x) => {
// Top of stack exists, and its value is x of type Box<T>
}
}
There are tons of helper methods on the Option type itself to deal with common patterns. Below are just a few of the most common ones I use. Note that all of these can be implemented in terms of match and are just convenience functions.
The equivalent of the following Java code
if (value == null) {
result = null;
} else {
result = ...;
}
is
let result = value.map(|v| ...)
Or, if the inner computation can feasibly produce None as well,
let result = value.and_then(|v| ...)
If you want to provide a default value, say zero, like
if (value == null) {
result = 0;
} else {
result = value;
}
Then you want
result = value.unwrap_or(0)
It's probably best to stop thinking in terms of how you would handle null and start learning Option<T> from scratch. Once you get the hang of it, it'll feel ten times safer and more ergonomic than null checks.

A Box<T> is a pointer to some location on the heap that contains some data of type T. Rust guarantees that Box<T> will never be a null pointer, i.e the address should always be valid as long as you aren't doing anything weird and unsafe.
If you need to represent a value that might not be there (e.g this node is the last node, so there is no next node), you can use the Option type like so
struct Node {
value: i32,
next: Option<Box<Node>>,
}
struct Stack {
top: Option<Box<Node>>,
}
Now, with Option<Box<Node>>, Node can either have a next Node or no next node. We can check if the Option is not None like so
fn print_next_node_value(node: &Node) {
match &node.next {
Some(next) => println!("the next value is {}", next.value),
None => println!("there is no next node")
}
}
Because a Box is just a pointer to some location on the heap, it can be better to use Option<Box<T>> instead of Box<Option<T>>. This is because the second one will allocate an Option<T> on the heap, while the first one will not. Additionally, Option<Box<T>> and Box<T> are equally big (both are 8 bytes). This is because Rust knows that Box<T> can never be all zeros (i.e can never be the null pointer), so it can use the all-0's state to represent the None case of Option<Box<T>>.

Related

How to avoid cloning parts when changing a mutable struct while recursion over that struct

I try to go recursively through a rose tree. The following code also works as intended but I still have the problem that I need to clone the value due to issues with the borrow checker. Therefore, it would be nice if there is a way to change from cloning to something better.
Without the clone() rust complains (rightfully) that I borrow self mutable by looking at the child nodes and the second time in the closure.
The whole structure and code is more complicated and bigger than shown below but that are the core elements. Do I have to change the data structure or do I miss something obvious? If the data structure is the issue how would you change it?
Also, the NType enum seems kinda useless here but I have some additional kinds that I have to consider. Here Inner nodes always have children and Outer nodes never.
enum NType{
Inner,
Outer
}
#[derive(Eq, PartialEq, Clone, Debug)]
struct Node {
// isn't a i32 actually. In my real program it's another struct
count: i32,
n_type: NType,
children: Option<Vec<usize>>
}
#[derive(Eq, PartialEq, Clone, Debug)]
struct Tree {
nodes: Vec<Node>,
}
impl Tree{
pub fn calc(&mut self, features: &Vec<i32>) -> i32{
// root is the last node
self.calc_h(self.nodes.len() - 1, features);
self.nodes[self.nodes.len() - 1].count.clone()
}
fn calc_h(&mut self, current: usize, features: &Vec<i32>){
// do some other things to decide where to go into recursion and where not to
// also use the features
if self.nodes[current].n_type == Inner{
//cloneing is very expensiv and destroys the performance
self.nodes[current].children.as_ref().unwrap().clone().iter().for_each(|&n| self.calc_h(n, features));
self.do_smt(current)
}
self.do_smt(current)
}
}
Edit:
Lagerbaer suggested to use as_mut but that results into current being a &mut usize and that doesn't really solve the problem.
changed childs into children
The correct plural of child is children so this is what I will refer to in this answer. Presumably this is what childs means in your code.
Since node.children is already an Option, the best solution would be to .take() the vector out of the node at the start of the iteration and put it in at the end. This way we avoid holding a reference to tree.nodes during the iteration.
if self.nodes[current].n_type == Inner {
let children = self.nodes[current].children.take().unwrap();
for &child in children.iter() {
self.calc_h(child, features);
}
self.nodes[current].children = Some(children);
}
Note that the behavior is different from your original code in case of cycles, but this is not something you need to worry about if the rest of the tree is implemented correctly.

Techniques for turning recursive functions into iterators in Rust?

I'm struggling to turn a simple recursive function into a simple iterator. The problem is that the recursive function maintains state in its local variables and call stack -- and to turn this into a rust iterator means basically externalizing all the function state into mutable properties on some custom iterator struct. It's quite a messy endeavor.
In a language like javascript or python, yield comes to the rescue. Are there any techniques in Rust to help manage this complexity?
Simple example using yield (pseudocode):
function one_level(state, depth, max_depth) {
if depth == max_depth {
return
}
for s in next_states_from(state) {
yield state_to_value(s);
yield one_level(s, depth+1, max_depth);
}
}
To make something similar work in Rust, I'm basically creating a Vec<Vec<State>> on my iterator struct, to reflect the data returned by next_states_from at each level of the call stack. Then for each next() invocation, carefully popping pieces off of this to restore state. I feel like I may be missing something.
You are performing a (depth-limited) depth-first search on your state graph. You can do it iteratively by using a single stack of unprocessed subtrees(depending on your state graph structure).
struct Iter {
stack: Vec<(State, u32)>,
max_depth: u32,
}
impl Iter {
fn new(root: State, max_depth: u32) -> Self {
Self {
stack: vec![(root, 0)],
max_depth
}
}
}
impl Iterator for Iter {
type Item = u32; // return type of state_to_value
fn next(&mut self) -> Option<Self::Item> {
let (state, depth) = self.stack.pop()?;
if depth < self.max_depth {
for s in next_states_from(state) {
self.stack.push((s, depth+1));
}
}
return Some(state_to_value(state));
}
}
There are some slight differences to your code:
The iterator yields the value of the root element, while your version does not. This can be easily fixed using .skip(1)
Children are processed in right-to-left order (reversed from the result of next_states_from). Otherwise, you will need to reverse the order of pushing the next states (depending on the result type of next_states_from you can just use .rev(), otherwise you will need a temporary)

How to convert a vector of vectors into a vector of slices without creating a new object? [duplicate]

I have the following:
enum SomeType {
VariantA(String),
VariantB(String, i32),
}
fn transform(x: SomeType) -> SomeType {
// very complicated transformation, reusing parts of x in order to produce result:
match x {
SomeType::VariantA(s) => SomeType::VariantB(s, 0),
SomeType::VariantB(s, i) => SomeType::VariantB(s, 2 * i),
}
}
fn main() {
let mut data = vec![
SomeType::VariantA("hello".to_string()),
SomeType::VariantA("bye".to_string()),
SomeType::VariantB("asdf".to_string(), 34),
];
}
I would now like to call transform on each element of data and store the resulting value back in data. I could do something like data.into_iter().map(transform).collect(), but this will allocate a new Vec. Is there a way to do this in-place, reusing the allocated memory of data? There once was Vec::map_in_place in Rust but it has been removed some time ago.
As a work-around, I've added a Dummy variant to SomeType and then do the following:
for x in &mut data {
let original = ::std::mem::replace(x, SomeType::Dummy);
*x = transform(original);
}
This does not feel right, and I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop. Is there a better way of doing this?
Your first problem is not map, it's transform.
transform takes ownership of its argument, while Vec has ownership of its arguments. Either one has to give, and poking a hole in the Vec would be a bad idea: what if transform panics?
The best fix, thus, is to change the signature of transform to:
fn transform(x: &mut SomeType) { ... }
then you can just do:
for x in &mut data { transform(x) }
Other solutions will be clunky, as they will need to deal with the fact that transform might panic.
No, it is not possible in general because the size of each element might change as the mapping is performed (fn transform(u8) -> u32).
Even when the sizes are the same, it's non-trivial.
In this case, you don't need to create a Dummy variant because creating an empty String is cheap; only 3 pointer-sized values and no heap allocation:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
let old = std::mem::replace(self, VariantA(String::new()));
// Note this line for the detailed explanation
*self = match old {
VariantA(s) => VariantB(s, 0),
VariantB(s, i) => VariantB(s, 2 * i),
};
}
}
for x in &mut data {
x.transform();
}
An alternate implementation that just replaces the String:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
*self = match self {
VariantA(s) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 0)
}
VariantB(s, i) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 2 * *i)
}
};
}
}
In general, yes, you have to create some dummy value to do this generically and with safe code. Many times, you can wrap your whole element in Option and call Option::take to achieve the same effect .
See also:
Change enum variant while moving the field to the new variant
Why is it so complicated?
See this proposed and now-closed RFC for lots of related discussion. My understanding of that RFC (and the complexities behind it) is that there's an time period where your value would have an undefined value, which is not safe. If a panic were to happen at that exact second, then when your value is dropped, you might trigger undefined behavior, a bad thing.
If your code were to panic at the commented line, then the value of self is a concrete, known value. If it were some unknown value, dropping that string would try to drop that unknown value, and we are back in C. This is the purpose of the Dummy value - to always have a known-good value stored.
You even hinted at this (emphasis mine):
I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop
That "should" is the problem. During a panic, that dummy value is visible.
See also:
How can I swap in a new value for a field in a mutable reference to a structure?
Temporarily move out of borrowed content
How do I move out of a struct field that is an Option?
The now-removed implementation of Vec::map_in_place spans almost 175 lines of code, most of having to deal with unsafe code and reasoning why it is actually safe! Some crates have re-implemented this concept and attempted to make it safe; you can see an example in Sebastian Redl's answer.
You can write a map_in_place in terms of the take_mut or replace_with crates:
fn map_in_place<T, F>(v: &mut [T], f: F)
where
F: Fn(T) -> T,
{
for e in v {
take_mut::take(e, f);
}
}
However, if this panics in the supplied function, the program aborts completely; you cannot recover from the panic.
Alternatively, you could supply a placeholder element that sits in the empty spot while the inner function executes:
use std::mem;
fn map_in_place_with_placeholder<T, F>(v: &mut [T], f: F, mut placeholder: T)
where
F: Fn(T) -> T,
{
for e in v {
let mut tmp = mem::replace(e, placeholder);
tmp = f(tmp);
placeholder = mem::replace(e, tmp);
}
}
If this panics, the placeholder you supplied will sit in the panicked slot.
Finally, you could produce the placeholder on-demand; basically replace take_mut::take with take_mut::take_or_recover in the first version.

golang what's the different between return []*TreeNode{} and return []*TreeNode{nil}

I'm new to golang and confused by the following,
type TreeNode struct {
Val int
Left *TreeNode
Right *TreeNode
}
func test() []*TreeNode {
return []*TreeNode{}
}
func test1() []*TreeNode {
return []*TreeNode{nil}
}
I'm trying to write a recursion func on TreeNode, however, if I used the test style to represent the leaf node, I will get an empty TreeNode slice from the caller func.
If I use the test1 to represent the leaf node, then the behaviour is what I want.
I feel that for the test1, it's giving me a point to an empty TreeNode, however, the test code, is giving me a point to nil... I'm not sure if I'm getting this right or not. Can you please point me the correct terms or concepts to dig, it will be great.
In addition, if you can let me know more about the underline logic, it would be great.
Thanks in advance.
This returns an empty slice:
return []*TreeNode{}
This returns a slice containing one element, and that element is a nil pointer:
return []*TreeNode{nil}
None of these give you a TreeNode though. The second one gives you a TreeNode pointer that is nil. How you interpret these depends on the rest of the code, but I doubt either is really what you want, since none can have the val field.

Rust cannot move out of dereference pointer

I try to run this code:
impl FibHeap {
fn insert(&mut self, key: int) -> () {
let new_node = Some(box create_node(key, None, None));
match self.min{
Some(ref mut t) => t.right = new_node,
None => (),
};
println!("{}",get_right(self.min));
}
}
fn get_right(e: Option<Box<Node>>) -> Option<Box<Node>> {
match e {
Some(t) => t.right,
None => None,
}
}
And get error
error: cannot move out of dereference of `&mut`-pointer
println!("{}",get_right(self.min));
^
I dont understand why I get this problem, and what I must use to avoid problem.
Your problem is that get_right() accepts Option<Box<Node>>, while it should really accept Option<&Node> and return Option<&Node> as well. The call site should be also changed appropriately.
Here is the explanation. Box<T> is a heap-allocated box. It obeys value semantics (that is, it behaves like plain T except that it has associated destructor so it is always moved, never copied). Hence passing just Box<T> into a function means giving up ownership of the value and moving it into the function. However, it is not what you really want and neither can do here. get_right() function only queries the existing structure, so it does not need ownership. And if ownership is not needed, then references are the answer. Moreover, it is just impossible to move the self.min into a function, because self.min is accessed through self, which is a borrowed pointer. However, you can't move out from a borrowed data, it is one of the basic safety guarantees provided by the compiler.
Change your get_right() definition to something like this:
fn get_right(e: Option<&Node>) -> Option<&Node> {
e.and_then(|n| n.right.as_ref().map(|r| &**r))
}
Then println!() call should be changed to this:
println!("{}", get_right(self.min.map(|r| &**r))
Here is what happens here. In order to obtain Option<&Node> from Option<Box<Node>> you need to apply the "conversion" to insides of the original Option. There is a method exactly for that, called map(). However, map() takes its target by value, which would mean moving Box<Node> into the closure. However, we only want to borrow Node, so first we need to go from Option<Box<Node>> to Option<&Box<Node>> in order for map() to work.
Option<T> has a method, as_ref(), which takes its target by reference and returns Option<&T>, a possible reference to the internals of the option. In our case it would be Option<&Box<Node>>. Now this value can be safely map()ped over since it contains a reference and a reference can be freely moved without affecting the original value.
So, next, map(|r| &**r) is a conversion from Option<&Box<Node>> to Option<&Node>. The closure argument is applied to the internals of the option if they are present, otherwise None is just passed through. &**r should be read inside out: &(*(*r)), that is, first we dereference &Box<Node>, obtaining Box<Node>, then we dereference the latter, obtaining just Node, and then we take a reference to it, finally getting &Node. Because these reference/dereference operations are juxtaposed, there is no movement/copying involved. So, we got an optional reference to a Node, Option<&Node>.
You can see that similar thing happens in get_right() function. However, there is also a new method, and_then() is called. It is equivalent to what you have written in get_right() initially: if its target is None, it returns None, otherwise it returns the result of Option-returning closure passed as its argument:
fn and_then<U>(self, f: |T| -> Option<U>) -> Option<U> {
match self {
Some(e) => f(e),
None => None
}
}
I strongly suggest reading the official guide which explains what ownership and borrowing are and how to use them, because these are the very foundation of Rust language and it is very important to grasp them in order to be productive with Rust.

Resources