I have a recursive Item structure that I am using to implement lists:
#[derive(Debug)]
pub enum Item<T> {
Cons(T, Box<Item<T>>),
Nil,
}
When implementing a function that inserts an element after another one, I found out that the Rust compiler wasn't that happy about my code:
pub fn add_after<T>(it: Box<Item<T>>, val: T) -> Box<Item<T>> {
match *it {
Item::Nil => return it,
Item::Cons(a, b) => {
let itm = Box::new(Item::Cons(val, b));
return Box::new(Item::Cons(a, itm));
}
}
}
The errors that I get are pretty obscure for a newbie:
error[E0382]: use of collaterally moved value: `(it as Item::Cons).1`
--> src/main.rs:12:23
|
12 | Item::Cons(a, b) => {
| - ^ value used here after move
| |
| value moved here
|
= note: move occurs because the value has type `T`, which does not implement the `Copy` trait
Another similar question suggested to do the unwrapping phase in two steps but it cannot be used here because we need to directly unwrap a two-fields Cons(..) item and not nested items like Option<Box<Whatever>> where the two-phase trick can be applied. Example of what I tried:
pub fn add_after<T>(it: Box<Item<T>>, val: T) -> Box<Item<T>> {
match *it {
Item::Nil => return it,
Item::Cons(..) => {
let Item::Cons(a, b) = *it;
let itm = Box::new(Item::Cons(val, b));
return Box::new(Item::Cons(a, itm));
}
}
}
But I get another error:
error[E0005]: refutable pattern in local binding: `Nil` not covered
--> src/main.rs:13:17
|
13 | let Item::Cons(a, b) = *it;
| ^^^^^^^^^^^^^^^^ pattern `Nil` not covered
Though I am pretty sure here that this is exhaustive at this point because we matched a Cons before.
You may be suffering from issue 16223 (see also 22205 which has a closer reproduction), although today's non-lexical lifetimes don't solve this problem. This seems to preclude destructuring multiple things through a Box.
Here's one way to work around it, although it's not the most efficient way as it deallocates and reallocates unnecessarily:
#[derive(Debug)]
pub enum Item<T> {
Cons(T, Box<Item<T>>),
Nil,
}
pub fn add_after<T>(it: Box<Item<T>>, val: T) -> Box<Item<T>> {
match { *it } {
Item::Nil => Box::new(Item::Nil),
Item::Cons(a, b) => {
let itm = Box::new(Item::Cons(val, b));
Box::new(Item::Cons(a, itm))
}
}
}
fn main() {}
A more verbose way pulls the value out of the Box, manipulates that, and then puts the manipulated value back into the Box. This should have a reduced amount of allocations:
use std::mem;
pub fn add_after<T>(mut item: Box<Item<T>>, val: T) -> Box<Item<T>> {
let unboxed_value = mem::replace(&mut *item, Item::Nil);
match unboxed_value {
Item::Nil => item,
Item::Cons(a, b) => {
let itm = Box::new(Item::Cons(val, b));
*item = Item::Cons(a, itm);
item
}
}
}
See also:
Collaterally moved error when deconstructing a Box of pairs
Related
Suppose we have a tree:
#[derive(Default, Debug, PartialEq, Eq)]
struct Tree {
children: Vec<Tree>,
}
And we want to build it from a list of bools where true is like an open tag and false is like a close tag in XML. I.e. [true, true, false, true, false, false] is
root -> node -> node
`-> node
We can easily parse this recursively like this:
fn read_tree_recursive<'a>(tags: &mut impl Iterator<Item=&'a bool>) -> Tree {
let mut tree = Tree::default();
while let Some(&tag) = tags.next() {
if tag {
tree.children.push(read_tree_recursive(tags));
} else {
break;
}
}
tree
}
Here is some test code:
fn main() {
assert_eq!(
read_tree_recursive(&mut [true, true, false, true, false, false].iter()),
Tree {
children: vec![
Tree {
children: vec![
Tree::default(),
Tree::default(),
],
}
],
},
);
}
But how do you do this iteratively? In any other language you'd make a stack of pointers on the heap something like this:
fn read_tree_iterative(tags: &[bool]) -> Tree {
let mut root = Tree::default();
let tree_stack: Vec<&mut Tree> = vec![&mut root];
for &tag in tags {
if tag {
tree_stack.last().unwrap().children.push(Tree::default());
tree_stack.push(tree_stack.last().unwrap().children.last_mut().unwrap());
} else {
tree_stack.pop();
}
}
root
}
Unfortunately this doesn't work in Rust because Rust can't know that we're only ever mutating tree_stack.last(). The recursive version uses function calls to enforce that, but obviously it has the downsides that come with recursion.
Is there a good way around this other than resorting to RefCell? Unfortunately Rust doesn't have become so TCO isn't a good option either.
You can simply store owned values in your stack and add them to the children when you pop them from the stack:
fn read_tree_iterative<'a> (tags: &mut impl Iterator<Item=&'a bool>) -> Tree {
let mut stack = Vec::new();
let mut last = Tree::default();
for &tag in tags {
if tag {
stack.push (last);
last = Tree::default();
} else {
let mut parent = stack.pop().unwrap();
parent.children.push (last);
last = parent;
}
}
last
}
Playground
TL/DR
I have a recursive algorithm where every call wants to modify entries in an iterable data structure (a Vec). How do I properly model this in Rust using iterators?
Disclaimer
The problem arose in a different, more complex context and I tried to build a rather minimal example which highlights the issue. I am mentioning this to explain that it won't help me if you recommend an entirely different solution to the particular problem I am solving in the below code, which is subset sum.
Example
Consider the following code, and remove a single slash in front of either option to comment it out. Now, option 1 compiles and works as expected, but option 2 does not compile.
#[derive(Clone, Copy)]
struct Item {
in_subset: bool,
value: u32,
}
fn try_find_subset_with_sum(items: &mut Vec<Item>, goal: u32) -> bool {
let current = items
.iter()
.filter(|x| x.in_subset)
.fold(0, |sum, x| sum + x.value);
if current == goal {
true
} else {
if current > goal {
false
} else {
//* OPTION 1: ********************************
for k in 0..items.len() {
if !items[k].in_subset {
items[k].in_subset = true;
if try_find_subset_with_sum(items, goal) {
return true;
}
items[k].in_subset = false;
}
}
// *******************************************/
//* OPTION 2: ********************************
for x in items.iter_mut() {
if !x.in_subset {
x.in_subset = true;
if try_find_subset_with_sum(items, goal) {
return true;
} else {
x.in_subset = false
}
}
}
// *******************************************/
false
}
}
}
fn main() {
let tmp = vec![12, 9, 2, 1, 13, 46, 5, 5, 9];
let mut set: Vec<Item> = tmp
.into_iter()
.map(|x| Item {
in_subset: false,
value: x,
})
.collect();
if try_find_subset_with_sum(&mut set, 16) {
println!("found the following subset:");
for x in set {
if x.in_subset {
println!("{}", x.value);
}
}
}
}
The error message for option 2 states:
error[E0499]: cannot borrow `*items` as mutable more than once at a time
--> src/main.rs:33:49
|
30 | for x in items.iter_mut() {
| ----------------
| |
| first mutable borrow occurs here
| first borrow used here, in later iteration of loop
...
33 | if try_find_subset_with_sum(items, goal) {
| ^^^^^ second mutable borrow occurs here
I understand what the compiler is telling me. If it were human, I would yell back at it to "just borrow the darn iterable whenever I modify the data and get it back before the recursive call for crying out loud!" -- but I do understand that it probably can't do that.
The Question
Is there a more elegant way to solve this than option 1? Is there any way to do this with iterators?
I want to build a tree using exactly two structs: Node and Tree and then recursively search for a target node from the tree. If a target is found, return true, else return false.
The challenge for me here is how to recursively call the find function, since it is only defined on Tree not Node.
pub struct Node<T> {
value: T,
left: Option<Box<Node<T>>>,
right: Option<Box<Node<T>>>,
}
pub struct Tree<T> {
root: Option<Box<Node<T>>>,
}
impl<T: Ord> Tree<T> {
/// Creates an empty tree
pub fn new() -> Self {
Tree { root: None }
}
// search the tree
pub fn find(&self, key: &T) -> bool {
let root_node = &self.root; // root is Option
match *root_node {
Some(ref node) => {
if node.value == *key {
return true;
}
let target_node = if *key < node.value {
&node.left
} else {
&node.right
};
match *target_node {
Some(sub_node) => sub_node.find(key),
None => {
return false;
}
}
}
None => return false,
}
}
}
fn main() {
let mut mytree: Tree<i32> = Tree::new();
let node1 = Node {
value: 100,
left: None,
right: None,
};
let boxed_node1 = Some(Box::new(node1));
let root = Node {
value: 200,
left: boxed_node1,
right: None,
};
let boxed_root = Some(Box::new(root));
let mytree = Tree { root: boxed_root };
let res = mytree.find(&100);
}
The current code reports the error:
error: no method named `find` found for type `Box<Node<T>>` in the current scope
--> src/main.rs:36:48
|
36 | Some(sub_node) => sub_node.find(key),
| ^^^^
|
= note: the method `find` exists but the following trait bounds were not satisfied: `Node<T> : std::iter::Iterator`
= help: items from traits can only be used if the trait is implemented and in scope; the following traits define an item `find`, perhaps you need to implement one of them:
= help: candidate #1: `std::iter::Iterator`
= help: candidate #2: `core::str::StrExt`
I understand that find is only implemented on Tree, so there is an error, but I don't think it is efficient to implement find on both Tree and Node. Any hint to solve this?
You need to move the majority of the implementation to the Node type, then leave only a small shim in Tree:
impl<T: Ord> Tree<T> {
pub fn find(&self, key: &T) -> bool {
self.root.as_ref().map(|n| n.find(key)).unwrap_or(false)
}
}
impl<T: Ord> Node<T> {
// search the tree
pub fn find(&self, key: &T) -> bool {
if self.value == *key {
return true;
}
let target_node = if *key < self.value {
&self.left
} else {
&self.right
};
target_node.as_ref().map(|n| n.find(key)).unwrap_or(false)
}
}
However, I might avoid multiple comparisons by just matching on the result:
pub fn find(&self, key: &T) -> bool {
use ::std::cmp::Ordering::*;
match self.value.cmp(key) {
Equal => true,
Less => self.left.as_ref().map(|n| n.find(key)).unwrap_or(false),
Greater => self.right.as_ref().map(|n| n.find(key)).unwrap_or(false),
}
}
Or
pub fn find(&self, key: &T) -> bool {
use ::std::cmp::Ordering::*;
let child = match self.value.cmp(key) {
Equal => return true,
Less => self.left.as_ref(),
Greater => self.right.as_ref(),
};
child.map(|n| n.find(key)).unwrap_or(false)
}
I found it is hard to understand target_node.as_ref().map(|n| n.find(key)).unwrap_or(false). I just started to learn the iterator. Is that possible to explain the long expression step by step?
Just follow the type signatures of each function:
self is a &Node<T>
&self.left / &self.right / target_node are a &Option<Box<Node<T>>>
Option::as_ref converts an &Option<T> to Option<&T>. Now we have Option<&Box<Node<T>>>.
Option::map applies a function (which may change the contained type) to the option if it is Some, otherwise it leaves it None.
The function we apply is Node::find, which takes a &Node<T> and returns a bool.
Box<T> implements Deref so any methods on T appear on Box<T>.
Automatic dereferencing allows us to treat &Box<T> as Box<T>.
Now we have Option<bool>
Option::unwrap_or returns the contained value if there is one, otherwise the fallback value provided. The final type is bool.
There is no usage of the Iterator trait. Both Iterator and Option have a map method. If you are interested in the fact that they have the same name and do similar things, that [is what people refer to as a monad. Understanding monads is interesting but not required to actually use them.
Implement the find method on Node and create a stub find method for Tree which could look like this:
impl<T: Ord> Tree<T> {
pub fn find(&self, key: &T) -> bool {
match self.root.as_ref() {
None => false,
Some(x) => x.find(key)
}
}
}
I understand that the preferred way to iterate in Rust is through the for var in (range) syntax, but sometimes I'd like to work on more than one of the elements in that range at a time.
From a Ruby perspective, I'm trying to find a way of doing (1..100).each_slice(5) do |this_slice| in Rust.
I'm trying things like
for mut segment_start in (segment_size..max_val).step_by(segment_size) {
let this_segment = segment_start..(segment_start + segment_size).iter().take(segment_size);
}
but I keep getting errors that suggest I'm barking up the wrong type tree. The docs aren't helpful either--they just don't contain this use case.
What's the Rust way to do this?
Use chunks (or chunks_mut if you need mutability):
fn main() {
let things = [5, 4, 3, 2, 1];
for slice in things.chunks(2) {
println!("{:?}", slice);
}
}
Outputs:
[5, 4]
[3, 2]
[1]
The easiest way to combine this with a Range would be to collect the range to a Vec first (which dereferences to a slice):
fn main() {
let things: Vec<_> = (1..100).collect();
for slice in things.chunks(5) {
println!("{:?}", slice);
}
}
Another solution that is pure-iterator would be to use Itertools::chunks_lazy:
extern crate itertools;
use itertools::Itertools;
fn main() {
for chunk in &(1..100).chunks_lazy(5) {
for val in chunk {
print!("{}, ", val);
}
println!("");
}
}
Which suggests a similar solution that only requires the standard library:
fn main() {
let mut range = (1..100).peekable();
while range.peek().is_some() {
for value in range.by_ref().take(5) {
print!("{}, ", value);
}
println!("");
}
}
One trick is that Ruby and Rust have different handling here, mostly centered around efficiency.
In Ruby Enumerable can create new arrays to stuff values in without worrying about ownership and return a new array each time (check with this_slice.object_id).
In Rust, allocating a new vector each time would be pretty unusual. Additionally, you can't easily return a reference to a vector that the iterator holds due to complicated lifetime concerns.
A solution that's very similar to Ruby's is:
fn main() {
let mut range = (1..100).peekable();
while range.peek().is_some() {
let chunk: Vec<_> = range.by_ref().take(5).collect();
println!("{:?}", chunk);
}
}
Which could be wrapped up in a new iterator that hides the details:
use std::iter::Peekable;
struct InefficientChunks<I>
where I: Iterator
{
iter: Peekable<I>,
size: usize,
}
impl<I> Iterator for InefficientChunks<I>
where I: Iterator
{
type Item = Vec<I::Item>;
fn next(&mut self) -> Option<Self::Item> {
if self.iter.peek().is_some() {
Some(self.iter.by_ref().take(self.size).collect())
} else {
None
}
}
}
trait Awesome: Iterator + Sized {
fn inefficient_chunks(self, size: usize) -> InefficientChunks<Self> {
InefficientChunks {
iter: self.peekable(),
size: size,
}
}
}
impl<I> Awesome for I where I: Iterator {}
fn main() {
for chunk in (1..100).inefficient_chunks(5) {
println!("{:?}", chunk);
}
}
Collecting into a vec can easily kill your performance. An approach similar to in the question is perfectly fine.
fn chunk_range(range: Range<usize>, chunk_size: usize) -> impl Iterator<Item=Range<usize>> {
range.clone().step_by(chunk_size).map(move |block_start| {
let block_end = (block_start + chunk_size).min(range.end);
block_start..block_end
})
}
I am attempting to implement a function that returns a recursive closure., though I am not sure how to express that in the function signature. Here is example code of a working implementation in Python
def counter(state):
def handler(msg):
if msg == 'inc':
print state
return counter(state + 1)
if msg == 'dec':
print state
return counter(state - 1)
return handler
c = counter(1)
for x in range(1000000):
c = c('inc')
and pseudo code for Rust.
enum Msg {
Inc,
Dec
}
fn counter(state: Int) -> ? {
move |msg| match msg {
Msg::Inc => counter(state + 1),
Msg::Dec => counter(state - 1),
}
}
Because Rust supports recursive types, you just need to encode the recursion in a separate structure:
enum Msg {
Inc,
Dec,
}
// in this particular example Fn(Msg) -> F should work as well
struct F(Box<FnMut(Msg) -> F>);
fn counter(state: i32) -> F {
F(Box::new(move |msg| match msg {
Msg::Inc => {
println!("{}", state);
counter(state + 1)
}
Msg::Dec => {
println!("{}", state);
counter(state - 1)
}
}))
}
fn main() {
let mut c = counter(1);
for _ in 0..1000 {
c = c.0(Msg::Inc);
}
}
We cannot do away with boxing here, unfortunately - since unboxed closures have unnameable types, we need to box them into a trait object to be able to name them inside the structure declaration.