How do I flatten a recursive structure using recursive iterators? - recursion

I'm trying to flatten a recursive structure but I'm having trouble with recursive iterators.
Here's what the struct looks like:
#[derive(Debug, Clone)]
pub struct C {
name: String,
vb: Option<Vec<B>>,
}
#[derive(Debug, Clone)]
pub struct B {
c: Option<C>,
}
#[derive(Debug, Clone)]
pub struct A {
vb: Option<Vec<B>>,
flat_c: Option<Vec<C>>,
}
My plan is to traverse the vb vector and flatten it into flat_c. I want it to look like this, or at least, be a Vec<String>:
Some([
C {
name: "foo",
vb: None,
},
C {
name: "bar",
vb: None,
},
C {
name: "fizz",
vb: None,
},
C {
name: "buzz",
vb: None,
},
])
Here what I managed to do, somewhat flattening the struct, but only for the last element, as the recursion is not implemented.
impl A {
fn flat_c(self) -> Self {
let fc: Vec<C> = self
.vb
.clone()
.unwrap()
.iter()
.flat_map(|x| x.c.as_ref().unwrap().vb.as_ref().unwrap().iter())
.cloned()
.map(|x| x.c.unwrap())
.collect();
Self {
flat_c: Some(fc),
..self
}
}
}
fn main() {
let a = A {
vb: Some(vec![
B {
c: Some(C {
name: "foo".to_string(),
vb: Some(vec![B {
c: Some(C {
name: "bar".to_string(),
vb: None,
}),
}]),
}),
},
B {
c: Some(C {
name: "fiz".to_string(),
vb: Some(vec![B {
c: Some(C {
name: "buzz".to_string(),
vb: None,
}),
}]),
}),
},
]),
flat_c: None,
};
let a = a.flat_c();
println!("a: {:#?}", a);
}
playground
The output for flat_c:
Some([
C {
name: "bar",
vb: None,
},
C {
name: "buzz",
vb: None,
},
])
I haven't dived into the Iterator trait implementation that might be required for this problem.
How would I tackle this problem? Maybe using a fold? Perhaps a recursive approach is not even needed? I'm at loss.

It's a good idea to be familiar with common data structures. You have a tree, and there are several ways to traverse a tree. You haven't precisely specified which method to use, so I chose one arbitrarily that's easy to implement.
The key here is to implement an iterator that keeps track of some state: all of the nodes yet to be visited. On each call to Iterator::next, we take the next value, save aside any new nodes to visit, and return the value.
Once you have the iterator, you can collect it into a Vec.
use std::collections::VecDeque;
impl IntoIterator for A {
type IntoIter = IntoIter;
type Item = String;
fn into_iter(self) -> Self::IntoIter {
IntoIter {
remaining: self.vb.into_iter().flatten().collect(),
}
}
}
struct IntoIter {
remaining: VecDeque<B>,
}
impl Iterator for IntoIter {
type Item = String;
fn next(&mut self) -> Option<Self::Item> {
self.remaining.pop_front().and_then(|b| {
b.c.map(|C { name, vb }| {
self.remaining.extend(vb.into_iter().flatten());
name
})
})
}
}
fn to_strings(a: A) -> Vec<String> {
a.into_iter().collect()
}
#[derive(Debug, Clone)]
struct A {
vb: Option<Vec<B>>,
}
#[derive(Debug, Clone)]
struct B {
c: Option<C>,
}
#[derive(Debug, Clone)]
struct C {
name: String,
vb: Option<Vec<B>>,
}
fn main() {
let example: A = A {
vb: Some(vec![
B {
c: Some(C {
name: "Hello ".to_string(),
vb: None,
}),
},
B {
c: Some(C {
name: "World!".to_string(),
vb: None,
}),
},
]),
};
println!("The example struct: {:?}", example);
//clone a copy for a second example, because to_strings() takes ownership of the example A struct
let receipt: A = example.clone();
println!("Iterated: {:?}", to_strings(example));
// another example of using to_strings()
println!(
"As a string: {:?}",
to_strings(receipt).into_iter().collect::<String>()
);
}
From here, it should be straight-forward to create an iterator of B if that's what you need. Having all of the None values seemed silly, so I left them out and directly returned Strings.
I also made this a by-value iterator. You could follow the same pattern to create an iterator that returned references to the B / String and only clone them as needed.
See also:
How to implement Iterator and IntoIterator for a simple struct?
Implement IntoIterator for binary tree
Cannot obtain a mutable reference when iterating a recursive structure: cannot borrow as mutable more than once at a time
Recursive inorder traversal of a binary search tree

There is my solution:
impl C {
fn flat(&self) -> Vec<C> {
let mut result = Vec::new();
result.push(C {
name: self.name.clone(),
vb: None,
});
if self.vb.is_some() {
result.extend(
(self.vb.as_ref().unwrap().iter())
.flat_map(|b| b.c.as_ref().map(|c| c.flat()).unwrap_or(Vec::new())),
);
}
return result;
}
}
impl A {
fn flat_c(self) -> Self {
let fc = (self.vb.as_ref().unwrap().iter())
.flat_map(|b| b.c.as_ref().unwrap().flat())
.collect();
Self {
flat_c: Some(fc),
..self
}
}
}
It adds flat function for C because the C is the source of the recursion and only this struct may properly handle it.
Because of those Options it looks scary and there is hard to deal with cryptic error messages. This solution supposes that all b.cs of initial a is not a None. Otherwise, it will panic. My advice is to avoid using Option<Vec> and use just empty vector instead of None.
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=09ea11342cdd733b03172c0fc13c85fd

I'm not sure what exactly you want the result of "traverse the vb vector and flatten it into flat_c" to be, but here's a slightly simpler example of flattening a recursive structure, using once for the value that corresponds to the current node, chain to concatenate it with its children and flat_map to flatten everything:
use std::iter::once;
#[derive(Debug)]
struct S {
name: String,
children: Vec<S>,
}
impl S {
fn flat(self) -> Vec<String> {
once(self.name)
.chain(self.children.into_iter().flat_map(|c| c.flat()))
.collect()
}
}
fn main() {
let s = S {
name: "parent".into(),
children: vec![
S {
name: "child 1".into(),
children: vec![],
},
S {
name: "child 2".into(),
children: vec![],
},
],
};
println!("s: {:?}", s);
println!("flat: {:?}", s.flat());
}
playground

Related

Iterative tree parsing in Rust

Suppose we have a tree:
#[derive(Default, Debug, PartialEq, Eq)]
struct Tree {
children: Vec<Tree>,
}
And we want to build it from a list of bools where true is like an open tag and false is like a close tag in XML. I.e. [true, true, false, true, false, false] is
root -> node -> node
`-> node
We can easily parse this recursively like this:
fn read_tree_recursive<'a>(tags: &mut impl Iterator<Item=&'a bool>) -> Tree {
let mut tree = Tree::default();
while let Some(&tag) = tags.next() {
if tag {
tree.children.push(read_tree_recursive(tags));
} else {
break;
}
}
tree
}
Here is some test code:
fn main() {
assert_eq!(
read_tree_recursive(&mut [true, true, false, true, false, false].iter()),
Tree {
children: vec![
Tree {
children: vec![
Tree::default(),
Tree::default(),
],
}
],
},
);
}
But how do you do this iteratively? In any other language you'd make a stack of pointers on the heap something like this:
fn read_tree_iterative(tags: &[bool]) -> Tree {
let mut root = Tree::default();
let tree_stack: Vec<&mut Tree> = vec![&mut root];
for &tag in tags {
if tag {
tree_stack.last().unwrap().children.push(Tree::default());
tree_stack.push(tree_stack.last().unwrap().children.last_mut().unwrap());
} else {
tree_stack.pop();
}
}
root
}
Unfortunately this doesn't work in Rust because Rust can't know that we're only ever mutating tree_stack.last(). The recursive version uses function calls to enforce that, but obviously it has the downsides that come with recursion.
Is there a good way around this other than resorting to RefCell? Unfortunately Rust doesn't have become so TCO isn't a good option either.
You can simply store owned values in your stack and add them to the children when you pop them from the stack:
fn read_tree_iterative<'a> (tags: &mut impl Iterator<Item=&'a bool>) -> Tree {
let mut stack = Vec::new();
let mut last = Tree::default();
for &tag in tags {
if tag {
stack.push (last);
last = Tree::default();
} else {
let mut parent = stack.pop().unwrap();
parent.children.push (last);
last = parent;
}
}
last
}
Playground

Appending an element to recursive type in Rust

The code I have:
#[derive(Debug, Clone, Eq, PartialEq)]
struct BugColony {
pub first: Link,
}
type Link = Option<Box<Bug>>;
#[derive(Debug, Clone, Eq, PartialEq)]
struct Bug {
bug_type: String,
next_bug: Link,
}
Now I'd like to create a function that appends a new Bug to the end of the recursive bug 'list'. What is the rust way of doing that.
ex:
fn main() {
let mut list = BugColony{Link::None};
list.add_bug(String::from("Bee"));
list.add_bug(String::from("Bedbug"));
println!("{:?}", list);
}
impl BugColony {
fn add_bug(&mut self, type: String) {
...
}
}
So the result would be:
WorkEnvironment {
first: Some(
Bug {
bug_type: "Bee",
next_bug: Some(
Bug {
bug_type: "Bee",
next_bug: None
})
})
}
This just sounds like appending to a Linked List to me; I don't think there's anything particularly Rusty about it. If what you're asking is how one would recommend performing the whole "loop to the end of a Linked List" thing in Rust, I'd use while let to unwrap your Option enums.
Below is an example. I also took the liberty to rename type (since it's a keyword), and to swap String for impl Into<String> so that you could pass in a &'static str and the like. Here's a link to the playground.
type Link<T> = Option<Box<T>>;
#[derive(Debug, Clone, Eq, PartialEq)]
struct BugColony {
pub first: Link<Bug>,
}
#[derive(Debug, Clone, Eq, PartialEq)]
struct Bug {
bug_type: String,
next_bug: Link<Bug>,
}
impl BugColony {
fn add_bug(&mut self, bug_type: impl Into<String>) {
let mut cur = &mut self.first;
// As long as the current bug has one after it, advance down the list.
// Once the loop is done running, `cur` will be a reference to the `None`
// at the end of the list.
while let Some(bug) = cur {
cur = &mut bug.next_bug;
}
let new = Bug { bug_type: bug_type.into(), next_bug: None };
*cur = Some(Box::new(new));
}
}
fn main() {
let mut col = BugColony { first: None };
col.add_bug("Grasshopper");
col.add_bug("Ladybug");
col.add_bug("Fire Ant");
println!("{col:#?}");
}
Output:
BugColony {
first: Some(
Bug {
bug_type: "Grasshopper",
next_bug: Some(
Bug {
bug_type: "Ladybug",
next_bug: Some(
Bug {
bug_type: "Fire Ant",
next_bug: None,
},
),
},
),
},
),
}
If you really want a linked list, there is nothing specific to Rust here.
In your example, the add_bug() operation was expected as appending to the end of the list.
Just follow the Links and append a new Bug when you find None (see append_bug() below).
It would be much simpler to prepend: you just have to change the first Link (see prepend_bug() below).
I cannot think about a real situation where a linked list is better than a vector; it's much more complicated and much less efficient (that would be long to explain in details here -- cache-line usage, prefetching...).
I suggest you simply use a vector; see VBugColony and VBug below.
#[derive(Debug, Clone, Eq, PartialEq)]
struct BugColony {
first: Link,
}
type Link = Option<Box<Bug>>;
#[derive(Debug, Clone, Eq, PartialEq)]
struct Bug {
bug_type: String,
next_bug: Link,
}
impl BugColony {
fn append_bug(
&mut self,
bug_type: String,
) {
let mut lnk = &mut self.first;
loop {
if let Some(bug) = lnk {
lnk = &mut bug.next_bug;
} else {
*lnk = Some(Box::new(Bug {
bug_type,
next_bug: None,
}));
break;
}
}
}
fn prepend_bug(
&mut self,
bug_type: String,
) {
self.first = Some(Box::new(Bug {
bug_type,
next_bug: self.first.take(),
}));
}
}
#[derive(Debug, Clone, Eq, PartialEq)]
struct VBugColony {
bugs: Vec<VBug>,
}
#[derive(Debug, Clone, Eq, PartialEq)]
struct VBug {
bug_type: String,
}
impl VBugColony {
fn add_bug(
&mut self,
bug_type: String,
) {
self.bugs.push(VBug { bug_type });
}
}
fn main() {
let mut list = BugColony { first: None };
list.append_bug(String::from("Bee"));
list.append_bug(String::from("Bedbug"));
println!("{:#?}", list);
//
let mut list = BugColony { first: None };
list.prepend_bug(String::from("Bedbug"));
list.prepend_bug(String::from("Bee"));
println!("{:#?}", list);
//
let mut list = VBugColony {
bugs: Default::default(),
};
list.add_bug(String::from("Bedbug"));
list.add_bug(String::from("Bee"));
println!("{:#?}", list);
}
/*
BugColony {
first: Some(
Bug {
bug_type: "Bee",
next_bug: Some(
Bug {
bug_type: "Bedbug",
next_bug: None,
},
),
},
),
}
BugColony {
first: Some(
Bug {
bug_type: "Bee",
next_bug: Some(
Bug {
bug_type: "Bedbug",
next_bug: None,
},
),
},
),
}
VBugColony {
bugs: [
VBug {
bug_type: "Bedbug",
},
VBug {
bug_type: "Bee",
},
],
}
*/

How to get Array of Field Values from Array of Structs in Rust?

I'd like to map an array of structs to an array of field values. How would I do this?
pub struct Person {
name: String
}
fn main() {
let my_people = vec![
Person {
name: "Bob".to_string(),
},
Person {
name: "Jill".to_string(),
},
Person {
name: "Rakim".to_string(),
},
];
//map my_people to ["Bob", "Jill", "Rakim"]
}
You have two possible solutions, depending on whether you want to clone the names or borrow them. Both solutions below:
pub struct Person {
name: String,
}
fn people_names_owned(people: &[Person]) -> Vec<String> {
people.iter().map(|p| p.name.clone()).collect()
}
fn people_names_borrowed(people: &[Person]) -> Vec<&str> {
people.iter().map(|p| p.name.as_ref()).collect()
}
fn main() {
let my_people = vec![
Person {
name: "Bob".to_string(),
},
Person {
name: "Jill".to_string(),
},
Person {
name: "Rakim".to_string(),
},
];
println!("{:?}", people_names_owned(&my_people));
println!("{:?}", people_names_borrowed(&my_people));
}
playground

Recursive search of node in tree

I want to build a tree using exactly two structs: Node and Tree and then recursively search for a target node from the tree. If a target is found, return true, else return false.
The challenge for me here is how to recursively call the find function, since it is only defined on Tree not Node.
pub struct Node<T> {
value: T,
left: Option<Box<Node<T>>>,
right: Option<Box<Node<T>>>,
}
pub struct Tree<T> {
root: Option<Box<Node<T>>>,
}
impl<T: Ord> Tree<T> {
/// Creates an empty tree
pub fn new() -> Self {
Tree { root: None }
}
// search the tree
pub fn find(&self, key: &T) -> bool {
let root_node = &self.root; // root is Option
match *root_node {
Some(ref node) => {
if node.value == *key {
return true;
}
let target_node = if *key < node.value {
&node.left
} else {
&node.right
};
match *target_node {
Some(sub_node) => sub_node.find(key),
None => {
return false;
}
}
}
None => return false,
}
}
}
fn main() {
let mut mytree: Tree<i32> = Tree::new();
let node1 = Node {
value: 100,
left: None,
right: None,
};
let boxed_node1 = Some(Box::new(node1));
let root = Node {
value: 200,
left: boxed_node1,
right: None,
};
let boxed_root = Some(Box::new(root));
let mytree = Tree { root: boxed_root };
let res = mytree.find(&100);
}
The current code reports the error:
error: no method named `find` found for type `Box<Node<T>>` in the current scope
--> src/main.rs:36:48
|
36 | Some(sub_node) => sub_node.find(key),
| ^^^^
|
= note: the method `find` exists but the following trait bounds were not satisfied: `Node<T> : std::iter::Iterator`
= help: items from traits can only be used if the trait is implemented and in scope; the following traits define an item `find`, perhaps you need to implement one of them:
= help: candidate #1: `std::iter::Iterator`
= help: candidate #2: `core::str::StrExt`
I understand that find is only implemented on Tree, so there is an error, but I don't think it is efficient to implement find on both Tree and Node. Any hint to solve this?
You need to move the majority of the implementation to the Node type, then leave only a small shim in Tree:
impl<T: Ord> Tree<T> {
pub fn find(&self, key: &T) -> bool {
self.root.as_ref().map(|n| n.find(key)).unwrap_or(false)
}
}
impl<T: Ord> Node<T> {
// search the tree
pub fn find(&self, key: &T) -> bool {
if self.value == *key {
return true;
}
let target_node = if *key < self.value {
&self.left
} else {
&self.right
};
target_node.as_ref().map(|n| n.find(key)).unwrap_or(false)
}
}
However, I might avoid multiple comparisons by just matching on the result:
pub fn find(&self, key: &T) -> bool {
use ::std::cmp::Ordering::*;
match self.value.cmp(key) {
Equal => true,
Less => self.left.as_ref().map(|n| n.find(key)).unwrap_or(false),
Greater => self.right.as_ref().map(|n| n.find(key)).unwrap_or(false),
}
}
Or
pub fn find(&self, key: &T) -> bool {
use ::std::cmp::Ordering::*;
let child = match self.value.cmp(key) {
Equal => return true,
Less => self.left.as_ref(),
Greater => self.right.as_ref(),
};
child.map(|n| n.find(key)).unwrap_or(false)
}
I found it is hard to understand target_node.as_ref().map(|n| n.find(key)).unwrap_or(false). I just started to learn the iterator. Is that possible to explain the long expression step by step?
Just follow the type signatures of each function:
self is a &Node<T>
&self.left / &self.right / target_node are a &Option<Box<Node<T>>>
Option::as_ref converts an &Option<T> to Option<&T>. Now we have Option<&Box<Node<T>>>.
Option::map applies a function (which may change the contained type) to the option if it is Some, otherwise it leaves it None.
The function we apply is Node::find, which takes a &Node<T> and returns a bool.
Box<T> implements Deref so any methods on T appear on Box<T>.
Automatic dereferencing allows us to treat &Box<T> as Box<T>.
Now we have Option<bool>
Option::unwrap_or returns the contained value if there is one, otherwise the fallback value provided. The final type is bool.
There is no usage of the Iterator trait. Both Iterator and Option have a map method. If you are interested in the fact that they have the same name and do similar things, that [is what people refer to as a monad. Understanding monads is interesting but not required to actually use them.
Implement the find method on Node and create a stub find method for Tree which could look like this:
impl<T: Ord> Tree<T> {
pub fn find(&self, key: &T) -> bool {
match self.root.as_ref() {
None => false,
Some(x) => x.find(key)
}
}
}

How do I return a vector element from a Rust function?

I would like to return an element of a vector:
struct EntryOne {
pub name: String,
pub value: Option<String>,
}
struct TestVec {}
impl TestVec {
pub fn new() -> TestVec {
TestVec {}
}
pub fn findAll(&self) -> Vec<EntryOne> {
let mut ret = Vec::new();
ret.push(EntryOne {
name: "foo".to_string(),
value: Some("FooVal".to_string()),
});
ret.push(EntryOne {
name: "foo2".to_string(),
value: Some("FooVal2".to_string()),
});
ret.push(EntryOne {
name: "foo3".to_string(),
value: None,
});
ret.push(EntryOne {
name: "foo4".to_string(),
value: Some("FooVal4".to_string()),
});
ret
}
pub fn findOne(&self) -> Option<EntryOne> {
let mut list = &self.findAll();
if list.len() > 0 {
println!("{} elements found", list.len());
list.first()
} else {
None
}
}
}
fn main() {
let test = TestVec::new();
test.findAll();
test.findOne();
}
(playground)
I always get this error:
error[E0308]: mismatched types
--> src/main.rs:40:13
|
35 | pub fn findOne(&self) -> Option<EntryOne> {
| ---------------- expected `std::option::Option<EntryOne>` because of return type
...
40 | list.first()
| ^^^^^^^^^^^^ expected struct `EntryOne`, found &EntryOne
|
= note: expected type `std::option::Option<EntryOne>`
found type `std::option::Option<&EntryOne>`
How do I return an element?
Look at the signature for Vec::first:
fn first(&self) -> Option<&T>
Given a reference to a vector, it will return a reference to the first item if there is one, and None otherwise. That means that the vector containing the values must outlive the return value, otherwise the reference would point to undefined memory.
There are two main avenues:
If you cannot change the vector, then you will need to make a copy of your data structure. The easiest way to do this is to annotate the structure with #[derive(Clone)]. Then you can call Option::cloned on the result of first.
If you can change the vector, then you can remove the first value from it and return it. There are many ways of doing this, but the shortest code-wise is to use the drain iterator.
#[derive(Debug, Clone)]
struct EntryOne {
name: String,
value: Option<String>,
}
fn find_all() -> Vec<EntryOne> {
vec![
EntryOne {
name: "foo".to_string(),
value: Some("FooVal".to_string()),
},
EntryOne {
name: "foo2".to_string(),
value: Some("FooVal2".to_string()),
},
EntryOne {
name: "foo3".to_string(),
value: None,
},
EntryOne {
name: "foo4".to_string(),
value: Some("FooVal4".to_string()),
},
]
}
fn find_one_by_clone() -> Option<EntryOne> {
find_all().first().cloned()
}
fn find_one_by_drain() -> Option<EntryOne> {
let mut all = find_all();
let mut i = all.drain(0..1);
i.next()
}
fn main() {
println!("{:?}", find_one_by_clone());
println!("{:?}", find_one_by_drain());
}
Additional changes:
There's no need for TestVec if there's no state; just make functions.
Rust style is snake_case for method and variable names.
Use vec! to construct a vector when providing all the elements.
Derive Debug so you can print the value.
If you wanted to always get the last element, you can use pop:
fn find_one_by_pop() -> Option<EntryOne> {
find_all().pop()
}

Resources