How to create self-referential AST in Rust? - recursion

Let's imagine we have a very simple AST for programming language with only functions and calls
use std::sync::Arc;
struct Function {
pub body: Vec<Call>
}
struct Call {
pub function: Arc<Function>
}
Now we want to recursively call functions:
let f = Arc::new(Function { body: vec![Call {function: f.clone()}] })
// Doesn't work, as we haven't created Arc yet
let mut f = Arc::new(Function { body: vec![] });
f.body.push(Call { function: f.clone() })
// Doesn't work, because Arc has immutable data
The problem can be solved using Arc<Mutex<_>> or Arc<RwLock<_>> in Call:
use std::sync::{Arc, RwLock}
struct Call {
pub function: Arc<RwLock<Function>>
}
And then creating with:
let f = Arc::new(RwLock::new(Function { body: vec![] }));
f.write().unwrap().body.push(Call { function: f.clone() })
But now you need to always write f.write().unwrap() or f.read().unwrap() everywhere, even though function is mutable only on construction.
Can you do something better than this?

Use Weak for the parent pointer. You need to do that anyway to allow for proper destruction. Then use Arc::new_cyclic():
use std::sync::{Arc, Weak};
struct Function {
pub body: Vec<Call>,
}
struct Call {
pub function: Weak<Function>,
}
let f = Arc::new_cyclic(|f| Function {
body: vec![Call {
function: Weak::clone(f),
}],
});

Related

How do I flatten a recursive structure using recursive iterators?

I'm trying to flatten a recursive structure but I'm having trouble with recursive iterators.
Here's what the struct looks like:
#[derive(Debug, Clone)]
pub struct C {
name: String,
vb: Option<Vec<B>>,
}
#[derive(Debug, Clone)]
pub struct B {
c: Option<C>,
}
#[derive(Debug, Clone)]
pub struct A {
vb: Option<Vec<B>>,
flat_c: Option<Vec<C>>,
}
My plan is to traverse the vb vector and flatten it into flat_c. I want it to look like this, or at least, be a Vec<String>:
Some([
C {
name: "foo",
vb: None,
},
C {
name: "bar",
vb: None,
},
C {
name: "fizz",
vb: None,
},
C {
name: "buzz",
vb: None,
},
])
Here what I managed to do, somewhat flattening the struct, but only for the last element, as the recursion is not implemented.
impl A {
fn flat_c(self) -> Self {
let fc: Vec<C> = self
.vb
.clone()
.unwrap()
.iter()
.flat_map(|x| x.c.as_ref().unwrap().vb.as_ref().unwrap().iter())
.cloned()
.map(|x| x.c.unwrap())
.collect();
Self {
flat_c: Some(fc),
..self
}
}
}
fn main() {
let a = A {
vb: Some(vec![
B {
c: Some(C {
name: "foo".to_string(),
vb: Some(vec![B {
c: Some(C {
name: "bar".to_string(),
vb: None,
}),
}]),
}),
},
B {
c: Some(C {
name: "fiz".to_string(),
vb: Some(vec![B {
c: Some(C {
name: "buzz".to_string(),
vb: None,
}),
}]),
}),
},
]),
flat_c: None,
};
let a = a.flat_c();
println!("a: {:#?}", a);
}
playground
The output for flat_c:
Some([
C {
name: "bar",
vb: None,
},
C {
name: "buzz",
vb: None,
},
])
I haven't dived into the Iterator trait implementation that might be required for this problem.
How would I tackle this problem? Maybe using a fold? Perhaps a recursive approach is not even needed? I'm at loss.
It's a good idea to be familiar with common data structures. You have a tree, and there are several ways to traverse a tree. You haven't precisely specified which method to use, so I chose one arbitrarily that's easy to implement.
The key here is to implement an iterator that keeps track of some state: all of the nodes yet to be visited. On each call to Iterator::next, we take the next value, save aside any new nodes to visit, and return the value.
Once you have the iterator, you can collect it into a Vec.
use std::collections::VecDeque;
impl IntoIterator for A {
type IntoIter = IntoIter;
type Item = String;
fn into_iter(self) -> Self::IntoIter {
IntoIter {
remaining: self.vb.into_iter().flatten().collect(),
}
}
}
struct IntoIter {
remaining: VecDeque<B>,
}
impl Iterator for IntoIter {
type Item = String;
fn next(&mut self) -> Option<Self::Item> {
self.remaining.pop_front().and_then(|b| {
b.c.map(|C { name, vb }| {
self.remaining.extend(vb.into_iter().flatten());
name
})
})
}
}
fn to_strings(a: A) -> Vec<String> {
a.into_iter().collect()
}
#[derive(Debug, Clone)]
struct A {
vb: Option<Vec<B>>,
}
#[derive(Debug, Clone)]
struct B {
c: Option<C>,
}
#[derive(Debug, Clone)]
struct C {
name: String,
vb: Option<Vec<B>>,
}
fn main() {
let example: A = A {
vb: Some(vec![
B {
c: Some(C {
name: "Hello ".to_string(),
vb: None,
}),
},
B {
c: Some(C {
name: "World!".to_string(),
vb: None,
}),
},
]),
};
println!("The example struct: {:?}", example);
//clone a copy for a second example, because to_strings() takes ownership of the example A struct
let receipt: A = example.clone();
println!("Iterated: {:?}", to_strings(example));
// another example of using to_strings()
println!(
"As a string: {:?}",
to_strings(receipt).into_iter().collect::<String>()
);
}
From here, it should be straight-forward to create an iterator of B if that's what you need. Having all of the None values seemed silly, so I left them out and directly returned Strings.
I also made this a by-value iterator. You could follow the same pattern to create an iterator that returned references to the B / String and only clone them as needed.
See also:
How to implement Iterator and IntoIterator for a simple struct?
Implement IntoIterator for binary tree
Cannot obtain a mutable reference when iterating a recursive structure: cannot borrow as mutable more than once at a time
Recursive inorder traversal of a binary search tree
There is my solution:
impl C {
fn flat(&self) -> Vec<C> {
let mut result = Vec::new();
result.push(C {
name: self.name.clone(),
vb: None,
});
if self.vb.is_some() {
result.extend(
(self.vb.as_ref().unwrap().iter())
.flat_map(|b| b.c.as_ref().map(|c| c.flat()).unwrap_or(Vec::new())),
);
}
return result;
}
}
impl A {
fn flat_c(self) -> Self {
let fc = (self.vb.as_ref().unwrap().iter())
.flat_map(|b| b.c.as_ref().unwrap().flat())
.collect();
Self {
flat_c: Some(fc),
..self
}
}
}
It adds flat function for C because the C is the source of the recursion and only this struct may properly handle it.
Because of those Options it looks scary and there is hard to deal with cryptic error messages. This solution supposes that all b.cs of initial a is not a None. Otherwise, it will panic. My advice is to avoid using Option<Vec> and use just empty vector instead of None.
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=09ea11342cdd733b03172c0fc13c85fd
I'm not sure what exactly you want the result of "traverse the vb vector and flatten it into flat_c" to be, but here's a slightly simpler example of flattening a recursive structure, using once for the value that corresponds to the current node, chain to concatenate it with its children and flat_map to flatten everything:
use std::iter::once;
#[derive(Debug)]
struct S {
name: String,
children: Vec<S>,
}
impl S {
fn flat(self) -> Vec<String> {
once(self.name)
.chain(self.children.into_iter().flat_map(|c| c.flat()))
.collect()
}
}
fn main() {
let s = S {
name: "parent".into(),
children: vec![
S {
name: "child 1".into(),
children: vec![],
},
S {
name: "child 2".into(),
children: vec![],
},
],
};
println!("s: {:?}", s);
println!("flat: {:?}", s.flat());
}
playground

How to recursively call a closure that is stored in an Arc<Mutex<_>>?

I’m trying to transpile a dynamic language into Rust and closures are the most difficult part to implement.
I've tried using a Arc<Mutex<dyn FnMut>>, but it doesn't support recursion.
use std::sync::{Arc, Mutex};
type Data = Arc<DataUnpack>;
enum DataUnpack {
Number(f64),
Function(Box<Mutex<FnMut(Vec<Data>) -> Data>>),
}
fn call(f: Data, args: Vec<Data>) -> Data {
if let DataUnpack::Function(v) = &*f {
let f = &mut *v.lock().unwrap();
f(args)
} else {
panic!("TYPE ERR")
}
}
fn lambda(f: Box<FnMut(Vec<Data>) -> Data>) -> Data {
Arc::new(DataUnpack::Function(Box::new(Mutex::new(Box::leak(f)))))
}
fn main() {
let f: Arc<Mutex<Data>> = Arc::new(Mutex::new(Arc::new(DataUnpack::Number(0.0))));
*f.lock().unwrap() = {
let f = f.clone();
lambda(Box::new(move |xs| {
println!("Ha");
call(f.lock().unwrap().clone(), xs.clone())
}))
};
call(f.lock().unwrap().clone(), vec![]);
}
playground
It shows one Ha and then stops. Where am I wrong?

How to replace combinators with Future?

I have a function which returns Future. It accepts another function which accepts one argument and returns Future. Second function can be implemented as combinators chain passed into first function. It looks like this:
use bb8::{Pool, RunError};
use bb8_postgres::PostgresConnectionManager;
use tokio_postgres::{error::Error, Client, NoTls};
#[derive(Clone)]
pub struct DataManager(Pool<PostgresConnectionManager<NoTls>>);
impl DataManager {
pub fn new(pool: Pool<PostgresConnectionManager<NoTls>>) -> Self {
Self(pool)
}
pub fn create_user(
&self,
reg_req: UserRequest,
) -> impl Future<Item = User, Error = RunError<Error>> {
let sql = "long and awesome sql";
let query = move |mut conn: Client| { // function which accepts one argument and returns Future
conn.prepare(sql).then(move |r| match r {
Ok(select) => {
let f = conn
.query(&select, &[&reg_req.email, &reg_req.password])
.collect()
.map(|mut rows| {
let row = rows.remove(0);
row.into()
})
.then(move |r| match r {
Ok(v) => Ok((v, conn)),
Err(e) => Err((e, conn)),
});
Either::A(f)
}
Err(e) => Either::B(future::err((e, conn))),
})
};
self.0.run(query) // function which returns Future and accepts another function
}
}
But I want to write code of create_user as a struct implementing Future.
struct UserCreator(Pool<PostgresConnectionManager<NoTls>>, UserRequest);
impl UserCreator {
fn new(pool: Pool<PostgresConnectionManager<NoTls>>, reg_req: UserRequest) -> Self {
Self(pool, reg_req)
}
}
How to implement Future for this struct that works as first function? Please help me with an example.
Now I tried to make it like this, but nothing is computed and execution always blocks.
impl Future for UserCreator {
type Item = User;
type Error = RunError<Error>;
fn poll(&mut self) -> Poll<Self::Item, Self::Error> {
// Code which which works like `DataManager.create_user`
let sql = "long and awesome sql";
let reg_req = &self.1;
let query = move |mut conn: Client| {
conn.prepare(sql).then(move |r| match r {
Ok(select) => {
let f = conn
.query(&select, &[&reg_req.email, &reg_req.password])
.collect()
.map(|mut rows| {
let row = rows.remove(0);
row.into()
})
.then(move |r| match r {
Ok(v) => Ok((v, conn)),
Err(e) => Err((e, conn)),
});
Either::A(f)
}
Err(e) => Either::B(future::err((e, conn))),
})
};
self.0.run(query).poll()
}
}

Recursive search of node in tree

I want to build a tree using exactly two structs: Node and Tree and then recursively search for a target node from the tree. If a target is found, return true, else return false.
The challenge for me here is how to recursively call the find function, since it is only defined on Tree not Node.
pub struct Node<T> {
value: T,
left: Option<Box<Node<T>>>,
right: Option<Box<Node<T>>>,
}
pub struct Tree<T> {
root: Option<Box<Node<T>>>,
}
impl<T: Ord> Tree<T> {
/// Creates an empty tree
pub fn new() -> Self {
Tree { root: None }
}
// search the tree
pub fn find(&self, key: &T) -> bool {
let root_node = &self.root; // root is Option
match *root_node {
Some(ref node) => {
if node.value == *key {
return true;
}
let target_node = if *key < node.value {
&node.left
} else {
&node.right
};
match *target_node {
Some(sub_node) => sub_node.find(key),
None => {
return false;
}
}
}
None => return false,
}
}
}
fn main() {
let mut mytree: Tree<i32> = Tree::new();
let node1 = Node {
value: 100,
left: None,
right: None,
};
let boxed_node1 = Some(Box::new(node1));
let root = Node {
value: 200,
left: boxed_node1,
right: None,
};
let boxed_root = Some(Box::new(root));
let mytree = Tree { root: boxed_root };
let res = mytree.find(&100);
}
The current code reports the error:
error: no method named `find` found for type `Box<Node<T>>` in the current scope
--> src/main.rs:36:48
|
36 | Some(sub_node) => sub_node.find(key),
| ^^^^
|
= note: the method `find` exists but the following trait bounds were not satisfied: `Node<T> : std::iter::Iterator`
= help: items from traits can only be used if the trait is implemented and in scope; the following traits define an item `find`, perhaps you need to implement one of them:
= help: candidate #1: `std::iter::Iterator`
= help: candidate #2: `core::str::StrExt`
I understand that find is only implemented on Tree, so there is an error, but I don't think it is efficient to implement find on both Tree and Node. Any hint to solve this?
You need to move the majority of the implementation to the Node type, then leave only a small shim in Tree:
impl<T: Ord> Tree<T> {
pub fn find(&self, key: &T) -> bool {
self.root.as_ref().map(|n| n.find(key)).unwrap_or(false)
}
}
impl<T: Ord> Node<T> {
// search the tree
pub fn find(&self, key: &T) -> bool {
if self.value == *key {
return true;
}
let target_node = if *key < self.value {
&self.left
} else {
&self.right
};
target_node.as_ref().map(|n| n.find(key)).unwrap_or(false)
}
}
However, I might avoid multiple comparisons by just matching on the result:
pub fn find(&self, key: &T) -> bool {
use ::std::cmp::Ordering::*;
match self.value.cmp(key) {
Equal => true,
Less => self.left.as_ref().map(|n| n.find(key)).unwrap_or(false),
Greater => self.right.as_ref().map(|n| n.find(key)).unwrap_or(false),
}
}
Or
pub fn find(&self, key: &T) -> bool {
use ::std::cmp::Ordering::*;
let child = match self.value.cmp(key) {
Equal => return true,
Less => self.left.as_ref(),
Greater => self.right.as_ref(),
};
child.map(|n| n.find(key)).unwrap_or(false)
}
I found it is hard to understand target_node.as_ref().map(|n| n.find(key)).unwrap_or(false). I just started to learn the iterator. Is that possible to explain the long expression step by step?
Just follow the type signatures of each function:
self is a &Node<T>
&self.left / &self.right / target_node are a &Option<Box<Node<T>>>
Option::as_ref converts an &Option<T> to Option<&T>. Now we have Option<&Box<Node<T>>>.
Option::map applies a function (which may change the contained type) to the option if it is Some, otherwise it leaves it None.
The function we apply is Node::find, which takes a &Node<T> and returns a bool.
Box<T> implements Deref so any methods on T appear on Box<T>.
Automatic dereferencing allows us to treat &Box<T> as Box<T>.
Now we have Option<bool>
Option::unwrap_or returns the contained value if there is one, otherwise the fallback value provided. The final type is bool.
There is no usage of the Iterator trait. Both Iterator and Option have a map method. If you are interested in the fact that they have the same name and do similar things, that [is what people refer to as a monad. Understanding monads is interesting but not required to actually use them.
Implement the find method on Node and create a stub find method for Tree which could look like this:
impl<T: Ord> Tree<T> {
pub fn find(&self, key: &T) -> bool {
match self.root.as_ref() {
None => false,
Some(x) => x.find(key)
}
}
}

How do I process a range in slices in Rust?

I understand that the preferred way to iterate in Rust is through the for var in (range) syntax, but sometimes I'd like to work on more than one of the elements in that range at a time.
From a Ruby perspective, I'm trying to find a way of doing (1..100).each_slice(5) do |this_slice| in Rust.
I'm trying things like
for mut segment_start in (segment_size..max_val).step_by(segment_size) {
let this_segment = segment_start..(segment_start + segment_size).iter().take(segment_size);
}
but I keep getting errors that suggest I'm barking up the wrong type tree. The docs aren't helpful either--they just don't contain this use case.
What's the Rust way to do this?
Use chunks (or chunks_mut if you need mutability):
fn main() {
let things = [5, 4, 3, 2, 1];
for slice in things.chunks(2) {
println!("{:?}", slice);
}
}
Outputs:
[5, 4]
[3, 2]
[1]
The easiest way to combine this with a Range would be to collect the range to a Vec first (which dereferences to a slice):
fn main() {
let things: Vec<_> = (1..100).collect();
for slice in things.chunks(5) {
println!("{:?}", slice);
}
}
Another solution that is pure-iterator would be to use Itertools::chunks_lazy:
extern crate itertools;
use itertools::Itertools;
fn main() {
for chunk in &(1..100).chunks_lazy(5) {
for val in chunk {
print!("{}, ", val);
}
println!("");
}
}
Which suggests a similar solution that only requires the standard library:
fn main() {
let mut range = (1..100).peekable();
while range.peek().is_some() {
for value in range.by_ref().take(5) {
print!("{}, ", value);
}
println!("");
}
}
One trick is that Ruby and Rust have different handling here, mostly centered around efficiency.
In Ruby Enumerable can create new arrays to stuff values in without worrying about ownership and return a new array each time (check with this_slice.object_id).
In Rust, allocating a new vector each time would be pretty unusual. Additionally, you can't easily return a reference to a vector that the iterator holds due to complicated lifetime concerns.
A solution that's very similar to Ruby's is:
fn main() {
let mut range = (1..100).peekable();
while range.peek().is_some() {
let chunk: Vec<_> = range.by_ref().take(5).collect();
println!("{:?}", chunk);
}
}
Which could be wrapped up in a new iterator that hides the details:
use std::iter::Peekable;
struct InefficientChunks<I>
where I: Iterator
{
iter: Peekable<I>,
size: usize,
}
impl<I> Iterator for InefficientChunks<I>
where I: Iterator
{
type Item = Vec<I::Item>;
fn next(&mut self) -> Option<Self::Item> {
if self.iter.peek().is_some() {
Some(self.iter.by_ref().take(self.size).collect())
} else {
None
}
}
}
trait Awesome: Iterator + Sized {
fn inefficient_chunks(self, size: usize) -> InefficientChunks<Self> {
InefficientChunks {
iter: self.peekable(),
size: size,
}
}
}
impl<I> Awesome for I where I: Iterator {}
fn main() {
for chunk in (1..100).inefficient_chunks(5) {
println!("{:?}", chunk);
}
}
Collecting into a vec can easily kill your performance. An approach similar to in the question is perfectly fine.
fn chunk_range(range: Range<usize>, chunk_size: usize) -> impl Iterator<Item=Range<usize>> {
range.clone().step_by(chunk_size).map(move |block_start| {
let block_end = (block_start + chunk_size).min(range.end);
block_start..block_end
})
}

Resources