Is it possible to have a closure call itself in Rust? [duplicate] - recursion

This is a very simple example, but how would I do something similar to:
let fact = |x: u32| {
match x {
0 => 1,
_ => x * fact(x - 1),
}
};
I know that this specific example can be easily done with iteration, but I'm wondering if it's possible to make a recursive function in Rust for more complicated things (such as traversing trees) or if I'm required to use my own stack instead.

There are a few ways to do this.
You can put closures into a struct and pass this struct to the closure. You can even define structs inline in a function:
fn main() {
struct Fact<'s> { f: &'s dyn Fn(&Fact, u32) -> u32 }
let fact = Fact {
f: &|fact, x| if x == 0 {1} else {x * (fact.f)(fact, x - 1)}
};
println!("{}", (fact.f)(&fact, 5));
}
This gets around the problem of having an infinite type (a function that takes itself as an argument) and the problem that fact isn't yet defined inside the closure itself when one writes let fact = |x| {...} and so one can't refer to it there.
Another option is to just write a recursive function as a fn item, which can also be defined inline in a function:
fn main() {
fn fact(x: u32) -> u32 { if x == 0 {1} else {x * fact(x - 1)} }
println!("{}", fact(5));
}
This works fine if you don't need to capture anything from the environment.
One more option is to use the fn item solution but explicitly pass the args/environment you want.
fn main() {
struct FactEnv { base_case: u32 }
fn fact(env: &FactEnv, x: u32) -> u32 {
if x == 0 {env.base_case} else {x * fact(env, x - 1)}
}
let env = FactEnv { base_case: 1 };
println!("{}", fact(&env, 5));
}
All of these work with Rust 1.17 and have probably worked since version 0.6. The fn's defined inside fns are no different to those defined at the top level, except they are only accessible within the fn they are defined inside.

As of Rust 1.62 (July 2022), there's still no direct way to recurse in a closure. As the other answers have pointed out, you need at least a bit of indirection, like passing the closure to itself as an argument, or moving it into a cell after creating it. These things can work, but in my opinion they're kind of gross, and they're definitely hard for Rust beginners to follow. If you want to use recursion but you have to have a closure, for example because you need something that implements FnOnce() to use with thread::spawn, then I think the cleanest approach is to use a regular fn function for the recursive part and to wrap it in a non-recursive closure that captures the environment. Here's an example:
let x = 5;
let fact = || {
fn helper(arg: u64) -> u64 {
match arg {
0 => 1,
_ => arg * helper(arg - 1),
}
}
helper(x)
};
assert_eq!(120, fact());

Here's a really ugly and verbose solution I came up with:
use std::{
cell::RefCell,
rc::{Rc, Weak},
};
fn main() {
let weak_holder: Rc<RefCell<Weak<dyn Fn(u32) -> u32>>> =
Rc::new(RefCell::new(Weak::<fn(u32) -> u32>::new()));
let weak_holder2 = weak_holder.clone();
let fact: Rc<dyn Fn(u32) -> u32> = Rc::new(move |x| {
let fact = weak_holder2.borrow().upgrade().unwrap();
if x == 0 {
1
} else {
x * fact(x - 1)
}
});
weak_holder.replace(Rc::downgrade(&fact));
println!("{}", fact(5)); // prints "120"
println!("{}", fact(6)); // prints "720"
}
The advantages of this are that you call the function with the expected signature (no extra arguments needed), it's a closure that can capture variables (by move), it doesn't require defining any new structs, and the closure can be returned from the function or otherwise stored in a place that outlives the scope where it was created (as an Rc<Fn...>) and it still works.

Closure is just a struct with additional contexts. Therefore, you can do this to achieve recursion (suppose you want to do factorial with recursive mutable sum):
#[derive(Default)]
struct Fact {
ans: i32,
}
impl Fact {
fn call(&mut self, n: i32) -> i32 {
if n == 0 {
self.ans = 1;
return 1;
}
self.call(n - 1);
self.ans *= n;
self.ans
}
}
To use this struct, just:
let mut fact = Fact::default();
let ans = fact.call(5);

Related

Rust: Joining and iterating over futures' results

I have some code that iterates over objects and uses an async method on each of them sequentially before doing something with the results. I'd like to change it so that the async method calls are joined into a single future before being executed. The important bit below is in HolderStruct::add_squares. My current code looks like this:
use anyhow::Result;
struct AsyncMethodStruct {
value: u64
}
impl AsyncMethodStruct {
fn new(value: u64) -> Self {
AsyncMethodStruct {
value
}
}
async fn get_square(&self) -> Result<u64> {
Ok(self.value * self.value)
}
}
struct HolderStruct {
async_structs: Vec<AsyncMethodStruct>
}
impl HolderStruct {
fn new(async_structs: Vec<AsyncMethodStruct>) -> Self {
HolderStruct {
async_structs
}
}
async fn add_squares(&self) -> Result<u64> {
let mut squares = Vec::with_capacity(self.async_structs.len());
for async_struct in self.async_structs.iter() {
squares.push(async_struct.get_square().await?);
}
let mut sum = 0;
for square in squares.iter() {
sum += square;
}
return Ok(sum);
}
}
I'd like to change HolderStruct::add_squares to something like this:
use futures::future::join_all;
// [...]
impl HolderStruct {
async fn add_squares(&self) -> Result<u64> {
let mut square_futures = Vec::with_capacity(self.async_structs.len());
for async_struct in self.async_structs.iter() {
square_futures.push(async_struct.get_square());
}
let square_results = join_all(square_futures).await;
let mut sum = 0;
for square_result in square_results.iter() {
sum += square_result?;
}
return Ok(sum);
}
}
However, the compiler gives me this error using the above:
error[E0277]: the `?` operator can only be applied to values that implement `std::ops::Try`
--> src/main.rs:46:20
|
46 | sum += square_result?;
| ^^^^^^^^^^^^^^ the `?` operator cannot be applied to type `&std::result::Result<u64, anyhow::Error>`
|
= help: the trait `std::ops::Try` is not implemented for `&std::result::Result<u64, anyhow::Error>`
= note: required by `std::ops::Try::into_result`
How would I change the code to not have this error?
for square_result in square_results.iter()
Lose the iter() call here.
for square_result in square_results
You seem to be under impression that calling iter() is mandatory to iterate over a collection. Actually, anything that implements IntoIterator can be used in a for loop.
Calling iter() on a Vec<T> derefs to slice (&[T]) and yields an iterator over references to the vectors elements. The ? operator tries to take the value out of the Result, but that is only possible if you own the Result rather than just have a reference to it.
However, if you simply use a vector itself in a for statement, it will use the IntoIterator implementation for Vec<T> which will yield items of type T rather than &T.
square_results.into_iter() does the same thing, albeit more verbosely. It is mostly useful when using iterators in a functional style, a la vector.into_iter().map(|x| x + 1).collect().

Is it safe to use a raw pointer to access the &T of a RefCell<HashMap<T>>?

I have a cache-like structure which internally uses a HashMap:
impl Cache {
fn insert(&mut self, k: u32, v: String) {
self.map.insert(k, v);
}
fn borrow(&self, k: u32) -> Option<&String> {
self.map.get(&k)
}
}
Playground with external mutability
Now I need internal mutability. Since HashMap does not implement Copy, my guess is that RefCell is the path to follow. Writing the insert method is straight forward but I encountered problems with the borrow-function. I could return a Ref<String>, but since I'd like to cache the result, I wrote a small Ref-wrapper:
struct CacheRef<'a> {
borrow: Ref<'a, HashMap<u32, String>>,
value: &'a String,
}
This won't work since value references borrow, so the struct can't be constructed. I know that the reference is always valid: The map can't be mutated, because Ref locks the map. Is it safe to use a raw pointer instead of a reference?
struct CacheRef<'a> {
borrow: Ref<'a, HashMap<u32, String>>,
value: *const String,
}
Am I overlooking something here? Are there better (or faster) options? I'm trying to avoid RefCell due to the runtime overhead.
Playground with internal mutability
I'll complement #Shepmaster's safe but not quite as efficient answer with the unsafe version. For this, we'll pack some unsafe code in a utility function.
fn map_option<'a, T, F, U>(r: Ref<'a, T>, f: F) -> Option<Ref<'a, U>>
where
F: FnOnce(&'a T) -> Option<&'a U>
{
let stolen = r.deref() as *const T;
let ur = f(unsafe { &*stolen }).map(|sr| sr as *const U);
match ur {
Some(u) => Some(Ref::map(r, |_| unsafe { &*u })),
None => None
}
}
I'm pretty sure this code is correct. Although the compiler is rather unhappy with the lifetimes, they work out. We just have to inject some raw pointers to make the compiler shut up.
With this, the implementation of borrow becomes trivial:
fn borrow<'a>(&'a self, k: u32) -> Option<Ref<'a, String>> {
map_option(self.map.borrow(), |m| m.get(&k))
}
Updated playground link
The utility function only works for Option<&T>. Other containers (such as Result) would require their own modified copy, or else GATs or HKTs to implement generically.
I'm going to ignore your direct question in favor of a definitely safe alternative:
impl Cache {
fn insert(&self, k: u32, v: String) {
self.map.borrow_mut().insert(k, v);
}
fn borrow<'a>(&'a self, k: u32) -> Option<Ref<'a, String>> {
let borrow = self.map.borrow();
if borrow.contains_key(&k) {
Some(Ref::map(borrow, |hm| {
hm.get(&k).unwrap()
}))
} else {
None
}
}
}
Ref::map allows you to take a Ref<'a, T> and convert it into a Ref<'a, U>. The ugly part of this solution is that we have to lookup in the hashmap twice because I can't figure out how to make the ideal solution work:
Ref::map(borrow, |hm| {
hm.get(&k) // Returns an `Option`, not a `&...`
})
This might require Generic Associated Types (GATs) and even then the return type might be a Ref<Option<T>>.
As mentioned by Shepmaster, it is better to avoid unsafe when possible.
There are multiple possibilities:
Ref::map, with double look-up (as illustrated by Shepmaster's answer),
Ref::map with sentinel value,
Cloning the return value.
Personally, I'd consider the latter first. Store Rc<String> into your map and your method can easily return a Option<Rc<String>> which completely sidesteps the issues:
fn get(&self, k: u32) -> Option<Rc<String>> {
self.map.borrow().get(&k).cloned()
}
As a bonus, your cache is not "locked" any longer while you use the result.
Or, alternatively, you can work-around the fact that Ref::map does not like Option by using a sentinel value:
fn borrow<'a>(&'a self, k: u32) -> Ref<'a, str> {
let borrow = self.map.borrow();
Ref::map(borrow, |map| map.get(&k).map(|s| &s[..]).unwrap_or(""))
}

How to get the index of an element in a vector using pointer arithmetic?

In C you can use the pointer offset to get index of an element within an array, e.g.:
index = element_pointer - &vector[0];
Given a reference to an element in an array, this should be possible in Rust too.
While Rust has the ability to get the memory address from vector elements, convert them to usize, then subtract them - is there a more convenient/idiomatic way to do this in Rust?
There isn't a simpler way. I think the reason is that it would be hard to guarantee that any operation or method that gave you that answer only allowed you to use it with the a Vec (or more likely slice) and something inside that collection; Rust wouldn't want to allow you to call it with a reference into a different vector.
More idiomatic would be to avoid needing to do it in the first place. You can't store references into Vec anywhere very permanent away from the Vec anyway due to lifetimes, so you're likely to have the index handy when you've got the reference anyway.
In particular, when iterating, for example, you'd use the enumerate to iterate over pairs (index, &item).
So, given the issues which people have brought up with the pointers and stuff; the best way, imho, to do this is:
fn index_of_unchecked<T>(slice: &[T], item: &T) -> usize {
if ::std::mem::size_of::<T>() == 0 {
return 0; // do what you will with this case
}
(item as *const _ as usize - slice.as_ptr() as usize)
/ std::mem::size_of::<T>()
}
// note: for zero sized types here, you
// return Some(0) if item as *const T == slice.as_ptr()
// and None otherwise
fn index_of<T>(slice: &[T], item: &T) -> Option<usize> {
let ptr = item as *const T;
if
slice.as_ptr() < ptr &&
slice.as_ptr().offset(slice.len()) > ptr
{
Some(index_of_unchecked(slice, item))
} else {
None
}
}
although, if you want methods:
trait IndexOfExt<T> {
fn index_of_unchecked(&self, item: &T) -> usize;
fn index_of(&self, item: &T) -> Option<usize>;
}
impl<T> IndexOfExt<T> for [T] {
fn index_of_unchecked(&self, item: &T) -> usize {
// ...
}
fn index_of(&self, item: &T) -> Option<usize> {
// ...
}
}
and then you'll be able to use this method for any type that Derefs to [T]

Is it possible to make a recursive closure in Rust?

This is a very simple example, but how would I do something similar to:
let fact = |x: u32| {
match x {
0 => 1,
_ => x * fact(x - 1),
}
};
I know that this specific example can be easily done with iteration, but I'm wondering if it's possible to make a recursive function in Rust for more complicated things (such as traversing trees) or if I'm required to use my own stack instead.
There are a few ways to do this.
You can put closures into a struct and pass this struct to the closure. You can even define structs inline in a function:
fn main() {
struct Fact<'s> { f: &'s dyn Fn(&Fact, u32) -> u32 }
let fact = Fact {
f: &|fact, x| if x == 0 {1} else {x * (fact.f)(fact, x - 1)}
};
println!("{}", (fact.f)(&fact, 5));
}
This gets around the problem of having an infinite type (a function that takes itself as an argument) and the problem that fact isn't yet defined inside the closure itself when one writes let fact = |x| {...} and so one can't refer to it there.
Another option is to just write a recursive function as a fn item, which can also be defined inline in a function:
fn main() {
fn fact(x: u32) -> u32 { if x == 0 {1} else {x * fact(x - 1)} }
println!("{}", fact(5));
}
This works fine if you don't need to capture anything from the environment.
One more option is to use the fn item solution but explicitly pass the args/environment you want.
fn main() {
struct FactEnv { base_case: u32 }
fn fact(env: &FactEnv, x: u32) -> u32 {
if x == 0 {env.base_case} else {x * fact(env, x - 1)}
}
let env = FactEnv { base_case: 1 };
println!("{}", fact(&env, 5));
}
All of these work with Rust 1.17 and have probably worked since version 0.6. The fn's defined inside fns are no different to those defined at the top level, except they are only accessible within the fn they are defined inside.
As of Rust 1.62 (July 2022), there's still no direct way to recurse in a closure. As the other answers have pointed out, you need at least a bit of indirection, like passing the closure to itself as an argument, or moving it into a cell after creating it. These things can work, but in my opinion they're kind of gross, and they're definitely hard for Rust beginners to follow. If you want to use recursion but you have to have a closure, for example because you need something that implements FnOnce() to use with thread::spawn, then I think the cleanest approach is to use a regular fn function for the recursive part and to wrap it in a non-recursive closure that captures the environment. Here's an example:
let x = 5;
let fact = || {
fn helper(arg: u64) -> u64 {
match arg {
0 => 1,
_ => arg * helper(arg - 1),
}
}
helper(x)
};
assert_eq!(120, fact());
Here's a really ugly and verbose solution I came up with:
use std::{
cell::RefCell,
rc::{Rc, Weak},
};
fn main() {
let weak_holder: Rc<RefCell<Weak<dyn Fn(u32) -> u32>>> =
Rc::new(RefCell::new(Weak::<fn(u32) -> u32>::new()));
let weak_holder2 = weak_holder.clone();
let fact: Rc<dyn Fn(u32) -> u32> = Rc::new(move |x| {
let fact = weak_holder2.borrow().upgrade().unwrap();
if x == 0 {
1
} else {
x * fact(x - 1)
}
});
weak_holder.replace(Rc::downgrade(&fact));
println!("{}", fact(5)); // prints "120"
println!("{}", fact(6)); // prints "720"
}
The advantages of this are that you call the function with the expected signature (no extra arguments needed), it's a closure that can capture variables (by move), it doesn't require defining any new structs, and the closure can be returned from the function or otherwise stored in a place that outlives the scope where it was created (as an Rc<Fn...>) and it still works.
Closure is just a struct with additional contexts. Therefore, you can do this to achieve recursion (suppose you want to do factorial with recursive mutable sum):
#[derive(Default)]
struct Fact {
ans: i32,
}
impl Fact {
fn call(&mut self, n: i32) -> i32 {
if n == 0 {
self.ans = 1;
return 1;
}
self.call(n - 1);
self.ans *= n;
self.ans
}
}
To use this struct, just:
let mut fact = Fact::default();
let ans = fact.call(5);

Owned pointer in graph structure

With the generous help of the rust community I managed to get the base of a topological data structure assembled using managed pointers. This came together rather nicely and I was pretty excited about Rust in general. Then I read this post (which seems like a reasonable plan) and it inspired me to back track and try to re-assemble it using only owned pointers if possible.
This is the working version using managed pointers:
struct Dart<T> {
alpha: ~[#mut Dart<T>],
embed: ~[#mut T],
tagged: bool
}
impl<T> Dart<T> {
pub fn new(dim: uint) -> #mut Dart<T> {
let mut dart = #mut Dart{alpha: ~[], embed: ~[], tagged: false};
dart.alpha = vec::from_elem(dim, dart);
return dart;
}
pub fn get_dim(&self) -> uint {
return self.alpha.len();
}
pub fn traverse(#mut self, invs: &[uint], f: &fn(&Dart<T>)) {
let dim = self.get_dim();
for invs.each |i| {if *i >= dim {return}}; //test bounds on invs vec
if invs.len() == 2 {
let spread:int = int::abs(invs[1] as int - invs[0] as int);
if spread == 1 { //simple loop
let mut dart = self;
let mut i = invs[0];
while !dart.tagged {
dart.tagged = true;
f(dart);
dart = dart.alpha[i];
if i == invs[0] {i = invs[1];}
else {i == invs[0];}
} }
// else if spread == 2 { // max 4 cells traversed
// }
}
else {
let mut stack = ~[self];
self.tagged = true;
while !stack.is_empty() {
let mut dart = stack.pop();
f(dart);
for invs.each |i| {
if !dart.alpha[*i].tagged {
dart.alpha[*i].tagged = true;
stack.push(dart);
} } } } } }
After a few hours of chasing lifetime errors I have come to the conclusion that this may not even be possible with owned pointers due to the cyclic nature (without tying the knot as I was warned). My feeble attempt at this is below. My question, is this structure possible to implement without resorting to managed pointers? And if not, is the code above considered reasonably "rusty"? (idiomatic rust). Thanks.
struct GMap<'self,T> {
dim: uint,
darts: ~[~Dart<'self,T>]
}
struct Dart<'self,T> {
alpha: ~[&'self mut Dart<'self, T>],
embed: ~[&'self mut T],
tagged: bool
}
impl<'self, T> GMap<'self, T> {
pub fn new_dart(&'self mut self) {
let mut dart = ~Dart{alpha: ~[], embed: ~[], tagged: false};
let dartRef: &'self mut Dart<'self, T> = dart;
dartRef.alpha = vec::from_elem(self.dim, copy dartRef);
self.darts.push(dart);
}
}
I'm pretty sure that using &mut pointers is impossible, since one can only have one such pointer in existence at a time, e.g.:
fn main() {
let mut i = 0;
let a = &mut i;
let b = &mut i;
}
and-mut.rs:4:12: 4:18 error: cannot borrow `i` as mutable more than once at at a time
and-mut.rs:4 let b = &mut i;
^~~~~~
and-mut.rs:3:12: 3:18 note: second borrow of `i` as mutable occurs here
and-mut.rs:3 let a = &mut i;
^~~~~~
error: aborting due to previous error
One could get around the borrow checker unsafely, by either storing unsafe pointer to the memory (ptr::to_mut_unsafe_ptr), or indices into the darts member of GMap. Essentially, storing a single reference to the memory (in self.darts) and all operations have to go through it.
This might look like:
impl<'self, T> GMap<'self, T> {
pub fn new_dart(&'self mut self) {
let ind = self.darts.len();
self.darts.push(~Dart{alpha: vec::from_elem(self.dim, ind), embed: ~[], tagged: false});
}
}
traverse would need to change to either be a method on GMap (e.g. fn(&mut self, node_ind: uint, invs: &[uint], f: &fn(&Dart<T>))), or at least take a GMap type.
(On an entirely different note, there is library support for external iterators, which are far more composable than the internal iterators (the ones that take a closure). So defining one of these for traverse may (or may not) make using it nicer.)

Resources