I continue to struggle with the concept of recursion. I have a function that takes a u64 and returns a Vec<u64> of factors of that integer. I would like to recursively call this function on each item in the Vec, returning a flattened Vec until the function returns Vec<self> for each item, i.e., each item is prime.
fn prime_factors(x: u64) -> Vec<u64> {
let factors = factoring_method(x);
factors.iter().flat_map(|&i| factoring_method(i)).collect()
}
(The complete code)
This returns only the Vec of factors of the final iteration and also has no conditional that allows it to keep going until items are all prime.
The factoring_method is a congruence of squares that I'm pretty happy with. I'm certain there's lots of room for optimization, but I'm hoping to get a working version complete before refactoring. I think the recursion should in the congruence_of_squares — calling itself upon each member of the Vec it returns, but I'm not sure how to frame the conditional to keep it from doing so infinitely.
Useful recursion requires two things:
That a function call itself, either directly or indirectly.
That there be some terminating case.
One definition of the prime factorization of a number is:
if the number is prime, that is the only prime factor
otherwise, combine the prime factors of a pair of factors of the number
From that, we can identify a termination condition ("if it's prime") and the recursive call ("prime factors of factors").
Note that we haven't written any code yet — everything to this point is conceptual.
We can then transcribe the idea to Rust:
fn prime_factors(x: u64) -> Vec<u64> {
if is_prime(x) {
vec![x]
} else {
factors(x).into_iter().flat_map(prime_factors).collect()
}
}
Interesting pieces here:
We use into_iter to avoid the need to dereference the iterated value.
We can pass the function name directly as the closure because the types align.
Some (inefficient) helper functions round out the implementation:
fn is_prime(x: u64) -> bool {
!(2..x).any(|i| x % i == 0)
}
fn factors(x: u64) -> Vec<u64> {
match (2..x).filter(|i| x % i == 0).next() {
Some(v) => vec![v, x / v],
None => vec![],
}
}
Related
I'm looking for some way to shorten an iterator by some condition. A bit like an inverse filter but it stops iterating at the first true value. Let's call it until(f). Where:
iterator.until(f)
Would return an iterator that only runs until f is true once.
Let's use an example of finding the next prime number.
We have some structure containing known primes and a function to extend it.
// Structure for caching known prime numbers
struct PrimeGenerator {
primes:Vec<i64>
}
impl PrimeGenerator {
// Create a new prime generator
fn new()->Self{
let primes = vec![2,3];
Self {
primes,
}
}
// Extend the list of known primes by 1
fn extend_by_one(&mut self){
let mut next_option = self.primes.last().unwrap()+2;
while self.primes.iter().any(|x| next_option%x == 0) { // This is the relevant line
next_option += 2;
}
self.primes.push(next_option);
}
}
Now this snippet is a bit too exhaustive as we should only have to check until the square root of next_option, so I was looking for a some method that would shorten the iterator based on some condition, so I could write something like:
self.iter().until(|x| x*x > next_option).any(|x| next_option%x == 0)
Is there any similar pattern available?
Looks like your until is similar to inverted take_while.
self.iter().take_while(|x| x*x <= next_option).all(|x| next_option%x != 0)
I have the following:
enum SomeType {
VariantA(String),
VariantB(String, i32),
}
fn transform(x: SomeType) -> SomeType {
// very complicated transformation, reusing parts of x in order to produce result:
match x {
SomeType::VariantA(s) => SomeType::VariantB(s, 0),
SomeType::VariantB(s, i) => SomeType::VariantB(s, 2 * i),
}
}
fn main() {
let mut data = vec![
SomeType::VariantA("hello".to_string()),
SomeType::VariantA("bye".to_string()),
SomeType::VariantB("asdf".to_string(), 34),
];
}
I would now like to call transform on each element of data and store the resulting value back in data. I could do something like data.into_iter().map(transform).collect(), but this will allocate a new Vec. Is there a way to do this in-place, reusing the allocated memory of data? There once was Vec::map_in_place in Rust but it has been removed some time ago.
As a work-around, I've added a Dummy variant to SomeType and then do the following:
for x in &mut data {
let original = ::std::mem::replace(x, SomeType::Dummy);
*x = transform(original);
}
This does not feel right, and I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop. Is there a better way of doing this?
Your first problem is not map, it's transform.
transform takes ownership of its argument, while Vec has ownership of its arguments. Either one has to give, and poking a hole in the Vec would be a bad idea: what if transform panics?
The best fix, thus, is to change the signature of transform to:
fn transform(x: &mut SomeType) { ... }
then you can just do:
for x in &mut data { transform(x) }
Other solutions will be clunky, as they will need to deal with the fact that transform might panic.
No, it is not possible in general because the size of each element might change as the mapping is performed (fn transform(u8) -> u32).
Even when the sizes are the same, it's non-trivial.
In this case, you don't need to create a Dummy variant because creating an empty String is cheap; only 3 pointer-sized values and no heap allocation:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
let old = std::mem::replace(self, VariantA(String::new()));
// Note this line for the detailed explanation
*self = match old {
VariantA(s) => VariantB(s, 0),
VariantB(s, i) => VariantB(s, 2 * i),
};
}
}
for x in &mut data {
x.transform();
}
An alternate implementation that just replaces the String:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
*self = match self {
VariantA(s) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 0)
}
VariantB(s, i) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 2 * *i)
}
};
}
}
In general, yes, you have to create some dummy value to do this generically and with safe code. Many times, you can wrap your whole element in Option and call Option::take to achieve the same effect .
See also:
Change enum variant while moving the field to the new variant
Why is it so complicated?
See this proposed and now-closed RFC for lots of related discussion. My understanding of that RFC (and the complexities behind it) is that there's an time period where your value would have an undefined value, which is not safe. If a panic were to happen at that exact second, then when your value is dropped, you might trigger undefined behavior, a bad thing.
If your code were to panic at the commented line, then the value of self is a concrete, known value. If it were some unknown value, dropping that string would try to drop that unknown value, and we are back in C. This is the purpose of the Dummy value - to always have a known-good value stored.
You even hinted at this (emphasis mine):
I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop
That "should" is the problem. During a panic, that dummy value is visible.
See also:
How can I swap in a new value for a field in a mutable reference to a structure?
Temporarily move out of borrowed content
How do I move out of a struct field that is an Option?
The now-removed implementation of Vec::map_in_place spans almost 175 lines of code, most of having to deal with unsafe code and reasoning why it is actually safe! Some crates have re-implemented this concept and attempted to make it safe; you can see an example in Sebastian Redl's answer.
You can write a map_in_place in terms of the take_mut or replace_with crates:
fn map_in_place<T, F>(v: &mut [T], f: F)
where
F: Fn(T) -> T,
{
for e in v {
take_mut::take(e, f);
}
}
However, if this panics in the supplied function, the program aborts completely; you cannot recover from the panic.
Alternatively, you could supply a placeholder element that sits in the empty spot while the inner function executes:
use std::mem;
fn map_in_place_with_placeholder<T, F>(v: &mut [T], f: F, mut placeholder: T)
where
F: Fn(T) -> T,
{
for e in v {
let mut tmp = mem::replace(e, placeholder);
tmp = f(tmp);
placeholder = mem::replace(e, tmp);
}
}
If this panics, the placeholder you supplied will sit in the panicked slot.
Finally, you could produce the placeholder on-demand; basically replace take_mut::take with take_mut::take_or_recover in the first version.
I encountered this competitive programming problem:
nums is a vector of integers (length n)
ops is a vector of strings containing + and - (length n-1)
It can be solved with the reduce operation in Kotlin like this:
val op_iter = ops.iterator();
nums.reduce {a, b ->
when (op_iter.next()) {
"+" -> a+b
"-" -> a-b
else -> throw Exception()
}
}
reduce is described as:
Accumulates value starting with the first element and applying operation from left to right to current accumulator value and each element.
It looks like Rust vectors do not have a reduce method. How would you achieve this task?
Edited: since Rust version 1.51.0, this function is called reduce
Be aware of similar function which is called fold. The difference is that reduce will produce None if iterator is empty while fold accepts accumulator and will produce accumulator's value if iterator is empty.
Outdated answer is left to capture the history of this function debating how to name it:
There is no reduce in Rust 1.48. In many cases you can simulate it with fold but be aware that the semantics of the these functions are different. If the iterator is empty, fold will return the initial value whereas reduce returns None. If you want to perform multiplication operation on all elements, for example, getting result 1 for empty set is not too logical.
Rust does have a fold_first function which is equivalent to Kotlin's reduce, but it is not stable yet. The main discussion is about naming it. It is a safe bet to use it if you are ok with nightly Rust because there is little chance the function will be removed. In the worst case, the name will be changed. If you need stable Rust, then use fold if you are Ok with an illogical result for empty sets. If not, then you'll have to implement it, or find a crate such as reduce.
Kotlin's reduce takes the first item of the iterator for the starting point while Rust's fold and try_fold let you specify a custom starting point.
Here is an equivalent of the Kotlin code:
let mut it = nums.iter().cloned();
let start = it.next().unwrap();
it.zip(ops.iter()).try_fold(start, |a, (b, op)| match op {
'+' => Ok(a + b),
'-' => Ok(a - b),
_ => Err(()),
})
Playground
Or since we're starting from a vector, which can be indexed:
nums[1..]
.iter()
.zip(ops.iter())
.try_fold(nums[0], |a, (b, op)| match op {
'+' => Ok(a + b),
'-' => Ok(a - b),
_ => Err(()),
});
Playground
I came across this while doing the 2018 Advent of Code (Day 2, Part 1) solution in Rust.
The problem to solve:
Take the count of strings that have exactly two of the same letter, multiplied by the count of strings that have exactly three of the same letter.
INPUT
abcdega
hihklmh
abqasbb
aaaabcd
The first string abcdega has a repeated twice.
The second string hihklmh has h repeated three times.
The third string abqasbb has a repeated twice, and b repeated three times, so it counts for both.
The fourth string aaaabcd contains a letter repeated 4 times (not 2, or 3) so it does not count.
So the result should be:
2 strings that contained a double letter (first and third) multiplied by 2 strings that contained a triple letter (second and third) = 4
The Question:
const PUZZLE_INPUT: &str =
"
abcdega
hihklmh
abqasbb
aaaabcd
";
fn letter_counts(id: &str) -> [u8;26] {
id.chars().map(|c| c as u8).fold([0;26], |mut counts, c| {
counts[usize::from(c - b'a')] += 1;
counts
})
}
fn has_repeated_letter(n: u8, letter_counts: &[u8;26]) -> bool {
letter_counts.iter().any(|&count| count == n)
}
fn main() {
let ids_iter = PUZZLE_INPUT.lines().map(letter_counts);
let num_ids_with_double = ids_iter.clone().filter(|id| has_repeated_letter(2, id)).count();
let num_ids_with_triple = ids_iter.filter(|id| has_repeated_letter(3, id)).count();
println!("{}", num_ids_with_double * num_ids_with_triple);
}
Rust Playground
Consider line 21. The function letter_counts takes only one argument, so I can use the syntax: .map(letter_counts) on elements that match the type of the expected argument. This is really nice to me! I love that I don't have to create a closure: .map(|id| letter_counts(id)). I find both to be readable, but the former version without the closure is much cleaner to me.
Now consider lines 22 and 23. Here, I have to use the syntax: .filter(|id| has_repeated_letter(3, id)) because the has_repeated_letter function takes two arguments. I would really like to do .filter(has_repeated_letter(3)) instead.
Sure, I could make the function take a tuple instead, map to a tuple and consume only a single argument... but that seems like a terrible solution. I'd rather just create the closure.
Leaving out the only argument is something that Rust lets you do. Why would it be any harder for the compiler to let you leave out the last argument, provided that it has all of the other n-1 arguments for a function that takes n arguments.
I feel like this would make the syntax a lot cleaner, and it would fit in a lot better with the idiomatic functional style that Rust prefers.
I am certainly no expert in compilers, but implementing this behavior seems like it would be straightforward. If my thinking is incorrect, I would love to know more about why that is so.
No, you cannot pass a function with multiple arguments as an implicit closure.
In certain cases, you can choose to use currying to reduce the arity of a function. For example, here we reduce the add function from 2 arguments to one:
fn add(a: i32, b: i32) -> i32 {
a + b
}
fn curry<A1, A2, R>(f: impl FnOnce(A1, A2) -> R, a1: A1) -> impl FnOnce(A2) -> R {
move |a2| f(a1, a2)
}
fn main() {
let a = Some(1);
a.map(curry(add, 2));
}
However, I agree with the comments that this isn't a benefit:
It's not any less typing:
a.map(curry(add, 2));
a.map(|v| add(v, 2));
The curry function is extremely limited: it chooses to use FnOnce, but Fn and FnMut also have use cases. It only applies to a function with two arguments.
However, I have used this higher-order function trick in other projects, where the amount of code that is added is much greater.
I'm trying to do some operations that would cause a StackOverflow in Kotlin just now.
Knowing that, I remembered that Kotlin has support for tailrec functions, so I tried to do:
private tailrec fun Turn.debuffPhase(): List<Turn> {
val turns = listOf(this)
if (facts.debuff == 0 || knight.damage == 0) {
return turns
}
// Recursively find all possible thresholds of debuffing
return turns + debuff(debuffsForNextThreshold()).debuffPhase()
}
Upon my surprise that IDEA didn't recognize it as a tailrec, I tried to unmake it a extension function and make it a normal function:
private tailrec fun debuffPhase(turn: Turn): List<Turn> {
val turns = listOf(turn)
if (turn.facts.debuff == 0 || turn.knight.damage == 0) {
return turns
}
// Recursively find all possible thresholds of debuffing
val newTurn = turn.debuff(turn.debuffsForNextThreshold())
return turns + debuffPhase(newTurn)
}
Even so it isn't accepted. The important isn't that the last function call is to the same function? I know that the + is a sign to the List plus function, but should it make a difference? All the examples I see on the internet for tail call for another languages allow those kind of actions.
I tried to do that with Int too, that seemed to be something more commonly used than addition to lists, but had the same result:
private tailrec fun discoverBuffsNeeded(dragon: RPGChar): Int {
val buffedDragon = dragon.buff(buff)
if (dragon.turnsToKill(initKnight) < 1 + buffedDragon.turnsToKill(initKnight)) {
return 0
}
return 1 + discoverBuffsNeeded(buffedDragon)
}
Shouldn't all those implementations allow for tail call? I thought of some other ways to solve that(Like passing the list as a MutableList on the parameters too), but when possible I try to avoid sending collections to be changed inside the function and this seems a case that this should be possible.
PS: About the question program, I'm implementing a solution to this problem.
None of your examples are tail-recursive.
A tail call is the last call in a subroutine. A recursive call is a call of a subroutine to itself. A tail-recursive call is a tail call of a subroutine to itself.
In all of your examples, the tail call is to +, not to the subroutine. So, all of those are recursive (because they call themselves), and all of those have tail calls (because every subroutine always has a "last call"), but none of them is tail-recursive (because the recursive call isn't the last call).
Infix notation can sometimes obscure what the tail call is, it is easier to see when you write every operation in prefix form or as a method call:
return plus(turns, debuff(debuffsForNextThreshold()).debuffPhase())
// or
return turns.plus(debuff(debuffsForNextThreshold()).debuffPhase())
Now it becomes much easier to see that the call to debuffPhase is not in tail position, but rather it is the call to plus (i.e. +) which is in tail position. If Kotlin had general tail calls, then that call to plus would indeed be eliminated, but AFAIK, Kotlin only has tail-recursion (like Scala), so it won't.
Without giving away an answer to your puzzle, here's a non-tail-recursive function.
fun fac(n: Int): Int =
if (n <= 1) 1 else n * fac(n - 1)
It is not tail recursive because the recursive call is not in a tail position, as noted by Jörg's answer.
It can be transformed into a tail-recursive function using CPS,
tailrec fun fac2(n: Int, k: Int = 1): Int =
if (n <= 1) k else fac2(n - 1, n * k)
although a better interface would likely hide the continuation in a private helper function.
fun fac3(n: Int): Int {
tailrec fun fac_continue(n: Int, k: Int): Int =
if (n <= 1) k else fac_continue(n - 1, n * k)
return fac_continue(n, 1)
}