Shorten iterator by condition in rust - dictionary

I'm looking for some way to shorten an iterator by some condition. A bit like an inverse filter but it stops iterating at the first true value. Let's call it until(f). Where:
iterator.until(f)
Would return an iterator that only runs until f is true once.
Let's use an example of finding the next prime number.
We have some structure containing known primes and a function to extend it.
// Structure for caching known prime numbers
struct PrimeGenerator {
primes:Vec<i64>
}
impl PrimeGenerator {
// Create a new prime generator
fn new()->Self{
let primes = vec![2,3];
Self {
primes,
}
}
// Extend the list of known primes by 1
fn extend_by_one(&mut self){
let mut next_option = self.primes.last().unwrap()+2;
while self.primes.iter().any(|x| next_option%x == 0) { // This is the relevant line
next_option += 2;
}
self.primes.push(next_option);
}
}
Now this snippet is a bit too exhaustive as we should only have to check until the square root of next_option, so I was looking for a some method that would shorten the iterator based on some condition, so I could write something like:
self.iter().until(|x| x*x > next_option).any(|x| next_option%x == 0)
Is there any similar pattern available?

Looks like your until is similar to inverted take_while.
self.iter().take_while(|x| x*x <= next_option).all(|x| next_option%x != 0)

Related

How to split a Vec<u8> by a sequence of chars?

I want to extract the payload of a HTTP request as a Vec<u8>. In the request, the payload is separated from the rest by the sequence \r\n\r\n, that's why I want to split my Vec at this position, and take the second element.
My current solution is to use the following function I wrote.
fn find_payload_index(buffer: &Vec<u8>) -> usize {
for (pos, e) in buffer.iter().enumerate() {
if pos < 3 {
continue
}
if buffer[pos - 3] == 13 && buffer[pos - 2] == 10 && buffer[pos - 1] == 13 && buffer[pos] == 10 {
return pos + 1;
}
}
0
}
13 is the ASCII value of \r and 10 the value of \n. I then split by the returned index. While this solution is technically working, it feels very unclean, and I was wondering how to do this in a more elegant way.
First of:
A function should almost never have a &Vec<_> parameter.
See Why is it discouraged to accept a reference to a String (&String), Vec (&Vec), or Box (&Box) as a function argument?.
Don't use the magic values 10 and 13, Rust supports byte literals: b'\r' and b'\n'.
As for your question: I believe you can make it a bit simpler using windows and matches! with a byte string literal pattern:
fn find_payload_index(buffer: &[u8]) -> Option<usize> {
buffer
.windows(4)
.enumerate()
.find(|(_, w)| matches!(*w, b"\r\n\r\n"))
.map(|(i, _)| i)
}
Permalink to the playground with test cases.
Note that slice has a starts_with method which will more easily do what you want:
fn find_payload_index(buffer: &[u8]) -> usize {
for i in 0..buffer.len() {
if buffer[i..].starts_with(b"\r\n\r\n") {
return i
}
}
panic!("malformed buffer without the sequence")
}
I see no reason to use enumerate if the actual element itself never be used, simply looping over 0..buffer.len() seems the easiest solution to me.
I have also elected to make the function panic, rather than return 0, when the sequence be malformed, which I believe is more proper, though you should probably in the end return some kind of Result value, and handle the error case cleanly, if the input be malformed, but you should never return 0 in this case.
A shorter alternative for #mccarton answer would be to use position:
fn find_payload_index(buffer: &[u8]) -> Option<usize> {
buffer
.windows(4)
.position(|arr| arr == b"\r\n\r\n")
}

How to change the index of a vector element?

In JavaScript I would do this:
function move(arr, old_index, new_index) {
while (old_index < 0) {
old_index += arr.length;
}
while (new_index < 0) {
new_index += arr.length;
}
if (new_index >= arr.length) {
var k = new_index - arr.length;
while ((k--) + 1) {
arr.push(undefined);
}
}
arr.splice(new_index, 0, arr.splice(old_index, 1)[0]);
return arr;
}
How can I accomplish the same thing in Rust?
I don't want to use insert and remove because my vector is a std::vec::Vec<std::string::String> and I want to literally move them to a different location in the vector, not remove them and then insert a copy.
I don't want to swap 2 elements. I want to change the index of an element to an arbitrary other index, like a person cutting to some arbitrary other position in a queue.
When you do insert + remove (or the double splice in JavaScript) you move all of the items between the larger of the two indices and the end of the array twice: first you move them back one slot for the remove, and then you move them forward one slot for the insert. But this is unnecessary. Instead you can simply take a slice of the Vec and rotate it:
fn move_me(arr: &mut [String], old_index: usize, new_index: usize) {
if old_index < new_index {
arr[old_index..=new_index].rotate_left(1);
} else {
arr[new_index..=old_index].rotate_right(1);
}
}
Note that this change allows move_me to take &mut [String] instead of &mut Vec<String>, which makes this code more general as well as more efficient. It is better to accept &[T] instead of &Vec<T>, and in this case the same logic applies to &mut Vec<T> because move_me does not need to grow or shrink the vector.
Also, as in the other answer, I have left out the part that makes negative indices count from the back of the slice, and the part that grows the vector when the index is too large, because neither of those conventions is common in idiomatic Rust.
Thanks to SCappella for telling me that JavaScript Array.splice() does the same thing as Rust Vec.insert() and Vec.remove(). So I just went ahead and ported the function as literally as I could.
Thanks to John Kugelman for letting me know I can delete everything but the last 2 lines.
/* move is a reserved identifier */
fn move_(arr: &mut Vec<String>, old_index: usize, new_index: usize) {
let removed = arr.remove(old_index);
arr.insert(new_index, removed);
}

Recursive function that returns a Vec

I continue to struggle with the concept of recursion. I have a function that takes a u64 and returns a Vec<u64> of factors of that integer. I would like to recursively call this function on each item in the Vec, returning a flattened Vec until the function returns Vec<self> for each item, i.e., each item is prime.
fn prime_factors(x: u64) -> Vec<u64> {
let factors = factoring_method(x);
factors.iter().flat_map(|&i| factoring_method(i)).collect()
}
(The complete code)
This returns only the Vec of factors of the final iteration and also has no conditional that allows it to keep going until items are all prime.
The factoring_method is a congruence of squares that I'm pretty happy with. I'm certain there's lots of room for optimization, but I'm hoping to get a working version complete before refactoring. I think the recursion should in the congruence_of_squares — calling itself upon each member of the Vec it returns, but I'm not sure how to frame the conditional to keep it from doing so infinitely.
Useful recursion requires two things:
That a function call itself, either directly or indirectly.
That there be some terminating case.
One definition of the prime factorization of a number is:
if the number is prime, that is the only prime factor
otherwise, combine the prime factors of a pair of factors of the number
From that, we can identify a termination condition ("if it's prime") and the recursive call ("prime factors of factors").
Note that we haven't written any code yet — everything to this point is conceptual.
We can then transcribe the idea to Rust:
fn prime_factors(x: u64) -> Vec<u64> {
if is_prime(x) {
vec![x]
} else {
factors(x).into_iter().flat_map(prime_factors).collect()
}
}
Interesting pieces here:
We use into_iter to avoid the need to dereference the iterated value.
We can pass the function name directly as the closure because the types align.
Some (inefficient) helper functions round out the implementation:
fn is_prime(x: u64) -> bool {
!(2..x).any(|i| x % i == 0)
}
fn factors(x: u64) -> Vec<u64> {
match (2..x).filter(|i| x % i == 0).next() {
Some(v) => vec![v, x / v],
None => vec![],
}
}

Why is the second solution faster than the first?

First:
boolean isPrime(int n) {
if (n < 2)
return false;
return isPrime(n, 2);
}
private boolean isPrime(int n, int i) {
if (i == n)
return true;
return (n % i == 0) ? false : isPrime(n, i + 1);
}
Second:
boolean isPrime(int n) {
if (n < 0) n = -n;
if (n < 2) return false;
if (n == 2) return true;
if (n % 2 == 0) return false;
return rec_isPrime(n, 3);
}
boolean rec_isPrime(int n, int div) {
if (div * div > n) return true;
if (n % div == 0) return false;
return rec_isPrime(n, div + 2);
}
Please explain why is the second solution better than the first. I offered the first solution in an exam and my pointes were recudec under the claim that the solution is not effecient. I want to know what is the big differene
So this is a test question and I always keep in mind some professors have a colorful prerogative, but i could see a few reasons one might claim the first is slower:
when calculating primes you really only need to test if another primes are factors. The second so seeds with an odd, 3, then adds 2 ever recursive call which skips checking evens factors and reduces the numbers of calls needed by half.
and as #uselpa pointed out, the second code snippet stops at the when the testing factor's square is greater than n. Which effectively means in this version that all odd's between 1 and n have been accounted for. This allows deducing n is prime faster than the first which counts all the way up to n before declaring prime.
may argue that since the first tests for evens inside the recursive function instead of the outer method like the second, is it is an unnessary method on the call stack.
I have also seem some claims that ternary operations are slower than if-else checks, so you professor may fall into this belief. [Personally, I am not convinced there is a performance difference.]
Hope this helps. Was fun to think about some primes!
The first solution has a complexity of O(n) as it takes linear time, the second solution takes O(sqrt(n)) due to this line of code : if (div * div > n) return true;, because to look for a divisor past the square root is not required. For more details about that you can check : Why do we check up to the square root of a prime number to determine if it is prime?

Understanding recursion

I am struggling to understand this recursion used in the dynamic programming example. Can anyone explain the working of this. The objective is to find the least number of coins for a value.
//f(n) = 1 + min f(n-d) for all denomimations d
Pseudocode:
int memo[128]; //initialized to -1
int min_coin(int n)
{
if(n < 0) return INF;
if(n == 0) return 0;
if(memo[n] != -1)
int ans = INF;
for(int i = 0; i < num_denomination; ++i)
{
ans = min(ans, min_coin(n - denominations[i]));
}
return memo[n] = ans+1; //when does this get called?
}
This particular example is explained very well in this article at Topcoder.
Basically this recursion is using the solutions to smaller problems (least number of coins for a smaller n) to find the solution for the overall problem. The dynamic programming aspect of this is the memoization of the solutions to the sub-problems so they don't have to be recalculated every time.
And yes - there are {} missing as ring0 mentioned in his comment - the recursion should only be executed if the sub-problem has not been solved before.
To answer the owner's question when does this get called? : in a solution based on a recursive program, the same function is called by itself... but eventually returns... When does it return? from the time the function ceased to call itself
f(a) {
if (a > 0) f(a-1);
display "x"
}
f(5);
f(5) would call f(4), in turns call f(3) that call f(2) which calls f(1) calling f(0).
f(0) has a being 0, so it does not call f(), and displays "x" then returns. It returns to the previous f(1) that, after calling f(0) - done - displays also "x". f(1) ends, f(2) displays "x", ... , until f(5). You get 6 "x".
In another terms from what ring0 has already mentioned - when the program reaches the base case and starts to unwind by going up the stack (call frames). For similar case using factorial example see this.
#!/usr/bin/env perl
use strict;
use IO::Handle;
use Carp qw(cluck);
STDOUT->autoflush(1);
STDERR->autoflush(1);
sub factorial {
my $v = shift;
dummy_func();
return 1 if $v == 1;
print "Variable v value: $v and it's address:", \$v, "\ncurrent sub factorial addr:", \&factorial, "\n","-"x40;
return $v * factorial($v - 1);
}
sub dummy_func {
cluck;
}
factorial(5);

Resources