How to rewrite Vec.append "in favor of" extend? - vector

When the following is submitted to the compiler
fn main()
{
let abc = vec![10u, 20u, 30u];
let bcd = vec![20u, 30u, 40u];
let cde = abc.append(bcd.as_slice());
println!("{}", cde);
}
the compiler emits the following warning:
this function has been deprecated in favor of extend()
How would the equivalent look using extend?

Take a look at the signature for extend:
fn extend<I: Iterator<T>>(&mut self, iterator: I)
Note that it takes self by a mutable reference, and that it doesn’t take a slice but rather an iterator (which is more general-purpose).
The end result would look like this, then:
abc.extend(bcd.into_iter());
Or this:
abc.extend(bcd.iter().map(|&i| i))
(Bearing in mind that Vec.iter() produces something that iterates over references rather than values, hence the need for .map(|&i| i).)
I am a little surprised that it is recommending extend, as push_all is a much more direct replacement, taking a slice rather than an iterator:
abc.push_all(bcd.as_slice());

Related

Error: Reached the recursion limit while instantiating `<std::iter::Filter<std::iter::Sk...]>::{closure#0}]>::{closure#0}]>`

I have a function with following declaration:
fn find_solutions_internal<'a, I>(&self, words: I, iterate_from: usize, chain: &mut Vec<u32>)
where I: Iterator<Item=&'a u32> + Clone;
Because the function recursively iterates over the same vector (max-depth of recursion is 5) with different filters I decided that it would be more effecient to pass the iterator as an argument.
Following part of code causes error:
let iterator = words.skip(iterate_from).filter(|&mask| mask & last_mask == 0);
let filtered_words = iterator.clone();
iterator.enumerate().for_each(|(i, &mask)| {
chain.push(mask);
self.find_solutions_internal(filtered_words.clone(), i, chain); // filtered_words.clone() is what indirectly causes the error
chain.pop();
});
The error:
error: reached the recursion limit while instantiating `<std::iter::Filter<std::iter::Sk...]>::{closure#0}]>::{closure#0}]>`
115 | self.iter.fold(init, fold)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
|
note: `<std::iter::Filter<I, P> as Iterator>::fold` defined here
...
Self-contained example of the problematic code: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=ada8578f166ad2a34373d82cc376921f
Working example with collected iteration to vector: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=4c13d2d0ac17038dac61c9e2d59bbab5
As #Jmb commented, the compiler does not try figuring out how many times the function recurses, at least not for the purpose of allowing this kind of code to compile. The compiler assumes each call to find_solutions_internal() can potentially recurse, and so the compiler gets stuck repeatedly instantiating the function, as each recursive call has a unique iterator parameter type.
We can fix this problem by passing the iterator as a trait object when making the recursive call, though the fact that we're cloning the iterator complicates things, as the Clone trait doesn't work with trait objects. We can work around that with the dyn_clone crate, at the cost of some boilerplate.
First we define a clonable iterator trait:
use dyn_clone::DynClone;
trait ClonableIterator<T>: Iterator<Item = T> + DynClone {}
impl<I, T> ClonableIterator<T> for I where I: Iterator<Item = T> + DynClone {}
dyn_clone::clone_trait_object!(<T> ClonableIterator<T>);
Then in the recursive call we construct the trait object:
self.find_solutions_internal(
Box::new(filtered_words.clone()) as Box<dyn ClonableIterator<_>>,
i,
chain,
);
While the above solution works, I think it'll likely end up slower than doing the simple thing of just collecting into a Vec. If the vector is really big, using a datastructure like Vector from the im crate, which supports O(1) cloning and O(log n) remove might be faster.

How to invoke a multi-argument function without creating a closure?

I came across this while doing the 2018 Advent of Code (Day 2, Part 1) solution in Rust.
The problem to solve:
Take the count of strings that have exactly two of the same letter, multiplied by the count of strings that have exactly three of the same letter.
INPUT
abcdega
hihklmh
abqasbb
aaaabcd
The first string abcdega has a repeated twice.
The second string hihklmh has h repeated three times.
The third string abqasbb has a repeated twice, and b repeated three times, so it counts for both.
The fourth string aaaabcd contains a letter repeated 4 times (not 2, or 3) so it does not count.
So the result should be:
2 strings that contained a double letter (first and third) multiplied by 2 strings that contained a triple letter (second and third) = 4
The Question:
const PUZZLE_INPUT: &str =
"
abcdega
hihklmh
abqasbb
aaaabcd
";
fn letter_counts(id: &str) -> [u8;26] {
id.chars().map(|c| c as u8).fold([0;26], |mut counts, c| {
counts[usize::from(c - b'a')] += 1;
counts
})
}
fn has_repeated_letter(n: u8, letter_counts: &[u8;26]) -> bool {
letter_counts.iter().any(|&count| count == n)
}
fn main() {
let ids_iter = PUZZLE_INPUT.lines().map(letter_counts);
let num_ids_with_double = ids_iter.clone().filter(|id| has_repeated_letter(2, id)).count();
let num_ids_with_triple = ids_iter.filter(|id| has_repeated_letter(3, id)).count();
println!("{}", num_ids_with_double * num_ids_with_triple);
}
Rust Playground
Consider line 21. The function letter_counts takes only one argument, so I can use the syntax: .map(letter_counts) on elements that match the type of the expected argument. This is really nice to me! I love that I don't have to create a closure: .map(|id| letter_counts(id)). I find both to be readable, but the former version without the closure is much cleaner to me.
Now consider lines 22 and 23. Here, I have to use the syntax: .filter(|id| has_repeated_letter(3, id)) because the has_repeated_letter function takes two arguments. I would really like to do .filter(has_repeated_letter(3)) instead.
Sure, I could make the function take a tuple instead, map to a tuple and consume only a single argument... but that seems like a terrible solution. I'd rather just create the closure.
Leaving out the only argument is something that Rust lets you do. Why would it be any harder for the compiler to let you leave out the last argument, provided that it has all of the other n-1 arguments for a function that takes n arguments.
I feel like this would make the syntax a lot cleaner, and it would fit in a lot better with the idiomatic functional style that Rust prefers.
I am certainly no expert in compilers, but implementing this behavior seems like it would be straightforward. If my thinking is incorrect, I would love to know more about why that is so.
No, you cannot pass a function with multiple arguments as an implicit closure.
In certain cases, you can choose to use currying to reduce the arity of a function. For example, here we reduce the add function from 2 arguments to one:
fn add(a: i32, b: i32) -> i32 {
a + b
}
fn curry<A1, A2, R>(f: impl FnOnce(A1, A2) -> R, a1: A1) -> impl FnOnce(A2) -> R {
move |a2| f(a1, a2)
}
fn main() {
let a = Some(1);
a.map(curry(add, 2));
}
However, I agree with the comments that this isn't a benefit:
It's not any less typing:
a.map(curry(add, 2));
a.map(|v| add(v, 2));
The curry function is extremely limited: it chooses to use FnOnce, but Fn and FnMut also have use cases. It only applies to a function with two arguments.
However, I have used this higher-order function trick in other projects, where the amount of code that is added is much greater.

How to chain tokio read functions?

Is there a way to chain the read_* functions in tokio::io in a "recursive" way ?
I'm essentially looking to do something like:
read_until x then read_exact y then write response then go back to the top.
In case you are confused what functions i'm talking about: https://docs.rs/tokio/0.1.11/tokio/io/index.html
Yes, there is a way.
read_until is returns a struct ReadUntil, which implements the Future-trait, which iteself provides a lot of useful functions, e.g. and_then which can be used to chain futures.
A simple (and silly) example looks like this:
extern crate futures;
extern crate tokio_io; // 0.1.8 // 0.1.24
use futures::future::Future;
use std::io::Cursor;
use tokio_io::io::{read_exact, read_until};
fn main() {
let cursor = Cursor::new(b"abcdef\ngh");
let mut buf = vec![0u8; 2];
println!(
"{:?}",
String::from_utf8_lossy(
read_until(cursor, b'\n', vec![])
.and_then(|r| read_exact(r.0, &mut buf))
.wait()
.unwrap()
.1
)
);
}
Here I use a Cursor, which happens to implement the AsyncRead-trait and use the read_until function to read until a newline occurs (between 'f' and 'g').
Afterwards to chain those I use and_then to use read_exact in case of an success, use wait to get the Result unwrap it (don't do this in production kids!) and take the second argument from the tuple (the first one is the cursor).
Last I convert the Vec into a String to display "gh" with println!.

Concatenate a vector of vectors of strings

I'm trying to write a function that receives a vector of vectors of strings and returns all vectors concatenated together, i.e. it returns a vector of strings.
The best I could do so far has been the following:
fn concat_vecs(vecs: Vec<Vec<String>>) -> Vec<String> {
let vals : Vec<&String> = vecs.iter().flat_map(|x| x.into_iter()).collect();
vals.into_iter().map(|v: &String| v.to_owned()).collect()
}
However, I'm not happy with this result, because it seems I should be able to get Vec<String> from the first collect call, but somehow I am not able to figure out how to do it.
I am even more interested to figure out why exactly the return type of collect is Vec<&String>. I tried to deduce this from the API documentation and the source code, but despite my best efforts, I couldn't even understand the signatures of functions.
So let me try and trace the types of each expression:
- vecs.iter(): Iter<T=Vec<String>, Item=Vec<String>>
- vecs.iter().flat_map(): FlatMap<I=Iter<Vec<String>>, U=???, F=FnMut(Vec<String>) -> U, Item=U>
- vecs.iter().flat_map().collect(): (B=??? : FromIterator<U>)
- vals was declared as Vec<&String>, therefore
vals == vecs.iter().flat_map().collect(): (B=Vec<&String> : FromIterator<U>). Therefore U=&String.
I'm assuming above that the type inferencer is able to figure out that U=&String based on the type of vals. But if I give the expression the explicit types in the code, this compiles without error:
fn concat_vecs(vecs: Vec<Vec<String>>) -> Vec<String> {
let a: Iter<Vec<String>> = vecs.iter();
let b: FlatMap<Iter<Vec<String>>, Iter<String>, _> = a.flat_map(|x| x.into_iter());
let c = b.collect();
print_type_of(&c);
let vals : Vec<&String> = c;
vals.into_iter().map(|v: &String| v.to_owned()).collect()
}
Clearly, U=Iter<String>... Please help me clear up this mess.
EDIT: thanks to bluss' hint, I was able to achieve one collect as follows:
fn concat_vecs(vecs: Vec<Vec<String>>) -> Vec<String> {
vecs.into_iter().flat_map(|x| x.into_iter()).collect()
}
My understanding is that by using into_iter I transfer ownership of vecs to IntoIter and further down the call chain, which allows me to avoid copying the data inside the lambda call and therefore - magically - the type system gives me Vec<String> where it used to always give me Vec<&String> before. While it is certainly very cool to see how the high-level concept is reflected in the workings of the library, I wish I had any idea how this is achieved.
EDIT 2: After a laborious process of guesswork, looking at API docs and using this method to decipher the types, I got them fully annotated (disregarding the lifetimes):
fn concat_vecs(vecs: Vec<Vec<String>>) -> Vec<String> {
let a: Iter<Vec<String>> = vecs.iter();
let f : &Fn(&Vec<String>) -> Iter<String> = &|x: &Vec<String>| x.into_iter();
let b: FlatMap<Iter<Vec<String>>, Iter<String>, &Fn(&Vec<String>) -> Iter<String>> = a.flat_map(f);
let vals : Vec<&String> = b.collect();
vals.into_iter().map(|v: &String| v.to_owned()).collect()
}
I'd think about: why do you use iter() on the outer vec but into_iter() on the inner vecs? Using into_iter() is actually crucial, so that we don't have to copy first the inner vectors, then the strings inside, we just receive ownership of them.
We can actually write this just like a summation: concatenate the vectors two by two. Since we always reuse the allocation & contents of the same accumulation vector, this operation is linear time.
To minimize time spent growing and reallocating the vector, calculate the space needed up front.
fn concat_vecs(vecs: Vec<Vec<String>>) -> Vec<String> {
let size = vecs.iter().fold(0, |a, b| a + b.len());
vecs.into_iter().fold(Vec::with_capacity(size), |mut acc, v| {
acc.extend(v); acc
})
}
If you do want to clone all the contents, there's already a method for that, and you'd just use vecs.concat() /* -> Vec<String> */
The approach with .flat_map is fine, but if you don't want to clone the strings again you have to use .into_iter() on all levels: (x is Vec<String>).
vecs.into_iter().flat_map(|x| x.into_iter()).collect()
If instead you want to clone each string you can use this: (Changed .into_iter() to .iter() since x here is a &Vec<String> and both methods actually result in the same thing!)
vecs.iter().flat_map(|x| x.iter().map(Clone::clone)).collect()

Is there a way of providing a final transform method when chaining operations (like map reduce) in underscore.js?

(Really strugging to title this question, so if anyone has suggestions feel free.)
Say I wanted to do an operation like:
take an array [1,2,3]
multiply each element by 2 (map): [2,4,6]
add the elements together (reduce): 12
multiply the result by 10: 120
I can do this pretty cleanly in underscore using chaining, like so:
arr = [1,2,3]
map = (el) -> 2*el
reduce = (s,n) -> s+n
out = (r) -> 10*r
reduced = _.chain(arr).map(map).reduce(reduce).value()
result = out(reduced)
However, it would be even nicer if I could chain the 'out' method too, like this:
result = _.chain(arr).map(map).reduce(reduce).out(out).value()
Now this would be a fairly simple addition to a library like underscore. But my questions are:
Does this 'out' method have a name in functional programming?
Does this already exist in underscore (tap comes close, but not quite).
This question got me quite hooked. Here are some of my thoughts.
It feels like using underscore.js in 'chain() mode' breaks away from functional programming paradigm. Basically, instead of calling functions on functions, you're calling methods of an instance of a wrapper object in an OOP way.
I am using underscore's chain() myself here and there, but this question made me think. What if it's better to simply create more meaningful functions that can then be called in a sequence without having to use chain() at all. Your example would then look something like this:
arr = [1,2,3]
double = (arr) -> _.map(arr, (el) -> 2 * el)
sum = (arr) -> _.reduce(arr, (s, n) -> s + n)
out = (r) -> 10 * r
result = out sum double arr
# probably a less ambiguous way to do it would be
result = out(sum(double arr))
Looking at real functional programming languages (as in .. much more functional than JavaScript), it seems you could do exactly the same thing there in an even simpler manner. Here is the same program written in Standard ML. Notice how calling map with only one argument returns another function. There is no need to wrap this map in another function like we did in JavaScript.
val arr = [1,2,3];
val double = map (fn x => 2*x);
val sum = foldl (fn (a,b) => a+b) 0;
val out = fn r => 10*r;
val result = out(sum(double arr))
Standard ML also lets you create operators which means we can make a little 'chain' operator that can be used to call those functions in a more intuitive order.
infix 1 |>;
fun x |> f = f x;
val result = arr |> double |> sum |> out
I also think that this underscore.js chaining has something similar to monads in functional programming, but I don't know much about those. Though, I have feeling that this kind of data manipulation pipeline is not something you would typically use monads for.
I hope someone with more functional programming experience can chip in and correct me if I'm wrong on any of the points above.
UPDATE
Getting slightly off topic, but one way to creating partial functions could be the following:
// extend underscore with partialr function
_.mixin({
partialr: function (fn, context) {
var args = Array.prototype.slice.call(arguments, 2);
return function () {
return fn.apply(context, Array.prototype.slice.call(arguments).concat(args));
};
}
});
This function can now be used to create a partial function from any underscore function, because most of them take the input data as the first argument. For example, the sum function can now be created like
var sum = _.partialr(_.reduce, this, function (s, n) { return s + n; });
sum([1,2,3]);
I still prefer arr |> double |> sum |> out over out(sum(double(arr))) though. Underscore's chain() is nice in that it reads in a more natural order.
In terms of the name you are looking for, I think what you are trying to do is just a form of function application: you have an underscore object and you want to apply a function to its value. In underscore, you can define it like this:
_.mixin({
app: function(v, f) { return f (v); }
});
then you can pretty much do what you asked for:
var arr = [1,2,3];
function m(el) { return 2*el; };
function r(s,n) { return s+n; };
function out(r) { return 10*r; };
console.log("result: " + _.chain(arr).map(m).reduce(r).app(out).value()));
Having said all that, I think using traditional typed functional languages like SML make this kind of think a lot slicker and give much lighter weight syntax for function composition. Underscore is doing a kind of jquery twist on functional programming that I'm not sure what I think of; but without static-type checking it is frustratingly easy to make errors!

Resources