fn get_variable_info (route_path: &str) -> HashMap<String, uint> {
let mut map = HashMap::new();
let mut i = 0;
for matched in REGEX_VAR_SEQ.captures_iter(route_path) {
map.insert(matched.at(1).to_string(), i);
i = i + 1;
}
map
}
I have this function that takes a &str and loops through an Iterator of captures to produce a HashMap<String, uint>. I don't like the imperative fashion of this and wonder if this could be rewritten in a more functional way in Rust?
In pseudo code, something like this would be more what I'm after.
let mut i = 0;
REGEX_VAR_SEQ
.captures_iter(route_path)
.map(| matched | {
KeyValuePair{
key: matched.at(1).to_string(),
value: i
}
i = i + 1;
KeyValuePair
})
.toHashMap()
Well, this is still not perfect because I don't like the i variable but my first goal would be to get rid of the imperative loop :)
You’re pretty close! Your KeyValuePair and toHashMap is actually Iterator.collect, which works on the FromIterator trait, which HashMap implements for (K, V) pairs.
Thus, it’s something like [(k, v), (k, v), (k, v)].move_iter().collect::<HashMap<K, V>>().
For the i part, there is Iterator.enumerate, which turns [a, b, c] into [(0, a), (1, b), (2, c)].
And so this is the end result:
REGEX_VAR_SEQ.captures_iter(route_path)
.enumerate()
.map(|(i, matched)| (matched.at(1).to_string(), i))
.collect()
(You can either leave HashMap<String, int> to be inferred if it can be (e.g. a method return type), or specify it on the collect call, .collect::<HashMap<_, _>>().)
Related
I want to be able to repeat a process where a collection that we are iterating over is altered an n number of times. n is only known at runtime, and can be specified by the user, so we cannot hard-code it into the type.
An approach that uses intermediate data structures by collect-ing between iterations is possible, like so:
let n = 10;
let mut vec1 = vec![1, 2, 3];
{
for _index in 0..n {
let temp_vec = vec1.into_iter().flat_map(|x| vec![x, x * 2]).collect();
vec1 = temp_vec;
}
}
However, this seems wasteful, because we are creating intermediate datastructures, so I went on looking for a solution that chains iterators directly.
At first I thought one could just do something like:
let mut iter = vec![1, 2, 3].into_iter();
for index in 0..n {
iter = iter.flat_map(|x| vec![x, x * 2].into_iter());
}
However, this does not work because in Rust, all functions on iterators return their own kind of 'compound iterator' struct. (In for instance Haskell, functions on iterators return the appropriate kind of result iterator, which does not become a 'bigger and bigger compound type'.)
Rewriting this as a recursive function had similar problems because (a) I was returning 'some kind of Iterator' whose type was (near?)-impossible to write out by hand because of the recursion, and (b) this type was different in the base case from the recursive case.
I found this question about conditionally returning either one or the other iterator type, as well as using impl Iterator to indicate that we return some concrete type that implements the Iterator trait, but we do not care about its exact nature.
A similar example to the code in the linked answer has been implemented in the code below as maybe_flatmap. This works.
However, I don't want to run flat_map zero or one time, but rather N times on the incoming iterator. Therefore, I adapted the code to call itself recursively up to a depth of N.
Attempting to do that, then makes the Rust compiler complain with an error[E0720]: opaque type expands to a recursive type:
use either::Either; // 1.5.3
/// Later we want to work with any appropriate items,
/// but for simplicity's sake, just use plain integers for now.
type I = u64;
/// Works, but limited to single level.
fn maybe_flatmap<T: Iterator<Item = I>>(iter: T, flag: bool) -> impl Iterator<Item = I> {
match flag {
false => Either::Left(iter),
true => Either::Right(iter.flat_map(move |x| vec![x, x * 2].into_iter())),
}
}
/// Does not work: opaque type expands to a recursive type!
fn rec_flatmap<T: Iterator<Item = I>>(iter: T, depth: usize) -> impl Iterator<Item = I> {
match depth {
0 => Either::Left(iter),
_ => {
let iter2 = iter.flat_map(move |x| vec![x, x * 2]).into_iter();
Either::Right(rec_flatmap(iter2, depth - 1))
}
}
}
fn main() {
let xs = vec![1, 2, 3, 4];
let xs2 = xs.into_iter();
let xs3 = maybe_flatmap(xs2, true);
let xs4: Vec<_> = xs3.collect();
println!("{:?}", xs4);
let ys = vec![1, 2, 3, 4];
let ys2 = ys.into_iter();
let ys3 = rec_flatmap(ys2, 5);
let ys4: Vec<_> = ys3.collect();
println!("{:?}", ys4);
}
Rust playground
error[E0720]: opaque type expands to a recursive type
--> src/main.rs:16:65
|
16 | fn rec_flatmap<T: Iterator<Item = I>>(iter: T, depth: usize) -> impl Iterator<Item = I> {
| ^^^^^^^^^^^^^^^^^^^^^^^ expands to a recursive type
|
= note: expanded type is `either::Either<T, impl std::iter::Iterator>`
I am stuck.
Since regardless of how often you flat_map, the final answer is going to be an (iterator over) a vector of integers, it seems like there ought to be a way of writing this function using only a single concrete return type.
Is this possible? Is there a way out of this situation without resorting to runtime polymorphism?
I believe/hope that a solution without dynamic polymorphism (trait objects or the like) is possible because regardless of how often you call flat_map the end result should have (at least morally) have the same type. I hope there is a way to shoehorn the (non-matching) nested FlatMap struct in a matching single static type somehow.
Is there a way to resolve this without runtime polymorphism?
No.
To solve it using a trait object:
let mut iter: Box<dyn Iterator<Item = i32>> = Box::new(vec![1, 2, 3].into_iter());
for _ in 0..n {
iter = Box::new(iter.flat_map(|x| vec![x, x * 2].into_iter()));
}
regardless of how often you call flat_map the end result should have (at least morally) have the same type
I don't know which morality to apply to type systems, but the literal size in memory is (very likely to be) different for FlatMap<...> and FlatMap<FlatMap<...>>. They are different types.
See also:
Conditionally iterate over one of several possible iterators
Creating Diesel.rs queries with a dynamic number of .and()'s
How do I iterate over a Vec of functions returning Futures in Rust?
How can I extend the lifetime of a temporary variable inside of an iterator adaptor in Rust?
Why does Iterator::take_while take ownership of the iterator?
I have a program in which I fold over a vector of strings, and increment a value and build another vector over that. Something like this:
struct Widget { foo: isize, text: &'static str }
let foos = vec!["a", "b", "c"];
let (final, ws) = foos.iter().fold((0, widgets), |(x, ws), text| {
(x+1, Widget { foo: x, text: text })
});
The problem with the above code is that I need to append the new widget to ws, not return the new widget in place of ws. To do this, I need a way to non-destructively append a value to a vector. Here's the signature, roughly, of the function I'm looking for:
fn append<T>(vec: Vec<T>, item: T) -> Vec<T>
I can't seem to find a function like this. It seems like this (or this as a method on Vec) would be very idiomatic for Rust, but its just not there. Is there another way to implement this more idiomatically, or is there a function like the above?
non-destructively append a value to a vector
This is impossible. Modifying a vector... modifies it. You can't get around that. You can choose to clone the vector, which avoids mutating the original vector. You still need to mutate the clone, however:
let things = vec![1];
let things2 = {
let mut tmp = things.clone();
tmp.push(2);
tmp
};
If you are done with the original vector, you can reuse it:
let things = vec![1];
let things2 = {
let mut tmp = things;
tmp.push(2);
tmp
};
This can be extracted into your desired function signature:
fn append(vec: Vec<i32>, item: i32) -> Vec<i32> {
let mut vec = vec;
vec.push(item);
vec
};
Which would idiomatically but equivalently be written as
fn append(mut vec: Vec<i32>, item: i32) -> Vec<i32> {
vec.push(item);
vec
};
let things = vec![1];
let things2 = append(things.clone(), 2);
let things2 = append(things, 2);
It seems like this (or this as a method on Vec) would be very idiomatic for Rust
Not really - to have this method signature, you have to own the Vec. It's more common to only have a mutable reference to the Vec, which offers a different set of capabilities.
There was some talk about implementing Add for Vec, which might have allowed something like let things2 = things + 2, but I don't know that went anywhere.
For what it's worth, I'd write your code as:
let foos = ["a", "b", "c"];
let ws: Vec<_> = foos.iter()
.enumerate()
.map(|(i, text)| Widget { foo: i as isize, text })
.collect();
I tried the following code:
fn main() {
let v2 = vec![1; 10];
println!("{}", v2);
}
But the compiler complains:
error[E0277]: `std::vec::Vec<{integer}>` doesn't implement `std::fmt::Display`
--> src/main.rs:3:20
|
3 | println!("{}", v2);
| ^^ `std::vec::Vec<{integer}>` cannot be formatted with the default formatter
|
= help: the trait `std::fmt::Display` is not implemented for `std::vec::Vec<{integer}>`
= note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
= note: required by `std::fmt::Display::fmt`
Does anyone implement this trait for Vec<T>?
let v2 = vec![1; 10];
println!("{:?}", v2);
{} is for strings and other values which can be displayed directly to the user. There's no single way to show a vector to a user.
The {:?} formatter can be used to debug it, and it will look like:
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Display is the trait that provides the method behind {}, and Debug is for {:?}
Does anyone implement this trait for Vec<T> ?
No.
And surprisingly, this is a demonstrably correct answer; which is rare since proving the absence of things is usually hard or impossible. So how can we be so certain?
Rust has very strict coherence rules, the impl Trait for Struct can only be done:
either in the same crate as Trait
or in the same crate as Struct
and nowhere else; let's try it:
impl<T> std::fmt::Display for Vec<T> {
fn fmt(&self, _: &mut std::fmt::Formatter) -> Result<(), std::fmt::Error> {
Ok(())
}
}
yields:
error[E0210]: type parameter `T` must be used as the type parameter for some local type (e.g., `MyStruct<T>`)
--> src/main.rs:1:1
|
1 | impl<T> std::fmt::Display for Vec<T> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ type parameter `T` must be used as the type parameter for some local type
|
= note: only traits defined in the current crate can be implemented for a type parameter
Furthermore, to use a trait, it needs to be in scope (and therefore, you need to be linked to its crate), which means that:
you are linked both with the crate of Display and the crate of Vec
neither implement Display for Vec
and therefore leads us to conclude that no one implements Display for Vec.
As a work around, as indicated by Manishearth, you can use the Debug trait, which is invokable via "{:?}" as a format specifier.
If you know the type of the elements that the vector contains, you could make a struct that takes vector as an argument and implement Display for that struct.
use std::fmt::{Display, Formatter, Error};
struct NumVec(Vec<u32>);
impl Display for NumVec {
fn fmt(&self, f: &mut Formatter) -> Result<(), Error> {
let mut comma_separated = String::new();
for num in &self.0[0..self.0.len() - 1] {
comma_separated.push_str(&num.to_string());
comma_separated.push_str(", ");
}
comma_separated.push_str(&self.0[self.0.len() - 1].to_string());
write!(f, "{}", comma_separated)
}
}
fn main() {
let numbers = NumVec(vec![1; 10]);
println!("{}", numbers);
}
Here is a one-liner which should also work for you:
println!("[{}]", v2.iter().fold(String::new(), |acc, &num| acc + &num.to_string() + ", "));
Here is
a runnable example.
In my own case, I was receiving a Vec<&str> from a function call. I did not want to change the function signature to a custom type (for which I could implement the Display trait).
For my one-of case, I was able to turn the display of my Vec into a one-liner which I used with println!() directly as follows:
println!("{}", myStrVec.iter().fold(String::new(), |acc, &arg| acc + arg));
(The lambda can be adapted for use with different data types, or for more concise Display trait implementations.)
Starting with Rust 1.58, there is a slightly more concise way to print a vector (or any other variable). This lets you put the variable you want to print inside the curly braces, instead of needing to put it at the end. For the debug formatting needed to print a vector, you add :? in the braces, like this:
fn main() {
let v2 = vec![1; 10];
println!("{v2:?}");
}
Sometimes you don't want to use something like the accepted answer
let v2 = vec![1; 10];
println!("{:?}", v2);
because you want each element to be displayed using its Display trait, not its Debug trait; however, as noted, you can't implement Display on Vec because of Rust's coherence rules. Instead of implementing a wrapper struct with the Display trait, you can implement a more general solution with a function like this:
use std::fmt;
pub fn iterable_to_str<I, D>(iterable: I) -> String
where
I: IntoIterator<Item = D>,
D: fmt::Display,
{
let mut iterator = iterable.into_iter();
let head = match iterator.next() {
None => return String::from("[]"),
Some(x) => format!("[{}", x),
};
let body = iterator.fold(head, |a, v| format!("{}, {}", a, v));
format!("{}]", body)
}
which doesn't require wrapping your vector in a struct. As long as it implements IntoIterator and the element type implements Display, you can then call:
println!("{}", iterable_to_str(it));
Is there any reason not to write the vector's content item by item w/o former collecting? *)
use std::fmt::{Display, Formatter, Error};
struct NumVec(Vec<u32>);
impl Display for NumVec {
fn fmt(&self, f: &mut Formatter) -> Result<(), Error> {
let v = &self.0;
if v.len() == 0 {
return Ok(());
}
for num in &v[0..v.len() - 1] {
if let Err(e) = write!(f, "{}, ", &num.to_string()) {
return Err(e);
}
}
write!(f, "{}", &v[v.len() - 1])
}
}
fn main() {
let numbers = NumVec(vec![1; 10]);
println!("{}", numbers);
}
*) No there isn't.
Because we want to display something, the Display trait is implemented for sure. So this is correct Rust because: the Doc says about the ToString trait:
"This trait is automatically implemented for any type which implements the Display trait. As such, ToString shouldn’t be implemented directly: Display should be implemented instead, and you get the ToString implementation for free."
In particular on microcontrollers where space is limited I definitely would go with this solution and write immediately.
To create a new vector with the contents of other vectors, I'm currently doing this:
fn func(a: &Vec<i32>, b: &Vec<i32>, c: &Vec<i32>) {
let abc = Vec<i32> = {
let mut tmp = Vec::with_capacity(a.len(), b.len(), c.len());
tmp.extend(a);
tmp.extend(b);
tmp.extend(c);
tmp
};
// ...
}
Is there a more straightforward / elegant way to do this?
There is a concat method that can be used for this, however the values need to be slices, or borrowable to slices, not &Vec<_> as given in the question.
An example, similar to the question:
fn func(a: &Vec<i32>, b: &Vec<i32>, c: &Vec<i32>) {
let abc = Vec<i32> = [a.as_slice(), b.as_slice(), c.as_slice()].concat();
// ...
}
However, as #mindTree notes, using &[i32] type arguments is more idiomatic and removes the need for conversion. eg:
fn func(a: &[i32], b: &[i32], c: &[i32]) {
let abc = Vec<i32> = [a, b, c].concat();
// ...
}
SliceConcatExt::concat is a more general version of your function and can join multiple slices to a Vec. It will sum the sizes each slice to pre-allocate a Vec of the right capacity, then extend repeatedly.
fn concat(&self) -> Vec<T> {
let size = self.iter().fold(0, |acc, v| acc + v.borrow().len());
let mut result = Vec::with_capacity(size);
for v in self {
result.extend_from_slice(v.borrow())
}
result
}
One possible solution might be to use the Chain iterator:
let abc: Vec<_> = a.iter().chain(b).chain(c).collect();
However, in your example you are borrowing the slices, so we'll need to either deref each borrowed element or use the Cloned iterator to copy each integer. Cloned is probably a bit easier and as efficient as we are working with small Copy data (i32):
let abc: Vec<_> = a.iter().cloned()
.chain(b.iter().cloned())
.chain(c.iter().cloned())
.collect();
Seeing as each of these iterators are ExactSizeIterators, it should be possible to allocate the exact size for the target Vec up front, however I'm unware whether or not this is actually the case in the std implementation (they might be waiting on specialization to land before adding this optimisation).
I wrote a method:
fn foo(input: HashMap<String, Vec<String>>) {...}
I then realized that for the purpose of writing tests, I'd like to have control of the iteration order (maybe a BTreeMap or LinkedHashMap). This led to two questions:
Is there some trait or combination of traits I could use that would essentially express "a map of string to string-vector"? I didn't see anything promising in the docs for HashMap.
It turns out that in this method, I just want to iterate over the map entries, and then the items in each string vector, but couldn't figure out the right syntax for specifying this. What's the correct way to write this?
fn foo(input: IntoIterator<(String, IntoIterator<String>)>) {...}
There's no such trait to describe an abstract HashMap. I believe there's no plan to make one. The best answer so far is your #2 suggestion: for a read-only HashMap you probably just want something to iterate on.
To answer at the syntax level, you tried to write:
fn foo(input: IntoIterator<(String, IntoIterator<String>)>)
But this is not valid because IntoIterator takes no template argument:
pub trait IntoIterator where Self::IntoIter::Item == Self::Item {
type Item;
type IntoIter: Iterator;
fn into_iter(self) -> Self::IntoIter;
}
It takes two associated types, however, so what you really wanted to express is probably the following (internally I changed the nested IntoIterator to a concrete type like Vec for simplicity):
fn foo<I>(input: I)
where I: IntoIterator<
Item=(String, Vec<String>),
IntoIter=IntoIter<String, Vec<String>>>
However the choice if IntoIterator is not always suitable because it implies a transfer of ownership. If you just wanted to borrow the HashMap for read-only purposes, you'd be probably better with the standard iterator trait of a HashMap, Iterator<Item=(&'a String, &'a Vec<String>)>.
fn foo_iter<'a, I>(input: I)
where I: Iterator<Item=(&'a String, &'a Vec<String>)>
Which you can use several times by asking for a new iterator, unlike the first version.
let mut h = HashMap::new();
h.insert("The Beatles".to_string(),
vec!["Come Together".to_string(),
"Twist And Shout".to_string()]);
h.insert("The Rolling Stones".to_string(),
vec!["Paint It Black".to_string(),
"Satisfaction".to_string()]);
foo_iter(h.iter());
foo_iter(h.iter());
foo(h);
//foo(h); <-- error: use of moved value: `h`
Full gist
EDIT
As asked in comments, here is the version of foo for nested IntoIterators instead of the simpler Vec:
fn foo<I, IVecString>(input: I)
where
I: IntoIterator<
Item=(String, IVecString),
IntoIter=std::collections::hash_map::IntoIter<String, IVecString>>,
IVecString: IntoIterator<
Item=String,
IntoIter=std::vec::IntoIter<String>>
There are not traits that define a common interface for containers. The only trait that maybe is suited for your is the Index trait.
See below for a working example of the correct syntax for IntoIterator and the Index traits. You need to use references if you don't want consume the input, so be careful with lifetime parameters.
use std::ops::Index;
use std::iter::IntoIterator;
use std::collections::HashMap;
// this consume the input
fn foo<I: IntoIterator<Item = (String, String)>>(input: I) {
let mut c = 0;
for _ in input {
c += 1;
}
println!("{}", c);
}
// maybe you want this
fn foo_ref<'a, I: IntoIterator<Item = (&'a String, &'a String)>>(input: I) {
let mut c = 0;
for _ in input {
c += 1;
}
println!("{}", c);
}
fn get<'a, I: Index<&'a String, Output = String>>(table: &I, k: &'a String) {
println!("{}", table[k]);
}
fn main() {
let mut h = HashMap::<String, String>::new();
h.insert("one".to_owned(), "1".to_owned());
h.insert("two".to_owned(), "2".to_owned());
h.insert("three".to_owned(), "3".to_owned());
foo_ref(&h);
get(&h, &"two".to_owned());
}
Edit
I changed the value type to everything implements the IntoIterator trait :
use std::ops::Index;
use std::iter::IntoIterator;
use std::collections::HashMap;
use std::collections::LinkedList;
fn foo_ref<'a, B, I, >(input: I)
where B : IntoIterator<Item = String>, I: IntoIterator<Item = (&'a String, &'a B)> {
//
}
fn get<'a, B, I>(table: &I, k: &'a String)
where B : IntoIterator<Item = String>, I: Index<&'a String, Output = B>
{
// do something with table[k];
}
fn main() {
let mut h1 = HashMap::<String, Vec<String>>::new();
let mut h2 = HashMap::<String, LinkedList<String>>::new();
foo_ref(&h1);
get(&h1, &"two".to_owned());
foo_ref(&h2);
get(&h2, &"two".to_owned());
}