I am looking for an efficient way to iterate over a permutation of the rows in a two-dimensional array in ndarray. I do not need to mutate or keep the permuted array around, so I want to avoid a copy.
That is, I want to do the following, except select allocates an unnecessary array:
use ndarray::{Axis, Array}; // 0.13.1
use rand::seq::SliceRandom; // 0.7.3
use std::iter::FromIterator;
fn main() {
let array = Array::from_iter(0..15).into_shape((5, 3)).unwrap();
println!("Before shuffling rows:\n{}", array);
let mut permutation: Vec<usize> = (0..array.nrows()).collect();
permutation.shuffle(&mut rand::thread_rng());
let permuted = array.select(Axis(0), &permutation);
for (i, row) in permuted.axis_iter(Axis(0)).enumerate() {
println!("Row number {} is {}!", i, row);
}
}
Playground.
I am aware that the ndarray Github page includes an example of something similar, but it involves a block of unsafe code that I do not understand and therefore prefer not to adapt to my own use case.
One obvious answer that I missed, using index_axis:
use ndarray::{Axis, Array}; // 0.13.1
use rand::seq::SliceRandom; // 0.7.3
use std::iter::FromIterator;
fn main() {
let array = Array::from_iter(0..15).into_shape((5, 3)).unwrap();
println!("Before shuffling rows:\n{}", array);
let mut permutation: Vec<usize> = (0..array.nrows()).collect();
permutation.shuffle(&mut rand::thread_rng());
for i in permutation.iter() {
let row = array.index_axis(Axis(0), *i);
println!("Row number {} is {}!", i, row);
}
}
I'm sure there are better ways of doing this, however, and I'm still interested to see them.
I'm trying to change an enum's named property but getting this error.
cannot move out of a mutable referencerustc(E0507)
parser.rs(323, 36): data moved here
parser.rs(323, 36): move occurs because `statements` has type `std::vec::Vec<std::boxed::Box<ast::ast::StatementType>>`, which does not implement the `Copy` trait
I saw that we can change enum's named props with match statements. But I couldn't understand why there's a move occurrence, since I'm borrowing the enum itself. Here's the code:
match &mut block {
StatementType::Block { mut statements, .. } => {
statements = block_statements;
},
_ => panic!()
};
return block;
I've tried mem::swap too but still it's the same error:
match &mut block {
StatementType::Block { mut statements, .. } => {
// statements = block_statements;
std::mem::swap(&mut statements, &mut block_statements);
},
_ => panic!()
};
return block;
BUT when I do this:
std::mem::swap(&mut *statements, &mut *block_statements);
The error changes to:
the size for values of type `[std::boxed::Box<ast::ast::StatementType>]` cannot be known at compilation time
doesn't have a size known at compile-time
Types are:
StatementType is an enum that derives Clone
Block is mutable variable of StatementType
Block's statements is a variable of Vec<Box<StatementType>>
block_statements is another variable of Vec<Box<StatementType>>
Please do not say that it happens because statements' type is Vector: come with a solution as I can read error messages too.
You have to think what the type of statements is and what you would like it to be.
With the code as you wrote it, it is of type Vec<_> (sorry, I said it), but since the match captures the block by reference, it cannot take the contents by value, hence the error. Note that the error is not in the assignment but in the match brace itself:
error[E0507]: cannot move out of a mutable reference
--> src/main.rs:15:11
|
15 | match &mut block {
| ^^^^^^^^^^
16 | StatementType::Block { mut statements, .. } => {
| --------------
| |
| data moved here
| move occurs because `statements` has type `std::vec::Vec<std::boxed::Box<StatementType>>`, which does not implement the `Copy` trait
You would like statement to be of type &mut Vec<_> of course. And you get that by using the ref mut capture mode:
match block {
StatementType::Block { ref mut statements, .. } => {
*statements = block_statements;
},
_ => panic!()
};
And remember to use *statement when assigning, as it is now a reference. You could also use a mem::swap if you want, of course:
std::mem::swap(statements, &mut block_statements);
But note that you do not need to match &mut block but you can do match block directly.
There is this thing called match ergonomics that lets you match against a reference and omit the ref mut capture mode, that makes your code easier to write and understand:
match &mut block {
StatementType::Block { statements, .. } => {
*statements = block_statements;
},
_ => panic!()
};
The problem in your original code is that if specify any capture mode then match ergonomics is disabled.
I'm learning Rust and would like to know how I can improve the code below.
I have a vector of tuples of form (u32, String). The u32 values represent line numbers and the Strings are the text on the corresponding lines. As long as all the String values can be successfully parsed as integers, I want to return an Ok<Vec<i32>> containing the just parsed String values, but if not I want to return an error of some form (just an Err<String> in the example below).
I'm trying to learn to avoid mutability and use functional styles where appropriate, and the above is straightforward to do functionally if that was all that was needed. Here's what I came up with in this case:
fn data_vals(sv: &Vec<(u32, String)>) -> Result<Vec<i32>, String> {
sv.iter()
.map(|s| s.1.parse::<i32>()
.map_err(|_e| "*** Invalid data.".to_string()))
.collect()
}
However, the small catch is that I want to print an error message for every invalid value (and not just the first one), and the error messages should contain both the line number and the string values in the offending tuple.
I've managed to do it with the following code:
fn data_vals(sv: &Vec<(u32, String)>) -> Result<Vec<i32>, String> {
sv.iter()
.map(|s| (s.0, s.1.parse::<i32>()
.or_else(|e| {
eprintln!("ERROR: Invalid data value at line {}: '{}'",
s.0, s.1);
Err(e)
})))
.collect::<Vec<(u32, Result<i32, _>)>>() // Collect here to avoid short-circuit
.iter()
.map(|i| i.1
.clone()
.map_err(|_e| "*** Invalid data.".to_string()))
.collect()
}
This works, but seems rather messy and cumbersome - especially the typed collect() in the middle to avoid short-circuiting so all the errors are printed. The clone() call is also annoying, and I'm not really sure why it's needed - the compiler says I'm moving out of borrowed content otherwise, but I'm not really sure what's being moved. Is there a way it can be done more cleanly? Or should I go back to a more procedural style? When I tried, I ended up with mutable variables and a flag to indicate success and failure, which seems less elegant:
fn data_vals(sv: &Vec<(u32, String)>) -> Result<Vec<i32>, String> {
let mut datavals = Vec::new();
let mut success = true;
for s in sv {
match s.1.parse::<i32>() {
Ok(v) => datavals.push(v),
Err(_e) => {
eprintln!("ERROR: Invalid data value at line {}: '{}'",
s.0, s.1);
success = false;
},
}
}
if success {
return Ok(datavals);
} else {
return Err("*** Invalid data.".to_string());
}
}
Can someone advise me on the best way to do this? Should I stick to the procedural style here, and if so can that be improved? Or is there a cleaner functional way to do it? Or a blend of the two? Any advice appreciated.
I think that's what partition_map() from itertools is for:
use itertools::{Either, Itertools};
fn data_vals<'a>(sv: &[&'a str]) -> Result<Vec<i32>, Vec<(&'a str, std::num::ParseIntError)>> {
let (successes, failures): (Vec<_>, Vec<_>) =
sv.iter().partition_map(|s| match s.parse::<i32>() {
Ok(v) => Either::Left(v),
Err(e) => Either::Right((*s, e)),
});
if failures.len() != 0 {
Err(failures)
} else {
Ok(successes)
}
}
fn main() {
let numbers = vec!["42", "aaaezrgggtht", "..4rez41eza", "55"];
println!("{:#?}", data_vals(&numbers));
}
In a purely functional style, you have to avoid side-effects.
Printing errors is a side-effect. The preferred style would be to return an object of the style:
Result<Vec<i32>, Vec<String>>
and print the list after the data_vals function returns.
So, essentially, you want your processing to collect a list of integers, and a list of strings:
fn data_vals(sv: &Vec<(u32, String)>) -> Result<Vec<i32>, Vec<String>> {
let (ok, err): (Vec<_>, Vec<_>) = sv
.iter()
.map(|(i, s)| {
s.parse()
.map_err(|_e| format!("ERROR: Invalid data value at line {}: '{}'", i, s))
})
.partition(|e| e.is_ok());
if err.len() > 0 {
Err(err.iter().filter_map(|e| e.clone().err()).collect())
} else {
Ok(ok.iter().filter_map(|e| e.clone().ok()).collect())
}
}
fn main() {
let input = vec![(1, "0".to_string())];
let r = data_vals(&input);
assert_eq!(r, Ok(vec![0]));
let input = vec![(1, "zzz".to_string())];
let r = data_vals(&input);
assert_eq!(r, Err(vec!["ERROR: Invalid data value at line 1: 'zzz'".to_string()]));
}
Playground Link
This uses partition which does not depend on an external crate.
Side effects (eprintln!) in an iterator adapter are definitely not "functional". You should accumulate and return the errors and let the caller deal with them.
I would use fold here. The goal of fold is to reduce a list to a single value, starting from an initial value and augmenting the result with every item. This "single value" can very well be a list, though. Here, though, there are two possible lists we might want to return: a list of i32 if all values are valid, or a list of errors if there are any errors (I've chosen to return Strings for errors here, for simplicity.)
fn data_vals(sv: &[(u32, String)]) -> Result<Vec<i32>, Vec<String>> {
sv.iter().fold(
Ok(Vec::with_capacity(sv.len())),
|acc, (line_number, data)| {
let data = data
.parse::<i32>()
.map_err(|_| format!("Invalid data value at line {}: '{}'", line_number, data));
match (acc, data) {
(Ok(mut acc_data), Ok(this_data)) => {
// No errors yet; push the parsed value to the values vector.
acc_data.push(this_data);
Ok(acc_data)
}
(Ok(..), Err(this_error)) => {
// First error: replace the accumulator with an `Err` containing the first error.
Err(vec![this_error])
}
(Err(acc_errors), Ok(..)) => {
// There have been errors, but this item is valid; ignore it.
Err(acc_errors)
}
(Err(mut acc_errors), Err(this_error)) => {
// One more error: push it to the error vector.
acc_errors.push(this_error);
Err(acc_errors)
}
}
},
)
}
fn main() {
println!("{:?}", data_vals(&[]));
println!("{:?}", data_vals(&[(1, "123".into())]));
println!("{:?}", data_vals(&[(1, "123a".into())]));
println!("{:?}", data_vals(&[(1, "123".into()), (2, "123a".into())]));
println!("{:?}", data_vals(&[(1, "123a".into()), (2, "123".into())]));
println!("{:?}", data_vals(&[(1, "123a".into()), (2, "123b".into())]));
}
The initial value is Ok(Vec::with_capacity(sv.len())) (this is an optimization to avoid reallocating the vector as we push items to it; a simpler version would be Ok(vec![])). If the slice is empty, this will be fold's result; the closure will never be called.
For each item, the closure checks 1) whether there were any errors so far (indicated by the accumulator value being an Err) or not and 2) whether the current item is valid or not. I'm matching on two Result values simultaneously (by combining them in a tuple) to handle all 4 cases. The closure then returns an Ok if there are no errors so far (with all the parsed values so far) or an Err if there are any errors so far (with every invalid value found so far).
You'll notice I used the push method to add an item to a Vec. This is, strictly speaking, mutation, which is not considered "functional", but because we are moving the Vecs here, we know there are no other references to them, so we know we aren't affecting any other use of these Vecs.
I'm working on a code challenge which will detect case-insensitive anagrams of a given word from a list of words.
My first cut is to use something like this:
pub fn anagrams_for(s: &'static str, v: &[&'static str]) -> Vec<&'static str> {
let mut outputs: Vec<&str> = vec![];
// Find the case-insensitive, sorted word to check
let mut s_sorted: Vec<_> = s.to_string().to_lowercase().chars().collect();
s_sorted.sort();
for word in v {
// Case-desensitize and sort each word in the slice
let mut word_sorted: Vec<_> = word.to_string().to_lowercase().chars().collect();
word_sorted.sort();
// if the case-insensitive words are the same post sort and not presort (to avoid self-anagrams), add it to the vector
if word_sorted == s_sorted && s.to_string().to_lowercase() != word.to_string().to_lowercase() {
outputs.push(word)
}
}
outputs
}
This works as expected, but is not very idiomatic. I'm now trying a second iteration which uses more functional features of Rust:
pub fn anagrams_for(s: &'static str, v: &[&'static str]) -> Vec<&'static str> {
let mut s_sorted: Vec<_> = s.to_string().to_lowercase().chars().collect();
s_sorted.sort();
v.iter().map(&|word: &str| {
let mut word_sorted: Vec<_> = word.to_string().to_lowercase().chars().collect();
word_sorted.sort();
if word_sorted == s_sorted && s.to_string().to_lowercase() != word.to_string().to_lowercase() {
word
}
}).collect()
}
I'm currently getting a few errors (most of which I could likely resolve), but the one I'm interested in solving is
if may be missing an else clause:
expected `()`,
found `&str`
(expected (),
found &-ptr) [E0308]
This is because in the case of a non-anagram, map attempts to push something into the vector (seemingly ()).
How can I handle this? It's possible that map isn't the best idiom because it requires some operation to be performed on each element in a list, not a subset (maybe filter?).
As you noticed, the problem is that in the non-anagram-case your closure (the || { ... } block) doesn't return a value.
You can solve this by using filter_map instead of map. That function takes a closure that returns Option<U> instead of U, so the last expression of your closure looks something like:
if /* ... */ {
Some(word)
} else {
None
}
Unrelated to the main question, some notes on your code:
You can remove the .to_string() calls before .to_lowercase() calls. the latter method belongs to the type str, so it works fine. Calling to_string() adds unnecessary allocations.
the & in front of the closure (&|...|) can most probably be removed...
... as can the : &str type annotation in the closures argument list
I am trying to parse a string into a list of floating-point values in Rust. I would assume there is a clever way to do this using iterators and Options; however, I cannot get it to ignore the Err values that result from failed parse() calls. The Rust tutorial doesn't seem to cover any examples like this, and the documentation is confusing at best.
How would I go about implementing this using the functional-style vector/list operations? Pointers on better Rust style for my iterative implementation would be appreciated as well!
"Functional"-style
Panics when it encounters an Err
input.split(" ").map(|s| s.parse::<f32>().unwrap()).collect::<Vec<_>>()
Iterative-style
Ignores non-float values as intended
fn parse_string(input: &str) -> Vec<f32> {
let mut vals = Vec::new();
for val in input.split_whitespace() {
match val.parse() {
Ok(v) => vals.push(v),
Err(_) => (),
}
}
vals
}
fn main() {
let params = parse_string("1 -5.2 3.8 abc");
for ¶m in params.iter() {
println!("{:.2}", param);
}
}
filter_map does what you want, transforming the values and filtering out Nones:
input.split(" ").filter_map(|s| s.parse::<f32>().ok()).collect::<Vec<_>>();
Note the ok method to convert the Result to an Option.