ndarray: Iterate over shuffled rows - multidimensional-array

I am looking for an efficient way to iterate over a permutation of the rows in a two-dimensional array in ndarray. I do not need to mutate or keep the permuted array around, so I want to avoid a copy.
That is, I want to do the following, except select allocates an unnecessary array:
use ndarray::{Axis, Array}; // 0.13.1
use rand::seq::SliceRandom; // 0.7.3
use std::iter::FromIterator;
fn main() {
let array = Array::from_iter(0..15).into_shape((5, 3)).unwrap();
println!("Before shuffling rows:\n{}", array);
let mut permutation: Vec<usize> = (0..array.nrows()).collect();
permutation.shuffle(&mut rand::thread_rng());
let permuted = array.select(Axis(0), &permutation);
for (i, row) in permuted.axis_iter(Axis(0)).enumerate() {
println!("Row number {} is {}!", i, row);
}
}
Playground.
I am aware that the ndarray Github page includes an example of something similar, but it involves a block of unsafe code that I do not understand and therefore prefer not to adapt to my own use case.

One obvious answer that I missed, using index_axis:
use ndarray::{Axis, Array}; // 0.13.1
use rand::seq::SliceRandom; // 0.7.3
use std::iter::FromIterator;
fn main() {
let array = Array::from_iter(0..15).into_shape((5, 3)).unwrap();
println!("Before shuffling rows:\n{}", array);
let mut permutation: Vec<usize> = (0..array.nrows()).collect();
permutation.shuffle(&mut rand::thread_rng());
for i in permutation.iter() {
let row = array.index_axis(Axis(0), *i);
println!("Row number {} is {}!", i, row);
}
}
I'm sure there are better ways of doing this, however, and I'm still interested to see them.

Related

How to convert a vector of vectors into a vector of slices without creating a new object? [duplicate]

I have the following:
enum SomeType {
VariantA(String),
VariantB(String, i32),
}
fn transform(x: SomeType) -> SomeType {
// very complicated transformation, reusing parts of x in order to produce result:
match x {
SomeType::VariantA(s) => SomeType::VariantB(s, 0),
SomeType::VariantB(s, i) => SomeType::VariantB(s, 2 * i),
}
}
fn main() {
let mut data = vec![
SomeType::VariantA("hello".to_string()),
SomeType::VariantA("bye".to_string()),
SomeType::VariantB("asdf".to_string(), 34),
];
}
I would now like to call transform on each element of data and store the resulting value back in data. I could do something like data.into_iter().map(transform).collect(), but this will allocate a new Vec. Is there a way to do this in-place, reusing the allocated memory of data? There once was Vec::map_in_place in Rust but it has been removed some time ago.
As a work-around, I've added a Dummy variant to SomeType and then do the following:
for x in &mut data {
let original = ::std::mem::replace(x, SomeType::Dummy);
*x = transform(original);
}
This does not feel right, and I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop. Is there a better way of doing this?
Your first problem is not map, it's transform.
transform takes ownership of its argument, while Vec has ownership of its arguments. Either one has to give, and poking a hole in the Vec would be a bad idea: what if transform panics?
The best fix, thus, is to change the signature of transform to:
fn transform(x: &mut SomeType) { ... }
then you can just do:
for x in &mut data { transform(x) }
Other solutions will be clunky, as they will need to deal with the fact that transform might panic.
No, it is not possible in general because the size of each element might change as the mapping is performed (fn transform(u8) -> u32).
Even when the sizes are the same, it's non-trivial.
In this case, you don't need to create a Dummy variant because creating an empty String is cheap; only 3 pointer-sized values and no heap allocation:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
let old = std::mem::replace(self, VariantA(String::new()));
// Note this line for the detailed explanation
*self = match old {
VariantA(s) => VariantB(s, 0),
VariantB(s, i) => VariantB(s, 2 * i),
};
}
}
for x in &mut data {
x.transform();
}
An alternate implementation that just replaces the String:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
*self = match self {
VariantA(s) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 0)
}
VariantB(s, i) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 2 * *i)
}
};
}
}
In general, yes, you have to create some dummy value to do this generically and with safe code. Many times, you can wrap your whole element in Option and call Option::take to achieve the same effect .
See also:
Change enum variant while moving the field to the new variant
Why is it so complicated?
See this proposed and now-closed RFC for lots of related discussion. My understanding of that RFC (and the complexities behind it) is that there's an time period where your value would have an undefined value, which is not safe. If a panic were to happen at that exact second, then when your value is dropped, you might trigger undefined behavior, a bad thing.
If your code were to panic at the commented line, then the value of self is a concrete, known value. If it were some unknown value, dropping that string would try to drop that unknown value, and we are back in C. This is the purpose of the Dummy value - to always have a known-good value stored.
You even hinted at this (emphasis mine):
I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop
That "should" is the problem. During a panic, that dummy value is visible.
See also:
How can I swap in a new value for a field in a mutable reference to a structure?
Temporarily move out of borrowed content
How do I move out of a struct field that is an Option?
The now-removed implementation of Vec::map_in_place spans almost 175 lines of code, most of having to deal with unsafe code and reasoning why it is actually safe! Some crates have re-implemented this concept and attempted to make it safe; you can see an example in Sebastian Redl's answer.
You can write a map_in_place in terms of the take_mut or replace_with crates:
fn map_in_place<T, F>(v: &mut [T], f: F)
where
F: Fn(T) -> T,
{
for e in v {
take_mut::take(e, f);
}
}
However, if this panics in the supplied function, the program aborts completely; you cannot recover from the panic.
Alternatively, you could supply a placeholder element that sits in the empty spot while the inner function executes:
use std::mem;
fn map_in_place_with_placeholder<T, F>(v: &mut [T], f: F, mut placeholder: T)
where
F: Fn(T) -> T,
{
for e in v {
let mut tmp = mem::replace(e, placeholder);
tmp = f(tmp);
placeholder = mem::replace(e, tmp);
}
}
If this panics, the placeholder you supplied will sit in the panicked slot.
Finally, you could produce the placeholder on-demand; basically replace take_mut::take with take_mut::take_or_recover in the first version.

What is the right way to have multiple linked lists and move data between them in Rust?

What is the right way to have multiple std::collections::LinkedLists where the number of those lists is unknown at compile time?
I'm filling them with data as well as merging them (e.g. using append()).
I thought it would be good to have a vector that contains those lists, or contains references to those lists.
I have tried the following:
use std::collections::LinkedList;
fn listtest() {
let mut v: Vec<LinkedList<i32>> = Vec::new();
v.push(LinkedList::new()); // first list
v.push(LinkedList::new()); // second list
v[0].push_back(1); // fill with data
v[1].push_back(3); // fill with data
v[0].append(&mut v[1]); // merge lists
}
fn main() {
listtest();
}
This fails to compile because I have two mutable references of v when using append(). I also tried using Vec<&mut LinkedList<i32>>, but did not succeed.
What would be the right approach to this problem?
There is no right approach. One possibility is to use split_at_mut. This creates two separate slices, each of which can be mutated separately from the other:
use std::collections::LinkedList;
fn main() {
let mut v = vec![LinkedList::new(), LinkedList::new()];
v[0].push_back(1);
v[1].push_back(3);
{
let (head, tail) = v.split_at_mut(1);
head[0].append(&mut tail[0]);
}
println!("{:?}", v);
}
See:
How to get mutable references to two array elements at the same time?
How can I write data from a slice to the same slice?
How to operate on 2 mutable slices of a Rust array
etc.
Most collections have an iter_mut method that returns an iterator that yields mutable references to each item in the collection. And these references can all be used at the same time! (But the references must come from the same iterator; you can't use references coming from separate calls to iter_mut concurrently.)
use std::collections::LinkedList;
fn listtest() {
let mut v: Vec<LinkedList<i32>> = Vec::new();
v.push(LinkedList::new()); // first list
v.push(LinkedList::new()); // second list
v[0].push_back(1); // fill with data
v[1].push_back(3); // fill with data
let mut vi = v.iter_mut();
let first = vi.next().unwrap();
let second = vi.next().unwrap();
first.append(second); // merge lists
}
fn main() {
listtest();
}
Also remember that iterators have the nth method for doing the equivalent of next in a loop.

Whats the best way to join many vectors into a new vector?

To create a new vector with the contents of other vectors, I'm currently doing this:
fn func(a: &Vec<i32>, b: &Vec<i32>, c: &Vec<i32>) {
let abc = Vec<i32> = {
let mut tmp = Vec::with_capacity(a.len(), b.len(), c.len());
tmp.extend(a);
tmp.extend(b);
tmp.extend(c);
tmp
};
// ...
}
Is there a more straightforward / elegant way to do this?
There is a concat method that can be used for this, however the values need to be slices, or borrowable to slices, not &Vec<_> as given in the question.
An example, similar to the question:
fn func(a: &Vec<i32>, b: &Vec<i32>, c: &Vec<i32>) {
let abc = Vec<i32> = [a.as_slice(), b.as_slice(), c.as_slice()].concat();
// ...
}
However, as #mindTree notes, using &[i32] type arguments is more idiomatic and removes the need for conversion. eg:
fn func(a: &[i32], b: &[i32], c: &[i32]) {
let abc = Vec<i32> = [a, b, c].concat();
// ...
}
SliceConcatExt::concat is a more general version of your function and can join multiple slices to a Vec. It will sum the sizes each slice to pre-allocate a Vec of the right capacity, then extend repeatedly.
fn concat(&self) -> Vec<T> {
let size = self.iter().fold(0, |acc, v| acc + v.borrow().len());
let mut result = Vec::with_capacity(size);
for v in self {
result.extend_from_slice(v.borrow())
}
result
}
One possible solution might be to use the Chain iterator:
let abc: Vec<_> = a.iter().chain(b).chain(c).collect();
However, in your example you are borrowing the slices, so we'll need to either deref each borrowed element or use the Cloned iterator to copy each integer. Cloned is probably a bit easier and as efficient as we are working with small Copy data (i32):
let abc: Vec<_> = a.iter().cloned()
.chain(b.iter().cloned())
.chain(c.iter().cloned())
.collect();
Seeing as each of these iterators are ExactSizeIterators, it should be possible to allocate the exact size for the target Vec up front, however I'm unware whether or not this is actually the case in the std implementation (they might be waiting on specialization to land before adding this optimisation).

The type placeholder `_` is not allowed within types on item signatures

Beginner question; and a search couldn't find anything similar.
Background: I'm just practising functions in Rust by making a shuffling function. Program takes in any arguments and shuffles them and stores them in 'result'
Question: I guess I can't use V<_> in a function header so what would I use in this situation?
MCVE:
use std::io;
use std::cmp::Ordering;
use std::env;
fn main()
{
let mut result = shuffle(env::args().collect());
}//End of main
fn shuffle(args: Vec<_>) -> Vec<_>
{
let mut temp = Vec::with_capacity((args.capacity()));
while args.len() > 1
{
//LET N REPRESENT A RANDOM NUMBER GENERATED ON EACH ITERATION
let mut n = 2;
temp.push(args.swap_remove(n));
}
return temp;
}//End of shuffle function
Playground link
You would convert your function to a generic function:
fn shuffle<T>(args: Vec<T>) -> Vec<T> {
See it in the playpen: http://is.gd/MCCxal

How do I parse a string to a list of floats using functional style?

I am trying to parse a string into a list of floating-point values in Rust. I would assume there is a clever way to do this using iterators and Options; however, I cannot get it to ignore the Err values that result from failed parse() calls. The Rust tutorial doesn't seem to cover any examples like this, and the documentation is confusing at best.
How would I go about implementing this using the functional-style vector/list operations? Pointers on better Rust style for my iterative implementation would be appreciated as well!
"Functional"-style
Panics when it encounters an Err
input.split(" ").map(|s| s.parse::<f32>().unwrap()).collect::<Vec<_>>()
Iterative-style
Ignores non-float values as intended
fn parse_string(input: &str) -> Vec<f32> {
let mut vals = Vec::new();
for val in input.split_whitespace() {
match val.parse() {
Ok(v) => vals.push(v),
Err(_) => (),
}
}
vals
}
fn main() {
let params = parse_string("1 -5.2 3.8 abc");
for &param in params.iter() {
println!("{:.2}", param);
}
}
filter_map does what you want, transforming the values and filtering out Nones:
input.split(" ").filter_map(|s| s.parse::<f32>().ok()).collect::<Vec<_>>();
Note the ok method to convert the Result to an Option.

Resources