How do I process a range in slices in Rust? - vector

I understand that the preferred way to iterate in Rust is through the for var in (range) syntax, but sometimes I'd like to work on more than one of the elements in that range at a time.
From a Ruby perspective, I'm trying to find a way of doing (1..100).each_slice(5) do |this_slice| in Rust.
I'm trying things like
for mut segment_start in (segment_size..max_val).step_by(segment_size) {
let this_segment = segment_start..(segment_start + segment_size).iter().take(segment_size);
}
but I keep getting errors that suggest I'm barking up the wrong type tree. The docs aren't helpful either--they just don't contain this use case.
What's the Rust way to do this?

Use chunks (or chunks_mut if you need mutability):
fn main() {
let things = [5, 4, 3, 2, 1];
for slice in things.chunks(2) {
println!("{:?}", slice);
}
}
Outputs:
[5, 4]
[3, 2]
[1]
The easiest way to combine this with a Range would be to collect the range to a Vec first (which dereferences to a slice):
fn main() {
let things: Vec<_> = (1..100).collect();
for slice in things.chunks(5) {
println!("{:?}", slice);
}
}
Another solution that is pure-iterator would be to use Itertools::chunks_lazy:
extern crate itertools;
use itertools::Itertools;
fn main() {
for chunk in &(1..100).chunks_lazy(5) {
for val in chunk {
print!("{}, ", val);
}
println!("");
}
}
Which suggests a similar solution that only requires the standard library:
fn main() {
let mut range = (1..100).peekable();
while range.peek().is_some() {
for value in range.by_ref().take(5) {
print!("{}, ", value);
}
println!("");
}
}
One trick is that Ruby and Rust have different handling here, mostly centered around efficiency.
In Ruby Enumerable can create new arrays to stuff values in without worrying about ownership and return a new array each time (check with this_slice.object_id).
In Rust, allocating a new vector each time would be pretty unusual. Additionally, you can't easily return a reference to a vector that the iterator holds due to complicated lifetime concerns.
A solution that's very similar to Ruby's is:
fn main() {
let mut range = (1..100).peekable();
while range.peek().is_some() {
let chunk: Vec<_> = range.by_ref().take(5).collect();
println!("{:?}", chunk);
}
}
Which could be wrapped up in a new iterator that hides the details:
use std::iter::Peekable;
struct InefficientChunks<I>
where I: Iterator
{
iter: Peekable<I>,
size: usize,
}
impl<I> Iterator for InefficientChunks<I>
where I: Iterator
{
type Item = Vec<I::Item>;
fn next(&mut self) -> Option<Self::Item> {
if self.iter.peek().is_some() {
Some(self.iter.by_ref().take(self.size).collect())
} else {
None
}
}
}
trait Awesome: Iterator + Sized {
fn inefficient_chunks(self, size: usize) -> InefficientChunks<Self> {
InefficientChunks {
iter: self.peekable(),
size: size,
}
}
}
impl<I> Awesome for I where I: Iterator {}
fn main() {
for chunk in (1..100).inefficient_chunks(5) {
println!("{:?}", chunk);
}
}

Collecting into a vec can easily kill your performance. An approach similar to in the question is perfectly fine.
fn chunk_range(range: Range<usize>, chunk_size: usize) -> impl Iterator<Item=Range<usize>> {
range.clone().step_by(chunk_size).map(move |block_start| {
let block_end = (block_start + chunk_size).min(range.end);
block_start..block_end
})
}

Related

Is it possible to have a uniform Iterator interface for different data structures in rust? [duplicate]

This question already has answers here:
Conditionally iterate over one of several possible iterators
(4 answers)
Closed 2 years ago.
I want have something like below
use std::collections::HashMap;
pub enum DiffStruct {
V(Vec<i32>),
M(HashMap<i32,i32>),
}
impl DiffStruct {
fn to_iter(self) -> impl IntoIterator<Item = i32> {
match self {
DiffStruct::V(vec) => vec.iter().into_iter(),
DiffStruct::M(map) => map.values().into_iter(),
}
}
}
fn main() {
let v: Vec<_> = DiffStruct::V(vec![1,2,3]).to_iter().collect();
}
playground
So that I can minimize the collect behavior of my code for best performance, but it does not compile, any workaround to achieve this?
Assuming you want to take ownership of DiffStruct and not just borrow it when you collect its values:
use std::collections::HashMap;
pub enum DiffStruct {
V(Vec<i32>),
M(HashMap<i32,i32>),
}
impl DiffStruct {
fn to_iter(self) -> Box<dyn Iterator<Item = i32>> {
match self {
DiffStruct::V(vec) => Box::new(vec.into_iter()),
DiffStruct::M(map) => Box::new(map.into_iter().map(|(_, v)| v)),
}
}
}
fn main() {
let vec_values = vec![1, 2, 3];
let mut map_values = HashMap::new();
map_values.insert(1, 1);
map_values.insert(2, 2);
map_values.insert(3, 3);
let ds_vec = DiffStruct::V(vec_values);
let ds_map = DiffStruct::M(map_values);
let collected_from_vec: Vec<_> = ds_vec.to_iter().collect();
let collected_from_map: Vec<_> = ds_map.to_iter().collect();
}
playground
See also
Conditionally iterate over one of several possible iterators
What is the correct way to return an Iterator?

Fill a vector of struct elements by iteration rather than using .push() one by one

I am trying to find an elegant way to fill a vector of struct elements with a loop or logic instead of writing one .push() for every element I create.
The struct element is a question with many more fields than in the following example and the instances need to be mutable because they are modified by user input :
struct Question {
id: usize,
question: String,
}
fn main() {
//A large and growing list of questions
let mut q0 = Question {
id: 0,
question: String::from("A field I fill in manually"),
};
// .
// .
// .
let mut q100 = Question {
id: 100,
question: String::from("Another field, each one is different"),
};
let total_questions: usize = 100;
let mut w: Vec<String> = Vec::new();
for a in 0..total_questions {
let s = format!("q{}", a);
w.push(s);
}
//w contains ["q0", "q1", ..., "q100"] but is of type std::string::String
let mut v: Vec<&mut Question> = Vec::new();
//Expects type struct `main::Question`
//I would like to avoid :
v.push(&mut q0);
v.push(&mut q1);
// .
// .
// .
v.push(&mut q100);
}
I am not sure that in my example the w: Vec<String> is of any use.
I have looked into .collect() but could not understand how to utilize it in my case.
I'd be happy to be pointed towards a similar question if this is a duplicate I have not found one.
Edit : I have changed the structs string content as it was misleading. They each contain Strings that are unique and cannot be generated. I also realized that Stack Overflow automatically included this in a some_fn() function when we are actually inside main()
The problem is because you don't have any data structure that contains the Questions -- you just have 100+ independent local variables -- it's not possible to iterate over them to fill the Vec. You can fix this by putting all the Questions in a Vec<Question> as you create them. Here's an example:
let mut v: Vec<Question> = vec![
Question {
id: 0,
question: String::from("Q0"),
},
// ...
Question {
id: 100,
question: String::from("Q100"),
},
];
In fact, once you do this you probably don't need the Vec<&mut Question> at all, since you can mutate the questions directly by indexing v. However, if you do need the vector of references for some reason, you can create it by collecting an iterator:
let v_refs: Vec<&mut Question> = v.iter_mut().collect();
If you can generate your Question object with a function you can use an iterator. Here is an example which just generates numbered Question objects out of a numeric range:
struct Question {
id: usize,
question: String,
}
fn main() {
let v: Vec<Question> = (0..10)
.map(|x| Question {
id: x,
question: "Q".to_string() + &x.to_string(),
})
.collect();
for x in &v {
println!("{}: {}", x.id, x.question);
}
}
Here is an example where you get the strings from an array of strings:
struct Question<'a> {
id: usize,
question: &'a str,
}
const QUESTIONS: [&str; 3] = ["A", "B", "C"];
fn main() {
let v: Vec<Question> = (0..questions.len())
.map(|x| Question {
id: x,
question: questions[x],
})
.collect();
for x in &v {
println!("{}: {}", x.id, x.question);
}
}

How to recursively call a closure that is stored in an Arc<Mutex<_>>?

I’m trying to transpile a dynamic language into Rust and closures are the most difficult part to implement.
I've tried using a Arc<Mutex<dyn FnMut>>, but it doesn't support recursion.
use std::sync::{Arc, Mutex};
type Data = Arc<DataUnpack>;
enum DataUnpack {
Number(f64),
Function(Box<Mutex<FnMut(Vec<Data>) -> Data>>),
}
fn call(f: Data, args: Vec<Data>) -> Data {
if let DataUnpack::Function(v) = &*f {
let f = &mut *v.lock().unwrap();
f(args)
} else {
panic!("TYPE ERR")
}
}
fn lambda(f: Box<FnMut(Vec<Data>) -> Data>) -> Data {
Arc::new(DataUnpack::Function(Box::new(Mutex::new(Box::leak(f)))))
}
fn main() {
let f: Arc<Mutex<Data>> = Arc::new(Mutex::new(Arc::new(DataUnpack::Number(0.0))));
*f.lock().unwrap() = {
let f = f.clone();
lambda(Box::new(move |xs| {
println!("Ha");
call(f.lock().unwrap().clone(), xs.clone())
}))
};
call(f.lock().unwrap().clone(), vec![]);
}
playground
It shows one Ha and then stops. Where am I wrong?

What is an efficient way to reset all values of a Vec<T> without resizing it?

I can use resize, but it seems like overkill because I do not need to resize the vector, just modify its values. Using a new variable is not an option, since this vector is actually a field in a struct.
I guess that resize is efficient, and probably the answer to my question, but its name does not carry the meaning of resetting the values without modifying the size.
In C, I would use memset (in opposition to realloc).
Illustration of my question:
let my_vec_size = 42;
let mut my_vec = Vec::new(); // 'my_vec' will always have a size of 42
my_vec.resize(my_vec_size, false); // Set the size to 42, and all values to false
// [ ... ] piece of code where the values in 'my_vec' will be modified, checked, etc ...
// now I need to reuse my_vec.
// Possibility A -> use resize again
my_vec.resize(my_vec_size, false);
// Possibility B -> iterate on the vector to modify its values (long and laborious)
for item in my_vec.iter_mut() {
*item = false;
}
// Possibility C ?
The most efficient way in general is to reset the values themselves (aka B):
for item in &mut my_vec { *item = false; }
For booleans it is not immediately obvious, however for a String it is important to preserve the allocated buffer of each element:
for item in &mut my_vec { item.clear(); }
If discarding and recreating the elements of the Vec is cheap, such as the case of the boolean or if the elements will be overwritten anyway, then a combination of clear and resize is easier:
my_vec.clear();
my_vec.resize(my_vec_size, false);
resize by itself will not work to "reset" values:
const LEN: usize = 3;
fn main() {
let mut values = vec![false; LEN];
values[0] = true;
values.resize(LEN, false);
println!("{:?}", values); // [true, false, false]
}
Just use a for loop:
for v in &mut values {
*v = false;
}
println!("{:?}", values); // [false, false, false]
If that sight offends you, write an extension trait:
trait ResetExt<T: Copy> {
fn reset(&mut self, val: T);
}
impl<T: Copy> ResetExt<T> for [T] {
fn reset(&mut self, value: T) {
for v in self {
*v = value;
}
}
}
values.reset(false);
println!("{:?}", values); // [false, false, false]
The trait idea can be extended so that each value knows how to reset itself, if that makes sense for your situation:
trait ResetExt {
fn reset(&mut self);
}
impl<T: ResetExt> ResetExt for [T] {
fn reset(&mut self) {
for v in self {
v.reset();
}
}
}
impl ResetExt for bool {
fn reset(&mut self) {
*self = false;
}
}
impl ResetExt for String {
fn reset(&mut self) {
self.clear();
}
}
values.reset();
println!("{:?}", values); // [false, false, false]
In C, I would use memset
std::ptr::write_bytes uses memset internally, so you can (almost) precisely translate this code. An example from the Rust documentation:
let mut vec = vec![0u32; 4];
unsafe {
let vec_ptr = vec.as_mut_ptr();
ptr::write_bytes(vec_ptr, 0xfe, 2);
}
assert_eq!(vec, [0xfefefefe, 0xfefefefe, 0, 0]);

Implementing 2D vector syntax for accessing a 1D vector?

I'm making a toy roguelike and have a Level structure for storing the game map, for which the most naive implementation is a 2D vector.
I'm following this tutorial which uses a Vector of Vectors, but states that for performance gains it's also possible to use a single Vector of size MAP_HEIGHT * MAP_WIDTH, and to access a tile at (x, y) one can simply access map[y * MAP_WIDTH + x].
I'm trying to implement this faster method but using getters and setters is clunky, and public fields aren't that great either. I'd much prefer it to feel like a 2D vector.
In order to do that I need to implement the Index trait for my class, but I'm not sure how to get the result I want. Maybe by nesting the impls? I really no idea.
Here is my code with a terrible attempt at implementing Index for my structure, which obviously won't work for my purposes because it's one dimensional:
const MAP_WIDTH: i32 = 80;
const MAP_HEIGHT: i32 = 45;
pub struct Level {
map: Vec<Tile>,
}
impl Level {
pub fn new() -> Self {
Level { map: vec![Tile::empty(); (MAP_HEIGHT * MAP_WIDTH) as usize] }
}
}
impl std::ops::Index<i32> for Level {
type Output = Tile;
fn index(&self, x: i32) -> &Self::Output {
self[MAP_WIDTH + x]; // We have x and y values; how do we make this work?
}
}
Make your struct indexible over objects of type (i32, i32).
type Pos = (i32, i32);
impl std::ops::Index<Pos> for Level {
type Output = Tile;
fn index(&self, (x, y): Pos) -> &Self::Output {
&self.map[(y * MAP_WIDTH + x) as usize]
}
}
Which you can then access with, for example:
let tile = level[(3, 4)];
Since you are using i32, you need to make sure that the values are within range, and can be coerced to usize, which is what Vecs are indexed over. Probably you should just stick with u32 or usize values from the start. Otherwise, you'll need to keep track of the minimum x and y values, and subtract them, to keep the position in range. It's definitely simpler to deal with positive coordinates and make the assumption that the corner of your map is (0, 0).
It is possible, though not obvious.
First of all, I suggest having the MAP_WIDTH and MAP_HEIGHT in usize, as they are positive integers:
const MAP_WIDTH: usize = 80;
const MAP_HEIGHT: usize = 45;
Then you need to implement Index (and possibly IndexMut) to return a slice; in this case I'm assuming that you want the first coordinate to be the row:
impl std::ops::Index<usize> for Level {
type Output = [Tile];
fn index(&self, row: usize) -> &[Tile] {
let start = MAP_WIDTH * row;
&self.map[start .. start + MAP_WIDTH]
}
}
impl std::ops::IndexMut<usize> for Level {
fn index_mut(&mut self, row: usize) -> &mut [Tile] {
let start = MAP_WIDTH * row;
&mut self.map[start .. start + MAP_WIDTH]
}
}
Then, when you index a Level, it first returns a slice with the applicable row; then you can index that slice with the column number.
Below is an example implementation with a substitute Tile:
const MAP_WIDTH: usize = 80;
const MAP_HEIGHT: usize = 45;
#[derive(Clone, Debug)]
pub struct Tile {
x: u32,
y: u32
}
pub struct Level {
map: Vec<Tile>,
}
impl Level {
pub fn new() -> Self {
Level { map: vec![Tile { x: 0, y: 0 }; (MAP_HEIGHT * MAP_WIDTH) as usize] }
}
}
impl std::ops::Index<usize> for Level {
type Output = [Tile];
fn index(&self, row: usize) -> &[Tile] {
let start = MAP_WIDTH * row;
&self.map[start .. start + MAP_WIDTH]
}
}
impl std::ops::IndexMut<usize> for Level {
fn index_mut(&mut self, row: usize) -> &mut [Tile] {
let start = MAP_WIDTH * row;
&mut self.map[start .. start + MAP_WIDTH]
}
}
fn main() {
let mut lvl = Level::new();
lvl[5][2] = Tile { x: 5, y: 2 };
println!("{:?}", lvl[5][2]); // Tile { x: 5, y: 2 }
}
You cannot do this without exposing internal details about your implementation. Index is defined as:
pub trait Index<Idx>
where
Idx: ?Sized,
{
type Output: ?Sized;
fn index(&self, index: Idx) -> &Self::Output;
}
In order to support game[x][y], the return value of game[x] would need to:
Be a reference to something. (&Self::Output)
Implement Index itself.
There's no value to return a reference to other than self, and self would already implement Index for a usize so you can't reuse it.
Instead, you can implement indexing for a tuple:
impl std::ops::Index<(usize, usize)> for Level {
type Output = Tile;
fn index(&self, (x, y): (usize, usize)) -> &Self::Output {
&self.map[MAP_WIDTH as usize * y + x]
}
}
This can be used as level[(43, 12)].
If you implement Index to return a slice, you should be aware that you are forever requiring that your internal data structure be something that is based on slices. For example, you cannot use a "sparse" structure like a HashMap because it cannot return a &[Tile]. The ability to return a &[Tile] is now a part of the public API of the struct. It's certainly a possibility that the representation will change, especially since it's already changed once.

Resources