I have the following function, which takes a vector as argument and returns a vector of its pairs of elements:
fn to_pairs(flat: Vec<u64>) -> Vec<(u64, u64)> {
assert!(flat.len() % 2 == 0);
let mut pairs = Vec::new();
pairs.reserve(flat.len() / 2);
for pair in flat.chunks(2) {
assert!(pair.len() == 2);
pairs.push((pair.get(0).unwrap().clone(), pair.get(1).unwrap().clone()));
}
pairs
}
I want consume the vector flat so I don't have to clone its elements when constructing the pair.
Is it possible to do so without reimplementing a variation of Vec::chunks() myself?
I want consume the vector flat so I don't have to clone its elements when constructing the pair.
Convert the input Vec into an iterator, then take two things from the iterator at a time. Essentially, you want the same thing as processing a Range (an iterator) in chunks:
fn to_pairs<T>(flat: Vec<T>) -> Vec<(T, T)> {
let len = flat.len();
assert!(len % 2 == 0);
let mut pairs = Vec::with_capacity(len / 2);
let mut input = flat.into_iter().peekable();
while input.peek().is_some() {
match (input.next(), input.next()) {
(Some(a), Some(b)) => pairs.push((a, b)),
_ => unreachable!("Cannot have an odd number of values"),
}
}
pairs
}
fn main() {
assert_eq!(vec![(1,2), (3,4)], to_pairs(vec![1,2,3,4]));
assert_eq!(vec![(true,true), (false,false)], to_pairs(vec![true,true,false,false]));
}
The assert!(len % 2 == 0); is quite important here, as Iterator makes no guarantees about what happens after the first time next returns None. Since we call next twice without checking the first value, we could be triggering that case. In other cases, you'd want to use fuse.
As pointed out by Kha, you could simplify the while loop a bit:
let mut input = flat.into_iter();
while let (Some(a), Some(b)) = (input.next(), input.next()) {
pairs.push((a, b));
}
Related
I try to use iterator on vector slices, but it just doesn't work.
My code are as follows
pub fn three_sum(nums: Vec<i32>) -> Vec<Vec<i32>> {
let mut res: Vec<Vec<i32>> = Vec::new();
for (n1, &i1) in nums.iter().enumerate() {
for (n2, &i2) in nums[(n1 + 1)..].iter().enumerate() {
for (n3, &i3) in nums[(n2 + 1)..].iter().enumerate() {
if i1 + i2 + i3 == 0 {
res.push(Vec::from([i1, i2, i3]));
}
}
}
}
return res;
}
I expected the n2 loop in nums ranging n1 to the end, but it just loop from the beginning, regardless of what n1 is.
Same happened on n3.
Did I use iterators and slices correctly?
As you can see in the docs, enumerate just counts the number of iterations. If you want to skip the first n elements of an iterator, you should use the skip function instead, which also more clearly expresses your intent.
nums.iter().enumerate().skip(n)
Note the order of enumerate and skip however: this way you're first constructing an enumerated iterator, then skipping some elements, so your index will also "start from" n. The other way around you'll skip th elements first, then count them, starting from 0.
There is no way for enumerate to know where you want to start counting from so it always starts at 0.
You can use zip with a range if you want to start counting from a different number.
pub fn three_sum(nums: Vec<i32>) -> Vec<[i32;3]> {
let mut res = Vec::new();
for (n1, &i1) in nums.iter().enumerate() {
for (n2, &i2) in (n1+1..).zip(nums[(n1 + 1)..].iter()) {
for (n3, &i3) in (n2+1..).zip(nums[(n2 + 1)..].iter()) {
if i1 + i2 + i3 == 0 {
res.push([i1, i2, i3]);
}
}
}
}
res
}
Please help me understand how the following code always returns the smallest value in the array. I tried moving position of 3 but it always manages to return it irrespective of the position of it in the array.
let myA = [12,3,8,5]
let myN = 4
function F4(A,N)
{
if(N==1){
return A[0]
}
if(F4(A,N-1) < A[N-1]){
return F4(A,N-1)
}
return A[N-1]
}
console.log(F4(myA,myN))
This is quite tricky to get an intuition for. It's also quite important that you learn the process for tackling this type of problem rather than simply be told the answer.
If we take a first view of the code with a few comments and named variables it looks like this:
let myA = [12,3,8,5];
let myN = myA.length;
function F4(A, N) {
// if (once) there is only one element in the array "A", then it must be the minimum, do not recurse
if (N === 1){
return A[0]
}
const valueFromArrayLessLastEl = F4(A,N-1); // Goes 'into' array
const valueOfLastElement = A[N-1];
console.log(valueFromArrayLessLastEl, valueOfLastElement);
// note that the recursion happens before min(a, b) is evaluated so array is evaluated from the start
if (valueFromArrayLessLastEl < valueOfLastElement) {
return valueFromArrayLessLastEl;
}
return valueOfLastElement;
}
console.log(F4(myA, myN))
and produces
12 3 // recursed all the way down
3 8 // stepping back up with result from most inner/lowest recursion
3 5
3
but in order to gain insight it is vital that you approach the problem by considering the simplest cases and expand from there. What happens if we write the code for the cases of N = 1 and N = 2:
// trivially take N=1
function F1(A) {
return A[0];
}
// take N=2
function F2(A) {
const f1Val = F1(A); // N-1 = 1
const lastVal = A[1];
// return the minimum of the first element and the 2nd or last element
if (f1Val < lastVal) {
return f1Val;
}
return lastVal;
}
Please note that the array is not being modified, I speak as though it is because the value of N is decremented on each recursion.
With myA = [12, 3, 8, 5] F1 will always return 12. F2 will compare this value 12 with 3, the nth-1 element's value, and return the minimum.
If you can build on this to work out what F3 would do then you can extrapolate from there.
Play around with this, reordering the values in myA, but crucially look at the output as you increase N from 1 to 4.
As a side note: by moving the recursive call F4(A,N-1) to a local constant I've prevented it being called twice with the same values.
How to split a vector
let v: Vec<u8>; // vector with size x
into a vector of vectors of maxsize n? Pseudocode:
let n: usize = 1024;
let chunks_list: Vec<Vec<u8>> = chunks(v, n);
or using slices (to avoid copying):
let v: &[u8];
let chunks_list: Vec<&[u8]> = chunks(v, n);
Rust slices already contain the necessary method for that: chunks.
Starting from this:
let src: Vec<u8> = vec![1, 2, 3, 4, 5];
you can get a vector of slices (no copy):
let dst: Vec<&[u8]> = src.chunks(3).collect();
or a vector of vectors (slower, heavier):
let dst: Vec<Vec<u8>> = src.chunks(3).map(|s| s.into()).collect();
playground
There is a method already existing for slices:
pub fn chunks(&self, chunk_size: usize) -> Chunks<'_, T>
Returns an iterator over chunk_size elements of the slice at a time, starting at the beginning of the slice.
The chunks are slices and do not overlap. If chunk_size does not divide the length of the slice, then the last chunk will not have length chunk_size.
There is also chunks_mut for mutability as well as chunks_exact and chunks_exact_mut if the last chunk has to respect the size n, along with the unsafe as_chunks_unchecked in case we assume there is no remainder, see below example:
fn main() {
let v: [u8; 5] = *b"lorem";
let n = 2;
let chunks = v.chunks(n);
let chunks_list: Vec<&[u8]> = chunks.collect();
println!("{:?}", chunks_list);
}
Using a slice instead of vectors has some benefits, notably avoiding the overhead of copying.
If it's required to take a Vec and split it into multiple Vecs, I'd use Itertools::chunks. This takes an iterator and returns an iterator of iterators. You can then choose to collect both the inner and outer iterators into Vecs:
use itertools::Itertools; // 0.10.0
fn main() {
let v = vec![String::from("A"), String::from("B"), String::from("C")];
let x: Vec<Vec<String>> = v.into_iter().chunks(2).into_iter().map(|c| c.collect()).collect();
eprintln!("{:?}", x);
}
[["A", "B"], ["C"]]
This has the benefit of taking ownership of each value in the original vector. No data needs to be copied, but it does need to be moved. If you can use slices instead, it's much better to use slice::chunks.
Here is one approach:
use std::{usize, vec};
fn chunks(data: Vec<u8>, chunk_size: usize) -> Vec<Vec<u8>> {
let mut results = vec![];
let mut current = vec![];
for i in data {
if current.len() >= chunk_size {
results.push(current);
current = vec![];
}
current.push(i);
}
results.push(current);
return results;
}
fn main() {
let v: Vec<u8> = (1..100).collect();
let n: usize = 24;
let chunks_list = chunks(v, n);
println!("{:#?}", chunks_list);
}
I need to find out how many integers are present in both of two given sets, fast. The sets are written to only once but this operation will be performed many times with different pairs of sets. The sets contain 5-30 integers and the largest of these integers is 840000.
I have initially tried to iterate over one Vec and for each element check if its present in the other Vec. I then decided to use BTreeSet instead since it should be significantly faster at checking if an integer is present in the set, but that does not seem to be the case. The Vec implementation takes ~72ms and the BTreeSet ~96ms when called on couple thousands of sets in release mode under stable Rust 1.34 with same performance when using nightly.
This is the Vec implementation:
use std::cmp;
fn main() {
let mut sets = Vec::with_capacity(1000);
for i in 1..1000 {
let mut set = Vec::new();
for j in 1..i % 30 {
set.push(i * j % 50000);
}
sets.push(set);
}
for left_set in sets.iter() {
for right_set in sets.iter() {
calculate_waste(left_set, right_set);
}
}
}
fn calculate_waste(left_nums: &Vec<usize>, right_nums: &Vec<usize>) -> usize {
let common_nums = left_nums.iter().fold(0, |intersection_count, num| {
intersection_count + right_nums.contains(num) as usize
});
let left_side = left_nums.len() - common_nums;
let right_side = right_nums.len() - common_nums;
let score = cmp::min(common_nums, cmp::min(left_side, right_side));
left_side - score + right_side - score + common_nums - score
}
And this is the BTreeSet implementation:
use std::cmp;
use std::collections::BTreeSet;
fn main() {
let mut sets = Vec::with_capacity(1000);
for i in 1..1000 {
let mut set = BTreeSet::new();
for j in 1..i % 30 {
set.insert(i * j % 50000);
}
sets.push(set);
}
for left_set in sets.iter() {
for right_set in sets.iter() {
calculate_waste(left_set, right_set);
}
}
}
fn calculate_waste(left_nums: &BTreeSet<usize>, right_nums: &BTreeSet<usize>) -> usize {
let common_nums = left_nums.intersection(&right_nums).count();
let left_side = left_nums.len() - common_nums;
let right_side = right_nums.len() - common_nums;
let score = cmp::min(common_nums, cmp::min(left_side, right_side));
left_side - score + right_side - score + common_nums - score
}
It was ran with the command (-w 50 makes it ignore the first 50 runs):
hyperfine "cargo run --release" -w 50 -m 100
Full code of the program available here.
Is the BTreeSet implementation slower because there are too few integers in the set to allow its O(log n) access time to shine? If so, is there anything else I can do to speed up this function?
Since your sets don't change over time, I think your best option is to use sorted vectors. Sorting the vectors will be required only once, at initialization time. The intersection of two sorted vectors can be computed in linear time by iterating over them simultaneously, always advancing the iterator that currently points to the lower number. Here is an attempt at an implementation:
fn intersection_count_sorted_vec(a: &[u32], b: &[u32]) -> usize {
let mut count = 0;
let mut b_iter = b.iter();
if let Some(mut current_b) = b_iter.next() {
for current_a in a {
while current_b < current_a {
current_b = match b_iter.next() {
Some(current_b) => current_b,
None => return count,
};
}
if current_a == current_b {
count += 1;
}
}
}
count
}
This probably isn't particularly well optimised; regardless, benchmarking with Criterion-based code indicates this version is more than three times as fast as your solution using vectors.
I'm trying to complete the activity at the bottom of this page, where I need to print the index of each element as well as the value. I'm starting from the code
use std::fmt; // Import the `fmt` module.
// Define a structure named `List` containing a `Vec`.
struct List(Vec<i32>);
impl fmt::Display for List {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
// Extract the value using tuple indexing
// and create a reference to `vec`.
let vec = &self.0;
write!(f, "[")?;
// Iterate over `vec` in `v` while enumerating the iteration
// count in `count`.
for (count, v) in vec.iter().enumerate() {
// For every element except the first, add a comma.
// Use the ? operator, or try!, to return on errors.
if count != 0 { write!(f, ", ")?; }
write!(f, "{}", v)?;
}
// Close the opened bracket and return a fmt::Result value
write!(f, "]")
}
}
fn main() {
let v = List(vec![1, 2, 3]);
println!("{}", v);
}
I'm brand new to coding and I'm learning Rust by working my way through the Rust docs and Rust by Example. I'm totally stuck on this.
In the book you can see this line:
for (count, v) in vec.iter().enumerate()
If you look at the documentation, you can see a lot of useful functions for Iterator and enumerate's description states:
Creates an iterator which gives the current iteration count as well as the next value.
The iterator returned yields pairs (i, val), where i is the current index of iteration and val is the value returned by the iterator.
enumerate() keeps its count as a usize. If you want to count by a different sized integer, the zip function provides similar functionality.
With this, you have the index of each element in your vector. The simple way to do what you want is to use count:
write!(f, "{}: {}", count, v)?;
This is a simple example to print the index and value of a vector:
fn main() {
let vec1 = vec![1, 2, 3, 4, 5];
println!("length is {}", vec1.len());
for x in 0..vec1.len() {
println!("{} {}", x, vec1[x]);
}
}
This program output is -
length is 5
0 1
1 2
2 3
3 4
4 5