How to bulk insert into a vector in rust? [duplicate] - vector

Is there any straightforward way to insert or replace multiple elements from &[T] and/or Vec<T> in the middle or at the beginning of a Vec in linear time?
I could only find std::vec::Vec::insert, but that's only for inserting a single element in O(n) time, so I obviously cannot call that in a loop.
I could do a split_off at that index, extend the new elements into the left half of the split, and then extend the second half into the first, but is there a better way?

As of Rust 1.21.0, Vec::splice is available and allows inserting at any point, including fully prepending:
let mut vec = vec![1, 5];
let slice = &[2, 3, 4];
vec.splice(1..1, slice.iter().cloned());
println!("{:?}", vec); // [1, 2, 3, 4, 5]
The docs state:
Note 4: This is optimal if:
The tail (elements in the vector after range) is empty
or replace_with yields fewer elements than range’s length
or the lower bound of its size_hint() is exact.
In this case, the lower bound of the slice's iterator should be exact, so it should perform one memory move.
splice is a bit more powerful in that it allows you to remove a range of values (the first argument), insert new values (the second argument), and optionally get the old values (the result of the call).
Replacing a set of items
let mut vec = vec![0, 1, 5];
let slice = &[2, 3, 4];
vec.splice(..2, slice.iter().cloned());
println!("{:?}", vec); // [2, 3, 4, 5]
Getting the previous values
let mut vec = vec![0, 1, 2, 3, 4];
let slice = &[9, 8, 7];
let old: Vec<_> = vec.splice(3.., slice.iter().cloned()).collect();
println!("{:?}", vec); // [0, 1, 2, 9, 8, 7]
println!("{:?}", old); // [3, 4]

Okay, there is no appropriate method in Vec interface (as I can see). But we can always implement the same thing ourselves.
memmove
When T is Copy, probably the most obvious way is to move the memory, like this:
fn push_all_at<T>(v: &mut Vec<T>, offset: usize, s: &[T]) where T: Copy {
match (v.len(), s.len()) {
(_, 0) => (),
(current_len, _) => {
v.reserve_exact(s.len());
unsafe {
v.set_len(current_len + s.len());
let to_move = current_len - offset;
let src = v.as_mut_ptr().offset(offset as isize);
if to_move > 0 {
let dst = src.offset(s.len() as isize);
std::ptr::copy_memory(dst, src, to_move);
}
std::ptr::copy_nonoverlapping_memory(src, s.as_ptr(), s.len());
}
},
}
}
shuffle
If T is not copy, but it implements Clone, we can append given slice to the end of the Vec, and move it to the required position using swaps in linear time:
fn push_all_at<T>(v: &mut Vec<T>, mut offset: usize, s: &[T]) where T: Clone + Default {
match (v.len(), s.len()) {
(_, 0) => (),
(0, _) => { v.push_all(s); },
(_, _) => {
assert!(offset <= v.len());
let pad = s.len() - ((v.len() - offset) % s.len());
v.extend(repeat(Default::default()).take(pad));
v.push_all(s);
let total = v.len();
while total - offset >= s.len() {
for i in 0 .. s.len() { v.swap(offset + i, total - s.len() + i); }
offset += s.len();
}
v.truncate(total - pad);
},
}
}
iterators concat
Maybe the best choice will be to not modify Vec at all. For example, if you are going to access the result via iterator, we can just build iterators chain from our chunks:
let v: &[usize] = &[0, 1, 2];
let s: &[usize] = &[3, 4, 5, 6];
let offset = 2;
let chain = v.iter().take(offset).chain(s.iter()).chain(v.iter().skip(offset));
let result: Vec<_> = chain.collect();
println!("Result: {:?}", result);

I was trying to prepend to a vector in rust and found this closed question that was linked here, (despite this question being both prepend and insert AND efficiency. I think my answer would be better as an answer for that other, more precises question because I can't attest to the efficiency), but the following code helped me prepend, (and the opposite.) [I'm sure that the other two answers are more efficient, but the way that I learn, I like having answers that can be cut-n-pasted with examples that demonstrate an application of the answer.]
pub trait Unshift<T> { fn unshift(&mut self, s: &[T]) -> (); }
pub trait UnshiftVec<T> { fn unshift_vec(&mut self, s: Vec<T>) -> (); }
pub trait UnshiftMemoryHog<T> { fn unshift_memory_hog(&mut self, s: Vec<T>) -> (); }
pub trait Shift<T> { fn shift(&mut self) -> (); }
pub trait ShiftN<T> { fn shift_n(&mut self, s: usize) -> (); }
impl<T: std::clone::Clone> ShiftN<T> for Vec<T> {
fn shift_n(&mut self, s: usize) -> ()
// where
// T: std::clone::Clone,
{
self.drain(0..s);
}
}
impl<T: std::clone::Clone> Shift<T> for Vec<T> {
fn shift(&mut self) -> ()
// where
// T: std::clone::Clone,
{
self.drain(0..1);
}
}
impl<T: std::clone::Clone> Unshift<T> for Vec<T> {
fn unshift(&mut self, s: &[T]) -> ()
// where
// T: std::clone::Clone,
{
self.splice(0..0, s.to_vec());
}
}
impl<T: std::clone::Clone> UnshiftVec<T> for Vec<T> {
fn unshift_vec(&mut self, s: Vec<T>) -> ()
where
T: std::clone::Clone,
{
self.splice(0..0, s);
}
}
impl<T: std::clone::Clone> UnshiftMemoryHog<T> for Vec<T> {
fn unshift_memory_hog(&mut self, s: Vec<T>) -> ()
where
T: std::clone::Clone,
{
let mut tmp: Vec<_> = s.to_owned();
//let mut tmp: Vec<_> = s.clone(); // this also works for some data types
/*
let local_s: Vec<_> = self.clone(); // explicit clone()
tmp.extend(local_s); // to vec is possible
*/
tmp.extend(self.clone());
*self = tmp;
//*self = (*tmp).to_vec(); // Just because it compiles, doesn't make it right.
}
}
// this works for: v = unshift(v, &vec![8]);
// (If you don't want to impl Unshift for Vec<T>)
#[allow(dead_code)]
fn unshift_fn<T>(v: Vec<T>, s: &[T]) -> Vec<T>
where
T: Clone,
{
// create a mutable vec and fill it
// with a clone of the array that we want
// at the start of the vec.
let mut tmp: Vec<_> = s.to_owned();
// then we add the existing vector to the end
// of the temporary vector.
tmp.extend(v);
// return the tmp vec that is identitcal
// to unshift-ing the original vec.
tmp
}
/*
N.B. It is sometimes (often?) more memory efficient to reverse
the vector and use push/pop, rather than splice/drain;
Especially if you create your vectors in "stack order" to begin with.
*/
fn main() {
let mut v: Vec<usize> = vec![1, 2, 3];
println!("Before push:\t {:?}", v);
v.push(0);
println!("After push:\t {:?}", v);
v.pop();
println!("popped:\t\t {:?}", v);
v.drain(0..1);
println!("drain(0..1)\t {:?}", v);
/*
// We could use a function
let c = v.clone();
v = unshift_fn(c, &vec![0]);
*/
v.splice(0..0, vec![0]);
println!("splice(0..0, vec![0]) {:?}", v);
v.shift_n(1);
println!("shift\t\t {:?}", v);
v.unshift_memory_hog(vec![8, 16, 31, 1]);
println!("MEMORY guzzler unshift {:?}", v);
//v.drain(0..3);
v.drain(0..=2);
println!("back to the start: {:?}", v);
v.unshift_vec(vec![0]);
println!("zerothed with unshift: {:?}", v);
let mut w = vec![4, 5, 6];
/*
let prepend_this = &[1, 2, 3];
w.unshift_vec(prepend_this.to_vec());
*/
w.unshift(&[1, 2, 3]);
assert_eq!(&w, &[1, 2, 3, 4, 5, 6]);
println!("{:?} == {:?}", &w, &[1, 2, 3, 4, 5, 6]);
}

Related

Borrow error when attempting recursion on HashMap, where each value needs a reference to the map

I'm currently have an issue regarding Rust's borrowing policy.
I have a HashMap of structs 'Value', each which contains a list of keys to other Values in HashMap. I am attempting to recursively call a function on these Values which requires a reference to the HashMap.
use std::collections::HashMap;
struct Value {
val: f64,
prevs: Vec<usize>,
sum: f64,
}
impl Value {
pub fn new(i: usize) -> Value {
let mut res = Value {
val: 0.1,
prevs: Vec::new(),
sum: 0.0,
};
for j in 0..i {
res.prevs.push(j);
}
res
}
pub fn evaluate(&mut self, pool: &mut HashMap<usize, Value>) -> f64 {
self.sum = self.val;
for i in &self.prevs {
let prev = pool.get_mut(i).unwrap();
self.sum += prev.evaluate(pool);
}
self.sum
}
}
fn main() {
let mut hm: HashMap<usize, Value> = HashMap::new();
for i in 0..10 {
hm.insert(i, Value::new(i));
}
println!("{}", hm.get(&9).unwrap().evaluate(&mut hm));
}
Error:
error[E0499]: cannot borrow `*pool` as mutable more than once at a time
--> src/lib.rs:25:39
|
24 | let prev = pool.get_mut(i).unwrap();
| --------------- first mutable borrow occurs here
25 | self.sum += prev.evaluate(pool);
| -------- ^^^^ second mutable borrow occurs here
| |
| first borrow later used by call
Playground
Context
I'm attempting to calculate the output of a neural network (usually done via feedforward) by starting from the output, and recursively evaluating each node, as a weighted sum of the nodes connected to it, with an unpredictable topology. This requires each node having a list of input_nodes, which are keys to a node pool HashMap.
Below is a sample with a few variants:
Non-performant and probably deadlock-prone but compiling version using Arc<Mutex>
High-performance version using Vec and split_at_mut
Highly unsafe, UB and "against-all-good-practices" version using Vec and pointers. At least evaluates to the same number, wanted to add for performance comparison.
#![feature(test)]
extern crate test;
use std::{collections::HashMap, sync::{Arc, Mutex}};
#[derive(Debug)]
struct Value {
val: f64,
prevs: Vec<usize>,
sum: f64,
}
impl Value {
pub fn new(i: usize) -> Value {
let mut res = Value {
val: 0.1,
prevs: Vec::new(),
sum: 0.0,
};
for j in 0..i {
res.prevs.push(j);
}
res
}
pub fn evaluate(&mut self, pool: &mut HashMap<usize, Arc<Mutex<Value>>>) -> f64 {
self.sum = self.val;
for i in &self.prevs {
let val = pool.get_mut(i).unwrap().clone();
self.sum += val.lock().unwrap().evaluate(pool);
}
self.sum
}
pub fn evaluate_split(&mut self, pool: &mut [Value]) -> f64 {
self.sum = self.val;
for i in &self.prevs {
let (hm, val) = pool.split_at_mut(*i);
self.sum += val[0].evaluate_split(hm);
}
self.sum
}
// OBS! Don't do this, horribly unsafe and wrong
pub unsafe fn evaluate_unsafe(&mut self, pool: *const Value, pool_len: usize) -> f64 {
let pool = std::slice::from_raw_parts_mut(pool as *mut Value, pool_len);
self.sum = self.val;
for i in &self.prevs {
let (pool_ptr, pool_len) = (pool.as_ptr(), pool.len());
self.sum += pool[*i].evaluate_unsafe(pool_ptr, pool_len);
}
self.sum
}
}
fn main() {
// arcmutex
let mut hm: HashMap<usize, Arc<Mutex<Value>>> = HashMap::new();
for i in 0..10 {
hm.insert(i, Arc::new(Mutex::new(Value::new(i))));
}
let val = hm.get(&9).unwrap().clone();
assert_eq!(val.lock().unwrap().evaluate(&mut hm), 51.2);
// split vec
let mut hm = (0..10).map(|v| {
Value::new(v)
}).collect::<Vec<_>>();
let (hm, val) = hm.split_at_mut(9);
assert_eq!((hm.len(), val.len()), (9, 1));
assert_eq!(val[0].evaluate_split(hm), 51.2);
}
#[cfg(test)]
mod tests {
use test::bench;
use super::*;
#[bench]
fn bench_arc_mutex(b: &mut bench::Bencher) {
let mut hm: HashMap<usize, Arc<Mutex<Value>>> = HashMap::new();
for i in 0..10 {
hm.insert(i, Arc::new(Mutex::new(Value::new(i))));
}
b.iter(|| {
let val = hm.get(&9).unwrap().clone();
assert_eq!(val.lock().unwrap().evaluate(&mut hm), 51.2);
});
}
#[bench]
fn bench_split(b: &mut bench::Bencher) {
let mut hm = (0..10).map(|v| {
Value::new(v)
}).collect::<Vec<_>>();
b.iter(|| {
let (hm, val) = hm.split_at_mut(9);
assert_eq!(val[0].evaluate_split(hm), 51.2);
});
}
#[bench]
fn bench_unsafe(b: &mut bench::Bencher) {
let mut hm = (0..10).map(|v| {
Value::new(v)
}).collect::<Vec<_>>();
b.iter(|| {
// OBS! Don't do this, horribly unsafe and wrong
let (hm_ptr, hm_len) = (hm.as_ptr(), hm.len());
let val = &mut hm[9];
assert_eq!(unsafe { val.evaluate_unsafe(hm_ptr, hm_len) }, 51.2);
});
}
}
cargo bench results to:
running 3 tests
test tests::bench_arc_mutex ... bench: 13,249 ns/iter (+/- 367)
test tests::bench_split ... bench: 1,974 ns/iter (+/- 70)
test tests::bench_unsafe ... bench: 1,989 ns/iter (+/- 62)
Also, have a look at https://rust-unofficial.github.io/too-many-lists/index.html

Flatten a Map<Vec<u8>, Vec<u8>> into a Vec<u8> and then return it to a Map<Vec<u8>, Vec<u8>>

I've have data in a HashMap<Vec<u8>, Vec<u8>> and I want to write that data to a file as a byte buffer (a single Vec<u8>) and then read it back from the file and reconstruct the HashMap structure.
Is there an established algorithm for flattening and recovering maps like this? I could write metadata into the file to distinguish where the data partitions etc. I can't use structured serialization because of the nature of this project — I am encrypting the data and the file.
You may store this with the following format:
value1_len | value1_bytes | key1_len | key1_bytes | value2_len | value2_bytes | key2_len | key2_bytes | ...
what can be fairly easily done with the standard library (playground):
use std::collections::HashMap;
use std::convert::TryInto;
fn serialize(map: &HashMap<Vec<u8>, Vec<u8>>) -> Vec<u8> {
map.iter().fold(Vec::new(), |mut acc, (k, v)| {
acc.extend(&k.len().to_le_bytes());
acc.extend(k.as_slice());
acc.extend(&v.len().to_le_bytes());
acc.extend(v.as_slice());
acc
})
}
fn read_vec(input: &mut &[u8]) -> Vec<u8> {
let (len, rest) = input.split_at(std::mem::size_of::<usize>());
let len = usize::from_le_bytes(len.try_into().unwrap());
let (v, rest) = rest.split_at(len);
*input = rest;
v.to_vec()
}
fn deserialize(bytes: &Vec<u8>) -> HashMap<Vec<u8>, Vec<u8>> {
let mut map = HashMap::new();
let mut left = &bytes[..];
while left.len() > 0 {
let k = read_vec(&mut left);
let v = read_vec(&mut left);
map.insert(k, v);
}
map
}
fn main() {
let mut map = HashMap::new();
map.insert(vec![1, 2, 3], vec![4, 5, 6]);
map.insert(vec![4, 5, 6], vec![1, 2, 3]);
map.insert(vec![1, 5, 3], vec![4, 2, 6]);
let array = serialize(&map);
let recovered_map = deserialize(&array);
assert_eq!(map, recovered_map);
}

What is the correct pattern for a vector of structs where each struct contains a subset of an array of structs?

I've got code very similar to the following (my filter function is more complex though):
struct MyStruct {
a: i32,
b: i32,
count: i32,
}
impl MyStruct {
fn filter(&self) -> bool {
return self.a > self.b + self.count;
}
}
struct ContainerStruct<'a> {
x: i32,
v: Vec<&'a MyStruct>,
}
fn main() {
let mut list_of_items = vec![
MyStruct {
a: 1,
b: 2,
count: 0,
},
MyStruct {
a: 2,
b: 1,
count: 0,
},
MyStruct {
a: 5,
b: 2,
count: 0,
},
];
let mut count = 0;
let mut list_of_containers: Vec<ContainerStruct> = Vec::new();
while count < 10 {
let mut c = ContainerStruct {
x: 1,
v: Vec::new(),
};
for i in list_of_items.iter_mut() {
i.count = count;
if i.filter() {
c.v.push(i);
}
}
count += 1;
list_of_containers.push(c)
}
}
Which does not compile, due to the following error:
error[E0499]: cannot borrow `list_of_items` as mutable more than once at a time
--> src/main.rs:43:18
|
43 | for i in list_of_items.iter_mut() {
| ^^^^^^^^^^^^^ mutable borrow starts here in previous iteration of loop
I know this is a borrow-checking issue, and I can see the potential problems with references etc. What I don't know is the correct pattern to use to achieve what I'm looking for, which is essentially a vector of structs, where each struct contains a subset of an array of structs.
I need to be able to mutate the structs, so I'm forced into using iter_mut().
However that moves the vector into that scope which then gets released next time I go through the external while loop.
Is there any way to force the vector to live long enough to complete the outer loop? I thought about copying the structs but I don't want to do that. I only need references to each one and copying would introduce an unacceptable overhead due to the size of the vector in question.
This compiles:
use std::cell::Cell;
struct MyStruct {
a: i32,
b: i32,
count: Cell<i32>,
}
impl MyStruct {
fn filter(&self) -> bool {
return self.a > self.b + self.count.get();
}
}
struct ContainerStruct<'a> {
x: i32,
v: Vec<&'a MyStruct>,
}
fn main() {
let mut list_of_items = vec![
MyStruct {
a: 1,
b: 2,
count: Cell::new(0),
},
MyStruct {
a: 2,
b: 1,
count: Cell::new(0),
},
MyStruct {
a: 5,
b: 2,
count: Cell::new(0),
},
];
let mut count = 0;
let mut list_of_containers: Vec<ContainerStruct> = Vec::new();
while count < 10 {
let mut c = ContainerStruct {
x: 1,
v: Vec::new(),
};
for i in list_of_items.iter() {
i.count.set(count);
if i.filter() {
c.v.push(i);
}
}
count += 1;
list_of_containers.push(c)
}
}

How would I write this C function in Rust?

How would I write the function below in Rust? Is there a way to write replace() safely or is the operation inherently unsafe? list does not have to be an array, a vector would work as well. It's the replacement operation that I'm interested in.
void replace(int *list[], int a, int b) {
*list[a] = *list[b];
}
I would like the following behavior:
int a = 1;
int b = 2;
int *list[] = { &a, &a, &b, &b };
*list[0] = 3; // list has pointers to values: [3, 3, 2, 2]
replace(list, 2, 0); // list has pointers to values: [3, 3, 3, 3]
*list[0] = 4; // list has pointers to values: [4, 4, 4, 4]
Answer for modified question
Rust does not allow you to have multiple mutable references (aliasing) to the same item. This means you'd never be able to run the equivalent of your third line:
fn main() {
let mut a = 1;
let vals = &[&mut a, &mut a];
}
This fails with:
cannot borrow `a` as mutable more than once at a time
What about using Rc and RefCell?
Rc doesn't let us mutate the value:
A reference-counted pointer type over an immutable value.
(Emphasis mine)
RefCell::borrow_mut won't allow multiple concurrent borrows:
Panics if the value is currently borrowed.
Answer for original question
It's basically the same. I picked a u8 cause it's easier to type. :-)
fn replace(v: &mut [&mut u8], a: usize, b: usize) {
*v[a] = *v[b]
}
fn main() {
let mut vals = vec![1,2,3,4];
{
let mut val_refs: Vec<&mut u8> = vals.iter_mut().collect();
replace(&mut val_refs, 0, 3);
}
println!("{:?}", vals);
}
(playpen link)
Rust does do boundary-checking, so if you call with an index bigger than the slice, the program will panic and you don't get memory corruption.

Two dimensional vectors in Rust

Editor's note: This question predates Rust 0.1 (tagged 2013-07-03) and is not syntactically valid Rust 1.0 code. Answers may still contain valuable information.
Does anyone know how to create mutable two-dimensional vectors in Rust and pass them to a function to be manipulated?
This is what I tried so far:
extern crate std;
fn promeni(rec: &[u8]) {
rec[0][1] = 0x01u8;
}
fn main() {
let mut rec = ~[[0x00u8,0x00u8],
[0x00u8,0x00u8]
];
io::println(u8::str(rec[0][1]));
promeni(rec);
io::println(u8::str(rec[0][1]));
}
You could use the macro vec! to create 2d vectors.
fn test(vec: &mut Vec<Vec<char>>){
vec[0][0] = 'd';
..//
vec[23][79] = 'd';
}
fn main() {
let mut vec = vec![vec!['#'; 80]; 24];
test(&mut vec);
}
Did you intend that all of the subarrays will have the length 2, as in this example? In that case, the type of the parameter should not be &[u8], which is a borrowed array of u8's, but rather &[[u8; 2]].
If the functions that is going to manipulate are yours, you can create a custom struct with the helper methods to treat the vector as 2d:
use std::fmt;
#[derive(Debug)]
pub struct Vec2d<T> {
vec: Vec<T>,
row: usize,
col: usize,
}
impl<T> Vec2d<T> {
pub fn new(vec: Vec<T>, row: usize, col: usize) -> Self {
assert!(vec.len() == row * col);
Self { vec, row, col }
}
pub fn row(&self, row: usize) -> &[T] {
let i = self.col * row;
&self.vec[i..(i + self.col)]
}
pub fn index(&self, row: usize, col: usize) -> &T {
let i = self.col * row;
&self.vec[i + col]
}
pub fn index_mut(&mut self, row: usize, col: usize) -> &mut T {
let i = self.col * row;
&mut self.vec[i + col]
}
}
impl<T: std::fmt::Debug> std::fmt::Display for Vec2d<T> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let mut str = String::new();
for i in 0..self.row {
if i != 0 {
str.push_str(", ");
}
str.push_str(&format!("{:?}", &self.row(i)));
}
write!(f, "[{}]", str)
}
}
fn main() {
let mut mv = Vec2d::new(vec![1, 2, 3, 4, 5, 6], 2, 3);
*mv.index_mut(1, 2) = 10;
println!("Display: {}", mv);
println!("Debug: {:?}", mv);
}
The associated function new creates the Vec2d, have two main methods (index and index_mut, so you can get a index value borrowed immut ou mut) and added a Display trait to visualise it better (but it is stored as Vec<>).

Resources