Mutating a travelling window in a Rust ndarray - multidimensional-array

I am attempting to implement one iteration of Conway's Game of Life in Rust using the ndarray library.
I thought a 3x3 window looping over the array would be a simple way to count the living neighbours, however I am having trouble doing the actual update.
The array signifies life with # and the absence of life with :
let mut world = Array2::<String>::from_elem((10, 10), " ".to_string());
for mut window in world.windows((3, 3)) {
let count_all = window.fold(0, |count, cell| if cell == "#" { count + 1 } else { count });
let count_neighbours = count_all - if window[(1, 1)] == "#" { 1 } else { 0 };
match count_neighbours {
0 | 1 => window[(1, 1)] = " ".to_string(), // Under-population
2 => {}, // Live if alive
3 => window[(1, 1)] = "#".to_string(), // Re-produce
_ => window[(1, 1)] = " ".to_string(), // Over-population
}
}
This code does not compile! The error is within the match block with "error: cannot borrow as mutable" and "error: cannot assign to immutable index". I attempted for &mut window... but the library does not implement this(?)
I'm relatively new to Rust and I believe this may be an issue with the implementation of windows by the library. However, I'm not sure and I don't know if there perhaps some variation/fix that allows me to continue with this approach. Do I need to scrap this approach entirely? I'm not sure what the best approach would be here.
Any other suggestions or improvements to the code would be greatly appreciated.
(This code doesn't implement proper rules as I am mutating as I loop and I am ignoring the outer edge, however that is okay in this case. Also, any variations which do these things are also okay - the details are not important.)

Your general approach using ndarray and windows is ok, but the problem is that the values that you get from the windows iterator will always be immutable. You can work around that by wrapping the values in Cell or RefCell, which gives you interior mutability. That is, they wrap a value as if it was immutable, but provide an API to let you mutate it anyway.
Here is your code, fairly brutally adapted to use RefCell:
use ndarray::Array2;
use std::cell::RefCell;
fn main() {
// creating variables for convenience, so they can be &-referenced
let alive = String::from("#");
let dead = String::from(" ");
let world = Array2::<String>::from_elem((10, 10), " ".to_string());
let world = world.map(RefCell::new);
for mut window in world.windows((3, 3)) {
let count_all = window.fold(0, |count, cell| if *cell.borrow() == &alive { count + 1 } else { count });
let count_neighbours = count_all - if *window[(1, 1)].borrow() == &alive { 1 } else { 0 };
match count_neighbours {
0 | 1 => *window[(1, 1)].borrow_mut() = &dead, // Under-population
2 => {}, // Live if alive
3 => *window[(1, 1)].borrow_mut() = &alive, // Re-produce
_ => *window[(1, 1)].borrow_mut() = &alive, // Over-population
}
}
}
What I've done above is really just to get your code working, pretty much as-is. But, as E_net4 pointed out, your solution has a major bug because it is mutating as it reads. Also, in terms of best-practices, your usage of String isn't ideal. An enum is much better because it's smaller, can be stack-allocated and better captures the invariants of your model. With an enum you would derive Copy as below, which would let you use a Cell instead of RefCell, which is likely to be better performance because it copies the data, instead of having to count references.
#[derive(Debug, PartialEq, Clone, Copy)]
enum CellState {
Alive,
Dead
}

Related

Why is returning a cloned ndarray throwing an overflow error(exceeded max recursion limit)?

I'm currently trying to write a function that is generally equivalent to numpy's tile. Currently each time I try to return a (altered or unaltered) clone of the input array, I get a warning about an overflow, and cargo prompts me to increase the recursion limit. however this function isn't recursive, so I'm assuming its happening somewhere in the implementation.
here is the stripped down function, (full version):
pub fn tile<A, D1, D2>(arr: &Array<A, D1>, reps: Vec<usize>) -> Array<A, D2>
where
A: Clone,
D1: Dimension,
D2: Dimension,
{
let num_of_reps = reps.len();
//just clone the array if reps is all ones
let mut res = arr.clone();
let bail_flag = true;
for &x in reps.iter() {
if x != 1 {
bail_flag = false;
}
}
if bail_flag {
let mut res_dim = res.shape().to_owned();
_new_shape(num_of_reps, res.ndim(), &mut res_dim);
res.to_shape(res_dim);
return res;
}
...
//otherwise do extra work
...
return res.reshape(shape_out);
}
this is the actual error I'm getting on returning res:
overflow evaluating the requirement `&ArrayBase<_, _>: Neg`
consider increasing the recursion limit by adding a `#![recursion_limit = "1024"]` attribute to your crate (`mfcc_2`)
required because of the requirements on the impl of `Neg` for `&ArrayBase<_, _>`
511 redundant requirements hidden
required because of the requirements on the impl of `Neg` for `&ArrayBase<OwnedRepr<A>, D1>`rustcE0275
I looked at the implementation of Neg in ndarray, it doesn't seem to be recursive, so I'm a little confused as to what is going on.
p.s. I'm aware there are other errors in this code, as those appeared after I switched from A to f64(the actual type I plan on using the function with), but those are mostly trivial to fix. Still if you have suggestions on any error you see I appreciate them nonetheless.

How to split a Vec<u8> by a sequence of chars?

I want to extract the payload of a HTTP request as a Vec<u8>. In the request, the payload is separated from the rest by the sequence \r\n\r\n, that's why I want to split my Vec at this position, and take the second element.
My current solution is to use the following function I wrote.
fn find_payload_index(buffer: &Vec<u8>) -> usize {
for (pos, e) in buffer.iter().enumerate() {
if pos < 3 {
continue
}
if buffer[pos - 3] == 13 && buffer[pos - 2] == 10 && buffer[pos - 1] == 13 && buffer[pos] == 10 {
return pos + 1;
}
}
0
}
13 is the ASCII value of \r and 10 the value of \n. I then split by the returned index. While this solution is technically working, it feels very unclean, and I was wondering how to do this in a more elegant way.
First of:
A function should almost never have a &Vec<_> parameter.
See Why is it discouraged to accept a reference to a String (&String), Vec (&Vec), or Box (&Box) as a function argument?.
Don't use the magic values 10 and 13, Rust supports byte literals: b'\r' and b'\n'.
As for your question: I believe you can make it a bit simpler using windows and matches! with a byte string literal pattern:
fn find_payload_index(buffer: &[u8]) -> Option<usize> {
buffer
.windows(4)
.enumerate()
.find(|(_, w)| matches!(*w, b"\r\n\r\n"))
.map(|(i, _)| i)
}
Permalink to the playground with test cases.
Note that slice has a starts_with method which will more easily do what you want:
fn find_payload_index(buffer: &[u8]) -> usize {
for i in 0..buffer.len() {
if buffer[i..].starts_with(b"\r\n\r\n") {
return i
}
}
panic!("malformed buffer without the sequence")
}
I see no reason to use enumerate if the actual element itself never be used, simply looping over 0..buffer.len() seems the easiest solution to me.
I have also elected to make the function panic, rather than return 0, when the sequence be malformed, which I believe is more proper, though you should probably in the end return some kind of Result value, and handle the error case cleanly, if the input be malformed, but you should never return 0 in this case.
A shorter alternative for #mccarton answer would be to use position:
fn find_payload_index(buffer: &[u8]) -> Option<usize> {
buffer
.windows(4)
.position(|arr| arr == b"\r\n\r\n")
}

How to convert a vector of vectors into a vector of slices without creating a new object? [duplicate]

I have the following:
enum SomeType {
VariantA(String),
VariantB(String, i32),
}
fn transform(x: SomeType) -> SomeType {
// very complicated transformation, reusing parts of x in order to produce result:
match x {
SomeType::VariantA(s) => SomeType::VariantB(s, 0),
SomeType::VariantB(s, i) => SomeType::VariantB(s, 2 * i),
}
}
fn main() {
let mut data = vec![
SomeType::VariantA("hello".to_string()),
SomeType::VariantA("bye".to_string()),
SomeType::VariantB("asdf".to_string(), 34),
];
}
I would now like to call transform on each element of data and store the resulting value back in data. I could do something like data.into_iter().map(transform).collect(), but this will allocate a new Vec. Is there a way to do this in-place, reusing the allocated memory of data? There once was Vec::map_in_place in Rust but it has been removed some time ago.
As a work-around, I've added a Dummy variant to SomeType and then do the following:
for x in &mut data {
let original = ::std::mem::replace(x, SomeType::Dummy);
*x = transform(original);
}
This does not feel right, and I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop. Is there a better way of doing this?
Your first problem is not map, it's transform.
transform takes ownership of its argument, while Vec has ownership of its arguments. Either one has to give, and poking a hole in the Vec would be a bad idea: what if transform panics?
The best fix, thus, is to change the signature of transform to:
fn transform(x: &mut SomeType) { ... }
then you can just do:
for x in &mut data { transform(x) }
Other solutions will be clunky, as they will need to deal with the fact that transform might panic.
No, it is not possible in general because the size of each element might change as the mapping is performed (fn transform(u8) -> u32).
Even when the sizes are the same, it's non-trivial.
In this case, you don't need to create a Dummy variant because creating an empty String is cheap; only 3 pointer-sized values and no heap allocation:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
let old = std::mem::replace(self, VariantA(String::new()));
// Note this line for the detailed explanation
*self = match old {
VariantA(s) => VariantB(s, 0),
VariantB(s, i) => VariantB(s, 2 * i),
};
}
}
for x in &mut data {
x.transform();
}
An alternate implementation that just replaces the String:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
*self = match self {
VariantA(s) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 0)
}
VariantB(s, i) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 2 * *i)
}
};
}
}
In general, yes, you have to create some dummy value to do this generically and with safe code. Many times, you can wrap your whole element in Option and call Option::take to achieve the same effect .
See also:
Change enum variant while moving the field to the new variant
Why is it so complicated?
See this proposed and now-closed RFC for lots of related discussion. My understanding of that RFC (and the complexities behind it) is that there's an time period where your value would have an undefined value, which is not safe. If a panic were to happen at that exact second, then when your value is dropped, you might trigger undefined behavior, a bad thing.
If your code were to panic at the commented line, then the value of self is a concrete, known value. If it were some unknown value, dropping that string would try to drop that unknown value, and we are back in C. This is the purpose of the Dummy value - to always have a known-good value stored.
You even hinted at this (emphasis mine):
I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop
That "should" is the problem. During a panic, that dummy value is visible.
See also:
How can I swap in a new value for a field in a mutable reference to a structure?
Temporarily move out of borrowed content
How do I move out of a struct field that is an Option?
The now-removed implementation of Vec::map_in_place spans almost 175 lines of code, most of having to deal with unsafe code and reasoning why it is actually safe! Some crates have re-implemented this concept and attempted to make it safe; you can see an example in Sebastian Redl's answer.
You can write a map_in_place in terms of the take_mut or replace_with crates:
fn map_in_place<T, F>(v: &mut [T], f: F)
where
F: Fn(T) -> T,
{
for e in v {
take_mut::take(e, f);
}
}
However, if this panics in the supplied function, the program aborts completely; you cannot recover from the panic.
Alternatively, you could supply a placeholder element that sits in the empty spot while the inner function executes:
use std::mem;
fn map_in_place_with_placeholder<T, F>(v: &mut [T], f: F, mut placeholder: T)
where
F: Fn(T) -> T,
{
for e in v {
let mut tmp = mem::replace(e, placeholder);
tmp = f(tmp);
placeholder = mem::replace(e, tmp);
}
}
If this panics, the placeholder you supplied will sit in the panicked slot.
Finally, you could produce the placeholder on-demand; basically replace take_mut::take with take_mut::take_or_recover in the first version.

"exponentially large number of cases" errors in latest Flow with common spread pattern

I frequently use the following pattern to create objects with null/undefined properties omitted:
const whatever = {
something: true,
...(a ? { a } : null),
...(b ? { b } : null),
};
As of flow release v0.112, this leads to the error message:
Computing object literal [1] may lead to an exponentially large number of cases to reason about because conditional [2] and conditional [3] are both unions. Please use at most one union type per spread to simplify reasoning about the spread result. You may be able to get rid of a union by specifying a more general type that captures all of the branches of the union.
It sounds to me like this isn't really a type error, just that Flow is trying to avoid some heavier computation. This has led to dozens of flow errors in my project that I need to address somehow. Is there some elegant way to provide better type information for these? I'd prefer not to modify the logic of the code, I believe that it works the way that I need it to (unless someone has a more elegant solution here as well). Asking here before I resort to // $FlowFixMe for all of these.
Complete example on Try Flow
It's not as elegant to write, and I think Flow should handle the case that you've shown, but if you still want Flow to type check it you could try rewriting it like this:
/* #flow */
type A = {
cat: number,
};
type B = {
dog: string,
}
type Built = {
something: boolean,
a?: A,
b?: B,
};
function buildObj(a?: ?A, b?: ?B): Built {
const ret: Built = {
something: true
};
if(a) ret.a = a
if(b) ret.b = b
return ret;
}
Try Flow

Is there a better functional way to process a vector with error checking?

I'm learning Rust and would like to know how I can improve the code below.
I have a vector of tuples of form (u32, String). The u32 values represent line numbers and the Strings are the text on the corresponding lines. As long as all the String values can be successfully parsed as integers, I want to return an Ok<Vec<i32>> containing the just parsed String values, but if not I want to return an error of some form (just an Err<String> in the example below).
I'm trying to learn to avoid mutability and use functional styles where appropriate, and the above is straightforward to do functionally if that was all that was needed. Here's what I came up with in this case:
fn data_vals(sv: &Vec<(u32, String)>) -> Result<Vec<i32>, String> {
sv.iter()
.map(|s| s.1.parse::<i32>()
.map_err(|_e| "*** Invalid data.".to_string()))
.collect()
}
However, the small catch is that I want to print an error message for every invalid value (and not just the first one), and the error messages should contain both the line number and the string values in the offending tuple.
I've managed to do it with the following code:
fn data_vals(sv: &Vec<(u32, String)>) -> Result<Vec<i32>, String> {
sv.iter()
.map(|s| (s.0, s.1.parse::<i32>()
.or_else(|e| {
eprintln!("ERROR: Invalid data value at line {}: '{}'",
s.0, s.1);
Err(e)
})))
.collect::<Vec<(u32, Result<i32, _>)>>() // Collect here to avoid short-circuit
.iter()
.map(|i| i.1
.clone()
.map_err(|_e| "*** Invalid data.".to_string()))
.collect()
}
This works, but seems rather messy and cumbersome - especially the typed collect() in the middle to avoid short-circuiting so all the errors are printed. The clone() call is also annoying, and I'm not really sure why it's needed - the compiler says I'm moving out of borrowed content otherwise, but I'm not really sure what's being moved. Is there a way it can be done more cleanly? Or should I go back to a more procedural style? When I tried, I ended up with mutable variables and a flag to indicate success and failure, which seems less elegant:
fn data_vals(sv: &Vec<(u32, String)>) -> Result<Vec<i32>, String> {
let mut datavals = Vec::new();
let mut success = true;
for s in sv {
match s.1.parse::<i32>() {
Ok(v) => datavals.push(v),
Err(_e) => {
eprintln!("ERROR: Invalid data value at line {}: '{}'",
s.0, s.1);
success = false;
},
}
}
if success {
return Ok(datavals);
} else {
return Err("*** Invalid data.".to_string());
}
}
Can someone advise me on the best way to do this? Should I stick to the procedural style here, and if so can that be improved? Or is there a cleaner functional way to do it? Or a blend of the two? Any advice appreciated.
I think that's what partition_map() from itertools is for:
use itertools::{Either, Itertools};
fn data_vals<'a>(sv: &[&'a str]) -> Result<Vec<i32>, Vec<(&'a str, std::num::ParseIntError)>> {
let (successes, failures): (Vec<_>, Vec<_>) =
sv.iter().partition_map(|s| match s.parse::<i32>() {
Ok(v) => Either::Left(v),
Err(e) => Either::Right((*s, e)),
});
if failures.len() != 0 {
Err(failures)
} else {
Ok(successes)
}
}
fn main() {
let numbers = vec!["42", "aaaezrgggtht", "..4rez41eza", "55"];
println!("{:#?}", data_vals(&numbers));
}
In a purely functional style, you have to avoid side-effects.
Printing errors is a side-effect. The preferred style would be to return an object of the style:
Result<Vec<i32>, Vec<String>>
and print the list after the data_vals function returns.
So, essentially, you want your processing to collect a list of integers, and a list of strings:
fn data_vals(sv: &Vec<(u32, String)>) -> Result<Vec<i32>, Vec<String>> {
let (ok, err): (Vec<_>, Vec<_>) = sv
.iter()
.map(|(i, s)| {
s.parse()
.map_err(|_e| format!("ERROR: Invalid data value at line {}: '{}'", i, s))
})
.partition(|e| e.is_ok());
if err.len() > 0 {
Err(err.iter().filter_map(|e| e.clone().err()).collect())
} else {
Ok(ok.iter().filter_map(|e| e.clone().ok()).collect())
}
}
fn main() {
let input = vec![(1, "0".to_string())];
let r = data_vals(&input);
assert_eq!(r, Ok(vec![0]));
let input = vec![(1, "zzz".to_string())];
let r = data_vals(&input);
assert_eq!(r, Err(vec!["ERROR: Invalid data value at line 1: 'zzz'".to_string()]));
}
Playground Link
This uses partition which does not depend on an external crate.
Side effects (eprintln!) in an iterator adapter are definitely not "functional". You should accumulate and return the errors and let the caller deal with them.
I would use fold here. The goal of fold is to reduce a list to a single value, starting from an initial value and augmenting the result with every item. This "single value" can very well be a list, though. Here, though, there are two possible lists we might want to return: a list of i32 if all values are valid, or a list of errors if there are any errors (I've chosen to return Strings for errors here, for simplicity.)
fn data_vals(sv: &[(u32, String)]) -> Result<Vec<i32>, Vec<String>> {
sv.iter().fold(
Ok(Vec::with_capacity(sv.len())),
|acc, (line_number, data)| {
let data = data
.parse::<i32>()
.map_err(|_| format!("Invalid data value at line {}: '{}'", line_number, data));
match (acc, data) {
(Ok(mut acc_data), Ok(this_data)) => {
// No errors yet; push the parsed value to the values vector.
acc_data.push(this_data);
Ok(acc_data)
}
(Ok(..), Err(this_error)) => {
// First error: replace the accumulator with an `Err` containing the first error.
Err(vec![this_error])
}
(Err(acc_errors), Ok(..)) => {
// There have been errors, but this item is valid; ignore it.
Err(acc_errors)
}
(Err(mut acc_errors), Err(this_error)) => {
// One more error: push it to the error vector.
acc_errors.push(this_error);
Err(acc_errors)
}
}
},
)
}
fn main() {
println!("{:?}", data_vals(&[]));
println!("{:?}", data_vals(&[(1, "123".into())]));
println!("{:?}", data_vals(&[(1, "123a".into())]));
println!("{:?}", data_vals(&[(1, "123".into()), (2, "123a".into())]));
println!("{:?}", data_vals(&[(1, "123a".into()), (2, "123".into())]));
println!("{:?}", data_vals(&[(1, "123a".into()), (2, "123b".into())]));
}
The initial value is Ok(Vec::with_capacity(sv.len())) (this is an optimization to avoid reallocating the vector as we push items to it; a simpler version would be Ok(vec![])). If the slice is empty, this will be fold's result; the closure will never be called.
For each item, the closure checks 1) whether there were any errors so far (indicated by the accumulator value being an Err) or not and 2) whether the current item is valid or not. I'm matching on two Result values simultaneously (by combining them in a tuple) to handle all 4 cases. The closure then returns an Ok if there are no errors so far (with all the parsed values so far) or an Err if there are any errors so far (with every invalid value found so far).
You'll notice I used the push method to add an item to a Vec. This is, strictly speaking, mutation, which is not considered "functional", but because we are moving the Vecs here, we know there are no other references to them, so we know we aren't affecting any other use of these Vecs.

Resources