I have a Board (a.k.a. &mut Vec<Vec<Cell>>) which I would like to update while iterating over it. The new value I want to update with is derived from a function which requires a &Vec<Vec<Cell>> to the collection I'm updating.
I have tried several things:
Use board.iter_mut().enumerate() and row.iter_mut().enumerate() so that I could update the cell in the innermost loop. Rust does not allow calling the next_gen function because it requires a &Vec<Vec<Cell>> and you cannot have a immutable reference when you already have a mutable reference.
Change the next_gen function signature to accept a &mut Vec<Vec<Cell>>. Rust does not allow multiple mutable references to an object.
I'm currently deferring all the updates to a HashMap and then applying them after I've performed my iteration:
fn step(board: &mut Board) {
let mut cells_to_update: HashMap<(usize, usize), Cell> = HashMap::new();
for (row_index, row) in board.iter().enumerate() {
for (column_index, cell) in row.iter().enumerate() {
let cell_next = next_gen((row_index, column_index), &board);
if *cell != cell_next {
cells_to_update.insert((row_index, column_index), cell_next);
}
}
}
println!("To Update: {:?}", cells_to_update);
for ((row_index, column_index), cell) in cells_to_update {
board[row_index][column_index] = cell;
}
}
Full source
Is there a way that I could make this code update the board "in place", that is, inside the innermost loop while still being able to call next_gen inside the innermost loop?
Disclaimer:
I'm learning Rust and I know this is not the best way to do this. I'm playing around to see what I can and cannot do. I'm also trying to limit any copying to restrict myself a little bit. As oli_obk - ker mentions, this implementation for Conway's Game of Life is flawed.
This code was intended to gauge a couple of things:
if this is even possible
if it is idiomatic Rust
From what I have gathered in the comments, it is possible with std::cell::Cell. However, using std:cell:Cell circumvents some of the core Rust principles, which I described as my "dilemma" in the original question.
Is there a way that I could make this code update the board "in place"?
There exists a type specially made for situations such as these. It's coincidentally called std::cell::Cell. You're allowed to mutate the contents of a Cell even when it has been immutably borrowed multiple times. Cell is limited to types that implement Copy (for others you have to use RefCell, and if multiple threads are involved then you must use an Arc in combination with somethinng like a Mutex).
use std::cell::Cell;
fn main() {
let board = vec![Cell::new(0), Cell::new(1), Cell::new(2)];
for a in board.iter() {
for b in board.iter() {
a.set(a.get() + b.get());
}
}
println!("{:?}", board);
}
It entirely depends on your next_gen function. Assuming we know nothing about the function except its signature, the easiest way is to use indices:
fn step(board: &mut Board) {
for row_index in 0..board.len() {
for column_index in 0..board[row_index].len() {
let cell_next = next_gen((row_index, column_index), &board);
if board[row_index][column_index] != cell_next {
board[row_index][column_index] = cell_next;
}
}
}
}
With more information about next_gen a different solution might be possible, but it sounds a lot like a cellular automaton to me, and to the best of my knowledge this cannot be done in an iterator-way in Rust without changing the type of Board.
You might fear that the indexing solution will be less efficient than an iterator solution, but you should trust LLVM on this. In case your next_gen function is in another crate, you should mark it #[inline] so LLVM can optimize it too (not necessary if everything is in one crate).
Not an answer to your question, but to your problem:
Since you are implementing Conway's Game of Life, you cannot do the modification in-place. Imagine the following pattern:
00000
00100
00100
00100
00000
If you update line 2, it will change the 1 in that line to a 0 since it has only two 1s in its neighborhood. This will cause the middle 1 to see only two 1s instead of the three that were there to begin with. Therefor you always need to either make a copy of the entire Board, or, as you did in your code, write all the changes to some other location, and splice them in after going through the entire board.
Related
I'm looking for a method that consumes a Vec and returns one element, without the overhead of restoring Vec's invariants the way remove and swap_remove do:
fn take<T>(vec: Vec<T>, index: usize) -> Option<T>
However, I can't find such a method. Am I missing something? Is this actually unsafe or impossible?
This is a different question from Built in *safe* way to move out of Vec<T>?
There the goal was a remove method that didn't panic on out of bounds access and returned a Result. I'm looking for a method that consumes a Vec and returns one of the elements. None of the answers to the above question address my question.
You can write your function like this:
fn take<T>(mut vec: Vec<T>, index: usize) -> Option<T> {
if vec.get(index).is_none() {
None
} else {
Some(vec.swap_remove(index))
}
}
The code you see here (get and swap_remove) is guaranteed O(1).
However, kind of hidden, vec is dropped at the end of the function and this drop operation is likely not O(1), but O(n) (where n is vec.len()). If T implements Drop, then drop() is called for every element still inside the vector, meaning dropping the vector is guaranteed O(n). If T does not implement Drop, then the Vec only needs to deallocate the memory. The time complexity of the dealloc operation depends on the allocator and is not specified, so we cannot assume it is O(1).
To mention another solution using iterators:
fn take<T>(vec: Vec<T>, index: usize) -> Option<T> {
vec.into_iter().nth(index)
}
I was about to write this:
While Iterator::nth() usually is a linear time operation, the iterator over a vector overrides this method to make it a O(1) operation.
But then I noticed, that this is only true for the iterator which iterates over slices. The std::vec::IntoIter iterator which would be used in the code above, doesn't override nth(). It has been attempted here, but it doesn't seem to be that easy.
So, as of right now, the iterator solution above is a O(n) operation! Not to mention the time needed to drop the vector, as explained above.
The reason fn take<T>(vec: Vec<T>, index: usize) -> Option<T> does not exist in the standard library is that it is not very useful in general. For example, supposing that you have a Vec<String> of length 10, it means throwing away 9 strings and only using 1. This seems wasteful.
In general, the standard library will try to provide an API that is useful in a maximum of scenarios, and in this instance it would be more logical to have a fn take<T>(vec: &mut Vec<T>, index: usize) -> Option<T>.
The only question is how to preserve the invariant, of course:
it can be preserved by exchanging with the last element, which is what Vec::swap_remove does,
it can be preserved by shifting the successor elements in, which is what Vec::drain does.
Those are very flexible, and can be adapted to fill more specific scenarios, such as yours.
Adapting swap_remove:
fn take<T>(mut vec: Vec<T>, index: usize) -> Option<T> {
if index < vec.len() {
Some(vec.swap_remove(index))
} else {
None
}
}
Adapting drain:
fn take<T>(mut vec: Vec<T>, index: usize) -> Option<T> {
if index < vec.len() {
vec.drain(index..index+1).next()
} else {
None
}
}
Noting that the former is more efficient: it's O(1).
I'm looking for a method that consumes the Vec and returns one element, without the overhead of restoring Vec's invariants the way remove and swap_remove do.
This reeks of premature micro-optimization to me.
First of all, note that it is necessary to destroy the elements of the vector; you can accomplish this in two ways:
swap_remove, then iterate over each element to destroy them,
Iterate over each element to destroy them, skipping the specific index.
It is not clear to me that the latter would be faster than the former; if anything it looks more complicated, with more branches (I advise two loops), which may throw off the predictor and may be less amenable to vectorization.
Secondly, before complaining about the overhead of restoring the Vec's invariant, have you properly profiled the solution?
If we look at the swap_remove variant, there are 3 steps:
swap_remove (O(1)),
destroy each remaining element (O(N)),
free the backing memory.
Step 2 may be optimized out if the element has no Drop implementation, but otherwise I would be it's a toss whether (2) or (3) is dominating the cost.
TL;DR: I am afraid that you are fighting ghost issues, profile before trying to optimize.
I've been reading about functional programming and its concepts. It's clear to me that when working in big projects you always need to mix (at some adequate level) multiple paradigms such as OO and functional. In theory, concepts such as function purity are too strict such as
The function always evaluates the same result value given the same argument value(s). The function result value cannot depend on any hidden information or state that may change while program execution proceeds or between different executions of the program, nor can it depend on any external input from I/O devices. (https://en.wikipedia.org/wiki/Pure_function)
That said, is this (or can be considered) code a pure function?
const externalVar = 10;
function timesTen(value) {
return externalVar * value;
}
I'm asking this because, in this case, the timesTen function will always return the same value for an input, and anyone can change the value of externalVar as this is a constant. However, this code breaks the rule of accessing external function's scope.
Yes. It is guaranteed to be pure.
The reason is that it only depends on bound and immutable free variables.
However, this code breaks the rule of accessing external function's
scope.
There is nothing in your quote that says you cannot access free variables. It says external input as reading from a file, network, etc not a free variable from a previous scope.
Even Haskell use global function names like foldr and it is a free variable in every function it is used and of course the result is pure.
Remember that functions by name is just variables. parseInt is a variable that points to a function so it would have been hard to make anything at all if every function you should use in another function be passed as parameter.
If you redefine parseInt to something that is not pure or during the duration of your program so that it works differently then no function calling it would be pure.
Function composition and partial evaluation work because they supply free variables. Its an essential method of abstraction in functional programming. eg.
function compose(f2, f1) {
return (...args) => f2(f1(...args));
}
function makeAdder(initialValue) {
return v => v + initialValue;
}
const add11 = compose(makeAdder(10), makeAdder(1));
add11(5); // ==> 16
This is pure. The closure variable / free variable f1, f2, initialValue never changes for the created functions. add11 is a pure function.
Now look at compose again. It looks pure but it can be tainted. If not both functions passed to it were pure the result isn't either.
OO can be purely functional too!
They can easily be combined by not mutating the objects you create.
class FunctionalNumber {
constructor(value) {
this.value = value;
}
add(fn) {
return new FunctionalNumber(this.value + fn.value);
}
sub(fn) {
return new FunctionalNumber(this.value - fn.value);
}
}
This class is purely functional.
In fact you can think of a method call like obj.someMethod(arg1, arg2) as a function call with obj as first argument someFunction(obj, arg1, arg2). It's only syntactic differences and if someFunction mutated obj you would have said it was not pure. This is how it is with someMethod and obj too.
You can make classes that work on large data structures that are functional, which means you never have to copy it before changing when doing a backtracking puzzle solver. A simple example is the pair in Haskell and Lisp. Here is one way to make it in JavaScript:
class Cons {
constructor(car, cdr) {
this.car = car;
this.cdr = cdr;
}
}
const lst = new Cons(1, new Cons(2, new Cons(3, null)));
const lst0 = new Cons(0, lst);
lst0 is lst but with a new element in front. lst0 reuses everything in lst. Everything from lists to binary trees can be made with this and you can make many sequential data structures with immutable binary trees. It's been around since the 50s.
I understand your opinion and totally agree with #Sylwester, but there's a point that is worth mention: with reflection external constant values can be modified and break the pureness of your function. We know that everything in IT can be hacked and we should not considere this over the concepts, but in practice we should have this clear in mind, that in this way functional pureness is unsound.
I can convert Vec<String> to Vec<&str> this way:
let mut items = Vec::<&str>::new();
for item in &another_items {
items.push(item);
}
Are there better alternatives?
There are quite a few ways to do it, some have disadvantages, others simply are more readable to some people.
This dereferences s (which is of type &String) to a String "right hand side reference", which is then dereferenced through the Deref trait to a str "right hand side reference" and then turned back into a &str. This is something that is very commonly seen in the compiler, and I therefor consider it idiomatic.
let v2: Vec<&str> = v.iter().map(|s| &**s).collect();
Here the deref function of the Deref trait is passed to the map function. It's pretty neat but requires useing the trait or giving the full path.
let v3: Vec<&str> = v.iter().map(std::ops::Deref::deref).collect();
This uses coercion syntax.
let v4: Vec<&str> = v.iter().map(|s| s as &str).collect();
This takes a RangeFull slice of the String (just a slice into the entire String) and takes a reference to it. It's ugly in my opinion.
let v5: Vec<&str> = v.iter().map(|s| &s[..]).collect();
This is uses coercions to convert a &String into a &str. Can also be replaced by a s: &str expression in the future.
let v6: Vec<&str> = v.iter().map(|s| { let s: &str = s; s }).collect();
The following (thanks #huon-dbaupp) uses the AsRef trait, which solely exists to map from owned types to their respective borrowed type. There's two ways to use it, and again, prettiness of either version is entirely subjective.
let v7: Vec<&str> = v.iter().map(|s| s.as_ref()).collect();
and
let v8: Vec<&str> = v.iter().map(AsRef::as_ref).collect();
My bottom line is use the v8 solution since it most explicitly expresses what you want.
The other answers simply work. I just want to point out that if you are trying to convert the Vec<String> into a Vec<&str> only to pass it to a function taking Vec<&str> as argument, consider revising the function signature as:
fn my_func<T: AsRef<str>>(list: &[T]) { ... }
instead of:
fn my_func(list: &Vec<&str>) { ... }
As pointed out by this question: Function taking both owned and non-owned string collections. In this way both vectors simply work without the need of conversions.
All of the answers idiomatically use iterators and collecting instead of a loop, but do not explain why this is better.
In your loop, you first create an empty vector and then push into it. Rust makes no guarantees about the strategy it uses for growing factors, but I believe the current strategy is that whenever the capacity is exceeded, the vector capacity is doubled. If the original vector had a length of 20, that would be one allocation, and 5 reallocations.
Iterating from a vector produces an iterator that has a "size hint". In this case, the iterator implements ExactSizeIterator so it knows exactly how many elements it will return. map retains this and collect takes advantage of this by allocating enough space in one go for an ExactSizeIterator.
You can also manually do this with:
let mut items = Vec::<&str>::with_capacity(another_items.len());
for item in &another_items {
items.push(item);
}
Heap allocations and reallocations are probably the most expensive part of this entire thing by far; far more expensive than taking references or writing or pushing to a vector when no new heap allocation is involved. It wouldn't surprise me if pushing a thousand elements onto a vector allocated for that length in one go were faster than pushing 5 elements that required 2 reallocations and one allocation in the process.
Another unsung advantage is that using the methods with collect do not store in a mutable variable which one should not use if it's unneeded.
another_items.iter().map(|item| item.deref()).collect::<Vec<&str>>()
To use deref() you must add using use std::ops::Deref
This one uses collect:
let strs: Vec<&str> = another_items.iter().map(|s| s as &str).collect();
Here is another option:
use std::iter::FromIterator;
let v = Vec::from_iter(v.iter().map(String::as_str));
Note that String::as_str is stable since Rust 1.7.
I have a struct that contains a RefCell for storing mutable values within a vector, and I'd like to loop over its values.
Adding an element causes no problems, but when attempting to convert the borrowed vector into an iterator it throws:
error: cannot move out of borrowed content [E0507]
Why does the borrow even matter, if it's immutable? I don't understand why the compiler would mark this as a potential issue when the content of the variable doesn't even change.
I can get around the ownership issue by cloning it, but why do I need to do that in the first place? Cloning the structure I'm trying to loop over is probably going to have a high CPU cost and I'd prefer not to have to do it if possible.
Example of what I'm trying to achieve:
fn main() {
use std::cell::RefCell;
let c = RefCell::new(vec![1, 2, 3]);
let arr = c.borrow();
for i in arr.into_iter() {
println!("{}", i);
}
}
Is there something I'm missing here or is Rust being overly cautious about this?
Would appreciate it if someone could fill any gaps in my understanding of how this works.
It appears the issue was in there being a difference between Vec.into_iter and Vec.iter. To solve, change:
for i in arr.into_iter() {
println!("{}", i);
}
to:
for i in arr.iter() {
println!("{}", i);
}
As described in Effectively Using Iterators In Rust.
Generally, I have a headache because something is wrong with my reasoning:
For 1 set of arguments, referential transparent function will always return 1 set of output values.
that means that such function could be represented as a truth table (a table where 1 set of output parameters is specified for 1 set of arguments).
that makes the logic behind such functions is combinational (as opposed to sequential)
that means that with pure functional language (that has only rt functions) it is possible to describe only combinational logic.
The last statement is derived from this reasoning, but it's obviously false; that means there is an error in reasoning. [question: where is error in this reasoning?]
UPD2. You, guys, are saying lots of interesting stuff, but not answering my question. I defined it more explicitly now. Sorry for messing up with question definition!
Question: where is error in this reasoning?
A referentially transparent function might require an infinite truth table to represent its behavior. You will be hard pressed to design an infinite circuit in combinatory logic.
Another error: the behavior of sequential logic can be represented purely functionally as a function from states to states. The fact that in the implementation these states occur sequentially in time does not prevent one from defining a purely referentially transparent function which describes how state evolves over time.
Edit: Although I apparently missed the bullseye on the actual question, I think my answer is pretty good, so I'm keeping it :-) (see below).
I guess a more concise way to phrase the question might be: can a purely functional language compute anything an imperative one can?
First of all, suppose you took an imperative language like C and made it so you can't alter variables after defining them. E.g.:
int i;
for (i = 0; // okay, that's one assignment
i < 10; // just looking, that's all
i++) // BUZZZ! Sorry, can't do that!
Well, there goes your for loop. Do we get to keep our while loop?
while (i < 10)
Sure, but it's not very useful. i can't change, so it's either going to run forever or not run at all.
How about recursion? Yes, you get to keep recursion, and it's still plenty useful:
int sum(int *items, unsigned int count)
{
if (count) {
// count the first item and sum the rest
return *items + sum(items + 1, count - 1);
} else {
// no items
return 0;
}
}
Now, with functions, we don't alter state, but variables can, well, vary. Once a variable passes into our function, it's locked in. However, we can call the function again (recursion), and it's like getting a brand new set of variables (the old ones stay the same). Although there are multiple instances of items and count, sum((int[]){1,2,3}, 3) will always evaluate to 6, so you can replace that expression with 6 if you like.
Can we still do anything we want? I'm not 100% sure, but I think the answer is "yes". You certainly can if you have closures, though.
You have it right. The idea is, once a variable is defined, it can't be redefined. A referentially transparent expression, given the same variables, always yields the same result value.
I recommend looking into Haskell, a purely functional language. Haskell doesn't have an "assignment" operator, strictly speaking. For instance:
my_sum numbers = ??? where
i = 0
total = 0
Here, you can't write a "for loop" that increments i and total as it goes along. All is not lost, though. Just use recursion to keep getting new is and totals:
my_sum numbers = f 0 0 where
f i total =
if i < length numbers
then f i' total'
else total
where
i' = i+1
total' = total + (numbers !! i)
(Note that this is a stupid way to sum a list in Haskell, but it demonstrates a method of coping with single assignment.)
Now, consider this highly imperative-looking code:
main = do
a <- readLn
b <- readLn
print (a + b)
It's actually syntactic sugar for:
main =
readLn >>= (\a ->
readLn >>= (\b ->
print (a + b)))
The idea is, instead of main being a function consisting of a list of statements, main is an IO action that Haskell executes, and actions are defined and chained together with bind operations. Also, an action that does nothing, yielding an arbitrary value, can be defined with the return function.
Note that bind and return aren't specific to actions. They can be used with any type that calls itself a Monad to do all sorts of funky things.
To clarify, consider readLn. readLn is an action that, if executed, would read a line from standard input and yield its parsed value. To do something with that value, we can't store it in a variable because that would violate referential transparency:
a = readLn
If this were allowed, a's value would depend on the world and would be different every time we called readLn, meaning readLn wouldn't be referentially transparent.
Instead, we bind the readLn action to a function that deals with the action, yielding a new action, like so:
readLn >>= (\x -> print (x + 1))
The result of this expression is an action value. If Haskell got off the couch and performed this action, it would read an integer, increment it, and print it. By binding the result of an action to a function that does something with the result, we get to keep referential transparency while playing around in the world of state.
As far as I understand it, referential transparency just means: A given function will always yield the same result when invoked with the same arguments. So, the mathematical functions you learned about in school are referentially transparent.
A language you could check out in order to learn how things are done in a purely functional language would be Haskell. There are ways to use "updateable storage possibilities" like the Reader Monad, and the State Monad for example. If you're interested in purely functional data structures, Okasaki might be a good read.
And yes, you're right: Order of evaluation in a purely functional language like haskell does not matter as in non-functional languages, because if there are no side effects, there is no reason to do someting before/after something else -- unless the input of one depends on the output of the other, or means like monads come into play.
I don't really know about the truth-table question.
Here's my stab at answering the question:
Any system can be described as a combinatorial function, large or small.
There's nothing wrong with the reasoning that pure functions can only deal with combinatorial logic -- it's true, just that functional languages hide that from you to some extent or another.
You could even describe, say, the workings of a game engine as a truth table or a combinatorial function.
You might have a deterministic function that takes in "the current state of the entire game" as the RAM occupied by the game engine and the keyboard input, and returns "the state of the game one frame later". The return value would be determined by the combinations of the bits in the input.
Of course, in any meaningful and sane function, the input is parsed down to blocks of integers, decimals and booleans, but the combinations of the bits in those values is still determining the output of your function.
Keep in mind also that basic digital logic can be described in truth tables. The only reason that that's not done for anything more than, say, arithmetic on 4-bit integers, is because the size of the truth table grows exponentially.
The error in Your reasoning is the following:
"that means that such function could be represented as a truth table".
You conclude that from a functional language's property of referential transparency. So far the conclusion would sound plausible, but You oversee that a function is able to accept collections as input and process them in contrast to the fixed inputs of a logic gate.
Therefore a function does not equal a logic gate but rather a construction plan of such a logic gate depending on the actual (at runtime determined) input!
To comment on Your comment: Functional languages can - although stateless - implement a state machine by constructing the states from scratch each time they are being accessed.