Modifying map while iterating over it in Go

Modifying map while iterating over it in Go - dictionary

Given the following code I would expected an infinite loop but the loop is being stopped at certain point.
m := make(map[int]string, 4)
m[0] = "Foo"
for k, v := range m {
m[k+1] = v
}
I cannot figure out what happen under the hood because different execution return different output. For example these are a few outputs from different executions:
map[0:Foo 1:Foo 2:Foo 3:Foo 4:Foo 5:Foo 6:Foo 7:Foo]
map[0:Foo 1:Foo]
map[0:Foo 1:Foo 2:Foo]
How range works in order to exit from loop at certain point and what is the exit condition?

Spec: For statements with range clause says the behavior is unpredictable:
The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next. If a map entry that has not yet been reached is removed during iteration, the corresponding iteration value will not be produced. If a map entry is created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next. If the map is nil, the number of iterations is 0.
Adding elements to the map you're ranging over, those entries may or may not be visited by the loop, you should not assume anything regarding to that.

Based on the language spec:
If a map entry is created during iteration, that entry may be produced during the iteration or may be skipped.
So if the new elements are skipped, the for-loop eventually ends.

The other answers have already explained the behavior you observe with your snippet.
Because your title is rather generic but your snippet only covers the addition of map entries while iterating over the map, here is a complementary example that should convince you that "cross-removing" map entries while iterating over the map is a bad idea (Playground):
package main
import "fmt"
func main() {
m := map[string]int{"foo": 0, "bar": 1, "baz": 2}
for k := range m {
if k == "foo" {
delete(m, "bar")
}
if k == "bar" {
delete(m, "foo")
}
}
fmt.Println(m)
}
The spec says:
The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next. If a map entry that has not yet been reached is removed during iteration, the corresponding iteration value will not be produced.
As a result, the program outputs either map[bar:1 baz:2] or map[baz:2 foo:0], but there is no way to tell which.

Related

Why does my condition only eliminate the first key/value pair in my dictionary?

I'm currently working on a problem relating to dictionaries where you write a function that deletes all key/value pairs in which a value is larger than a given number. Here is the code:
def remove_numbers_larger_than(number, dict1):
for i, value in dict1.items():
if value > number :
del dict1[i]
return dict1
else:
return dict1
dict1 = {'animals': 6 , 'truck': 3, 'country': 2}
number = 2
print(remove_numbers_larger_than(number, dict1))
Normally I would expect to see output: {'country': 2} given that it is the only value smaller than the given number but instead I get output: {'truck': 3, 'country': 2}. It seems to be taking the initial condition and removing the first value but then the loop stops.

Only the first item is getting deleted because you immediately return from the function in the first iteration of the for loop. To loop through every value, you shouldn't return until after the for loop is over.
However, there's also another issue with the code. You are iterating through a list (dict1.items()) which will change when you remove items from the dictionary. An easy fix is to make a copy of the items list that you iterate over, allowing the original list to change without problems:
for i, value in list(dict1.items()):

How do I mutate and optionally remove elements from a vec without memory allocation?

I have a Player struct that contains a vec of Effect instances. I want to iterate over this vec, decrease the remaining time for each Effect, and then remove any effects whose remaining time reaches zero. So far so good. However, for any effect removed, I also want to pass it to Player's undo_effect() method, before destroying the effect instance.
This is part of a game loop, so I want to do this without any additional memory allocation if possible.
I've tried using a simple for loop and also iterators, drain, retain, and filter, but I keep running into issues where self (the Player) would be mutably borrowed more than once, because modifying self.effects requires a mutable borrow, as does the undo_effect() method. The drain_filter() in nightly looks useful here but it was first proposed in 2017 so not holding my breath on that one.
One approach that did compile (see below), was to use two vectors and alternate between them on each frame. Elements are pop()'ed from vec 1 and either push()'ed to vec 2 or passed to undo_effect() as appropriate. On the next game loop iteration, the direction is reversed. Since each vec will not shrink, the only allocations will be if they grow larger than before.
I started abstracting this as its own struct but want to check if there is a better (or easier) way.
This one won't compile. The self.undo_effect() call would borrow self as mutable twice.
struct Player {
effects: Vec<Effect>
}
impl Player {
fn update(&mut self, delta_time: f32) {
for effect in &mut self.effects {
effect.remaining -= delta_time;
if effect.remaining <= 0.0 {
effect.active = false;
}
}
for effect in self.effects.iter_mut().filter(|e| !e.active) {
self.undo_effect(effect);
}
self.effects.retain(|e| e.active);
}
}
The below compiles ok - but is there a better way?
struct Player {
effects: [Vec<Effect>; 2],
index: usize
}
impl Player {
fn update(&mut self, delta_time: f32) {
let src_index = self.index;
let target_index = if self.index == 0 { 1 } else { 0 };
self.effects[target_index].clear(); // should be unnecessary.
while !self.effects[src_index].is_empty() {
if let Some(x) = self.effects[src_index].pop() {
if x.active {
self.effects[target_index].push(x);
} else {
self.undo_effect(&x);
}
}
}
self.index = target_index;
}
}
Is there an iterator version that works without unnecessary memory allocations? I'd be ok with allocating memory only for the removed elements, since this will be much rarer.
Would an iterator be more efficient than the pop()/push() version?
EDIT 2020-02-23:
I ended up coming back to this and I found a slightly more robust solution, similar to the above but without the danger of requiring a target_index field.
std::mem::swap(&mut self.effects, &mut self.effects_cache);
self.effects.clear();
while !self.effects_cache.is_empty() {
if let Some(x) = self.effects_cache.pop() {
if x.active {
self.effects.push(x);
} else {
self.undo_effect(&x);
}
}
}
Since self.effects_cache is unused outside this method and does not require self.effects_cache to have any particular value beforehand, the rest of the code can simply use self.effects and it will always be current.

The main issue is that you are borrowing a field (effects) of Player and trying to call undo_effect while this field is borrowed. As you noted, this does not work.
You already realized that you could juggle two vectors, but you could actually only juggle one (permanent) vector:
struct Player {
effects: Vec<Effect>
}
impl Player {
fn update(&mut self, delta_time: f32) {
for effect in &mut self.effects {
effect.remaining -= delta_time;
if effect.remaining <= 0.0 {
effect.active = false;
}
}
// Temporarily remove effects from Player.
let mut effects = std::mem::replace(&mut self.effects, vec!());
// Call Player::undo_effects (no outstanding borrows).
// `drain_filter` could also be used, for better efficiency.
for effect in effects.iter_mut().filter(|e| !e.active) {
self.undo_effect(effect);
}
// Restore effects
self.effects = effects;
self.effects.retain(|e| e.active);
}
}
This will not allocate because the default constructor of Vec does not allocate.
On the other hand, the double-vector solution might be more efficient as it allows a single pass over self.effects rather than two. YMMV.

If I understand you correctly, you have two questions:
How can I split a Vec into two Vecs (one which fulfill a predidate, the other one which doesn't)
Is it possible to do without memory overhead
There are multiple ways of splitting a Vec into two (or more).
You could use Iteratator::partition which will give you two distinct Iterators which can be used further.
There is the unstable Vec::drain_filter function which does the same but on a Vec itself
Use splitn (or splitn_mut) which will split your Vec/slice into n (2 in your case) Iterators
Depending on what you want to do, all solutions are applicable and good to use.
Is it possible without memory overhead? Not with the solutions above, because you need to create a second Vec which can hold the filtered items. But there is a solution, namely you can "sort" the Vec where the first half will contain all the items that fulfill the predicate (e.g. are not expired) and the second half that will fail the predicate (are expired). You just need to count the amount of items that fulfill the predicate.
Then you can use split_at (or split_at_mut) to split the Vec/slice into two distinct slices. Afterwards you can resize the Vec to the length of the good items and the other ones will be dropped.

The best answer is this one in C++.
[O]rder the indices vector, create two iterators into the data vector, one for reading and one for writing. Initialize the writing iterator to the first element to be removed, and the reading iterator to one beyond that one. Then in each step of the loop increment the iterators to the next value (writing) and next value not to be skipped (reading) and copy/move the elements. At the end of the loop call erase to discard the elements beyond the last written to position.
The Rust adaptation to your specific problem is to move the removed items out of the vector instead of just writing over them.
An alternative is to use a linked list instead of a vector to hold your Effect instances.

Combination Sum in Go

/*
Given an array: [1,2] and a target: 4
Find the solution set that adds up to the target
in this case:
[1,1,1,1]
[1,1,2]
[2,2]
*/
import "sort"
func combinationSum(candidates []int, target int) [][]int {
sort.Ints(candidates)
return combine(0, target, []int{}, candidates)
}
func combine(sum int, target int, curComb []int, candidates []int) [][]int {
var tmp [][]int
var result [][]int
if sum == target {
fmt.Println(curComb)
return [][]int{curComb}
} else if sum < target {
for i,v := range candidates {
tmp = combine(sum+v, target, append(curComb, v), candidates[i:])
result = append(result,tmp...)
}
}
return result
}
This is a problem in Leetcode and I use recursion to solve it.
In line 18, I print every case when the sum is equal to the target.
The output is :
[1,1,1,1]
[1,1,2]
[2,2]
And that is the answer that I want!
But why is the final answer (two-dimensional):
[[1,1,1,2],[1,1,2],[2,2]]
Expected answer is : [[1,1,1,1],[1,1,2],[2,2]]
Please help me find the mistake in the code. Thanks for your time.

This happens because of the way slices work. A slice object is a reference to an underlying array, along with the length of the slice, a pointer to the start of the slice in the array, and the slice's capacity. The capacity of a slice is the number of elements from the beginning of the slice to the end of the array. When you append to a slice, if there is available capacity for the new element, it is added to the existing array. However, if there isn't sufficient capacity, append allocates a new array and copies the elements. The new array is allocated with extra capacity so that an allocation isn't required for every append.
In your for loop, when curComb is [1, 1, 1], its capacity is 4. On successive iterations of the loop, you append 1 and then 2, neither of which causes a reallocation because there's enough room in the array for the new element. When curComb is [1, 1, 1, 1], it is put on the results list, but in the next iteration of the for loop, the append changes the last element to 2 (remember that it's the same underlying array), so that's what you see when you print the results at the end.
The solution to this is to return a copy of curComb when the sum equals the target:
if sum == target {
fmt.Println(curComb)
tmpCurComb := make([]int, len(curComb))
copy(tmpCurComb, curComb)
return [][]int{tmpCurComb}
This article gives a good explanation of how slices work.

Golang Reusing Memory Address Copying from slice?

I was hitting an issue in a project I'm working on. I found a way around it, but I wasn't sure why my solution worked. I'm hoping that someone more experience with how Go pointers work could help me.
I have a Model interface and a Region struct that implements the interface. The Model interface is implemented on the pointer of the Region struct. I also have a Regions collection which is a slice of Region objects. I have a method that can turn a Regions object into a []Model:
// Regions is the collection of the Region model
type Regions []Region
// Returns the model collection as a list of models
func (coll *Regions) ToModelList() []Model {
output := make([]Model, len(*coll))
for idx, item := range *coll {
output[idx] = &item
}
return output
}
When I run this code, I end up with the first pointer to the Region outputted multiple times. So, if the Regions collection has two distinct items, I will get the same address duplicated twice. When I print the variables before I set them in the slice, they have the proper data.
I messed with it a little bit, thinking Go might be reusing the memory address between loops. This solution is currently working for me in my tests:
// Returns the model collection as a list of models
func (coll *Regions) ToModelList() []Model {
output := make([]Model, len(*coll))
for idx, _ := range *coll {
i := (*coll)[idx]
output[idx] = &i
}
return output
}
This gives the expected output of two distinct addresses in the output slice.
This honestly seems like a bug with the range function reusing the same memory address between runs, but I always assume I'm missing something in cases like this.
I hope I explained this well enough for you. I'm surprised that the original solution did not work.
Thanks!

In your first (non working) example item is the loop variable. Its address is not changing, only its value. That's why you get the same address in output idx times.
Run this code to see the mechanics in action;
func main() {
coll := []int{5, 10, 15}
for i, v := range coll {
fmt.Printf("This one is always the same; %v\n", &v)
fmt.Println("This one is 4 bytes larger each iteration; %v\n", &coll[i])
}
}

There is just one item variable for the entire loop, which is assigned the corresponding value during each iteration of the loop. You do not get a new item variable in each iteration. So you are just repeatedly taking the address of the same variable, which will of course be the same.
On the other hand, if you declared a local variable inside the loop, it will be a new variable in each iteration, and the addresses will be different:
for idx, item := range *coll {
temp := item
output[idx] = &temp
}

Is it safe to remove selected keys from map within a range loop?

How can one remove selected keys from a map?
Is it safe to combine delete() with range, as in the code below?
package main
import "fmt"
type Info struct {
value string
}
func main() {
table := make(map[string]*Info)
for i := 0; i < 10; i++ {
str := fmt.Sprintf("%v", i)
table[str] = &Info{str}
}
for key, value := range table {
fmt.Printf("deleting %v=>%v\n", key, value.value)
delete(table, key)
}
}
https://play.golang.org/p/u1vufvEjSw

This is safe! You can also find a similar sample in Effective Go:
for key := range m {
if key.expired() {
delete(m, key)
}
}
And the language specification:
The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next. If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced. If map entries are created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next. If the map is nil, the number of iterations is 0.

Sebastian's answer is accurate, but I wanted to know why it was safe, so I did some digging into the Map source code. It looks like on a call to delete(k, v), it basically just sets a flag (as well as changing the count value) instead of actually deleting the value:
b->tophash[i] = Empty;
(Empty is a constant for the value 0)
What the map appears to actually be doing is allocating a set number of buckets depending on the size of the map, which grows as you perform inserts at the rate of 2^B (from this source code):
byte *buckets; // array of 2^B Buckets. may be nil if count==0.
So there are almost always more buckets allocated than you're using, and when you do a range over the map, it checks that tophash value of each bucket in that 2^B to see if it can skip over it.
To summarize, the delete within a range is safe because the data is technically still there, but when it checks the tophash it sees that it can just skip over it and not include it in whatever range operation you're performing. The source code even includes a TODO:
// TODO: consolidate buckets if they are mostly empty
// can only consolidate if there are no live iterators at this size.
This explains why using the delete(k,v) function doesn't actually free up memory, just removes it from the list of buckets you're allowed to access. If you want to free up the actual memory you'll need to make the entire map unreachable so that garbage collection will step in. You can do this using a line like
map = nil

I was wondering if a memory leak could happen. So I wrote a test program:
package main
import (
log "github.com/Sirupsen/logrus"
"os/signal"
"os"
"math/rand"
"time"
)
func main() {
log.Info("=== START ===")
defer func() { log.Info("=== DONE ===") }()
go func() {
m := make(map[string]string)
for {
k := GenerateRandStr(1024)
m[k] = GenerateRandStr(1024*1024)
for k2, _ := range m {
delete(m, k2)
break
}
}
}()
osSignals := make(chan os.Signal, 1)
signal.Notify(osSignals, os.Interrupt)
for {
select {
case <-osSignals:
log.Info("Recieved ^C command. Exit")
return
}
}
}
func GenerateRandStr(n int) string {
rand.Seed(time.Now().UnixNano())
const letterBytes = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
b := make([]byte, n)
for i := range b {
b[i] = letterBytes[rand.Int63() % int64(len(letterBytes))]
}
return string(b)
}
Looks like GC do frees the memory. So it's okay.

In short, yes. See previous answers.
And also this, from here:
ianlancetaylor commented on Feb 18, 2015
I think the key to understanding this is to realize that while executing the body of a for/range statement, there is no current iteration. There is a set of values that have been seen, and a set of values that have not been seen. While executing the body, one of the key/value pairs that has been seen--the most recent pair--was assigned to the variable(s) of the range statement. There is nothing special about that key/value pair, it's just one of the ones that has already been seen during the iteration.
The question he's answering is about modifying map elements in place during a range operation, which is why he mentions the "current iteration". But it's also relevant here: you can delete keys during a range, and that just means that you won't see them later on in the range (and if you already saw them, that's okay).

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Modifying map while iterating over it in Go - dictionary

Based on the language spec: If a map entry is created during iteration, that entry may be produced during the iteration or may be skipped. So if the new elements are skipped, the for-loop eventually ends.

Related

Why does my condition only eliminate the first key/value pair in my dictionary?

How do I mutate and optionally remove elements from a vec without memory allocation?

Combination Sum in Go

Golang Reusing Memory Address Copying from slice?

Is it safe to remove selected keys from map within a range loop?

Categories

Resources