Golang: help understanding pointers, assignment, and unexpected behavior - pointers

So I am back with more beginner questions that I can not seem to wrap my head around.
I was experimenting with the following code.
func main() {
start := time.Now()
var powers []*big.Int
for i := 1; i < 1000; i++ {
I := big.NewInt(int64(i))
I.Mul(I, I)
powers = append(powers, I)
}
fmt.Println(powers)
fmt.Println(time.Since(start))
start = time.Now()
var seqDiffs []*big.Int
diff := new(big.Int)
for i, v := range powers {
if i == len(powers)-2 {
break
}
diff = v.Sub(powers[i+1], v)
seqDiffs = append(seqDiffs, diff)
}
fmt.Println(seqDiffs)
fmt.Println(time.Since(start))
}
my intention was to assign the result of Sub() to diff in the following way
diff.Sub(powers[i+1], v)
however this results in seqDiffs's value being 1995 (the correct last value) repeated over and over. I know that this is likely because seqDiffs is just a list of pointers to the same memory address but what I dont understand is why the following works just fine
v.Sub(powers[i+1], v)
seqDiffs = append(seqDiffs, v)
this results in seqDiffs being a list of all the odd numbers from 3 to 1995 which is correct but isn't this essentially still a list of pointers to the same memory address as well?
Also why is the following correct when it should also result in seqDiffs being a list of pointers to the same memory address as well?
diff = v.Sub(powers[i+1], v)
seqDiffs = append(seqDiffs, diff)
also I tried to do it the following way
diff := new(*big.Int)
for i, v := range powers {
if i == len(powers)-2 {
break
}
diff.Sub(powers[i+1], v)
seqDiffs = append(seqDiffs, diff)
}
but received these errors from the ide:
*./sequentialPowers.go:26: calling method Sub with receiver diff (type **big.Int) requires explicit dereference
./sequentialPowers.go:27: cannot use diff (type **big.Int) as type *big.Int in append*
How would I make an "explicit" dereference?

When debugging issues with pointers in Go, one way to understand what is going on is use fmt.Printf using %p to print the memory address of variables of interest.
In regards to your first question as to why when appending the results of diff.Sub(powers[i+1], v) to your slice of *big.Int results in a slice where every index is the same value - you are updating the value at the memory address diff is assigned to and appending a copy of that pointer to the slice. Thus all values in the slice are pointers to the same value.
Printing the memory address of diff will show this to be the case. After populating your slice - doing something like the following:
for _, val := range seqDiffs {
fmt.Printf("%p\n", val) // when i ran this - it printed 0xc4200b7d40 every iteration
}
In your second example, the the value v is pointer to a big.Int at a different address. You are assigning the the result of v.Sub(..) to diff, which updates the underlying address diff is pointing to. So when you append diff to your slice, you are appending a copy of of a pointer at a unique address. Using fmt.Printf you can see this like so -
var seqDiffs []*big.Int
diff := new(big.Int)
for i, v := range powers {
if i == len(powers)-2 {
break
}
diff = v.Sub(powers[i+1], v)
fmt.Printf("%p\n", diff) // 1st iteration 0xc4200109e0, 2nd 0xc420010a00, 3rd 0xc420010a20, etc
seqDiffs = append(seqDiffs, diff)
}
Regarding your second question - using the new keyword in Go allocates memory of the specified type but does not initialize it (check the docs). A call to new in your case allocates a type of pointer to a pointer to a big.Int (**big.Int), thus the compiler error saying you cannot use this type in your call to append.
To explicitly dereference diff in order to call the Sub on it, you would have to modify your code to the following:
(*diff).Sub(powers[i+1], v)
In Go, a selector expression dereferences pointers to structs for you, but in this case you are calling a method on a pointer to a pointer, thus you have to explicitly dereference it.
A very informative read on calling methods on structs (selector expressions) in Go can be found here
And to add it to the slice
seqDiffs = append(seqDiffs, *diff)

Related

What's the difference between []struct{} and []*struct{}?

What's the difference about below ?
type Demo struct {s string}
func getDemo1()([]*Demo) // 1
func getDemo2()([]Demo) // 2
Is there any memory difference between getDemo1 and getDemo2?
I'm going to answer this, despite my better judgement to just send OP to the tour and documentation/specification. Mostly because of this:
Is there any memory difference between getDemo1 and getDemo2?
The answer to this specific question depends on how you utilize the slice. Go is Pass by value, so passing struct values around copies them. For instance, consider the following example.
https://play.golang.org/p/VzjYXwUy0EI
d1 := getDemo1()
d2 := getDemo2()
for _, v := range d1 {
// v is of type *Demo, so this modifies the value in the slice
v.s = "same"
}
fmt.Println(d1)
for _, v := range d2 {
// v is of type Demo, and is a COPY of the struct in the slice, so the original is not modified
v.s = "same"
}
So as to the memory question, obviously using *Demo, which returns a copy of the pointer in the range (effectively a uint64) as opposed to returning a copy of a Demo (the entire struct and all it's fields) would use less memory. BUT, you can still index directly to the array to avoid copies, except when you pass individual items in the slice around.
That said, passing the slice itself around, the two types have no difference in overhead. A slice is an abstraction of an array, and the slice itself that gets passed around is merely a slice header, which would be the same memory footprint regardless what type the slice contains.
BTW, the paradigm for modifying the values in the case of []Demo is:
for i, _ := range d2 {
d2[i].s = "same"
}

Combination Sum in Go

/*
Given an array: [1,2] and a target: 4
Find the solution set that adds up to the target
in this case:
[1,1,1,1]
[1,1,2]
[2,2]
*/
import "sort"
func combinationSum(candidates []int, target int) [][]int {
sort.Ints(candidates)
return combine(0, target, []int{}, candidates)
}
func combine(sum int, target int, curComb []int, candidates []int) [][]int {
var tmp [][]int
var result [][]int
if sum == target {
fmt.Println(curComb)
return [][]int{curComb}
} else if sum < target {
for i,v := range candidates {
tmp = combine(sum+v, target, append(curComb, v), candidates[i:])
result = append(result,tmp...)
}
}
return result
}
This is a problem in Leetcode and I use recursion to solve it.
In line 18, I print every case when the sum is equal to the target.
The output is :
[1,1,1,1]
[1,1,2]
[2,2]
And that is the answer that I want!
But why is the final answer (two-dimensional):
[[1,1,1,2],[1,1,2],[2,2]]
Expected answer is : [[1,1,1,1],[1,1,2],[2,2]]
Please help me find the mistake in the code. Thanks for your time.
This happens because of the way slices work. A slice object is a reference to an underlying array, along with the length of the slice, a pointer to the start of the slice in the array, and the slice's capacity. The capacity of a slice is the number of elements from the beginning of the slice to the end of the array. When you append to a slice, if there is available capacity for the new element, it is added to the existing array. However, if there isn't sufficient capacity, append allocates a new array and copies the elements. The new array is allocated with extra capacity so that an allocation isn't required for every append.
In your for loop, when curComb is [1, 1, 1], its capacity is 4. On successive iterations of the loop, you append 1 and then 2, neither of which causes a reallocation because there's enough room in the array for the new element. When curComb is [1, 1, 1, 1], it is put on the results list, but in the next iteration of the for loop, the append changes the last element to 2 (remember that it's the same underlying array), so that's what you see when you print the results at the end.
The solution to this is to return a copy of curComb when the sum equals the target:
if sum == target {
fmt.Println(curComb)
tmpCurComb := make([]int, len(curComb))
copy(tmpCurComb, curComb)
return [][]int{tmpCurComb}
This article gives a good explanation of how slices work.

Pointer problems

TL;DR Somehow, I am appending a pointer to a list instead of the object within a for loop of objects so at the end the entire slice is composed of the same object multiple times. I just don't know how to fix that.
The Long Way
I am still having a super hard time trying to figure out pointers in go. I posted a question yesterday and got some help but now I am stuck on a slightly different issue in the same piece of code.
I am working with gocql and cqlr go packages to try and bit a small object mapper for my Cassandra queries. Essentially the problem I am having is I am appending what appears to be a pointer to an object, not a new instance of the obj to the array. How do I fix that? I have tried adding & and * in front of value but that doesn't seem to work. How do I fix these? The bind function needs an & according to their docs.
Code
type Query struct {
query string
values interface{}
attempts int
maxAttempts int
structType reflect.Type
}
func (query Query) RetryingQuery() (results []interface{}) {
var q *gocql.Query
if query.values != nil {
q = c.Session.Query(query.query, query.values)
} else {
q = c.Session.Query(query.query)
}
bindQuery := cqlr.BindQuery(q)
value := reflect.New(query.structType).Interface()
for bindQuery.Scan(value) {
fmt.Println(value)
results = append(results, value)
}
return
}
The docs ask for var value type then in bind you would pass &value. I quoted the docs below.
var t Tweet
var s []Tweet
for b.Scan(&t) {
// Application specific code goes here
append(s, t)
}
The issue is I cannot directly go var value query.structType to define its type then pass the reference of that to bindQuery.Scan().
What is printed
&{result1 x86_64 24 3.2.0-74-generic Linux}
&{result2 x86_64 24 3.19.0-25-generic Linux}
&{result3 x86_64 4 3.13.0-48-generic Linux}
&{result4 x86_64 2 3.13.0-62-generic Linux}
&{result5 x86_64 4 3.13.0-48-generic Linux}
What is in the slice
Spoiler, it is result5 repeated over and over. I understand that I am just appending the pointer to same object to the list and that every loop iteration the object is changed and that changes all the results in the slice to that new object. I just don't know how to fix it.
[{"hostname":"result5","machine":"x86_64","num_cpus":4,"release":"3.13.0-48-generic","sysname":"Linux"},{"hostname":"result5","machine":"x86_64","num_cpus":4,"release":"3.13.0-48-generic","sysname":"Linux"},{"hostname":"result5","machine":"x86_64","num_cpus":4,"release":"3.13.0-48-generic","sysname":"Linux"},{"hostname":"result5","machine":"x86_64","num_cpus":4,"release":"3.13.0-48-generic","sysname":"Linux"},{"hostname":"result5","machine":"x86_64","num_cpus":4,"release":"3.13.0-48-generic","sysname":"Linux"}]
Well I can at least tell you what you're doing. bindQuery takes a pointer. It changes the value stored at the address.
What you're essentially doing is this:
package main
import "fmt"
func main() {
var q int
myInts := make([]*int, 0, 5)
for i := 0; i < 5; i++ {
q = i
fmt.Printf("%d ", q)
myInts = append(myInts, &q)
}
fmt.Printf("\n")
for _, value := range myInts {
fmt.Printf("%d ", *value)
}
fmt.Printf("\n")
fmt.Println(myInts)
}
Which, as you can probably guess, gives you this:
0 1 2 3 4
4 4 4 4 4
[0x104382e0 0x104382e0 0x104382e0 0x104382e0 0x104382e0]
Things get a little more confusing with reflect. You can get your type as an interface, but that is it (unless you want to play with unsafe). An interface, in simple terms, contains a pointer to the original type underneath (and some other stuff). So in your function you are passing a pointer (and some other stuff). Then you're appending the pointer. It might be nice just to get concrete and type switch your interface. I assume you know what types it could be. In which case you'd have to have something along these lines:
package main
import (
"fmt"
"reflect"
)
type foo struct {
fooval string
}
type bar struct {
barval string
}
func main() {
f1 := foo{"hi"}
f2 := &foo{"hi"}
b1 := bar{"bye"}
b2 := &bar{"bye"}
doSomething(f1)
doSomething(f2)
doSomething(b1)
doSomething(b2)
}
func doSomething(i interface{}) {
n := reflect.TypeOf(i)
// get a new one
newn := reflect.New(n).Interface()
// find out what we got and handle each case
switch t := newn.(type) {
case **foo:
*t = &foo{"hi!"}
fmt.Printf("It was a **foo, here is the address %p and here is the value %v\n", *t, **t)
case **bar:
*t = &bar{"bye :("}
fmt.Printf("It was a **bar, here is the address %p and here is the value %v\n", *t, **t)
case *foo:
t = &foo{"hey!"}
fmt.Printf("It was a *foo, here is the address %p and here is the value %v\n", t, *t)
case *bar:
t = &bar{"ahh!"}
fmt.Printf("It was a *bar, here is the address %p and here is the value %v\n", t, *t)
default:
panic("AHHHH")
}
}
You could also just keep calling value = reflect.New(query.structType).Interface() inside of the loop which will give you new interfaces every time. Reassigning value after every append. Last time through the loop would make one extra though..

Golang Reusing Memory Address Copying from slice?

I was hitting an issue in a project I'm working on. I found a way around it, but I wasn't sure why my solution worked. I'm hoping that someone more experience with how Go pointers work could help me.
I have a Model interface and a Region struct that implements the interface. The Model interface is implemented on the pointer of the Region struct. I also have a Regions collection which is a slice of Region objects. I have a method that can turn a Regions object into a []Model:
// Regions is the collection of the Region model
type Regions []Region
// Returns the model collection as a list of models
func (coll *Regions) ToModelList() []Model {
output := make([]Model, len(*coll))
for idx, item := range *coll {
output[idx] = &item
}
return output
}
When I run this code, I end up with the first pointer to the Region outputted multiple times. So, if the Regions collection has two distinct items, I will get the same address duplicated twice. When I print the variables before I set them in the slice, they have the proper data.
I messed with it a little bit, thinking Go might be reusing the memory address between loops. This solution is currently working for me in my tests:
// Returns the model collection as a list of models
func (coll *Regions) ToModelList() []Model {
output := make([]Model, len(*coll))
for idx, _ := range *coll {
i := (*coll)[idx]
output[idx] = &i
}
return output
}
This gives the expected output of two distinct addresses in the output slice.
This honestly seems like a bug with the range function reusing the same memory address between runs, but I always assume I'm missing something in cases like this.
I hope I explained this well enough for you. I'm surprised that the original solution did not work.
Thanks!
In your first (non working) example item is the loop variable. Its address is not changing, only its value. That's why you get the same address in output idx times.
Run this code to see the mechanics in action;
func main() {
coll := []int{5, 10, 15}
for i, v := range coll {
fmt.Printf("This one is always the same; %v\n", &v)
fmt.Println("This one is 4 bytes larger each iteration; %v\n", &coll[i])
}
}
There is just one item variable for the entire loop, which is assigned the corresponding value during each iteration of the loop. You do not get a new item variable in each iteration. So you are just repeatedly taking the address of the same variable, which will of course be the same.
On the other hand, if you declared a local variable inside the loop, it will be a new variable in each iteration, and the addresses will be different:
for idx, item := range *coll {
temp := item
output[idx] = &temp
}

Is it safe to remove selected keys from map within a range loop?

How can one remove selected keys from a map?
Is it safe to combine delete() with range, as in the code below?
package main
import "fmt"
type Info struct {
value string
}
func main() {
table := make(map[string]*Info)
for i := 0; i < 10; i++ {
str := fmt.Sprintf("%v", i)
table[str] = &Info{str}
}
for key, value := range table {
fmt.Printf("deleting %v=>%v\n", key, value.value)
delete(table, key)
}
}
https://play.golang.org/p/u1vufvEjSw
This is safe! You can also find a similar sample in Effective Go:
for key := range m {
if key.expired() {
delete(m, key)
}
}
And the language specification:
The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next. If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced. If map entries are created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next. If the map is nil, the number of iterations is 0.
Sebastian's answer is accurate, but I wanted to know why it was safe, so I did some digging into the Map source code. It looks like on a call to delete(k, v), it basically just sets a flag (as well as changing the count value) instead of actually deleting the value:
b->tophash[i] = Empty;
(Empty is a constant for the value 0)
What the map appears to actually be doing is allocating a set number of buckets depending on the size of the map, which grows as you perform inserts at the rate of 2^B (from this source code):
byte *buckets; // array of 2^B Buckets. may be nil if count==0.
So there are almost always more buckets allocated than you're using, and when you do a range over the map, it checks that tophash value of each bucket in that 2^B to see if it can skip over it.
To summarize, the delete within a range is safe because the data is technically still there, but when it checks the tophash it sees that it can just skip over it and not include it in whatever range operation you're performing. The source code even includes a TODO:
// TODO: consolidate buckets if they are mostly empty
// can only consolidate if there are no live iterators at this size.
This explains why using the delete(k,v) function doesn't actually free up memory, just removes it from the list of buckets you're allowed to access. If you want to free up the actual memory you'll need to make the entire map unreachable so that garbage collection will step in. You can do this using a line like
map = nil
I was wondering if a memory leak could happen. So I wrote a test program:
package main
import (
log "github.com/Sirupsen/logrus"
"os/signal"
"os"
"math/rand"
"time"
)
func main() {
log.Info("=== START ===")
defer func() { log.Info("=== DONE ===") }()
go func() {
m := make(map[string]string)
for {
k := GenerateRandStr(1024)
m[k] = GenerateRandStr(1024*1024)
for k2, _ := range m {
delete(m, k2)
break
}
}
}()
osSignals := make(chan os.Signal, 1)
signal.Notify(osSignals, os.Interrupt)
for {
select {
case <-osSignals:
log.Info("Recieved ^C command. Exit")
return
}
}
}
func GenerateRandStr(n int) string {
rand.Seed(time.Now().UnixNano())
const letterBytes = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
b := make([]byte, n)
for i := range b {
b[i] = letterBytes[rand.Int63() % int64(len(letterBytes))]
}
return string(b)
}
Looks like GC do frees the memory. So it's okay.
In short, yes. See previous answers.
And also this, from here:
ianlancetaylor commented on Feb 18, 2015
I think the key to understanding this is to realize that while executing the body of a for/range statement, there is no current iteration. There is a set of values that have been seen, and a set of values that have not been seen. While executing the body, one of the key/value pairs that has been seen--the most recent pair--was assigned to the variable(s) of the range statement. There is nothing special about that key/value pair, it's just one of the ones that has already been seen during the iteration.
The question he's answering is about modifying map elements in place during a range operation, which is why he mentions the "current iteration". But it's also relevant here: you can delete keys during a range, and that just means that you won't see them later on in the range (and if you already saw them, that's okay).

Resources