Related
I Go, I assumed slices were passed by reference, but this seems to work for values
but not for the array itself. For example, If I have this struct:
l := Line{
Points: []Point{
Point{3, 4},
},
}
I can define a variable, which gets passed a reference to the struct's slice
slice := l.Points
And then if I modify it, the original struct referenced by the variable
is going to reflect those modifications.
slice[0].X = 1000
fmt.Printf(
"This value %d is the same as this %d",
slice[0].X,
l.Points[0].X,
)
This differs from the behavior of arrays which, I assume, are passed by value.
So, for example, if I had defined the previous code using an array:
l := Line{
Points: [1]Point{
Point{3, 4},
},
}
arr := l.Points
arr[0].X = 1000
fmt.Println(arr.[0].X != s.Points[0].X) // equals true, original struct is untouched
Then, the l struct wouldn't have been modified.
Now, if I want to modify the slice itself I obviously cannot do this:
slice = append(slice, Point{99, 100})
Since that would only redefine the slice variable, losing the original reference.
I know I can simply do this:
l.Points = append(l.Points, Point{99, 100})
But, in some cases, it is more convenient to have another variable instead of having
to type the whole thing.
I tried this:
*slice = append(*slice, Point{99, 100})
But it doesn't work as I am trying to dereference something that apparently is not a pointer.
I finally tried this:
slice := &l.Points
*slice = append(l.Points, Point{99, 100})
And it works, but I am not sure what is happening. Why is the value of slice not overwritten? How does append works here?
Let's dispense first with a terminology issue. The Go language specification does not use the word reference the way you are using it. Go does however have pointers, and pointers are a form of reference. In addition, slices and maps are kind of special as there's some underlying data—the array underneath a slice, or the storage for a map—that may or may not already exist or be created by declaring or defining a variable whose type is slice of T or map[T1]T2 for some type T or type-pair T1 and T2.1
We can take your usage of the word reference to mean explicit pointer when talking about, e.g.:
func f1(p *int) {
// code ...
}
and the implied pointer when talking about:
func f2(m map[T1]T2) { ... }
func f3(s []T) { ... }
In f1, p really is a pointer: it thus refers to some actual int, or is nil. In f2, m refers to some underlying map, or is nil. In f3, s refers to some underlying array, or is nil.
But if you write:
l := Line{
Points: []Point{
Point{3, 4},
},
}
then you must have written:
type Line struct {
// ... maybe some fields here ...
Points []Point
// ... maybe more fields here ...
}
This Line is a struct type. It is not a slice type; it is not a map type. It contains a slice type but it is not itself one.
You now talk about passing these slices. If you pass l, you're passing the entire struct by value. It's pretty important to distinguish between that, and passing the value of l.Points. The function that receives one of these arguments must declare it with the right type.
For the most part, then, talking about references is just a red herring—a distraction from what's really going on. What we need to know is: What variables are you assigning what values, using what source code?
With all of that out of the way, let's talk about your actual code samples:
l.Points = append(l.Points, Point{99, 100})
This does just what it says:
Pass l.Points to append, which is a built-in as it is somewhat magically type-flexible (vs the rest of Go, where types are pretty rigid). It takes any value of type []T (slice of T, for any valid type T) plus one or more values of type T, and produces a new value of the same type, []T.
Assigns the result to l.Points.
When append does its work, it may:
receive nil (of the given type): in this case, it creates the underlying array, or
receive a non-nil slice: in this case, it writes into the underlying array or discards that array in favor of a new larger-capacity array as needed.2
So in all cases, the underlying array may have, in effect, just been created or replaced. It's therefore important that any other use of the same underlying array be updated appropriately. Assigning the result back to l.Points updates the—presumably one-and-only—slice variable that refers to the underlying array.
We can, however, break these assumptions:
s2 := l.Points
Now l.Points and s2 both refer to the (single) underlying array. Operations that modify that underlying array will, at least potentially, affect both s2 and l.Points.
Your second example is itself OK:
*slice = append(*slice, Point{99, 100})
but you haven't shown how slice itself was declared and/or assigned-to.
Your third example is fine as well:
slice := &l.Points
*slice = append(l.Points, Point{99, 100})
The first of these lines declares-and-initializes slice to point to l.Points. The variable slice therefore has type *[]Point. Its value—the value in slice, that is, rather than that in *slice—is the address of l.Points, which has type []Point.
The value in *slice is the value in l.Points. So you could write:
*slice = append(*slice, Point{99, 100})
here. Since *slice is just another name for l.Points, you can also write:
l.Points = append(*slice, Point{99, 100})
You only need to use *slice if there's some reason that l.Points is not available,3 but you may use *slice if that's more convenient. Reading *slice reads l.Points and updating *slice updates l.Points.
1To see what I mean by may or may not be created here, consider:
var s []int
vs:
var s = []int{42}
The first leaves s == nil while the second creates an underlying array with the capacity to hold the one int value 42, holding the one int value 42, so that s != nil.
2It's not clear to me whether there is a promise never to write on an existing slice-array whose capacity is greater than its current length, but not sufficient to hold the final result. That is, can append first append 10 objects to the existing underlying array, then discover that it needs a bigger array and expand the underlying array? The difference is observable if there are other slice values referring to the existing underlying array.
3Here, a classic example would occur if you have reason to pass l.Points or &l.Points to some existing (pre-written) function:
If you need pass l.Points—the slice value—to some existing function, that existing function cannot change the slice value, but could change the underlying array. That's probably a bad plan, so if it does do this, make sure that this is OK! If it only reads the slice and underlying array, that's a lot safer.
If you need to pass &l.Points—a value that points to the slice value—to some existing function, that existing function can change both the slice, and the underlying array.
If you're writing a new function, it's up to you to write it in whatever manner is most appropriate. If you're only going to read the slice and underlying array, you can take a value of type []Point. If you intend to update the slice in place, you should take a value of type *[]Point—pointer to slice of Point.
Append returns a new slice that may modify the original backing array of the initial slice. The original slice will still point to the original backing array, not the new one (which may or may not be in the same place in memory)
For example (playground)
slice := []int{1,2,3}
fmt.Println(len(slice))
// Output: 3
newSlice := append(slice, 4)
fmt.Println(len(newSlice))
// Output: 4
fmt.Println(len(slice))
// Output: 3
While a slice can be described as a "fat pointer to an array", it is not a pointer and therefore you can't dereference it, which is why you get an error.
By creating a pointer to a slice, and using append as you did above, you are setting the slice the pointer points to to the "new" slice returned by append.
For more information, check out Go Slice Usage And Internals
Your first attempt didn't work because slices are not pointers, they can be considered reference types. Append will modify the underlying array if it has enough capacity, otherwise it returns a new slice.
You can achieve what you want with a combination of your two attempts.
playground
l := Line{
Points: []Point{
Point{3, 4},
},
}
slice := &l.Points
for i := 0; i < 100; i++ {
*slice = append(*slice, Point{99 + i, 100 + i})
}
fmt.Println(l.Points)
I know that this might be sacrilegious, but, for me, it is useful to think of slices
as structs.
type Slice struct {
len int
cap int
Array *[n]T // Pointer to array of type T
}
Since in languages like C, the [] operator is also a dereferencing operator, we can think that every time we are accessing a slice, we are actually dereferencing the underlying array and assigning some value to it. That is:
var s []int
s[0] = 1
Might be thought of as equivalent to (in pseudo-code):
var s Slice
*s.Array[0] = 1
That is why we can say that slices are "pointers". For that reason, it can modify its underlying array like this:
myArray := [3]int{1,1,1}
mySlice := myArray[0:1]
mySlice = append(mySlice, 2, 3) // myArray == mySlice
Modifying mySlice also modifies myArray, since the slice stores a pointer to the array and, on appending, we are dereferencing that pointer.
This behavior, nonetheless, is not always like this. If we exceed the capacity of the original array, a new array is created and the original array is left untouched.
myArray := [3]int{1,1,1}
mySlice := myArray[0:1]
mySlice = append(mySlice, 2, 3, 4, 5) // myArray != mySlice
The confusion arises when we try to treat the slice itself as an actual pointer. Since we can modify an underlying array by appending to it, we are led to believe that in this case:
sliceCopy := mySlice
sliceCopy = append(sliceCopy, 6)
both slices, slice and sliceCopy are the same, but they are not. We have to explicitly pass a reference to the memory address of the slice (using the & operator) in order to modify it. That is:
sliceAddress := &mySlice
*sliceAddress = append(mySlice, 6) // or append(*sliceAddress, 6)
See also
https://forum.golangbridge.org/t/slice-pass-as-value-or-pointer/2866/4
https://blog.golang.org/go-slices-usage-and-internals
https://appliedgo.net/slices/
When the formal parameter is map, assigning a value directly to a formal parameter cannot change the actual argument, but if you add a new key and value to the formal parameter, the actual argument outside the function can also be seen. Why is that?
I don't understand the output value of the following code, and the formal parameters are different from the actual parameters.
unc main() {
t := map[int]int{
1: 1,
}
fmt.Println(unsafe.Pointer(&t))
copysss(t)
fmt.Println(t)
}
func copysss(m map[int]int) {
//pointer := unsafe.Pointer(&m)
//fmt.Println(pointer)
m = map[int]int{
1: 2,
}
}
stdout :0xc000086010
map[1:1]
func main() {
t := map[int]int{
1: 1,
}
fmt.Println(unsafe.Pointer(&t))
copysss(t)
fmt.Println(t)
}
func copysss(m map[int]int) {
//pointer := unsafe.Pointer(&m)
//fmt.Println(pointer)
m[1] = 2
}
stdout :0xc00007a010
map[1:2]
func main() {
t := map[int]int{
1: 1,
}
fmt.Println(unsafe.Pointer(&t))
copysss(t)
fmt.Println(t)
}
func copysss(m map[int]int) {
pointer := unsafe.Pointer(&m)
fmt.Println(pointer)
m[1] = 2
}
stdout:0xc00008a008
0xc00008a018
map[1:2]
I want to know if the parameter is a value or a pointer.
The parameter is both a value and a pointer.
Wait.. whut?
Yes, a map (and slices, for that matter) are types, pretty similar to what you would implement. Think of a map like this:
type map struct {
// meta information on the map
meta struct{
keyT type
valueT type
len int
}
value *hashTable // pointer to the underlying data structure
}
So in your first function, where you reassign m, you're passing a copy of the struct above (pass by value), and you're assigning a new map to it, creating a new hashtable pointer in the process. The variable in the function scope is updated, but the one you passed still holds a reference to the original map, and with it, the pointer to the original map is preserved.
In the second snippet, you're accessing the underlying hash table (a copy of the pointer, but the pointer points to the same memory). You're directly manipulating the original map, because you're just changing the contents of the memory.
So TL;DR
A map is a value, containing meta information of what the map looks like, and a pointer to the actual data stored inside. The pointer is passed by value, like anything else (same way pointers are passed by value in C/C++), but of course, dereferencing a pointer means you're changing the values in memory directly.
Careful...
Like I said, slices work pretty much in the same way:
type slice struct {
meta struct {
type T
len, cap int
}
value *array // yes, it's a pointer to an underlying array
}
The underlying array is of say, a slice of ints will be [10]int if the cap of the slice is 10, regardless of the length. A slice is managed by the go runtime, so if you exceed the capacity, a new array is allocated (twice the cap of the previous one), the existing data is copied over, and the slice value field is set to point to the new array. That's the reason why append returns the slice that you're appending to, the underlying pointer may have changed etc.. you can find more in-depth information on this.
The thing you have to be careful with is that a function like this:
func update(s []int) {
for i, v := range s {
s[i] = v*2
}
}
will behave much in the same way as the function you have were you're assigning m[1] = 2, but once you start appending, the runtime is free to move the underlying array around, and point to a new memory address. So bottom line: maps and slices have an internal pointer, which can produce side-effects, but you're better off avoiding bugs/ambiguities. Go supports multiple return values, so just return a slice if you set about changing it.
Notes:
In your attempt to figure out what a map is (reference, value, pointer...), I noticed you tried this:
pointer := unsafe.Pointer(&m)
fmt.Println(pointer)
What you're doing there, is actually printing the address of the argument variable, not any address that actually corresponds to the map itself. the argument passed to unsafe.Pointer isn't of the type map[int]int, but rather it's of type *map[int]int.
Personally, I think there's too much confusion around passing by value vs passing by . Go works exactly like C in this regard, just like C, absolutely everything is passed by value. It just so happens that this value can sometimes be a memory address (pointer).
More details (references)
Slices: usage & internals
Maps Note: there's some confusion caused by this one, as pointers, slices, and maps are referred to as *reference types*, but as explained by others, and elsewhere, this is not to be confused with C++ references
In Go, map is a reference type. This means that the map actually resides in the heap and variable is just a pointer to that.
The map is passed by copy. You can change the local copy in your function, but this will not be reflected in caller's scope.
But, since the map variable is a pointer to the unique map residing in the heap, every change can be seen by any variable that points to the same map.
This article can clarify the concept: https://www.ardanlabs.com/blog/2014/12/using-pointers-in-go.html.
So I am back with more beginner questions that I can not seem to wrap my head around.
I was experimenting with the following code.
func main() {
start := time.Now()
var powers []*big.Int
for i := 1; i < 1000; i++ {
I := big.NewInt(int64(i))
I.Mul(I, I)
powers = append(powers, I)
}
fmt.Println(powers)
fmt.Println(time.Since(start))
start = time.Now()
var seqDiffs []*big.Int
diff := new(big.Int)
for i, v := range powers {
if i == len(powers)-2 {
break
}
diff = v.Sub(powers[i+1], v)
seqDiffs = append(seqDiffs, diff)
}
fmt.Println(seqDiffs)
fmt.Println(time.Since(start))
}
my intention was to assign the result of Sub() to diff in the following way
diff.Sub(powers[i+1], v)
however this results in seqDiffs's value being 1995 (the correct last value) repeated over and over. I know that this is likely because seqDiffs is just a list of pointers to the same memory address but what I dont understand is why the following works just fine
v.Sub(powers[i+1], v)
seqDiffs = append(seqDiffs, v)
this results in seqDiffs being a list of all the odd numbers from 3 to 1995 which is correct but isn't this essentially still a list of pointers to the same memory address as well?
Also why is the following correct when it should also result in seqDiffs being a list of pointers to the same memory address as well?
diff = v.Sub(powers[i+1], v)
seqDiffs = append(seqDiffs, diff)
also I tried to do it the following way
diff := new(*big.Int)
for i, v := range powers {
if i == len(powers)-2 {
break
}
diff.Sub(powers[i+1], v)
seqDiffs = append(seqDiffs, diff)
}
but received these errors from the ide:
*./sequentialPowers.go:26: calling method Sub with receiver diff (type **big.Int) requires explicit dereference
./sequentialPowers.go:27: cannot use diff (type **big.Int) as type *big.Int in append*
How would I make an "explicit" dereference?
When debugging issues with pointers in Go, one way to understand what is going on is use fmt.Printf using %p to print the memory address of variables of interest.
In regards to your first question as to why when appending the results of diff.Sub(powers[i+1], v) to your slice of *big.Int results in a slice where every index is the same value - you are updating the value at the memory address diff is assigned to and appending a copy of that pointer to the slice. Thus all values in the slice are pointers to the same value.
Printing the memory address of diff will show this to be the case. After populating your slice - doing something like the following:
for _, val := range seqDiffs {
fmt.Printf("%p\n", val) // when i ran this - it printed 0xc4200b7d40 every iteration
}
In your second example, the the value v is pointer to a big.Int at a different address. You are assigning the the result of v.Sub(..) to diff, which updates the underlying address diff is pointing to. So when you append diff to your slice, you are appending a copy of of a pointer at a unique address. Using fmt.Printf you can see this like so -
var seqDiffs []*big.Int
diff := new(big.Int)
for i, v := range powers {
if i == len(powers)-2 {
break
}
diff = v.Sub(powers[i+1], v)
fmt.Printf("%p\n", diff) // 1st iteration 0xc4200109e0, 2nd 0xc420010a00, 3rd 0xc420010a20, etc
seqDiffs = append(seqDiffs, diff)
}
Regarding your second question - using the new keyword in Go allocates memory of the specified type but does not initialize it (check the docs). A call to new in your case allocates a type of pointer to a pointer to a big.Int (**big.Int), thus the compiler error saying you cannot use this type in your call to append.
To explicitly dereference diff in order to call the Sub on it, you would have to modify your code to the following:
(*diff).Sub(powers[i+1], v)
In Go, a selector expression dereferences pointers to structs for you, but in this case you are calling a method on a pointer to a pointer, thus you have to explicitly dereference it.
A very informative read on calling methods on structs (selector expressions) in Go can be found here
And to add it to the slice
seqDiffs = append(seqDiffs, *diff)
I was hitting an issue in a project I'm working on. I found a way around it, but I wasn't sure why my solution worked. I'm hoping that someone more experience with how Go pointers work could help me.
I have a Model interface and a Region struct that implements the interface. The Model interface is implemented on the pointer of the Region struct. I also have a Regions collection which is a slice of Region objects. I have a method that can turn a Regions object into a []Model:
// Regions is the collection of the Region model
type Regions []Region
// Returns the model collection as a list of models
func (coll *Regions) ToModelList() []Model {
output := make([]Model, len(*coll))
for idx, item := range *coll {
output[idx] = &item
}
return output
}
When I run this code, I end up with the first pointer to the Region outputted multiple times. So, if the Regions collection has two distinct items, I will get the same address duplicated twice. When I print the variables before I set them in the slice, they have the proper data.
I messed with it a little bit, thinking Go might be reusing the memory address between loops. This solution is currently working for me in my tests:
// Returns the model collection as a list of models
func (coll *Regions) ToModelList() []Model {
output := make([]Model, len(*coll))
for idx, _ := range *coll {
i := (*coll)[idx]
output[idx] = &i
}
return output
}
This gives the expected output of two distinct addresses in the output slice.
This honestly seems like a bug with the range function reusing the same memory address between runs, but I always assume I'm missing something in cases like this.
I hope I explained this well enough for you. I'm surprised that the original solution did not work.
Thanks!
In your first (non working) example item is the loop variable. Its address is not changing, only its value. That's why you get the same address in output idx times.
Run this code to see the mechanics in action;
func main() {
coll := []int{5, 10, 15}
for i, v := range coll {
fmt.Printf("This one is always the same; %v\n", &v)
fmt.Println("This one is 4 bytes larger each iteration; %v\n", &coll[i])
}
}
There is just one item variable for the entire loop, which is assigned the corresponding value during each iteration of the loop. You do not get a new item variable in each iteration. So you are just repeatedly taking the address of the same variable, which will of course be the same.
On the other hand, if you declared a local variable inside the loop, it will be a new variable in each iteration, and the addresses will be different:
for idx, item := range *coll {
temp := item
output[idx] = &temp
}
How can one remove selected keys from a map?
Is it safe to combine delete() with range, as in the code below?
package main
import "fmt"
type Info struct {
value string
}
func main() {
table := make(map[string]*Info)
for i := 0; i < 10; i++ {
str := fmt.Sprintf("%v", i)
table[str] = &Info{str}
}
for key, value := range table {
fmt.Printf("deleting %v=>%v\n", key, value.value)
delete(table, key)
}
}
https://play.golang.org/p/u1vufvEjSw
This is safe! You can also find a similar sample in Effective Go:
for key := range m {
if key.expired() {
delete(m, key)
}
}
And the language specification:
The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next. If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced. If map entries are created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next. If the map is nil, the number of iterations is 0.
Sebastian's answer is accurate, but I wanted to know why it was safe, so I did some digging into the Map source code. It looks like on a call to delete(k, v), it basically just sets a flag (as well as changing the count value) instead of actually deleting the value:
b->tophash[i] = Empty;
(Empty is a constant for the value 0)
What the map appears to actually be doing is allocating a set number of buckets depending on the size of the map, which grows as you perform inserts at the rate of 2^B (from this source code):
byte *buckets; // array of 2^B Buckets. may be nil if count==0.
So there are almost always more buckets allocated than you're using, and when you do a range over the map, it checks that tophash value of each bucket in that 2^B to see if it can skip over it.
To summarize, the delete within a range is safe because the data is technically still there, but when it checks the tophash it sees that it can just skip over it and not include it in whatever range operation you're performing. The source code even includes a TODO:
// TODO: consolidate buckets if they are mostly empty
// can only consolidate if there are no live iterators at this size.
This explains why using the delete(k,v) function doesn't actually free up memory, just removes it from the list of buckets you're allowed to access. If you want to free up the actual memory you'll need to make the entire map unreachable so that garbage collection will step in. You can do this using a line like
map = nil
I was wondering if a memory leak could happen. So I wrote a test program:
package main
import (
log "github.com/Sirupsen/logrus"
"os/signal"
"os"
"math/rand"
"time"
)
func main() {
log.Info("=== START ===")
defer func() { log.Info("=== DONE ===") }()
go func() {
m := make(map[string]string)
for {
k := GenerateRandStr(1024)
m[k] = GenerateRandStr(1024*1024)
for k2, _ := range m {
delete(m, k2)
break
}
}
}()
osSignals := make(chan os.Signal, 1)
signal.Notify(osSignals, os.Interrupt)
for {
select {
case <-osSignals:
log.Info("Recieved ^C command. Exit")
return
}
}
}
func GenerateRandStr(n int) string {
rand.Seed(time.Now().UnixNano())
const letterBytes = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
b := make([]byte, n)
for i := range b {
b[i] = letterBytes[rand.Int63() % int64(len(letterBytes))]
}
return string(b)
}
Looks like GC do frees the memory. So it's okay.
In short, yes. See previous answers.
And also this, from here:
ianlancetaylor commented on Feb 18, 2015
I think the key to understanding this is to realize that while executing the body of a for/range statement, there is no current iteration. There is a set of values that have been seen, and a set of values that have not been seen. While executing the body, one of the key/value pairs that has been seen--the most recent pair--was assigned to the variable(s) of the range statement. There is nothing special about that key/value pair, it's just one of the ones that has already been seen during the iteration.
The question he's answering is about modifying map elements in place during a range operation, which is why he mentions the "current iteration". But it's also relevant here: you can delete keys during a range, and that just means that you won't see them later on in the range (and if you already saw them, that's okay).