I was hitting an issue in a project I'm working on. I found a way around it, but I wasn't sure why my solution worked. I'm hoping that someone more experience with how Go pointers work could help me.
I have a Model interface and a Region struct that implements the interface. The Model interface is implemented on the pointer of the Region struct. I also have a Regions collection which is a slice of Region objects. I have a method that can turn a Regions object into a []Model:
// Regions is the collection of the Region model
type Regions []Region
// Returns the model collection as a list of models
func (coll *Regions) ToModelList() []Model {
output := make([]Model, len(*coll))
for idx, item := range *coll {
output[idx] = &item
}
return output
}
When I run this code, I end up with the first pointer to the Region outputted multiple times. So, if the Regions collection has two distinct items, I will get the same address duplicated twice. When I print the variables before I set them in the slice, they have the proper data.
I messed with it a little bit, thinking Go might be reusing the memory address between loops. This solution is currently working for me in my tests:
// Returns the model collection as a list of models
func (coll *Regions) ToModelList() []Model {
output := make([]Model, len(*coll))
for idx, _ := range *coll {
i := (*coll)[idx]
output[idx] = &i
}
return output
}
This gives the expected output of two distinct addresses in the output slice.
This honestly seems like a bug with the range function reusing the same memory address between runs, but I always assume I'm missing something in cases like this.
I hope I explained this well enough for you. I'm surprised that the original solution did not work.
Thanks!
In your first (non working) example item is the loop variable. Its address is not changing, only its value. That's why you get the same address in output idx times.
Run this code to see the mechanics in action;
func main() {
coll := []int{5, 10, 15}
for i, v := range coll {
fmt.Printf("This one is always the same; %v\n", &v)
fmt.Println("This one is 4 bytes larger each iteration; %v\n", &coll[i])
}
}
There is just one item variable for the entire loop, which is assigned the corresponding value during each iteration of the loop. You do not get a new item variable in each iteration. So you are just repeatedly taking the address of the same variable, which will of course be the same.
On the other hand, if you declared a local variable inside the loop, it will be a new variable in each iteration, and the addresses will be different:
for idx, item := range *coll {
temp := item
output[idx] = &temp
}
Related
I Go, I assumed slices were passed by reference, but this seems to work for values
but not for the array itself. For example, If I have this struct:
l := Line{
Points: []Point{
Point{3, 4},
},
}
I can define a variable, which gets passed a reference to the struct's slice
slice := l.Points
And then if I modify it, the original struct referenced by the variable
is going to reflect those modifications.
slice[0].X = 1000
fmt.Printf(
"This value %d is the same as this %d",
slice[0].X,
l.Points[0].X,
)
This differs from the behavior of arrays which, I assume, are passed by value.
So, for example, if I had defined the previous code using an array:
l := Line{
Points: [1]Point{
Point{3, 4},
},
}
arr := l.Points
arr[0].X = 1000
fmt.Println(arr.[0].X != s.Points[0].X) // equals true, original struct is untouched
Then, the l struct wouldn't have been modified.
Now, if I want to modify the slice itself I obviously cannot do this:
slice = append(slice, Point{99, 100})
Since that would only redefine the slice variable, losing the original reference.
I know I can simply do this:
l.Points = append(l.Points, Point{99, 100})
But, in some cases, it is more convenient to have another variable instead of having
to type the whole thing.
I tried this:
*slice = append(*slice, Point{99, 100})
But it doesn't work as I am trying to dereference something that apparently is not a pointer.
I finally tried this:
slice := &l.Points
*slice = append(l.Points, Point{99, 100})
And it works, but I am not sure what is happening. Why is the value of slice not overwritten? How does append works here?
Let's dispense first with a terminology issue. The Go language specification does not use the word reference the way you are using it. Go does however have pointers, and pointers are a form of reference. In addition, slices and maps are kind of special as there's some underlying data—the array underneath a slice, or the storage for a map—that may or may not already exist or be created by declaring or defining a variable whose type is slice of T or map[T1]T2 for some type T or type-pair T1 and T2.1
We can take your usage of the word reference to mean explicit pointer when talking about, e.g.:
func f1(p *int) {
// code ...
}
and the implied pointer when talking about:
func f2(m map[T1]T2) { ... }
func f3(s []T) { ... }
In f1, p really is a pointer: it thus refers to some actual int, or is nil. In f2, m refers to some underlying map, or is nil. In f3, s refers to some underlying array, or is nil.
But if you write:
l := Line{
Points: []Point{
Point{3, 4},
},
}
then you must have written:
type Line struct {
// ... maybe some fields here ...
Points []Point
// ... maybe more fields here ...
}
This Line is a struct type. It is not a slice type; it is not a map type. It contains a slice type but it is not itself one.
You now talk about passing these slices. If you pass l, you're passing the entire struct by value. It's pretty important to distinguish between that, and passing the value of l.Points. The function that receives one of these arguments must declare it with the right type.
For the most part, then, talking about references is just a red herring—a distraction from what's really going on. What we need to know is: What variables are you assigning what values, using what source code?
With all of that out of the way, let's talk about your actual code samples:
l.Points = append(l.Points, Point{99, 100})
This does just what it says:
Pass l.Points to append, which is a built-in as it is somewhat magically type-flexible (vs the rest of Go, where types are pretty rigid). It takes any value of type []T (slice of T, for any valid type T) plus one or more values of type T, and produces a new value of the same type, []T.
Assigns the result to l.Points.
When append does its work, it may:
receive nil (of the given type): in this case, it creates the underlying array, or
receive a non-nil slice: in this case, it writes into the underlying array or discards that array in favor of a new larger-capacity array as needed.2
So in all cases, the underlying array may have, in effect, just been created or replaced. It's therefore important that any other use of the same underlying array be updated appropriately. Assigning the result back to l.Points updates the—presumably one-and-only—slice variable that refers to the underlying array.
We can, however, break these assumptions:
s2 := l.Points
Now l.Points and s2 both refer to the (single) underlying array. Operations that modify that underlying array will, at least potentially, affect both s2 and l.Points.
Your second example is itself OK:
*slice = append(*slice, Point{99, 100})
but you haven't shown how slice itself was declared and/or assigned-to.
Your third example is fine as well:
slice := &l.Points
*slice = append(l.Points, Point{99, 100})
The first of these lines declares-and-initializes slice to point to l.Points. The variable slice therefore has type *[]Point. Its value—the value in slice, that is, rather than that in *slice—is the address of l.Points, which has type []Point.
The value in *slice is the value in l.Points. So you could write:
*slice = append(*slice, Point{99, 100})
here. Since *slice is just another name for l.Points, you can also write:
l.Points = append(*slice, Point{99, 100})
You only need to use *slice if there's some reason that l.Points is not available,3 but you may use *slice if that's more convenient. Reading *slice reads l.Points and updating *slice updates l.Points.
1To see what I mean by may or may not be created here, consider:
var s []int
vs:
var s = []int{42}
The first leaves s == nil while the second creates an underlying array with the capacity to hold the one int value 42, holding the one int value 42, so that s != nil.
2It's not clear to me whether there is a promise never to write on an existing slice-array whose capacity is greater than its current length, but not sufficient to hold the final result. That is, can append first append 10 objects to the existing underlying array, then discover that it needs a bigger array and expand the underlying array? The difference is observable if there are other slice values referring to the existing underlying array.
3Here, a classic example would occur if you have reason to pass l.Points or &l.Points to some existing (pre-written) function:
If you need pass l.Points—the slice value—to some existing function, that existing function cannot change the slice value, but could change the underlying array. That's probably a bad plan, so if it does do this, make sure that this is OK! If it only reads the slice and underlying array, that's a lot safer.
If you need to pass &l.Points—a value that points to the slice value—to some existing function, that existing function can change both the slice, and the underlying array.
If you're writing a new function, it's up to you to write it in whatever manner is most appropriate. If you're only going to read the slice and underlying array, you can take a value of type []Point. If you intend to update the slice in place, you should take a value of type *[]Point—pointer to slice of Point.
Append returns a new slice that may modify the original backing array of the initial slice. The original slice will still point to the original backing array, not the new one (which may or may not be in the same place in memory)
For example (playground)
slice := []int{1,2,3}
fmt.Println(len(slice))
// Output: 3
newSlice := append(slice, 4)
fmt.Println(len(newSlice))
// Output: 4
fmt.Println(len(slice))
// Output: 3
While a slice can be described as a "fat pointer to an array", it is not a pointer and therefore you can't dereference it, which is why you get an error.
By creating a pointer to a slice, and using append as you did above, you are setting the slice the pointer points to to the "new" slice returned by append.
For more information, check out Go Slice Usage And Internals
Your first attempt didn't work because slices are not pointers, they can be considered reference types. Append will modify the underlying array if it has enough capacity, otherwise it returns a new slice.
You can achieve what you want with a combination of your two attempts.
playground
l := Line{
Points: []Point{
Point{3, 4},
},
}
slice := &l.Points
for i := 0; i < 100; i++ {
*slice = append(*slice, Point{99 + i, 100 + i})
}
fmt.Println(l.Points)
I know that this might be sacrilegious, but, for me, it is useful to think of slices
as structs.
type Slice struct {
len int
cap int
Array *[n]T // Pointer to array of type T
}
Since in languages like C, the [] operator is also a dereferencing operator, we can think that every time we are accessing a slice, we are actually dereferencing the underlying array and assigning some value to it. That is:
var s []int
s[0] = 1
Might be thought of as equivalent to (in pseudo-code):
var s Slice
*s.Array[0] = 1
That is why we can say that slices are "pointers". For that reason, it can modify its underlying array like this:
myArray := [3]int{1,1,1}
mySlice := myArray[0:1]
mySlice = append(mySlice, 2, 3) // myArray == mySlice
Modifying mySlice also modifies myArray, since the slice stores a pointer to the array and, on appending, we are dereferencing that pointer.
This behavior, nonetheless, is not always like this. If we exceed the capacity of the original array, a new array is created and the original array is left untouched.
myArray := [3]int{1,1,1}
mySlice := myArray[0:1]
mySlice = append(mySlice, 2, 3, 4, 5) // myArray != mySlice
The confusion arises when we try to treat the slice itself as an actual pointer. Since we can modify an underlying array by appending to it, we are led to believe that in this case:
sliceCopy := mySlice
sliceCopy = append(sliceCopy, 6)
both slices, slice and sliceCopy are the same, but they are not. We have to explicitly pass a reference to the memory address of the slice (using the & operator) in order to modify it. That is:
sliceAddress := &mySlice
*sliceAddress = append(mySlice, 6) // or append(*sliceAddress, 6)
See also
https://forum.golangbridge.org/t/slice-pass-as-value-or-pointer/2866/4
https://blog.golang.org/go-slices-usage-and-internals
https://appliedgo.net/slices/
When the formal parameter is map, assigning a value directly to a formal parameter cannot change the actual argument, but if you add a new key and value to the formal parameter, the actual argument outside the function can also be seen. Why is that?
I don't understand the output value of the following code, and the formal parameters are different from the actual parameters.
unc main() {
t := map[int]int{
1: 1,
}
fmt.Println(unsafe.Pointer(&t))
copysss(t)
fmt.Println(t)
}
func copysss(m map[int]int) {
//pointer := unsafe.Pointer(&m)
//fmt.Println(pointer)
m = map[int]int{
1: 2,
}
}
stdout :0xc000086010
map[1:1]
func main() {
t := map[int]int{
1: 1,
}
fmt.Println(unsafe.Pointer(&t))
copysss(t)
fmt.Println(t)
}
func copysss(m map[int]int) {
//pointer := unsafe.Pointer(&m)
//fmt.Println(pointer)
m[1] = 2
}
stdout :0xc00007a010
map[1:2]
func main() {
t := map[int]int{
1: 1,
}
fmt.Println(unsafe.Pointer(&t))
copysss(t)
fmt.Println(t)
}
func copysss(m map[int]int) {
pointer := unsafe.Pointer(&m)
fmt.Println(pointer)
m[1] = 2
}
stdout:0xc00008a008
0xc00008a018
map[1:2]
I want to know if the parameter is a value or a pointer.
The parameter is both a value and a pointer.
Wait.. whut?
Yes, a map (and slices, for that matter) are types, pretty similar to what you would implement. Think of a map like this:
type map struct {
// meta information on the map
meta struct{
keyT type
valueT type
len int
}
value *hashTable // pointer to the underlying data structure
}
So in your first function, where you reassign m, you're passing a copy of the struct above (pass by value), and you're assigning a new map to it, creating a new hashtable pointer in the process. The variable in the function scope is updated, but the one you passed still holds a reference to the original map, and with it, the pointer to the original map is preserved.
In the second snippet, you're accessing the underlying hash table (a copy of the pointer, but the pointer points to the same memory). You're directly manipulating the original map, because you're just changing the contents of the memory.
So TL;DR
A map is a value, containing meta information of what the map looks like, and a pointer to the actual data stored inside. The pointer is passed by value, like anything else (same way pointers are passed by value in C/C++), but of course, dereferencing a pointer means you're changing the values in memory directly.
Careful...
Like I said, slices work pretty much in the same way:
type slice struct {
meta struct {
type T
len, cap int
}
value *array // yes, it's a pointer to an underlying array
}
The underlying array is of say, a slice of ints will be [10]int if the cap of the slice is 10, regardless of the length. A slice is managed by the go runtime, so if you exceed the capacity, a new array is allocated (twice the cap of the previous one), the existing data is copied over, and the slice value field is set to point to the new array. That's the reason why append returns the slice that you're appending to, the underlying pointer may have changed etc.. you can find more in-depth information on this.
The thing you have to be careful with is that a function like this:
func update(s []int) {
for i, v := range s {
s[i] = v*2
}
}
will behave much in the same way as the function you have were you're assigning m[1] = 2, but once you start appending, the runtime is free to move the underlying array around, and point to a new memory address. So bottom line: maps and slices have an internal pointer, which can produce side-effects, but you're better off avoiding bugs/ambiguities. Go supports multiple return values, so just return a slice if you set about changing it.
Notes:
In your attempt to figure out what a map is (reference, value, pointer...), I noticed you tried this:
pointer := unsafe.Pointer(&m)
fmt.Println(pointer)
What you're doing there, is actually printing the address of the argument variable, not any address that actually corresponds to the map itself. the argument passed to unsafe.Pointer isn't of the type map[int]int, but rather it's of type *map[int]int.
Personally, I think there's too much confusion around passing by value vs passing by . Go works exactly like C in this regard, just like C, absolutely everything is passed by value. It just so happens that this value can sometimes be a memory address (pointer).
More details (references)
Slices: usage & internals
Maps Note: there's some confusion caused by this one, as pointers, slices, and maps are referred to as *reference types*, but as explained by others, and elsewhere, this is not to be confused with C++ references
In Go, map is a reference type. This means that the map actually resides in the heap and variable is just a pointer to that.
The map is passed by copy. You can change the local copy in your function, but this will not be reflected in caller's scope.
But, since the map variable is a pointer to the unique map residing in the heap, every change can be seen by any variable that points to the same map.
This article can clarify the concept: https://www.ardanlabs.com/blog/2014/12/using-pointers-in-go.html.
What's the difference about below ?
type Demo struct {s string}
func getDemo1()([]*Demo) // 1
func getDemo2()([]Demo) // 2
Is there any memory difference between getDemo1 and getDemo2?
I'm going to answer this, despite my better judgement to just send OP to the tour and documentation/specification. Mostly because of this:
Is there any memory difference between getDemo1 and getDemo2?
The answer to this specific question depends on how you utilize the slice. Go is Pass by value, so passing struct values around copies them. For instance, consider the following example.
https://play.golang.org/p/VzjYXwUy0EI
d1 := getDemo1()
d2 := getDemo2()
for _, v := range d1 {
// v is of type *Demo, so this modifies the value in the slice
v.s = "same"
}
fmt.Println(d1)
for _, v := range d2 {
// v is of type Demo, and is a COPY of the struct in the slice, so the original is not modified
v.s = "same"
}
So as to the memory question, obviously using *Demo, which returns a copy of the pointer in the range (effectively a uint64) as opposed to returning a copy of a Demo (the entire struct and all it's fields) would use less memory. BUT, you can still index directly to the array to avoid copies, except when you pass individual items in the slice around.
That said, passing the slice itself around, the two types have no difference in overhead. A slice is an abstraction of an array, and the slice itself that gets passed around is merely a slice header, which would be the same memory footprint regardless what type the slice contains.
BTW, the paradigm for modifying the values in the case of []Demo is:
for i, _ := range d2 {
d2[i].s = "same"
}
Here's my code:
func test(v map[string]string) {
v["foo"] = "bar"
}
func main() {
v := make(map[string]string)
test(v)
fmt.Printf("%v\n", v) // prints map[foo:bar]
}
I'm pretty new to Go, but as far as I was aware, since I'm passing the map value to test() and not a pointer to the map, the test() function should modify a different variable of the map, and thus, not affect the value of the variable in main(). I would have expected it to print map[]. I tested a different scenario:
type myStruct struct {
foo int
}
func test2(v myStruct) {
v.foo = 5
}
func main() {
v := myStruct{1}
test2(v)
fmt.Printf("%v\n", v) // prints {1}
}
In this scenario, the code behaves as I would expect. The v variable in the main() function is not affected by the changes to the variable in test2(). So why is map different?
You are right in that when you pass something to a function, a copy will be made. But maps are some kind of descriptors to an underlying data structure. So when you pass a map value to a function, only the descriptor will be copied which will denote / point to the same data structures where the map data (entries) are stored.
This means if the function does any modification to the entries of the map (add, delete, modify entries), that is observed from the caller.
Read The Go Blog: Go maps in action for details.
Note that the same applies to slices and channels too; generally speaking the types that you can create using the built-in make() function. That's why the zero value of these types is nil, because a value of these types need some extra initialization which is done when calling make().
In your other example you are using a struct value, they are not descriptors. When you pass a struct value to another function, that creates a complete copy of the struct value (copying values of all its fields), which –when modified inside the function– will not have any effect on the original, as the memory of the copy will be modified – which is distinct.
How can one remove selected keys from a map?
Is it safe to combine delete() with range, as in the code below?
package main
import "fmt"
type Info struct {
value string
}
func main() {
table := make(map[string]*Info)
for i := 0; i < 10; i++ {
str := fmt.Sprintf("%v", i)
table[str] = &Info{str}
}
for key, value := range table {
fmt.Printf("deleting %v=>%v\n", key, value.value)
delete(table, key)
}
}
https://play.golang.org/p/u1vufvEjSw
This is safe! You can also find a similar sample in Effective Go:
for key := range m {
if key.expired() {
delete(m, key)
}
}
And the language specification:
The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next. If map entries that have not yet been reached are removed during iteration, the corresponding iteration values will not be produced. If map entries are created during iteration, that entry may be produced during the iteration or may be skipped. The choice may vary for each entry created and from one iteration to the next. If the map is nil, the number of iterations is 0.
Sebastian's answer is accurate, but I wanted to know why it was safe, so I did some digging into the Map source code. It looks like on a call to delete(k, v), it basically just sets a flag (as well as changing the count value) instead of actually deleting the value:
b->tophash[i] = Empty;
(Empty is a constant for the value 0)
What the map appears to actually be doing is allocating a set number of buckets depending on the size of the map, which grows as you perform inserts at the rate of 2^B (from this source code):
byte *buckets; // array of 2^B Buckets. may be nil if count==0.
So there are almost always more buckets allocated than you're using, and when you do a range over the map, it checks that tophash value of each bucket in that 2^B to see if it can skip over it.
To summarize, the delete within a range is safe because the data is technically still there, but when it checks the tophash it sees that it can just skip over it and not include it in whatever range operation you're performing. The source code even includes a TODO:
// TODO: consolidate buckets if they are mostly empty
// can only consolidate if there are no live iterators at this size.
This explains why using the delete(k,v) function doesn't actually free up memory, just removes it from the list of buckets you're allowed to access. If you want to free up the actual memory you'll need to make the entire map unreachable so that garbage collection will step in. You can do this using a line like
map = nil
I was wondering if a memory leak could happen. So I wrote a test program:
package main
import (
log "github.com/Sirupsen/logrus"
"os/signal"
"os"
"math/rand"
"time"
)
func main() {
log.Info("=== START ===")
defer func() { log.Info("=== DONE ===") }()
go func() {
m := make(map[string]string)
for {
k := GenerateRandStr(1024)
m[k] = GenerateRandStr(1024*1024)
for k2, _ := range m {
delete(m, k2)
break
}
}
}()
osSignals := make(chan os.Signal, 1)
signal.Notify(osSignals, os.Interrupt)
for {
select {
case <-osSignals:
log.Info("Recieved ^C command. Exit")
return
}
}
}
func GenerateRandStr(n int) string {
rand.Seed(time.Now().UnixNano())
const letterBytes = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
b := make([]byte, n)
for i := range b {
b[i] = letterBytes[rand.Int63() % int64(len(letterBytes))]
}
return string(b)
}
Looks like GC do frees the memory. So it's okay.
In short, yes. See previous answers.
And also this, from here:
ianlancetaylor commented on Feb 18, 2015
I think the key to understanding this is to realize that while executing the body of a for/range statement, there is no current iteration. There is a set of values that have been seen, and a set of values that have not been seen. While executing the body, one of the key/value pairs that has been seen--the most recent pair--was assigned to the variable(s) of the range statement. There is nothing special about that key/value pair, it's just one of the ones that has already been seen during the iteration.
The question he's answering is about modifying map elements in place during a range operation, which is why he mentions the "current iteration". But it's also relevant here: you can delete keys during a range, and that just means that you won't see them later on in the range (and if you already saw them, that's okay).