Does Go (deep) copy keys when inserting into a map? - dictionary

I have a map with complex keys - for example, 2D arrays:
m := make(map[[2][3]int]int)
When I insert a new key into the map, does Go make a deep copy of the key?
a := [2][3]int{{1, 2, 3}, {4, 5, 6}}
m[a] = 1
In other words, if I change the array a after using it as a map key, does the map still contain the old value of a?

Short answer, it is copied.
By specification, Arrays are value types.
Go's arrays are values. An array variable denotes the entire array; it is not a pointer to the first array element (as would be the case in C). This means that when you assign or pass around an array value you will make a copy of its contents. (To avoid the copy you could pass a pointer to the array, but then that's a pointer to an array, not an array.)
https://blog.golang.org/go-slices-usage-and-internals
See for yourself:
https://play.golang.org/p/fEUYWwN-pm
package main
import (
"fmt"
)
func main() {
m := make(map[[2][3]int]int)
a := [2][3]int{{1, 2, 3}, {4, 5, 6}}
fmt.Printf("Pointer to a: %p\n", &a)
m[a] = 1
for k, _ := range m {
fmt.Printf("Pointer to k: %p\n", &k)
}
}
The pointers do not match.
EDIT: The real reason is when inserting into a map, the key value is copied. Or, you can continue to just remember the rule above: arrays are value types and their reuse denotes a copy. Either works here. :)

Arrays are always passed by value, so, yes in this case Go will make a deep copy of the key.
From the language spec
The comparison operators == and != must be fully defined for operands of the key type; thus the key type must not be a function, map, or slice. If the key type is an interface type, these comparison operators must be defined for the dynamic key values; failure will cause a run-time panic.
The keys are copied into the map. Excluding map and slice as valid keys means that the keys can't change. Note that go doesn't follow pointers if you define a map type with a pointer as a key (eg map[*int]int) it compares the pointers directly.

Related

Why I cannot append a value to a structs' slice using a reference?

I Go, I assumed slices were passed by reference, but this seems to work for values
but not for the array itself. For example, If I have this struct:
l := Line{
Points: []Point{
Point{3, 4},
},
}
I can define a variable, which gets passed a reference to the struct's slice
slice := l.Points
And then if I modify it, the original struct referenced by the variable
is going to reflect those modifications.
slice[0].X = 1000
fmt.Printf(
"This value %d is the same as this %d",
slice[0].X,
l.Points[0].X,
)
This differs from the behavior of arrays which, I assume, are passed by value.
So, for example, if I had defined the previous code using an array:
l := Line{
Points: [1]Point{
Point{3, 4},
},
}
arr := l.Points
arr[0].X = 1000
fmt.Println(arr.[0].X != s.Points[0].X) // equals true, original struct is untouched
Then, the l struct wouldn't have been modified.
Now, if I want to modify the slice itself I obviously cannot do this:
slice = append(slice, Point{99, 100})
Since that would only redefine the slice variable, losing the original reference.
I know I can simply do this:
l.Points = append(l.Points, Point{99, 100})
But, in some cases, it is more convenient to have another variable instead of having
to type the whole thing.
I tried this:
*slice = append(*slice, Point{99, 100})
But it doesn't work as I am trying to dereference something that apparently is not a pointer.
I finally tried this:
slice := &l.Points
*slice = append(l.Points, Point{99, 100})
And it works, but I am not sure what is happening. Why is the value of slice not overwritten? How does append works here?
Let's dispense first with a terminology issue. The Go language specification does not use the word reference the way you are using it. Go does however have pointers, and pointers are a form of reference. In addition, slices and maps are kind of special as there's some underlying data—the array underneath a slice, or the storage for a map—that may or may not already exist or be created by declaring or defining a variable whose type is slice of T or map[T1]T2 for some type T or type-pair T1 and T2.1
We can take your usage of the word reference to mean explicit pointer when talking about, e.g.:
func f1(p *int) {
// code ...
}
and the implied pointer when talking about:
func f2(m map[T1]T2) { ... }
func f3(s []T) { ... }
In f1, p really is a pointer: it thus refers to some actual int, or is nil. In f2, m refers to some underlying map, or is nil. In f3, s refers to some underlying array, or is nil.
But if you write:
l := Line{
Points: []Point{
Point{3, 4},
},
}
then you must have written:
type Line struct {
// ... maybe some fields here ...
Points []Point
// ... maybe more fields here ...
}
This Line is a struct type. It is not a slice type; it is not a map type. It contains a slice type but it is not itself one.
You now talk about passing these slices. If you pass l, you're passing the entire struct by value. It's pretty important to distinguish between that, and passing the value of l.Points. The function that receives one of these arguments must declare it with the right type.
For the most part, then, talking about references is just a red herring—a distraction from what's really going on. What we need to know is: What variables are you assigning what values, using what source code?
With all of that out of the way, let's talk about your actual code samples:
l.Points = append(l.Points, Point{99, 100})
This does just what it says:
Pass l.Points to append, which is a built-in as it is somewhat magically type-flexible (vs the rest of Go, where types are pretty rigid). It takes any value of type []T (slice of T, for any valid type T) plus one or more values of type T, and produces a new value of the same type, []T.
Assigns the result to l.Points.
When append does its work, it may:
receive nil (of the given type): in this case, it creates the underlying array, or
receive a non-nil slice: in this case, it writes into the underlying array or discards that array in favor of a new larger-capacity array as needed.2
So in all cases, the underlying array may have, in effect, just been created or replaced. It's therefore important that any other use of the same underlying array be updated appropriately. Assigning the result back to l.Points updates the—presumably one-and-only—slice variable that refers to the underlying array.
We can, however, break these assumptions:
s2 := l.Points
Now l.Points and s2 both refer to the (single) underlying array. Operations that modify that underlying array will, at least potentially, affect both s2 and l.Points.
Your second example is itself OK:
*slice = append(*slice, Point{99, 100})
but you haven't shown how slice itself was declared and/or assigned-to.
Your third example is fine as well:
slice := &l.Points
*slice = append(l.Points, Point{99, 100})
The first of these lines declares-and-initializes slice to point to l.Points. The variable slice therefore has type *[]Point. Its value—the value in slice, that is, rather than that in *slice—is the address of l.Points, which has type []Point.
The value in *slice is the value in l.Points. So you could write:
*slice = append(*slice, Point{99, 100})
here. Since *slice is just another name for l.Points, you can also write:
l.Points = append(*slice, Point{99, 100})
You only need to use *slice if there's some reason that l.Points is not available,3 but you may use *slice if that's more convenient. Reading *slice reads l.Points and updating *slice updates l.Points.
1To see what I mean by may or may not be created here, consider:
var s []int
vs:
var s = []int{42}
The first leaves s == nil while the second creates an underlying array with the capacity to hold the one int value 42, holding the one int value 42, so that s != nil.
2It's not clear to me whether there is a promise never to write on an existing slice-array whose capacity is greater than its current length, but not sufficient to hold the final result. That is, can append first append 10 objects to the existing underlying array, then discover that it needs a bigger array and expand the underlying array? The difference is observable if there are other slice values referring to the existing underlying array.
3Here, a classic example would occur if you have reason to pass l.Points or &l.Points to some existing (pre-written) function:
If you need pass l.Points—the slice value—to some existing function, that existing function cannot change the slice value, but could change the underlying array. That's probably a bad plan, so if it does do this, make sure that this is OK! If it only reads the slice and underlying array, that's a lot safer.
If you need to pass &l.Points—a value that points to the slice value—to some existing function, that existing function can change both the slice, and the underlying array.
If you're writing a new function, it's up to you to write it in whatever manner is most appropriate. If you're only going to read the slice and underlying array, you can take a value of type []Point. If you intend to update the slice in place, you should take a value of type *[]Point—pointer to slice of Point.
Append returns a new slice that may modify the original backing array of the initial slice. The original slice will still point to the original backing array, not the new one (which may or may not be in the same place in memory)
For example (playground)
slice := []int{1,2,3}
fmt.Println(len(slice))
// Output: 3
newSlice := append(slice, 4)
fmt.Println(len(newSlice))
// Output: 4
fmt.Println(len(slice))
// Output: 3
While a slice can be described as a "fat pointer to an array", it is not a pointer and therefore you can't dereference it, which is why you get an error.
By creating a pointer to a slice, and using append as you did above, you are setting the slice the pointer points to to the "new" slice returned by append.
For more information, check out Go Slice Usage And Internals
Your first attempt didn't work because slices are not pointers, they can be considered reference types. Append will modify the underlying array if it has enough capacity, otherwise it returns a new slice.
You can achieve what you want with a combination of your two attempts.
playground
l := Line{
Points: []Point{
Point{3, 4},
},
}
slice := &l.Points
for i := 0; i < 100; i++ {
*slice = append(*slice, Point{99 + i, 100 + i})
}
fmt.Println(l.Points)
I know that this might be sacrilegious, but, for me, it is useful to think of slices
as structs.
type Slice struct {
len int
cap int
Array *[n]T // Pointer to array of type T
}
Since in languages like C, the [] operator is also a dereferencing operator, we can think that every time we are accessing a slice, we are actually dereferencing the underlying array and assigning some value to it. That is:
var s []int
s[0] = 1
Might be thought of as equivalent to (in pseudo-code):
var s Slice
*s.Array[0] = 1
That is why we can say that slices are "pointers". For that reason, it can modify its underlying array like this:
myArray := [3]int{1,1,1}
mySlice := myArray[0:1]
mySlice = append(mySlice, 2, 3) // myArray == mySlice
Modifying mySlice also modifies myArray, since the slice stores a pointer to the array and, on appending, we are dereferencing that pointer.
This behavior, nonetheless, is not always like this. If we exceed the capacity of the original array, a new array is created and the original array is left untouched.
myArray := [3]int{1,1,1}
mySlice := myArray[0:1]
mySlice = append(mySlice, 2, 3, 4, 5) // myArray != mySlice
The confusion arises when we try to treat the slice itself as an actual pointer. Since we can modify an underlying array by appending to it, we are led to believe that in this case:
sliceCopy := mySlice
sliceCopy = append(sliceCopy, 6)
both slices, slice and sliceCopy are the same, but they are not. We have to explicitly pass a reference to the memory address of the slice (using the & operator) in order to modify it. That is:
sliceAddress := &mySlice
*sliceAddress = append(mySlice, 6) // or append(*sliceAddress, 6)
See also
https://forum.golangbridge.org/t/slice-pass-as-value-or-pointer/2866/4
https://blog.golang.org/go-slices-usage-and-internals
https://appliedgo.net/slices/

When the form parameter in go is map, what is passed in?

When the formal parameter is map, assigning a value directly to a formal parameter cannot change the actual argument, but if you add a new key and value to the formal parameter, the actual argument outside the function can also be seen. Why is that?
I don't understand the output value of the following code, and the formal parameters are different from the actual parameters.
unc main() {
t := map[int]int{
1: 1,
}
fmt.Println(unsafe.Pointer(&t))
copysss(t)
fmt.Println(t)
}
func copysss(m map[int]int) {
//pointer := unsafe.Pointer(&m)
//fmt.Println(pointer)
m = map[int]int{
1: 2,
}
}
stdout :0xc000086010
map[1:1]
func main() {
t := map[int]int{
1: 1,
}
fmt.Println(unsafe.Pointer(&t))
copysss(t)
fmt.Println(t)
}
func copysss(m map[int]int) {
//pointer := unsafe.Pointer(&m)
//fmt.Println(pointer)
m[1] = 2
}
stdout :0xc00007a010
map[1:2]
func main() {
t := map[int]int{
1: 1,
}
fmt.Println(unsafe.Pointer(&t))
copysss(t)
fmt.Println(t)
}
func copysss(m map[int]int) {
pointer := unsafe.Pointer(&m)
fmt.Println(pointer)
m[1] = 2
}
stdout:0xc00008a008
0xc00008a018
map[1:2]
I want to know if the parameter is a value or a pointer.
The parameter is both a value and a pointer.
Wait.. whut?
Yes, a map (and slices, for that matter) are types, pretty similar to what you would implement. Think of a map like this:
type map struct {
// meta information on the map
meta struct{
keyT type
valueT type
len int
}
value *hashTable // pointer to the underlying data structure
}
So in your first function, where you reassign m, you're passing a copy of the struct above (pass by value), and you're assigning a new map to it, creating a new hashtable pointer in the process. The variable in the function scope is updated, but the one you passed still holds a reference to the original map, and with it, the pointer to the original map is preserved.
In the second snippet, you're accessing the underlying hash table (a copy of the pointer, but the pointer points to the same memory). You're directly manipulating the original map, because you're just changing the contents of the memory.
So TL;DR
A map is a value, containing meta information of what the map looks like, and a pointer to the actual data stored inside. The pointer is passed by value, like anything else (same way pointers are passed by value in C/C++), but of course, dereferencing a pointer means you're changing the values in memory directly.
Careful...
Like I said, slices work pretty much in the same way:
type slice struct {
meta struct {
type T
len, cap int
}
value *array // yes, it's a pointer to an underlying array
}
The underlying array is of say, a slice of ints will be [10]int if the cap of the slice is 10, regardless of the length. A slice is managed by the go runtime, so if you exceed the capacity, a new array is allocated (twice the cap of the previous one), the existing data is copied over, and the slice value field is set to point to the new array. That's the reason why append returns the slice that you're appending to, the underlying pointer may have changed etc.. you can find more in-depth information on this.
The thing you have to be careful with is that a function like this:
func update(s []int) {
for i, v := range s {
s[i] = v*2
}
}
will behave much in the same way as the function you have were you're assigning m[1] = 2, but once you start appending, the runtime is free to move the underlying array around, and point to a new memory address. So bottom line: maps and slices have an internal pointer, which can produce side-effects, but you're better off avoiding bugs/ambiguities. Go supports multiple return values, so just return a slice if you set about changing it.
Notes:
In your attempt to figure out what a map is (reference, value, pointer...), I noticed you tried this:
pointer := unsafe.Pointer(&m)
fmt.Println(pointer)
What you're doing there, is actually printing the address of the argument variable, not any address that actually corresponds to the map itself. the argument passed to unsafe.Pointer isn't of the type map[int]int, but rather it's of type *map[int]int.
Personally, I think there's too much confusion around passing by value vs passing by . Go works exactly like C in this regard, just like C, absolutely everything is passed by value. It just so happens that this value can sometimes be a memory address (pointer).
More details (references)
Slices: usage & internals
Maps Note: there's some confusion caused by this one, as pointers, slices, and maps are referred to as *reference types*, but as explained by others, and elsewhere, this is not to be confused with C++ references
In Go, map is a reference type. This means that the map actually resides in the heap and variable is just a pointer to that.
The map is passed by copy. You can change the local copy in your function, but this will not be reflected in caller's scope.
But, since the map variable is a pointer to the unique map residing in the heap, every change can be seen by any variable that points to the same map.
This article can clarify the concept: https://www.ardanlabs.com/blog/2014/12/using-pointers-in-go.html.

Map initialization in Go

As far as I understand, types slice and map are similar in many ways in Go. They both reference (or container) types. In terms of abstract data types, they represent an array and an associative array, respectively.
However, their behaviour is quite different.
var s []int
var m map[int]int
While we can use a declared slice immediately (append new items or reslice it), we cannot do anything with a newly declared map. We have to call make function and initialize a map explicitly. Therefore, if some struct contains a map we have to write a constructor function for the struct.
So, the question is why it is not possible to add some syntaсtic sugar and both allocate and initialize the memory when declaring a map.
I did google the question, learnt a new word "avtovivification", but still failing to see the reason.
I am not talking about struct literal. Yes, you can explicitly initialize a map by providing values such as m := map[int]int{1: 1}. However, if you have some struct:
package main
import (
"fmt"
)
type SomeStruct struct {
someField map[int]int
someField2 []int
}
func main() {
s := SomeStruct{}
s.someField2 = append(s.someField2, -1) // OK
s.someField[0] = -1 // panic: assignment to entry in nil map
fmt.Println(s)
}
It is not possible to use a struct immediately (with default values for all fields). One has to create a constructor function for SomeStruct which has to initialize a map explicitly.
While we can use a declared slice immediately (append new items or reslice it), we cannot do anything with a newly declared map. We have to call make function and initialize a map explicitly. Therefore, if some struct contains a map we have to write a constructor function for the struct.
That's not true. Default value–or more precisely zero value–for both slices and maps is nil. You may do the "same" with a nil map as you can do with a nil slice. You can check length of a nil map, you can index a nil map (result will be the zero value of the value type of the map), e.g. the following are all working:
var m map[int]int
fmt.Println(m == nil) // Prints true
fmt.Println(len(m)) // Prints 0
fmt.Println(m[2]) // Prints 0
Try it on the Go Playground.
What you "feel" more about the zero-value slice is that you may add values to it. This is true, but under the hood a new slice will be allocated using the exact make() builtin function that you'd have to call for a map in order to add entries to it, and you have to (re)assign the returned slice. So a zero-value slice is "no more ready for use" than a zero-value map. append() just takes care of necessary (re)allocation and copying over. We could have an "equivalent" addEntry() function to which you could pass a map value and the key-value pairs, and if the passed map is nil, it could allocate a new map value and return it. If you don't call append(), you can't add values to a nil slice, just as you can't add entries to a nil map.
The primary reason that the zero value for slices and maps is nil (and not an initialized slice or map) is performance and efficiency. It is very often that a map or slice value (either variable or a struct field) will never get used, or not right away, and so if they would be allocated at declaration, that would be a waste of memory (and some CPU) resources, not to mention it gives more job to the garbage collector. Also if the zero value would be an initialized value, it would often be insufficient (e.g. a 0-size slice cannot hold any elements), and often it would be discarded as you add new elements to it (so the initial allocation would be a complete waste).
Yes, there are cases when you do want to use slices and maps right away, in which cases you may call make() yourself, or use a composite literal. You may also use the special form of make() where you supply the (initial) capacity for maps, avoiding future restructuring of the map internals (which usually requires non-negligible computation). An automatic non-nil default value could not guess what capacity you'd require.
You can! What you're looking for is:
package main
import "fmt"
func main() {
v := map[int]int{}
v[1] = 1
v[2] = 2
fmt.Println(v)
}
:= is declare and assign, where as var is simply declare.

How to make composite key for a hash map in go

First, my definition of composite key - two ore more values combine to make the key. Not to confuse with composite keys in databases.
My goal is to save computed values of pow(x, y) in a hash table, where x and y are integers. This is where I need ideas on how to make a key, so that given x and y, I can look it up in the hash table, to find pow(x,y).
For example:
pow(2, 3) => {key(2,3):8}
What I want to figure out is how to get the map key for the pair (2,3), i.e. the best way to generate a key which is a combination of multiple values, and use it in hash table.
The easiest and most flexible way is to use a struct as the key type, including all the data you want to be part of the key, so in your case:
type Key struct {
X, Y int
}
And that's all. Using it:
m := map[Key]int{}
m[Key{2, 2}] = 4
m[Key{2, 3}] = 8
fmt.Println("2^2 = ", m[Key{2, 2}])
fmt.Println("2^3 = ", m[Key{2, 3}])
Output (try it on the Go Playground):
2^2 = 4
2^3 = 8
Spec: Map types: You may use any types as the key where the comparison operators == and != are fully defined, and the above Key struct type fulfills this.
Spec: Comparison operators: Struct values are comparable if all their fields are comparable. Two struct values are equal if their corresponding non-blank fields are equal.
One important thing: you should not use a pointer as the key type (e.g. *Key), because comparing pointers only compares the memory address, and not the pointed values.
Also note that you could also use arrays (not slices) as key type, but arrays are not as flexible as structs. You can read more about this here: Why have arrays in Go?
This is how it would look like with arrays:
type Key [2]int
m := map[Key]int{}
m[Key{2, 2}] = 4
m[Key{2, 3}] = 8
fmt.Println("2^2 = ", m[Key{2, 2}])
fmt.Println("2^3 = ", m[Key{2, 3}])
Output is the same. Try it on the Go Playground.
Go can't make a hash of a slice of ints.
Therefore the way I would approach this is mapping a struct to a number.
Here is an example of how that could be done:
package main
import (
"fmt"
)
type Nums struct {
num1 int
num2 int
}
func main() {
powers := make(map[Nums]int)
numbers := Nums{num1: 2, num2: 4}
powers[numbers] = 6
fmt.Printf("%v", powers[input])
}
I hope that helps
Your specific problem is nicely solved by the other answers. I want to add an additional trick that may be useful in some corner cases.
Given that map keys must be comparable, you can also use interfaces. Interfaces are comparable if their dynamic values are comparable.
This allows you to essentially partition the map, i.e. to use multiple types of keys within the same data structure. For example if you want to store in your map n-tuples (it wouldn't work with arrays, because the array length is part of the type).
The idea is to define an interface with a dummy method (but it can surely be not dummy at all), and use that as map key:
type CompKey interface {
isCompositeKey() bool
}
var m map[CompKey]string
At this point you can have arbitrary types implementing the interface, either explicitly or by just embedding it.
In this example, the idea is to make the interface method unexported so that other structs may just embed the interface without having to provide an actual implementation — the method can't be called from outside its package. It will just signal that the struct is usable as a composite map key.
type AbsoluteCoords struct {
CompKey
x, y int
}
type RelativeCoords struct {
CompKey
x, y int
}
func foo() {
p := AbsoluteCoords{x: 1, y: 2}
r := RelativeCoords{x: 10, y: 20}
m[p] = "foo"
m[r] = "bar"
fmt.Println(m[AbsoluteCoords{x: 10, y: 20}]) // "" (empty, types don't match)
fmt.Println(m[RelativeCoords{x: 10, y: 20}]) // "bar" (matches, key present)
}
Of course nothing stops you from declaring actual methods on the interface, that may be useful when ranging over the map keys.
The disadvantage of this interface key is that it is now your responsibility to make sure the implementing types are actually comparable. E.g. this map key will panic:
type BadKey struct {
CompKey
nonComparableSliceField []int
}
b := BadKey{nil, []int{1,2}}
m[b] = "bad!" // panic: runtime error: hash of unhashable type main.BadKey
All in all, this might be an interesting approach when you need to keep two sets of K/V pairs in the same map, e.g. to keep some sanity in function signatures or to avoid defining structs with N very similar map fields.
Playground https://play.golang.org/p/0t7fcvSWdy7

How does a pointer to a struct or array value in Go work?

Considering the following Go struct:
type Person struct {
Name string
Age int
Country string
}
I have encountered numerious times the following use:
p := &Person{"Adam", 33, "Argentina"}
Yet I can not see the point in pointing to a struct value, and I wonder, how does it differ from:
n := &999 // Error
My questions are:
How is it even possible to point to a value, even if it is a struct or array and not a primitive like a string or int? Strange enough, the following doesn't contribute to my understanding:
fmt.Println(p, &p) // outputs: &{Adam 33 Argentina} 0xc042084018
Why would a programmer want to declare a struct instance by a pointer? What could you achieve doing so?
&Person{} is a language "construct", it's part of the spec: it allocates a new variable of Person type, and provides you the address of that anonymous variable.
Spec: Composite literals:
Taking the address of a composite literal generates a pointer to a unique variable initialized with the literal's value.
Also: Spec: Variables:
Calling the built-in function new or taking the address of a composite literal allocates storage for a variable at run time.
&999 is not allowed by the language spec. The possible operands of the address operators are listed in the Spec: Address operators:
The operand must be addressable, that is, either a variable, pointer indirection, or slice indexing operation; or a field selector of an addressable struct operand; or an array indexing operation of an addressable array. As an exception to the addressability requirement, x may also be a (possibly parenthesized) composite literal.
p := Person{} creates a new variable p whose type will be Person. p := &Person{} creates a new variable whose type will be *Person.
See possible duplicate: How do I do a literal *int64 in Go?
When you print the values with the fmt package, it has certain rules how to print values of different types:
For compound objects, the elements are printed using these rules, recursively, laid out like this:
struct: {field0 field1 ...}
array, slice: [elem0 elem1 ...]
maps: map[key1:value1 key2:value2]
pointer to above: &{}, &[], &map[]
When you use fmt.Println(), the default formatting rules will be applied, which for a value of type *int is the %p verb, which will print the memory address in hexadecimal format, but for a pointer to struct it prints the struct value prepended with an & sign (&{}). You can read more about it in related question: Difference between golang pointers
If you want to print the pointed value, dereference the pointer and pass the pointed value, e.g.:
var p = new(int)
*p = 12
fmt.Println(*p) // Prints 12
As to why to create a pointer to a value (and not a value), see these related questions:
Pointers vs. values in parameters and return values
Why should constructor of Go return address?
Go, X does not implement Y (... method has a pointer receiver)

Resources