Understanding pointers in Go - pointers

I am trying to understand how pointers work in Go. On a general level, I have little experience with pointers as I mostly use javascript.
I wrote this dummy program:
func swap(a, b *int) {
fmt.Println("3", &a, &b)
*a, *b = *b, *a
fmt.Println("4", a, b)
}
func main() {
x := 15
y := 2
fmt.Println("1", x, y)
fmt.Println("2", &x, &y)
swap(&x, &y)
fmt.Println("5", x, y)
}
Which prints the following results:
$ go run test.go
1 15 2
2 0x208178170 0x208178178
3 0x2081ac020 0x2081ac028
4 0x208178178 0x208178170
5 2 15
I have several questions:
From what I understand &x gives the address at which x is stored. To get the actual value of x, I need to use *x. Then, I don't understand why &x is of type *int. As *x and &x are both of type *int, their differences are not clear to me at all.
In my swap function, what is the difference between using *a, *b = *b, *a and using a, b = b, a? Both work but I can't explain why...
Why are the addresses different when printed between step 2 and 3?
Why can't I just modify the address directly assigning &b to &a for example?
Many thanks for your help

1) It's a confusion about the language that leads you to think *x and &x are the same type. As Not_a_Golfer pointed out, the uses of * in expressions and type names are different. *x in your main() is invalid syntax, because * in an expression tries to get the value pointed to by the pointer that follows, but x is not a pointer (it's an int).
I think you were thinking of the fact that when you take a pointer to x using &x, the character you add to the type name is * to form *int. I can see how it's confusing that &var gets you *typ rather than &typ. On the other hand, if they'd put & on the type name instead, that would be confusing in other situations. Some trickiness is inevitable, and like human languages, it may be easier to learn by using than discussion alone.
2) Again, turns out the assumption is inaccurate: a, b = b, a swaps the pointers that the swap function looks at, but doesn't swap the values from main's perspective and the last line of output changes to 5 15 2: http://play.golang.org/p/rCXDgkZ9kG
3) swap is printing the address of the pointer variables, not of the underlying integers. You'd print a and b to see the addresses of the integers.
4) I'm going to assume, maybe wrongly, that you were hoping you could swap the locations that arbitrary variables point to with & syntax, as in &x, &y = &y, &x, without ever declaring pointer variables. There's some ambiguity, and if that's not what you were going for, I'm not sure if this part of the answer will help you.
As with many "why can't I..." questions the easy out is "because they defined the language that way". But going to why it's that way a little bit, I think you're required to declare a variable as a pointer at some point (or another type implemented using pointers, like maps or slices) because pointers come with booby traps: you can do something over in one piece of code that changes other code's local variables in unexpected ways, for example. So wherever you see *int appear, it's telling you you might have to worry about (or, that you're able to use) things like nil pointers, concurrent access from multiple pieces of code, etc.
Go is a bit more conservative about making pointer-hood explicit than other languages are: C++, for example, has the concept of "reference parameters" (int& i) where you could do your swap(x,y) with no & or pointers appearing in main. In other words, in languages with reference parameters, you might have to look at the declaration of a function to know whether it will change its arguments. That sort of behavior was a little too surprising/implicit/tricky for the Go folks to adopt.
No getting around that all the referencing and dereferencing takes some thinking, and you might just have to work with it a while to get it; hope all this helps, though.

From what I understand &x gives the address at which x is stored. To get the actual value of x, I need to use *x. Then, I don't understand why &x is of type *int. As *x and &x are both of type *int, their differences are not clear to me at all.
When *int is in function declaration, it mean pointer of int. but it's in the statement, *x is not type *int. it mean dereference of the pointer. So:
*a, *b = *b, *a
This mean, swapping values by dereference.
The fuction arguments is passing by copyed. So if you want to export values to caller, you need to pass a pointer of the variable as argument.
In my swap function, what is the difference between using *a, *b = *b, *a and using a, b = b, a? Both work but I can't explain why...
As I said, a, b = b, a mean swapping pointers not swapping values.
Why are the addresses different when printed between step 2 and 3?
This is not a defined result. Address of variable are disposed by go's runtime.
Why can't I just modify the address directly assigning &b to &a for example?
Not impossible. For example:
package main
import (
"unsafe"
)
func main() {
arr := []int{2, 3}
pi := &arr[0] // take address of first element of arr
println(*pi) // print 2
// Add 4bytes to the pointer
pi = (*int)(unsafe.Pointer(uintptr(unsafe.Pointer(pi)) + unsafe.Sizeof(int(0))))
println(*pi) // print 3
}
Using unsafe package, you can update address values. But it's not endorsed.

Related

What is the core difference between t=&T{} and t=new(T)

It seems that both ways to create a new object pointer with all "0" member values, both returns a pointer:
type T struct{}
...
t1:=&T{}
t2:=new(T)
So what is the core difference between t1 and t2, or is there anything that "new" can do while &T{} cannot, or vice versa?
[…] is there anything that "new" can do while &T{} cannot, or vice versa?
I can think of three differences:
The "composite literal" syntax (the T{} part of &T{}) only works for "structs, arrays, slices, and maps" [link], whereas the new function works for any type [link].
For a struct or array type, the new function always generates zero values for its elements, whereas the composite literal syntax lets you initialize some of the elements to non-zero values if you like.
For a slice or map type, the new function always returns a pointer to nil, whereas the composite literal syntax always returns an initialized slice or map. (For maps this is very significant, because you can't add elements to nil.) Furthermore, the composite literal syntax can even create a non-empty slice or map.
(The second and third bullet-points are actually two aspects of the same thing — that the new function always creates zero values — but I list them separately because the implications are a bit different for the different types.)
For structs and other composites, both are same.
t1:=&T{}
t2:=new(T)
//Both are same
You cannot return the address of un-named variable initialised to zero value of other basic types like int without using new. You would need to create a named variable and then take its address.
func newInt() *int {
return new(int)
}
func newInt() *int {
// return &int{} --> invalid
var dummy int
return &dummy
}
See ruakh's answer. I want to point out some of the internal implementation details, though. You should not make use of them in production code, but they help illuminate what really happens behind the scenes, in the Go runtime.
Essentially, a slice is represented by three values. The reflect package exports a type, SliceHeader:
SliceHeader is the runtime representation of a slice. It cannot be used safely or portably and its representation may change in a later release. Moreover, the Data field is not sufficient to guarantee the data it references will not be garbage collected, so programs must keep a separate, correctly typed pointer to the underlying data.
type SliceHeader struct {
Data uintptr
Len int
Cap int
}
If we use this to inspect a variable of type []T (for any type T), we can see the three parts: the pointer to the underlying array, the length, and the capacity. Internally, a slice value v always has all three of these parts. There's a general condition that I think should hold, and if you don't use unsafe to break it, it seems by inspection that it will hold (based on limited testing anyway):
either the Data field is not zero (in which case Len and Cap can but need not be nonzero), or
the Data field is zero (in which case the Len and Cap should both be zero).
That slice value v is nil if the Data field is zero.
By using the unsafe package, we can break it deliberately (and then put it all back—and hopefully nothing goes wrong while we have it broken) and thus inspect the pieces. When this code on the Go Playground is run (there's a copy below as well), it prints:
via &literal: base of array is 0x1e52bc; len is 0; cap is 0.
Go calls this non-nil.
via new: base of array is 0x0; len is 0; cap is 0.
Go calls this nil even though we clobbered len() and cap()
Making it non-nil by unsafe hackery, we get [42] (with cap=1).
after setting *p1=nil: base of array is 0x0; len is 0; cap is 0.
Go calls this nil even though we clobbered len() and cap()
Making it non-nil by unsafe hackery, we get [42] (with cap=1).
The code itself is a bit long so I have left it to the end (or use the above link to the Playground). But it shows that the actual p == nil test in the source compiles to just an inspection of the Data field.
When you do:
p2 := new([]int)
the new function actually allocates only the slice header. It sets all three parts to zero and returns the pointer to the resulting header. So *p2 has three zero fields in it, which makes it a correct nil value.
On the other hand, when you do:
p1 := &[]int{}
the Go compiler builds an empty array (of size zero, holding zero ints) and then builds a slice header: the pointer part points to the empty array, and the length and capacity are set to zero. Then p1 points to this header, with the non-nil Data field. A later assignment, *p1 = nil, writes zeros into all three fields.
Let me repeat this with boldface: these are not promised by the language specification, they're just the actual implementation in action.
Maps work very similarly. A map variable is actually a pointer to a map header. The details of map headers are even less accessible than those of slice headers: there is no reflect type for them. The actual implementation is viewable here under type hmap (note that it is not exported).
What this means is that m2 := new(map[T1]T2) really only allocates one pointer, and set that pointer itself to nil. There is no actual map! The new function returns the nil pointer, and m2 is then nil. Likewise var m1 map[T1]T2 just sets a simple pointer value in m1 to nil. But var m3 map[T1]T2{} allocates an actual hmap structure, fills it in, and makes m3 point to it. We can once again peek behind the curtain on the Go Playground, with code that is not guaranteed to work tomorrow, to see this in effect.
As someone writing Go programs, you don't need to know any of this. But if you have worked with lower-level languages (assembly and C for instance), these explain a lot. In particular, these explain why you cannot insert into a nil map: the map variable itself holds a pointer value, and until the map variable itself has a non-nil pointer to a (possibly empty) map-header, there is no way to do the insertion. An insertion could allocate a new map and insert the data, but the map variable wouldn't point to the correct hmap header object.
(The language authors could have made this work by using a second level of indirection: a map variable could be a pointer pointing to the variable that points to the map header. Or they could have made map variables always point to a header, and made new actually allocate a header, the way make does; then there would never be a nil map. But they didn't do either of these, and we get what we get, which is fine: you just need to know to initialize the map.)
Here's the slice inspector. (Use the playground link to view the map inspector: given that I had to copy hmap's definition out of the runtime, I expect it to be particularly fragile and not worth showing. The slice header's structure seems far less likely to change over time.)
package main
import (
"fmt"
"reflect"
"unsafe"
)
func main() {
p1 := &[]int{}
p2 := new([]int)
show("via &literal", *p1)
show("\nvia new", *p2)
*p1 = nil
show("\nafter setting *p1=nil", *p1)
}
// This demonstrates that given a slice (p), the test
// if p == nil
// is really a test on p.Data. If it's zero (nil),
// the slice as a whole is nil. If it's nonzero, the
// slice as a whole is non-nil.
func show(what string, p []int) {
pp := unsafe.Pointer(&p)
sh := (*reflect.SliceHeader)(pp)
fmt.Printf("%s: base of array is %#x; len is %d; cap is %d.\n",
what, sh.Data, sh.Len, sh.Cap)
olen, ocap := len(p), cap(p)
sh.Len, sh.Cap = 1, 1 // evil
if p == nil {
fmt.Println(" Go calls this nil even though we clobbered len() and cap()")
answer := 42
sh.Data = uintptr(unsafe.Pointer(&answer))
fmt.Printf(" Making it non-nil by unsafe hackery, we get %v (with cap=%d).\n",
p, cap(p))
sh.Data = 0 // restore nil-ness
} else {
fmt.Println("Go calls this non-nil.")
}
sh.Len, sh.Cap = olen, ocap // undo evil
}

In (Free) Pascal, can a function return a value that can be modified without dereference?

In Pascal, I understand that one could create a function returning a pointer which can be dereferenced and then assign a value to that, such as in the following (obnoxiously useless) example:
type ptr = ^integer;
var d: integer;
function f(x: integer): ptr;
begin
f := #x;
end;
begin
f(d)^ := 4;
end.
And now d is 4.
(The actual usage is to access part of a quite complicated array of records data structure. I know that a class would be better than an array of nested records, but it isn't my code (it's TeX: The Program) and was written before Pascal implementations supported object-orientation. The code was written using essentially a language built on top of Pascal that added macros which expand before the compiler sees them. Thus you could define some macro m that takes an argument x and expands into thearray[x + 1].f1.f2 instead of writing that every time; the usage would be m(x) := somevalue. I want to replicate this functionality with a function instead of a macro.)
However, is it possible to achieve this functionality without the ^ operator? Can a function f be written such that f(x) := y (no caret) assigns the value y to x? I know that this is stupid and the answer is probably no, but I just (a) don't really like the look of it and (b) am trying to mimic exactly the form of the macro I mentioned above.
References are not first class objects in Pascal, unlike languages such as C++ or D. So the simple answer is that you cannot directly achieve what you want.
Using a pointer as you illustrated is one way to achieve the same effect although in real code you'd need to return the address of an object whose lifetime extends beyond that of the function. In your code that is not the case because the argument x is only valid until the function returns.
You could use an enhanced record with operator overloading to encapsulate the pointer, and so encapsulate the pointer dereferencing code. That may be a good option, but it very much depends on your overall problem, of which we do not have sight.

Go functions with mixed return/signature types

I'm trying to figure out how pointers work in Go and I think I'm starting to get it, but this is confusing me and I don't really know what to search for. Let's say I have the following function:
func createNode(nodeInfo string) *TreeNode {
return &TreeNode{info: nodeInfo}
}
I understand that the function is returning the memory address of the created struct instance, but how does the function signature say *TreeNode? According to my understanding, the * is used to dereference pointers to get the value itself, so what is happening here?
Also, here:
func zero(xPtr *int) {
*xPtr = 0
}
func main() {
x := 5
zero(&x)
}
The opposite is happening where the function is accepting an argument with the * operator but the function itself is being called with & operator.
The * has 2 uses, one for variables and one for types:
For types, it signifies, that the type is a pointer, not a value directly.
For variables, it dereferences a pointer, as you already know. (One might also distinguish this use into to two "sub-uses", as * on the left side of assignments, as in *ptr = val sets the value the pointer is pointing to, while * usually "retrieves" the value the pointer is pointing to.)
& on the other hand, can only be used for variables and gets the address of an object in memory.
In your examples, the return type *TreeNode and the argument type *int signify that you are returning/expecting a pointer, according to the use for types. In contrast, *xPtr = 0 dereferences the variable xPtr.
To know which use is the correct one in your situation, you must make clear to yourself whether you are dealing with a type or a variable.
For a technical description, you can read the sections of the language specifications on Pointer types and Address operators.

Explaining C declarations in Rust

I need to rewrite these C declarations in Go and Rust for a set of practice problems I am working on. I figured out the Go part, but I am having trouble with the Rust part. Any ideas or help to write these in Rust?
double *a[n];
double (*b)[n];
double (*c[n])();
double (*d())[n];
Assuming n is a constant:
let a: [*mut f64, ..n]; // double *a[n];
let b: *mut [f64, ..n]; // double (*b)[n];
let c: [fn() -> f64, ..n]; // double (*c[n])();
fn d() -> *mut [f64, ..n]; // double (*d())[n];
These are rather awkward and unusual types in any language. Rust's syntax, however, makes these declarations a lot easier to read than C's syntax does.
Note that d in C is a function declaration. In Rust, external function declarations are only allowed in extern blocks (see the FFI guide).
The answer depends on what, exactly, the * is for. For example, is the first one being used as an array of pointers to doubles, or is it an array of arrays of doubles? Are the pointers nullable or not?
Also, is n a constant or not? If it is, then you want an array; if it's not, you want a Vec.
Also also, are these global or local declarations? Are they function arguments? There's different syntax involved for each.
Frankly, without more context, it's impossible to answer this question with any accuracy. Instead, I will give you the following:
The Rust documentation contains all the information you'll need, although it's spread out a bit. Check the reference and any appropriate-looking guides. The FFI Guide is probably worth looking at.
cdecl is a website that will unpick C declarations if that's the part you're having difficulty with. Just note that you'll have to remove the semicolon and the n or it won't parse.
The floating point types in Rust are f32 and f64, depending on whether you're using float or double. Also, don't get caught: int in Rust is not equivalent to int in C. Prefer explicitly-sized types like i32 or u64, or types from libc like c_int. int and uint should only be used with explicitly pointer-sized values.
Normally, you'd write a reference to a T as &T or &mut T, depending on desired mutability (default in C is mutable, default in Rust is immutable).
If you want a nullable reference, use Option<&T>.
If you are trying to use these in a context where you start getting complaints about needing "lifetimes"... well, you're just going to have to learn the language. At that point, simple translation isn't going to work very well.
In Rust, array types are written as brackets around the element type. So an "array of doubles" would be [f64], an array of size n would be [f64, ..n]. Typically, however, the actual equivalent to, say, double[] in C would be &[f64]; that is, a reference to an array, rather then the actual contents of the array.
Use of "raw pointers" is heavily discouraged in Rust, and you cannot use them meaningfully outside of unsafe code. In terms of syntax, a pointer to T is *const T or *mut T, depending on whether it's a pointer to constant or mutable data.
Function pointers are just written as fn (Args...) -> Result. So a function that takes nothing and returns a double would be fn () -> f64.

Use pointer to a value as a slice

Is it possible to convert a pointer to certain value to a slice?
For example, I want to read single byte from io.Reader into uint8 variable. io.Reader.Read accepts a slice as its argument, so I cannot simply provide it a pointer to my variable as I'd do in C.
I think that creating a slice of length 1, capacity 1 from a pointer is safe operation. Obviously, it should be the same as creating a slice from an array of length 1, which is allowed operation. Is there an easy way to do this with plain variable? Or maybe I do not understand something and there are reasons why this is prohibited?
A slice is not only a pointer, like an array in C. It also contains the length and capacity of the data, like this:
struct {
ptr *uint8
len int
cap int
}
So, yes, you will need to create a slice. Simplest way to create a slice of the var a uint8 would be []uint8{a}
a := uint8(42)
fmt.Printf("%#v\n", []uint8{a})
(But after rereading your question, this is not a solution as all)
But if you wish to create the slice from the variable, pointing to the same space of memory, you could use the unsafe package. This is most likely to be discouraged.
fmt.Printf("%#v\n", (*[1]uint8)(unsafe.Pointer(&a))[:] )
Instead of (over)complicating this trivial task, why not to use the simple solution? I.e. pass .Read a length-1 slice and then assign its zeroth element to your variable.
I found a way to overcome my case when I want to supply a variable to io.Reader. Go standard library is wonderful!
import (
"io"
"encoding/binary"
)
...
var x uint8
binary.Read(reader, LittleEndian, &x)
As a side effect this works for any basic type and even for some non-basic.

Resources