Trying to grasp Go's pointers - pointers

I've written a small snippet to recursively walk a directory and return a slice of files ([]string) based on extension. It seems to work, but I cannot fully get the idea behind pointers and how to properly use it.
package main
import (
"path/filepath"
"io/ioutil"
"fmt"
)
// aggregator slice should hold the result (a slice of filepaths)
// dir is the current directory being listed
// exts is a slice of extensions that should be included in the result
func recurse(aggregator *[]string, dir string, exts *[]string) []string {
files, _ := ioutil.ReadDir(dir)
for _, file := range files {
// current filepath
path := filepath.Join(dir, file.Name())
// if directory, recursively analyze it
if file.IsDir() {
*aggregator = recurse(aggregator, path, exts)
// else check if file is of proper extension and add it to aggregator
} else {
for _, ext := range *exts {
if (filepath.Ext(path) == ext) {
*aggregator = append(*aggregator, path)
break
}
}
}
}
return *aggregator
}
func getTemplates(templatesDir string, templatesExtensions ...string) {
templates := recurse(&[]string{}, templatesDir, &templatesExtensions)
// for testing purposes just print filepaths
for _, t := range templates {
fmt.Printf("%s\n", t)
}
}
func main() {
getTemplates("./templates", ".tmpl", ".html")
}
The main question is using *aggregator (aggregator *[]string), &[]string{} and *exts (exts *[]string). I come from javascript world where every object is basically a pointer and you can only copy objects and arrays explicitly. Here, on the other hand, it seems that if I didn't use pointers (*aggregator, etc.), these objects would get copied on each function iteration.
Am I correct in this approach or am I missing something?

Though this visualization is not specifically about Go (it's about Java) yet it's a perfect one to actually visualize the usage of pointers and values (1):
As you see a pointer does not actually contain any data, but it points to the place where data resides. So any modifications that are made on that data via a pointer, are actually getting performed on the data itself. But the data do not necessarily reside where our pointer is being used.
There are different situations when we might need pointers. For example when you want to modify the actual values in one specific place (and not pass those values around) or when your data is too big that the cost would be just too high to send the actual content around. You can use a/some pointer to this big data and everybody (every function for example) that has a pointer to that data, can read it or modify it. And as just we said, we can have as many pointers to the same data as needed. So there may be many pointers to just the same, one data. The value of these pointers but is the same; which is the address of the source data (object).
(1) Source

In go slices are reference types, along with maps and pointers (I'm pretty sure strings are too don't qoute me on it though :) see here for a discussion on this). So these particular types are automatically passed by reference. So the actual variable itself evaluates as a pointer where the value would be the reference address.
So in your case it would be safe and probably preferable to change aggregator *[]string to aggregator []string and your data will not be copied just a reference passed. Of course along with this change you will need to change all code that was previously dereferencing aggregator e.g.
// Change this line
*aggregator = append(*aggregator, path)
// To this
aggregator = append(aggregator, path)
The reasoning for doing this would likely stem from C based languages where arrays are simply pointer to the start of allocated memory.
Note: All other types including structs do not follow this pattern (interfaces are kind of another exception interesting read). Also this code looks like it could be greatly simplified with filepath.Walk().
For a bit more information although more targeted at maps see this blog post.

Related

Pointers sent to function

I have following code in main():
msgs, err := ch.Consume(
q.Name, // queue
//..
)
cache := ttlru.New(100, ttlru.WithTTL(5 * time.Minute)) //Cache type
//log.Println(reflect.TypeOf(msgs)) 'chan amqp.Delivery'
go func() {
//here I use `cache` and `msgs` as closures. And it works fine.
}
I decided to create separate function for instead of anonymous.
I declared it as func hitCache(cache *ttlru.Cache, msgs *chan amqp.Delivery) {
I get compile exception:
./go_server.go:61: cannot use cache (type ttlru.Cache) as type *ttlru.Cache in argument to hitCache:
*ttlru.Cache is pointer to interface, not interface
./go_server.go:61: cannot use msgs (type <-chan amqp.Delivery) as type *chan amqp.Delivery in argument to hitCache
Question: How should I pass msg and cache into the new function?
Well, if the receiving variable or a function parameter expects a value
of type *T — that is, "a pointer to T",
and you have a variable of type T, to get a pointer to it,
you have to get the address of that variable.
That's because "a pointer" is a value holding an address.
The address-taking operator in Go is &, so you need something like
hitCache(&cache, &msgs)
But note that some types have so-called "reference semantics".
That is, values of them keep references to some "hidden" data structure.
That means when you copy such values, you're copying references which all reference the same data structure.
In Go, the built-in types maps, slices and channels have reference semantics,
and hence you almost never need to pass around pointers to the values of such types (well, sometimes it can be useful but not now).
Interfaces can be thought of to have reference semantics, too (let's not for now digress into discussing this) because each value of any interface type contains two pointers.
So, in your case it's better to merely not declare the formal parameters of your function as pointers — declare them as "plain" types and be done with it.
All in all, you should definitely complete some basic resource on Go which explains these basic matters in more detail and more extensively.
You're using pointers in the function signature but not passing pointers - which is fine; as noted in the comments, there is no reason to use pointers for interface or channel values. Just change the function signature to:
hitCache(cache ttlru.Cache, msgs chan amqp.Delivery)
And it should work fine.
Pointers to interfaces are nearly never used. You may simplify things and use interfaces of pass by value.

Why is fmt.Println not consistent when printing pointers?

I'm an experienced programmer but have never before touched Go in my life.
I just started playing around with it and I found that fmt.Println() will actually print the values of pointers prefixed by &, which is neat.
However, it doesn't do this with all types. I'm pretty sure it is because the types it does not work with are primitives (or at least, Java would call them that, does Go?).
Does anyone know why this inconsistent behaviour exists in the Go fmt library? I can easily retrieve the value by using *p, but for some reason Println doesn't do this.
Example:
package main
import "fmt"
type X struct {
S string
}
func main() {
x := X{"Hello World"}
fmt.Println(&x) // &{Hello World} <-- displays the pointed-to value prefixed with &
fmt.Println(*(&x)) // {Hello World}
i := int(1)
fmt.Println(&i) // 0x10410028 <-- instead of &1 ?
fmt.Println(*(&i)) // 1
}
The "technical" answer to your question can be found here:
https://golang.org/src/fmt/print.go?#L839
As you can see, when printing pointers to Array, Slice, Struct or Map types, the special rule of printing "&" + value applies, but in all other cases the address is printed.
As for why they decided to only apply the rule for those, it seems the authors considered that for "compound" objects you'd be interested in always seeing the values (even when using a pointer), but for other simple values this was not the case.
You can see that reasoning here, where they added the rule for the Map type which was not there before:
https://github.com/golang/go/commit/a0c5adc35cbfe071786b6115d63abc7ad90578a9#diff-ebda2980233a5fb8194307ce437dd60a
I would guess this had to do with the fact that it is very common to use for example pointers to Struct to pass them around (so many times you'd just forget to de-reference the pointer when wanting to print the value), but no so common to use pointers to int or string to pass those around (so if you were printing the pointer you were probably interested in seeing the actual address).

Collection of Unique Functions in Go

I am trying to implement a set of functions in go. The context is an event server; I would like to prevent (or at least warn) adding the same handler more than once for an event.
I have read that maps are idiomatic to use as sets because of the ease of checking for membership:
if _, ok := set[item]; ok {
// don't add item
} else {
// do add item
}
I'm having some trouble with using this paradigm for functions though. Here is my first attempt:
// this is not the actual signature
type EventResponse func(args interface{})
type EventResponseSet map[*EventResponse]struct{}
func (ers EventResponseSet) Add(r EventResponse) {
if _, ok := ers[&r]; ok {
// warn here
return
}
ers[&r] = struct{}{}
}
func (ers EventResponseSet) Remove(r EventResponse) {
// if key is not there, doesn't matter
delete(ers, &r)
}
It is clear why this doesn't work: functions are not reference types in Go, though some people will tell you they are. I have proof, though we shouldn't need it since the language specification says that everything other than maps, slices, and pointers are passed by value.
Attempt 2:
func (ers EventResponseSet) Add(r *EventResponse) {
// ...
}
This has a couple of problems:
Any EventResponse has to be declared like fn := func(args interface{}){} because you can't address functions declared in the usual manner.
You can't pass a closure at all.
Using a wrapper is not an option because any function passed to the wrapper will get a new address from the wrapper - no function will be uniquely identifiable by address, and all this careful planning is for nought.
Is it silly of me to not accept defining functions as variables as a solution? Is there another (good) solution?
To be clear, I accept that there are cases that I can't catch (closures), and that's fine. The use case that I envision is defining a bunch of handlers and being relatively safe that I won't accidentally add one to the same event twice, if that makes sense.
You could use reflect.Value presented by Uvelichitel, or the function address as a string acquired by fmt.Sprint() or the address as uintptr acquired by reflect.Value.Pointer() (more in the answer How to compare 2 functions in Go?), but I recommend against it.
Since the language spec does not allow to compare function values, nor does it allow to take their addresses, you have no guarantee that something that works at a time in your program will work always, including a specific run, and including different (future) Go compilers. I would not use it.
Since the spec is strict about this, this means compilers are allowed to generate code that would for example change the address of a function at runtime (e.g. unload an unused function, then load it again later if needed again). I don't know about such behavior currently, but this doesn't mean that a future Go compiler will not take advantage of such thing.
If you store a function address (in whatever format), that value does not count as keeping the function value anymore. And if no one else would "own" the function value anymore, the generated code (and the Go runtime) would be "free" to modify / relocate the function (and thus changing its address) – without violating the spec and Go's type safety. So you could not be rightfully angry at and blame the compiler, but only yourself.
If you want to check against reusing, you could work with interface values.
Let's say you need functions with signature:
func(p ParamType) RetType
Create an interface:
type EventResponse interface {
Do(p ParamType) RetType
}
For example, you could have an unexported struct type, and a pointer to it could implement your EventResponse interface. Make an exported function to return the single value, so no new values may be created.
E.g.:
type myEvtResp struct{}
func (m *myEvtResp) Do(p ParamType) RetType {
// Your logic comes here
}
var single = &myEvtResp{}
func Get() EventResponse { return single }
Is it really needed to hide the implementation in a package, and only create and "publish" a single instance? Unfortunately yes, because else you could create other value like &myEvtResp{} which may be different pointers still having the same Do() method, but the interface wrapper values might not be equal:
Interface values are comparable. Two interface values are equal if they have identical dynamic types and equal dynamic values or if both have value nil.
[...and...]
Pointer values are comparable. Two pointer values are equal if they point to the same variable or if both have value nil. Pointers to distinct zero-size variables may or may not be equal.
The type *myEvtResp implements EventResponse and so you can register a value of it (the only value, accessible via Get()). You can have a map of type map[EventResponse]bool in which you may store your registered handlers, the interface values as keys, and true as values. Indexing a map with a key that is not in the map yields the zero value of the value type of the map. So if the value type of the map is bool, indexing it with a non-existing key will result in false – telling it's not in the map. Indexing with an already registered EventResponse (an existing key) will result in the stored value – true – telling it's in the map, it's already registered.
You can simply check if one already been registered:
type EventResponseSet map[*EventResponse]bool
func (ers EventResponseSet) Add(r EventResponse) {
if ers[r] {
// warn here
return
}
ers[r] = true
}
Closing: This may seem a little too much hassle just to avoid duplicated use. I agree, and I wouldn't go for it. But if you want to...
Which functions you mean to be equal? Comparability is not defined for functions types in language specification. reflect.Value gives you the desired behaviour more or less
type EventResponseSet map[reflect.Value]struct{}
set := make(EventResponseSet)
if _, ok := set[reflect.ValueOf(item)]; ok {
// don't add item
} else {
// do add item
set[reflect.ValueOf(item)] = struct{}{}
}
this assertion will treat as equal items produced by assignments only
//for example
item1 := fmt.Println
item2 := fmt.Println
item3 := item1
//would have all same reflect.Value
but I don't think this behaviour guaranteed by any documentation.

How to initialize a struct value fields using reflection?

I got a .ini configuration file that I want to use to initialize a Configuration struct.
I'd like to use the Configuration fields names and loop over them to populate my new instance with the corresponding value in the .ini file.
I thought the best way to achieve this might be reflection API (maybe I'm totally wrong, tell me...)
My problem here is that I cannot figure out how to access field's name (if it is at least possible)
Here is my code:
package test
import(
"reflect"
"gopkg.in/ini.v1"
)
type Config struct {
certPath string
keyPath string
caPath string
}
func InitConfig(iniConf *ini.File) *Config{
config:=new(Config)
var valuePtr reflect.Value = reflect.ValueOf(config)
var value reflect.Value = valuePtr.Elem()
for i := 0; i < value.NumField(); i++ {
field := value.Field(i)
if field.Type() == reflect.TypeOf("") {
//here is my problem, I can't get the field name, this method does not exist... :'(
value:=cfg.GetSection("section").GetKey(field.GetName())
field.SetString(value)
}
}
return config
}
Any help appreciated...
Use the type to get a StructField. The StructField has the name:
name := value.Type().Field(i).Name
Note that the ini package's File.MapTo and Section.MapTo methods implement this functionality.
While #MuffinTop solved your immediate issue, I'd say you may be solving a wrong problem. I personally know of at least two packages, github.com/Thomasdezeeuw/ini and gopkg.in/gcfg.v1, which are able to parse INI-style files (of the various level of "INI-ness", FWIW) and automatically populate your struct-typed values using reflection, so for you it merely amounts to properly setting tags on the fields of your struct (if needed at all).
I used both of these packages in production so am able to immediately recommend them. You might find more packages dedicated to parsing INI files on godoc.org.

Go Programming - bypassing access privileges using pointers

Let's say I have the following hierarchy for my project:
fragment/fragment.go
main.go
And in the fragment.go I have the following code, with one getter and no setter:
package fragment
type Fragment struct {
number int64 // private variable - lower case
}
func (f *Fragment) GetNumber() *int64 {
return &f.number
}
And in the main.go I create a Fragment and try to change Fragment.number without a setter:
package main
import (
"fmt"
"myproject/fragment"
)
func main() {
f := new(fragment.Fragment)
fmt.Println(*f.GetNumber()) // prints 0
//f.number = 8 // error - number is private
p := f.GetNumber()
*p = 4 // works. Now f.number is 4
fmt.Println(*f.GetNumber()) // prints 4
}
So by using the pointer, I changed the private variable outside of the fragment package. I understand that in for example C, pointers help to avoid copying large struct/arrays and they are supposed to enable you to change whatever they're pointing to. But I don't quite understand how they are supposed to work with private variables.
So my questions are:
Shouldn't the private variables stay private, no matter how they are accessed?
How is this compared to other languages such as C++/Java? Is it the case there too, that private variables can be changed using pointers outside of the class?
My Background: I know a bit C/C++, rather fluent in Python and new to Go. I learn programming as a hobby so don't know much about technical things happening behind the scenes.
You're not bypassing any access privilegies. If you acquire a *T from any imported package then you can always mutate *T, ie. the pointee at whole, as in an assignment. The imported package designer controls what you can get from the package, so the access control is not yours.
The restriction to what's said above is for structured types (structs), where the previous still holds, but the finer granularity of access control to a particular field is controlled by the field's name case even when referred to by a pointer to the whole structure. The field name must be uppercase to be visible outside its package.
Wrt C++: I believe you can achieve the same with one of the dozens C++ pointer types. Not sure which one, though.
Wrt Java: No, Java has no pointers. Not really comparable to pointers in Go (C, C++, ...).

Resources