Is there a better way to parse this Map? - dictionary

Fairly new to Go, essentially in the actual code I'm writing I plan to read from a file which will contain environment variables, i.e. API_KEY=XYZ. Means I can keep them out of Version control. The below solution 'works' but I feel like there is probably a better way of doing it.
The end goal is to be able to access the elements from the file like so
m["API_KEY"] and that would print XYZ. This may even already exist and I'm re-inventing the wheel, I saw Go has environment variables but it didn't seem to be what I was after specifically.
So any help is appreciated.
Playground
Code:
package main
import (
"fmt"
"strings"
)
var m = make(map[string]string)
func main() {
text := `Var1=Value1
Var2=Value2
Var3=Value3`
arr := strings.Split(text, "\n")
for _, value := range arr {
tmp := strings.Split(value, "=")
m[strings.TrimSpace(tmp[0])] = strings.TrimSpace(tmp[1])
}
fmt.Println(m)
}

First, I would recommend to read this related question: How to handle configuration in Go
Next, I would really consider storing your configuration in another format. Because what you propose isn't a standard. It's close to Java's property file format (.properties), but even property files may contain Unicode sequences and thus your code is not a valid .properties format parser as it doesn't handle Unicode sequences at all.
Instead I would recommend to use JSON, so you can easily parse it with Go or with any other language, and there are many tools to edit JSON texts, and still it is human-friendly.
Going with the JSON format, decoding it into a map is just one function call: json.Unmarshal(). It could look like this:
text := `{"Var1":"Value1", "Var2":"Value2", "Var3":"Value3"}`
var m map[string]string
if err := json.Unmarshal([]byte(text), &m); err != nil {
fmt.Println("Invalid config file:", err)
return
}
fmt.Println(m)
Output (try it on the Go Playground):
map[Var1:Value1 Var2:Value2 Var3:Value3]
The json package will handle formatting and escaping for you, so you don't have to worry about any of those. It will also detect and report errors for you. Also JSON is more flexible, your config may contain numbers, texts, arrays, etc. All those come for "free" just because you chose the JSON format.
Another popular format for configuration is YAML, but the Go standard library does not include a YAML parser. See Go implementation github.com/go-yaml/yaml.
If you don't want to change your format, then I would just use the code you posted, because it does exactly what you want it to do: process input line-by-line, and parse a name = value pair from each line. And it does it in a clear and obvious way. Using a CSV or any other reader for this purpose is bad because they hide what's under the hood (they intentionally and rightfully hide format specific details and transformations). A CSV reader is a CSV reader first; even if you change the tabulator / comma symbol: it will interpret certain escape sequences and might give you different data than what you see in a plain text editor. This is an unintended behavior from your point of view, but hey, your input is not in CSV format and yet you asked a reader to interpret it as CSV!
One improvement I would add to your solution is the use of bufio.Scanner. It can be used to read an input line-by-line, and it handles different styles of newline sequences. It could look like this:
text := `Var1=Value1
Var2=Value2
Var3=Value3`
scanner := bufio.NewScanner(strings.NewReader(text))
m := map[string]string{}
for scanner.Scan() {
parts := strings.Split(scanner.Text(), "=")
if len(parts) == 2 {
m[strings.TrimSpace(parts[0])] = strings.TrimSpace(parts[1])
}
}
if err := scanner.Err(); err != nil {
fmt.Println("Error encountered:", err)
}
fmt.Println(m)
Output is the same. Try it on the Go Playground.
Using bufio.Scanner has another advantage: bufio.NewScanner() accepts an io.Reader, the general interface for "all things being a source of bytes". This means if your config is stored in a file, you don't even have to read all the config into the memory, you can just open the file e.g. with os.Open() which returns a value of *os.File which also implements io.Reader, so you may directly pass the *os.File value to bufio.NewScanner() (and so the bufio.Scanner will read from the file and not from an in-memory buffer like in the example above).

1- You may read all with just one function call r.ReadAll() using csv.NewReader from encoding/csv with:
r.Comma = '='
r.TrimLeadingSpace = true
And result is [][]string, and input order is preserved, Try it on The Go Playground:
package main
import (
"encoding/csv"
"fmt"
"strings"
)
func main() {
text := `Var1=Value1
Var2=Value2
Var3=Value3`
r := csv.NewReader(strings.NewReader(text))
r.Comma = '='
r.TrimLeadingSpace = true
all, err := r.ReadAll()
if err != nil {
panic(err)
}
fmt.Println(all)
}
output:
[[Var1 Value1] [Var2 Value2] [Var3 Value3]]
2- You may fine-tune csv.ReadAll() to convert the output to the map, but the order is not preserved, try it on The Go Playground:
package main
import (
"encoding/csv"
"fmt"
"io"
"strings"
)
func main() {
text := `Var1=Value1
Var2=Value2
Var3=Value3`
r := csv.NewReader(strings.NewReader(text))
r.Comma = '='
r.TrimLeadingSpace = true
all, err := ReadAll(r)
if err != nil {
panic(err)
}
fmt.Println(all)
}
func ReadAll(r *csv.Reader) (map[string]string, error) {
m := make(map[string]string)
for {
tmp, err := r.Read()
if err == io.EOF {
return m, nil
}
if err != nil {
return nil, err
}
m[tmp[0]] = tmp[1]
}
}
output:
map[Var2:Value2 Var3:Value3 Var1:Value1]

Related

Flexible date/time parsing in Go (Adding default values in parsing)

Further to this question, I want to parse a date/time passed on the command line to a Go program. At the moment, I use the flag package to populate a string variable ts and then the following code:
if ts == "" {
config.Until = time.Now()
} else {
const layout = "2006-01-02T15:04:05"
if config.Until, err = time.Parse(layout, ts); err != nil {
log.Errorf("Could not parse %s as a time string: %s. Using current date/time instead.", ts, err.Error())
config.Until = time.Now()
}
}
This works OK, provided the user passes exactly the right format - e.g. 2019-05-20T09:07:33 or some such.
However, what I would like, if possible, is the flexibility to pass e.g. 2019-05-20T09:07 or 2019-05-20T09 or maybe even 2019-05-20 and have the hours, minutes and seconds default to 0 where appropriate.
Is there a sane1 way to do this?
1 not requiring me to essentially write my own parser
UPDATE
I've kind of got a solution to this, although it's not particularly elegant, it does appear to work for most of the cases I am likely to encounter.
package main
import (
"fmt"
"time"
)
func main() {
const layout = "2006-01-02T15:04:05"
var l string
var input string
for _, input = range []string{"2019-05-30", "2019-05-30T16", "2019-05-30T16:04", "2019-05-30T16:04:34",
"This won't work", "This is extravagantly long and won't work either"} {
if len(input) < len(layout) {
l = layout[:len(input)]
} else {
l = layout
}
if d, err := time.Parse(l, input); err != nil {
fmt.Printf("Error %s\n", err.Error())
} else {
fmt.Printf("Layout %-20s gives time %v\n", l, d)
}
}
}
Just try each format, until one works. If none work, return an error.
var formats = []string{"2006-01-02T15:04:05", "2006-01-02", ...}
func parseTime(input string) (time.Time, error) {
for _, format := range formats {
t, err := time.Parse(format, input)
if err == nil {
return t, nil
}
}
return time.Time{}, errors.New("Unrecognized time format")
}
I think this library is what you are looking for https://github.com/araddon/dateparse
Parse many date strings without knowing the format in advance. Uses a scanner to read bytes and use a state machine to find format.
t, err := dateparse.ParseAny("3/1/2014")
In the specific scenario you describe, you could check the length of the input datestamp string, and add the proper length of zero stuff at the end of it to correspond to your layout. So basically you could append as much of the string "T00:00:00" (counting from the end), to the input as is missing in length compared to the layout format string.

Print a type without using reflrect and create new object

In below code, in order to show the expected type, I have to create a new object and call reflect.TypeOf on it.
package main
import (
"fmt"
"reflect"
)
type X struct {
name string
}
func check(something interface{}) {
if _, ok := something.(*X); !ok {
fmt.Printf("Expecting type %v, got %v\n",
reflect.TypeOf(X{}), reflect.TypeOf(something))
}
}
func main()
check(struct{}{})
}
Perhaps that object creation is not an overhead, but I still curious to know a better way. Are there something like X.getName() or X.getSimpleName() in java?
To obtain the reflect.Type descriptor of a type, you may use
reflect.TypeOf((*X)(nil)).Elem()
to avoid having to create a value of type X. See these questions for more details:
How to get the string representation of a type?
Golang TypeOf without an instance and passing result to a func
And to print the type of some value, you may use fmt.Printf("%T, something).
And actually for what you want to do, you may put reflection aside completely, simply do:
fmt.Printf("Expecting type %T, got %T\n", (*X)(nil), something)
Output will be (try it on the Go Playground):
Expecting type *main.X, got struct {}
Using reflects is almost always a bad choice. You can consider using one of the following ways
Use switch
If you want to control the flow depending on the type you can use the switch construction
func do(i interface{}) {
switch v := i.(type) {
case int:
fmt.Printf("Twice %v is %v\n", v, v*2)
case string:
fmt.Printf("%q is %v bytes long\n", v, len(v))
default:
fmt.Printf("I don't know about type %T!\n", v)
}
}
Use fmt package
If you want only to display its type you can always use the fmt package
i := 1000
fmt.Printf("The type is %T", i)

r.PostForm and r.Form always empty

I have a very strange problem, and i'm either really blind, or this is some kind of a bug. I have the following http.Handler:
func ServeHTTP(w http.ResponseWriter, r *http.Request) {
err := r.ParseForm()
if err != nil {
log.Println("Error while parsing form data")
return
}
log.Println("Printing r.PostForm:")
for key, values := range r.PostForm { // range over map
for _, value := range values { // range over []string
log.Println(key, value)
}
}
b, _ := ioutil.ReadAll(r.Body)
s := string(b)
log.Println("Printing body: ",s)
}
Now, when sending a PUT-Request to the url binded to this handler with the following FORM-Data:
Name=someName
Version=1.0.0
PLanguage=java
GitRepo=someRepo
This is ALWAYS the output:
Printing r.PostForm:
Printing body: Name=someName&Version=1.0.0&PLanguage=java&GitRepo=someRepo
I've been trying to find the cause for like 2 hours already and i just have no idea what the heck is wrong here. There is no error parsing the Form-Data, but the r.PostForm map is always empty (i also tried r.Form, with same result). So for debugging i added the part where i print the body, just to make sure there actually is some data in there - and it is. I would really appreciate any help here. Thanks in advance!
You need to set the 'Content-Type' header.
If no header is set "application/octet-stream" is used according to RFC 2616.
Long story short that is a binary format so your body will not be parsed into the Form.

Go: Passing pointers of an array to gob without copying?

I have a very, very large array (not slice) of maps that I am then trying to encode. I really need to avoid making a copy of the array but I can't figure out how to do this.
So I far I have this:
func doSomething() {
var mygiantvar [5]map[string]Searcher
mygiantvar = Load()
Save(`file.gob.gz`, &mygiantvar)
}
func Save(filename string, variable *[5]map[string]Searcher) error {
// Open file for writing
fi, err := os.Create(filename)
if err !=nil {
return err
}
defer fi.Close()
// Attach gzip writer
fz := gzip.NewWriter(fi)
defer fz.Close()
// Push from the gob encoder
encoder := gob.NewEncoder(fz)
err = encoder.Encode(*variable)
if err !=nil {
return err
}
return nil
}
From my understanding that will pass a pointer of mygiantvar to Save, which saves the first copy. But then the entire array will surely be copied into encoder.Encode which will then copy it around many more functions, right?
This mygiantvar variable will be something like 10GB in size. So it must avoid being copied ever.
But then again perhaps only the actual array [5] part is copied but the maps inside of this are pointers inside an array, so the array of pointers to maps would be copied instead of the maps themselves? I have no idea about this - it's all very confusing.
Any ideas?
Note that Encoder.Encode will pass around an interface{}.
func (enc *Encoder) Encode(v interface{}) error {
That means a kind of a pointer to whatever you will be passing to it, as I described in "what is the meaning of interface{} in golang?"
(see also "Why can't I assign a *Struct to an *Interface?")
An interface value isn't the value of the concrete struct (as it has a variable size, this wouldn't be possible), but it's a kind of pointer (to be more precise a pointer to the struct and a pointer to the type)
That means it won't copy the full content of your map (or here of your array).
Since array is a value, you could slice it to avoid any copy during the call to Encode():
err = encoder.Encode(*variable[:])
See "Go Slices: usage and internals"
This is also the syntax to create a slice given an array:
x := [3]string{"Лайка", "Белка", "Стрелка"}
s := x[:] // a slice referencing the storage of x
If that doesn't work, you can keep *variable (here an array: [5]map[string]Searcher), as map types are reference types, like pointers or slices: the copy won't be huge.
See "Go maps in action".
While the array will be copied when passed to interface{}, the map content won't be copied.
See this play.golang.org example:
package main
import "fmt"
func main() {
var a [1]map[string]int
a[0] = make(map[string]int)
a[0]["test"] = 0
modify(a)
fmt.Println(a)
}
func modify(arr interface{}) {
a := arr.([1]map[string]int)
a[0]["test"] = -1
}
Output:
[map[test:-1]]

Using reflection with structs to build generic handler function

I have some trouble building a function that can dynamically use parametrized structs. For that reason my code has 20+ functions that are similar except basically for one type that gets used. Most of my experience is with Java, and I'd just develop basic generic functions, or use plain Object as parameter to function (and reflection from that point on). I would need something similar, using Go.
I have several types like:
// The List structs are mostly needed for json marshalling
type OrangeList struct {
Oranges []Orange
}
type BananaList struct {
Bananas []Banana
}
type Orange struct {
Orange_id string
Field_1 int
// The fields are different for different types, I am simplifying the code example
}
type Banana struct {
Banana_id string
Field_1 int
// The fields are different for different types, I am simplifying the code example
}
Then I have function, basically for each list type:
// In the end there are 20+ of these, the only difference is basically in two types!
// This is very un-DRY!
func buildOranges(rows *sqlx.Rows) ([]byte, error) {
oranges := OrangeList{} // This type changes
for rows.Next() {
orange := Orange{} // This type changes
err := rows.StructScan(&orange) // This can handle each case already, could also use reflect myself too
checkError(err, "rows.Scan")
oranges.Oranges = append(oranges.Oranges,orange)
}
checkError(rows.Err(), "rows.Err")
jsontext, err := json.Marshal(oranges)
return jsontext, err
}
Yes, I could change the sql library to use more intelligent ORM or framework, but that's besides the point. I want to learn on how to build generic function that can handle similar function for all my different types.
I got this far, but it still doesn't work properly (target isn't expected struct I think):
func buildWhatever(rows *sqlx.Rows, tgt interface{}) ([]byte, error) {
tgtValueOf := reflect.ValueOf(tgt)
tgtType := tgtValueOf.Type()
targets := reflect.SliceOf(tgtValueOf.Type())
for rows.Next() {
target := reflect.New(tgtType)
err := rows.StructScan(&target) // At this stage target still isn't 1:1 smilar struct so the StructScan fails... It's some perverted "Value" object instead. Meh.
// Removed appending to the list because the solutions for that would be similar
checkError(err, "rows.Scan")
}
checkError(rows.Err(), "rows.Err")
jsontext, err := json.Marshal(targets)
return jsontext, err
}
So umm, I would need to give the list type, and the vanilla type as parameters, then build one of each, and the rest of my logic would be probably fixable quite easily.
Turns out there's an sqlx.StructScan(rows, &destSlice) function that will do your inner loop, given a slice of the appropriate type. The sqlx docs refer to caching results of reflection operations, so it may have some additional optimizations compared to writing one.
Sounds like the immediate question you're actually asking is "how do I get something out of my reflect.Value that rows.StructScan will accept?" And the direct answer is reflect.Interface(target); it should return an interface{} representing an *Orange you can pass directly to StructScan (no additional & operation needed). Then, I think targets = reflect.Append(targets, target.Indirect()) will turn your target into a reflect.Value representing an Orange and append it to the slice. targets.Interface() should get you an interface{} representing an []Orange that json.Marshal understands. I say all these 'should's and 'I think's because I haven't tried that route.
Reflection, in general, is verbose and slow. Sometimes it's the best or only way to get something done, but it's often worth looking for a way to get your task done without it when you can.
So, if it works in your app, you can also convert Rows straight to JSON, without going through intermediate structs. Here's a sample program (requires sqlite3 of course) that turns sql.Rows into map[string]string and then into JSON. (Note it doesn't try to handle NULL, represent numbers as JSON numbers, or generally handle anything that won't fit in a map[string]string.)
package main
import (
_ "code.google.com/p/go-sqlite/go1/sqlite3"
"database/sql"
"encoding/json"
"os"
)
func main() {
db, err := sql.Open("sqlite3", "foo")
if err != nil {
panic(err)
}
tryQuery := func(query string, args ...interface{}) *sql.Rows {
rows, err := db.Query(query, args...)
if err != nil {
panic(err)
}
return rows
}
tryQuery("drop table if exists t")
tryQuery("create table t(i integer, j integer)")
tryQuery("insert into t values(?, ?)", 1, 2)
tryQuery("insert into t values(?, ?)", 3, 1)
// now query and serialize
rows := tryQuery("select * from t")
names, err := rows.Columns()
if err != nil {
panic(err)
}
// vals stores the values from one row
vals := make([]interface{}, 0, len(names))
for _, _ = range names {
vals = append(vals, new(string))
}
// rowMaps stores all rows
rowMaps := make([]map[string]string, 0)
for rows.Next() {
rows.Scan(vals...)
// now make value list into name=>value map
currRow := make(map[string]string)
for i, name := range names {
currRow[name] = *(vals[i].(*string))
}
// accumulating rowMaps is the easy way out
rowMaps = append(rowMaps, currRow)
}
json, err := json.Marshal(rowMaps)
if err != nil {
panic(err)
}
os.Stdout.Write(json)
}
In theory, you could build this to do fewer allocations by not reusing the same rowMap each time and using a json.Encoder to append each row's JSON to a buffer. You could go a step further and not use a rowMap at all, just the lists of names and values. I should say I haven't compared the speed against a reflect-based approach, though I know reflect is slow enough it might be worth comparing them if you can put up with either strategy.

Resources