Related
Okay it's hard to describe it in words but let's say I have a map that stores int pointers, and want to store the result of an operation as another key in my hash:
m := make(map[string]*int)
m["d"] = &(*m["x"] + *m["y"])
This doesn't work and gives me the error: cannot take the address of *m["x"] & *m["y"]
Thoughts?
A pointer is a memory address. For example a variable has an address in memory.
The result of an operation like 3 + 4 does not have an address because there is no specific memory allocated for it. The result may just live in processor registers.
You have to allocate memory whose address you can put into the map. The easiest and most straightforward is to create a local variable for it.
See this example:
x, y := 1, 2
m := map[string]*int{"x": &x, "y": &y}
d := *m["x"] + *m["y"]
m["d"] = &d
fmt.Println(m["d"], *m["d"])
Output (try it on the Go Playground):
0x10438300 3
Note: If the code above is in a function, the address of the local variable (d) that we just put into the map will continue to live even if we return from the function (that is if the map is returned or created outside - e.g. a global variable). In Go it is perfectly safe to take and return the address of a local variable. The compiler will analyze the code and if the address (pointer) escapes the function, it will automatically be allocated on the heap (and not on the stack). For details see FAQ: How do I know whether a variable is allocated on the heap or the stack?
Note #2: There are other ways to create a pointer to a value (as detailed in this answer: How do I do a literal *int64 in Go?), but they are just "tricks" and are not nicer or more efficient. Using a local variable is the cleanest and recommended way.
For example this also works without creating a local variable, but it's obviously not intuitive at all:
m["d"] = &[]int{*m["x"] + *m["y"]}[0]
Output is the same. Try it on the Go Playground.
The result of the addition is placed somewhere transient (on the stack) and it would therefore not be safe to take its address. You should be able to work around this by explicitly allocating an int on the heap to hold your result:
result := make(int)
*result = *m["x"] + *m["y"]
m["d"] = result
In Go, you can not take the reference of a literal value (formally known as an r-value). Try the following:
package main
import "fmt"
func main() {
x := 3;
y := 2;
m := make(map[string]*int)
m["x"] = &x
m["y"] = &y
f := *m["x"] + *m["y"]
m["d"] = &f
fmt.Printf("Result: %d\n",*m["d"])
}
Have a look at this tutorial.
I'm quite new to Erlang (Reading through "Software for a Concurrent World"). From what I've read, we link two processes together to form a reliable system.
But if we need more than two process, I think we should connect them in a ring. Although this is slightly tangential to my actual question, please let me know if this is incorrect.
Given a list of PIDs:
[1,2,3,4,5]
I want to form these in a ring of {My_Pid, Linked_Pid} tuples:
[{1,2},{2,3},{3,4},{4,5},{5,1}]
I have trouble creating an elegant solution that adds the final {5,1} tuple.
Here is my attempt:
% linkedPairs takes [1,2,3] and returns [{1,2},{2,3}]
linkedPairs([]) -> [];
linkedPairs([_]) -> [];
linkedPairs([X1,X2|Xs]) -> [{X1, X2} | linkedPairs([X2|Xs])].
% joinLinks takes [{1,2},{2,3}] and returns [{1,2},{2,3},{3,1}]
joinLinks([{A, _}|_]=P) ->
{X, Y} = lists:last(P)
P ++ [{Y, A}].
% makeRing takes [1,2,3] and returns [{1,2},{2,3},{3,1}]
makeRing(PIDs) -> joinLinks(linkedPairs(PIDs)).
I cringe when looking at my joinLinks function - list:last is slow (I think), and it doesn't look very "functional".
Is there a better, more idiomatic solution to this?
If other functional programmers (non-Erlang) stumble upon this, please post your solution - the concepts are the same.
Use lists:zip with the original list and its 'rotated' version:
1> L=[1,2,3].
[1,2,3]
2> lists:zip(L, tl(L) ++ [hd(L)]).
[{1,2},{2,3},{3,1}]
If you are manipulating long lists, you can avoid the creation of the intermediary list tl(L) ++ [hd(L)] using an helper function:
1> L = lists:seq(1,5).
[1,2,3,4,5]
2> Link = fun Link([Last],First,Acc) -> lists:reverse([{Last,First}|Acc]);
Link([X|T],First,Acc) -> Link(T,First,[{X,hd(T)}|Acc]) end.
#Fun<erl_eval.42.127694169>
3> Joinlinks = fun(List) -> Link(List,hd(List),[]) end.
#Fun<erl_eval.6.127694169>
4> Joinlinks(L).
[{1,2},{2,3},{3,4},{4,5},{5,1}]
5>
But if we need more than two process, I think we should connect them
in a ring.
No. For instance, suppose you want to download the text of 10 different web pages. Instead of sending a request, then waiting for the server to respond, then sending the next request, etc., you can spawn a separate process for each request. Each spawned process only needs the pid of the main process, and the main process collects the results as they come in. When a spawned process gets a reply from a server, the spawned process sends a message to the main process with the results, then terminates. The spawned processes have no reason to send messages to each other. No ring.
I would guess that it is unlikely that you will ever create a ring of processes in your erlang career.
I have trouble creating an elegant solution that adds the final {5,1} tuple.
You can create the four other processes passing them self(), which will be different for each spawned process. Then, you can create a separate branch of your create_ring() function that terminates the recursion and returns the pid of the last created process to the main process:
init(N) ->
LastPid = create_ring(....),
create_ring(0, PrevPid) -> PrevPid;
create_ring(N, PrevPid) when N > 0 ->
Pid = spawn(?MODULE, loop, [PrevPid]),
create_ring(.......).
Then, the main process can call (not spawn) the same function that is being spawned by the other processes, passing the function the last pid that was returned by the create_ring() function:
init(N) ->
LastPid = create_ring(...),
loop(LastPid).
As a result, the main process will enter into the same message loop as the other processes, and the main process will have the last pid stored in the loop parameter variable to send messages to.
In erlang, you will often find that while you are defining a function, you won't be able to do everything that you want in that function, so you need to call another function to do whatever it is that is giving you trouble, and if in the second function you find you can't do everything you need to do, then you need to call another function, etc. Applied to the ring problem above, I found that init() couldn't do everything I wanted in one function, so I defined the create_ring() function to handle part of the problem.
I want to get the data pointer of a string variable(like string::c_str() in c++) to pass to a c function and I found this doesn't work:
package main
/*
#include <stdio.h>
void Println(const char* str) {printf("%s\n", str);}
*/
import "C"
import (
"unsafe"
)
func main() {
s := "hello"
C.Println((*C.char)(unsafe.Pointer(&(s[0]))))
}
Compile error info is: 'cannot take the address of s[0]'.
This will be OK I but I doubt it will cause unneccesary memory apllying. Is there a better way to get the data pointer?
C.Println((*C.char)(unsafe.Pointer(&([]byte(s)[0]))))
There are ways to get the underlying data from a Go string to C without copying it. It will not work as a C string because it is not a C string. Your printf will not work even if you manage to extract the pointer even if it happens to work sometimes. Go strings are not C strings. They used to be for compatibility when Go used more libc, they aren't anymore.
Just follow the cgo manual and use C.CString. If you're fighting for efficiency you'll win much more by just not using cgo because the overhead of calling into C is much bigger than allocating some memory and copying a string.
(*reflect.StringHeader)(unsafe.Pointer(&sourceTail)).Data
Strings in go are not null terminated, therefore you should always pass the Data and the Len parameter to the corresponding C functions. There is a family of functions in the C standard library to deal with this type of strings, for example if you want to format them with printf, the format specifier is %.*s instead of %s and you have to pass both, the length and the pointer in the arguments list.
I'm a bit confused when I see code such as:
bigBox := &BigBox{}
bigBox.BubbleGumsCount = 4 // correct...
bigBox.SmallBox.AnyMagicItem = true // also correct
Why, or when, would I want to do bigBox := &BigBox{} instead of bigBox := BigBox{} ? Is it more efficient in some way?
Code sample was taken from here.
Sample no.2:
package main
import "fmt"
type Ints struct {
x int
y int
}
func build_struct() Ints {
return Ints{0,0}
}
func build_pstruct() *Ints {
return &Ints{0,0}
}
func main() {
fmt.Println(build_struct())
fmt.Println(build_pstruct())
}
Sample no. 3: ( why would I go with &BigBox in this example, and not with BigBox as a struct directly ? )
func main() {
bigBox := &BigBox{}
bigBox.BubbleGumsCount = 4
fmt.Println(bigBox.BubbleGumsCount)
}
Is there ever a reason to call build_pstruct instead of the the build_struct variant? Isn't that why we have the GC?
I figured out one motivation for this kind of code: avoidance of "struct copying by accident".
If you use a struct variable to hold the newly created struct:
bigBox := BigBox{}
you may copy the struct by accident like this
myBox := bigBox // Where you just want a refence of bigBox.
myBox.BubbleGumsCount = 4
or like this
changeBoxColorToRed(bigBox)
where changeBoxColorToRed is
// It makes a copy of entire struct as parameter.
func changeBoxColorToRed(box bigBox){
// !!!! This function is buggy. It won't work as expected !!!
// Please see the fix at the end.
box.Color=red
}
But if you use a struct pointer:
bigBox := &BigBox{}
there will be no copying in
myBox := bigBox
and
changeBoxColorToRed(bigBox)
will fail to compile, giving you a chance to rethink the design of changeBoxColorToRed. The fix is obvious:
func changeBoxColorToRed(box *bigBox){
box.Color=red
}
The new version of changeBoxColorToRed does not copy the entire struct and works correctly.
bb := &BigBox{} creates a struct, but sets the variable to be a pointer to it. It's the same as bb := new(BigBox). On the other hand, bb := BigBox{} makes bb a variable of type BigBox directly. If you want a pointer (because perhaps because you're going to use the data via a pointer), then it's better to make bb a pointer, otherwise you're going to be writing &bb a lot. If you're going to use the data as a struct directly, then you want bb to be a struct, otherwise you're going to be dereferencing with *bb.
It's off the point of the question, but it's usually better to create data in one go, rather than incrementally by creating the object and subsequently updating it.
bb := &BigBox{
BubbleGumsCount: 4,
SmallBox: {
AnyMagicItem: true,
},
}
The & takes an address of something. So it means "I want a pointer to" rather than "I want an instance of". The size of a variable containing a value depends on the size of the value, which could be large or small. The size of a variable containing a pointer is 8 bytes.
Here are examples and their meanings:
bigBox0 := &BigBox{} // bigBox0 is a pointer to an instance of BigBox{}
bigBox1 := BigBox{} // bigBox1 contains an instance of BigBox{}
bigBox2 := bigBox // bigBox2 is a copy of bigBox
bigBox3 := &bigBox // bigBox3 is a pointer to bigBox
bigBox4 := *bigBox3 // bigBox4 is a copy of bigBox, dereferenced from bigBox3 (a pointer)
Why would you want a pointer?
To prevent copying a large object when passing it as an argument to a function.
You want to modify the value by passing it as an argument.
To keep a slice, backed by an array, small. [10]BigBox would take up "the size of BigBox" * 10 bytes. [10]*BigBox would take up 8 bytes * 10. A slice when resized has to create a larger array when it reaches its capacity. This means the memory of the old array has to be copied to the new array.
Why do you not what to use a pointer?
If an object is small, it's better just to make a copy. Especially if it's <= 8 bytes.
Using pointers can create garbage. This garbage has to be collected by the garbage collector. The garbage collector is a mark-and-sweep stop-the-world implementation. This means that it has to freeze your application to collect the garbage. The more garbage it has to collect, the longer that pause is. This individual, for example. experienced a pause up to 10 seconds.
Copying an object uses the stack rather than the heap. The stack is usually always faster than the heap. You really don't have to think about stack vs heap in Go as it decides what should go where, but you shouldn't ignore it either. It really depends on the compiler implementation, but pointers can result in memory going on the heap, resulting in the need for garbage collection.
Direct memory access is faster. If you have a slice []BigBox and it doesn't change size it can be faster to access. []BigBox is faster to read, whereas []*BigBox is faster to resize.
My general advice is use pointers sparingly. Unless you're dealing with a very large object that needs to be passed around, it's often better to pass around a copy on the stack. Reducing garbage is a big deal. The garbage collector will get better, but you're better off by keeping it as low as possible.
As always test your application and profile it.
The difference is between creating a reference object (with the ampersand) vs. a value object (without the ampersand).
There's a nice explanation of the general concept of value vs. reference type passing here... What's the difference between passing by reference vs. passing by value?
There is some discussion of these concepts with regards to Go here... http://www.goinggo.net/2013/07/understanding-pointers-and-memory.html
In general there is no difference between a &BigBox{} and BigBox{}. The Go compiler is free to do whatever it likes as long as the semantics are correct.
func StructToStruct() {
s := Foo{}
StructFunction(&s)
}
func PointerToStruct() {
p := &Foo{}
StructFunction(p)
}
func StructToPointer() {
s := Foo{}
PointerFunction(&s)
}
func PointerToPointer() {
p := &Foo{}
PointerFunction(p)
}
//passed as a pointer, but used as struct
func StructFunction(f *Foo) {
fmt.Println(*f)
}
func PointerFunction(f *Foo) {
fmt.Println(f)
}
Summary of the assembly:
StructToStruct: 13 lines, no allocation
PointerToStruct: 16 lines, no allocation
StructToPointer: 20 lines, heap allocated
PointerToPointer: 12 lines, heap allocated
With a perfect compiler the *ToStruct functions would be the identical as would the *ToPointer functions. Go's escape analysis is good enough to tell if a pointer escapes even across module boundries. Which ever way is most efficient is the way the compiler will do it.
If you're really into micro-optimization note that Go is most efficient when the syntax lines up with the semantics (struct used as a struct, pointer used as a pointer). Or you can just forget about it and declare the variable the way it will be used and you will be right most of the time.
Note: if Foo is really big PointerToStruct will heap allocate it. The spec threatens to that even StructToStruct is allowed to do this but I couldn't make it happen. The lesson here is that the compiler will do whatever it wants. Just as the details of the registers is shielded from the code, so is the state of the heap/stack. Don't change your code because you think you know how the compiler is going to use the heap.
I have pasted a code below which is in Ada language.I need some clarification on some implementations.
C : character;
Char : character;
type Myarr_Type is array (character range 'A'..'K') of character;
Myarr : Myarr_Type := ('A','B','C','D','E','F','G','H','I','J','K');
Next_Address := Myarr'address --'
Last_Address := Next_Address + Storage_Offset'(40); --'
return P2 + Storage_Offset'(4); --'
Last_Address := Next_Address + Storage_Offset'(4); --'
Now my doubt is 1) what does P2 + Storage_Offset'(4) actually mean.Does that mean that its returning the address of the next element in the array which is 'B'.Storage_Offset'(4) in Ada --does this mean 4 bits or 4 bytes of memory. 2) If i assume that Last_Address points to last element of the array which is 'K', how does the arithmentic Storage_Offset'(40) satisfies the actual implementation?
Please get back to me if u need any more clarifications.
Please assume that the function does not exist.
As a matter of fact,i have some ada file and my job is to convert them to C files.Since i am a beginner in ada,i faced a lot of issues with that.Please pardon in case of any confusion
Thanks
Maddy
Storage_Offset is a special integeral type in package System.Storage_Elements that can be added to objects of type System.Address. What exactly the units of Address and Storage_Offset are is implementation defined, but probably just about every implementation in existence uses bytes. So Next_Address + Storage_Offset'(4) means "the address four bytes past whatever Next_Address refers to."
You talked a bit about Ada porting. In 99% of cases, that is a very stupid idea (the %1 being when you need to port to a platform that has no Ada compiler). I'd say the same thing no matter what language you are porting. It's a fool's game. The best outcome you can hope for when porting code is that after a ton of effort it works as well as it did before. With coding the best case never happens.
Ada can interface to C just fine, so it would be far smarter to keep unchanged code in Ada and only "port" stuff you need to change.
If you come across any tasking code, protected types, or custom streams you will be in a world of hurt. Those things don't really have easy C analogs.
If your bosses really have a boner for C or something, I'd suggest looking into Sofcheck's AdaMagic, which provides a service to transform Ada code to ANSI C. Back in the day (two owners prior) they used to claim that it produced maintainable C code. Either way, it will probably be far cheaper than having an inexperienced (in Ada) developer try to do it all by hand.
my_func(int P1,int P2)
{
return P2 + Storage_Offset'(4);
}
Well, this is a C function whose body is written in Ada. There is no "+" operator taking an Integer and a Storage_Offset; perhaps you're looking for
function "+"(Left : Address; Right : Storage_Offset)
return Address;
and perhaps you meant to call my_func(something, Next_Address)?
In that case, the expression will return the address of whatever is 4 Storage_Elements, ie bytes, after Myarr('A').
Myarr_Type is an array of Character, which with any normal compiler on any common architecture is going to be a standard 8-bit byte. So Myarr_Type objects will be 11 bytes long, not 44, and Myarr('A')'Address + 4 will be the address of Myarr('E').
If you want the address of the last element of Myarr, try
Myarr (Myarr'Last)'Address