I am a bit confused about how to transfer ownership without the overhead of actual data copy.
I have the following code. I am referring to underlying data copy by OS as memcopy.
fn main() {
let v1 = Vec::from([1; 1024]);
take_ownership_but_memcopies(v1);
let v2 = Vec::from([2; 1024]);
dont_memecopy_but_dont_take_ownership(&v2);
let v3 = Vec::from([3; 1024]);
take_ownership_dont_memcopy(???);
}
// Moves but memcopies all elements
fn take_ownership_but_memcopies(my_vec1: Vec<i32>) {
println!("{:?}", my_vec1);
}
// Doesn't memcopy but doesn't take ownership
fn dont_memecopy_but_dont_take_ownership(my_vec2: &Vec<i32>) {
println!("{:?}", my_vec2);
}
// Take ownership without the overhead of memcopy
fn take_ownership_dont_memcopy(myvec3: ???) {
println!("{:?}", my_vec3);
}
As i understand, if i use reference like v2, i don't get the ownership. If i use it like v1, there could be a memcopy.
How should i need to transfer v3 to guarantee that there is no underlying memcopy by OS?
Your understanding of what happens when you move a Vec is incorrect - it does not copy every element within the Vec!
To understand why, we need to take a step back and look at how a Vec is represented internally:
// This is slightly simplified, look at the source for more details!
struct Vec<T> {
pointer: *mut T, // pointer to the data (on the heap)
capacity: usize, // the current capacity of the Vec
len: usize, // the current number of elements in the Vec
}
While the Vec conceptually 'owns' the elements, they are not stored within the Vec struct - it only holds a pointer to that data. So when you move a Vec, it is only the pointer (plus the capacity and length) that gets copied.
If you are attempting to avoid copying altogether, as opposed to avoiding copying the contents of the Vec, that isn't really possible - in the semantics of the compiler, a move is a copy (just one that prevents you from using the old data afterwards). However, the compiler can and will optimize trivial copies into something more efficient.
How should i need to transfer v3 to guarantee that there is no underlying memcopy by OS?
You can't. Because that's Rust's semantics.
However a Vec is just 3 words on the stack, that's all which gets "memcopy"d, which is intrinsic, it's not like you're going to get a memcpy function call in there or duplicate the entire vector. And that's assuming the function call does not get inlined, and the compiler does not decide to pass in object as a reference anyway. It could also pass all 3 words through registers, at which point there's nothing to memcpy.
Though it's not entirely clear why you care either way, if you only want to read from the collection your function should be
// Take ownership without the overhead of memcopy
fn take_ownership_dont_memcopy(myvec3: &[i32]) {
println!("{:?}", my_vec3);
}
that is the most efficient and flexible signature: it's just two words, there's a single pointer (unlike &Vec), and it allows for non-Vec sources.
Related
My erroneous code snippet and compiler error info:
// code snippet 1:
0 fn main() {
1 let mut x: Box<i32> = Box::new(4);
2 let r: &Box<i32> = &x;
3 *x = 8;
4 println!("{}", r);
5 }
// compiler error info:
error[E0506]: cannot assign to `*x` because it is borrowed
--> src/main.rs:3:4
|
2 | let r = &x;
| -- borrow of `*x` occurs here
3 | *x = 8;
| ^^^^^^ assignment to borrowed `*x` occurs here
4 | println!("{}", r);
| - borrow later used here
For more information about this error, try `rustc --explain E0506`.
The following code won't compile, which makes quite senses to me cause we cannot invalidate the reference r .
// code snippet 2:
0 fn main() {
1 let mut x: i32 = 0;
2 let r: &i32 = &x;
3 x = 1;
4 println!("{}", r);
5 }
But the compiler error info of code snippet1 doesn't make too much sense to me.
x is a pointer on the stack pointing to a heap memory segment whose contents is 4 , reference r only borrows x (the pointer not the heap memory segment) , and in line 3 *x = 8; , what we did here is to alter the memory on the heap (not the pointer on the stack) . Change happens on the heap , while reference is only relevant to the stack, they do not interrelate.
This question is kind of picking a quarrel, but I do not mean to argue for the sake of argument.
If you found my question irregular, feel free to point it out :)
Change happens on the heap , while reference is only relevant to the stack, they do not interrelate.
That does not matter, because the type system doesn't work with that "depth" of information.
As far as it's concerned, borrowing x is borrowing the entirety of x up to any depth, and so any change anywhere inside x is forbidden.
For type checking purposes, this is no different than if x were a Box<Vec<_>>, and r were be actively used for iteration, leading any update to the inner vector to possibly invalidate the iterator.
(also type-wise *x = 8 does require first taking a unique reference to the box itself, before "upgrading" it to a unique reference to the box' content, as you can see from the trait implementation)
Rust's entire borrowing model enforces one simple requirement: the contents of a memory location can only be mutated if there is only one pointer through which that location can be accessed.
In your case, the heap location that you're trying to mutate can be accessed both through x and through r—and therefore mutation is denied.
This model enables the compiler to perform aggressive optimisations that permit, for example, the storage of values reachable through either alias in registers and/or caches without needing to fetch again from memory when the value is read.
The semantics of * is determined by two traits:
pub trait Deref {
type Target: ?Sized;
fn deref(&self) -> &Self::Target;
}
or
pub trait DerefMut: Deref {
fn deref_mut(&mut self) -> &mut Self::Target;
}
In your case, when you write *x = 8 Rust compiler expands the expression into the call
DerefMut::deref_mut(&mut x), because Box<T> implements Deref<Target=T> and DerefMut. That is why in the line *x = 8 mutable borrowing of x is performed, and by orphan rule it can't be done, because we've already borrowed x in let r: &Box<i32> = &x;.
I just found a great diagram from Programming Rust (Version 2), which really answers my question quite well:
In the case of my question, when x is shared-referenced by r, everything in the ownership tree of x (the stack pointer and the heap memory segment) becomes read-only.
I knew that the Stack Overflow community does not like pictures, but this diagram is really great and may help someone who will find this question in the future:)
I've written a wrapper for a camera library in Rust that commands and operates a camera, and also saves an image to file using bindgen. Once I command an exposure to start (basically telling the camera to take an image), I can grab the image using a function of the form:
pub fn GetQHYCCDSingleFrame(
handle: *mut qhyccd_handle,
w: *mut u32,
...,
imgdata: &mut [u8],) -> u32 //(u32 is a retval)
In C++, this function was:
uint32_t STDCALL GetQHYCCDSingleFrame(qhyccd_handle: *handle, ..., uint8_t *imgdata)
In C++, I could pass in a buffer of the form imgdata = new unsigned char[length_buffer] and the function would fill the buffer with image data from the camera.
In Rust, similarly, I can pass in a buffer in the form of a Vec: let mut buffer: Vec<u8> = Vec::with_capacity(length_buffer).
Currently, the way I have structured the code is that there is a main struct, with settings such as the width and height of image, the camera handle, and others, including the image buffer. The struct has been initialized as a mut as:
let mut main_settings = MainSettings {
width: 9600,
...,
buffer: Vec::with_capacity(length_buffer),
}
There is a separate function I wrote that takes the main struct as a parameter and calls the GetQHYCCDSingleFrame function:
fn grab_image(main_settings: &mut MainSettings) {
let retval = unsafe { GetQHYCCDSingleFrame(main_settings.cam_handle, ..., &mut main_settings.image_buffer) };
}
Immediately after calling this function, if I check the length and capacity of main_settings.image_buffer:
println!("Elements in buffer are {}, capacity of buffer is {}.", main_settings.image_buffer.len(), main_settings.image_buffer.capacity());
I get 0 for length, and the buffer_length as the capacity. Similarly, printing any index such as main_settings.image_buffer[0] or 1 leads to a panic exit saying len is 0.
This would make me think that the GetQHYCCDSingleFrame code is not working properly, however, when I save the image_buffer to file using fitsio and hdu.write_region (fitsio docs linked here), I use:
let ranges = [&(x_start..(x_start + roi_width)), &(y_start..(y_start+roi_height))];
hdu.write_region(&mut fits_file, &ranges, &main_settings.image_buffer).expect("Could not write to fits file");
This saves an actual image to file with the right size and is a perfectly fine image (exactly what it would look if I took using the C++ program). However, when I try to print the buffer, for some reason is empty, yet the hdu.write_region code is able to access data somehow.
Currently, my (not good) workaround is to create another vector that reads data from the saved file and saves to a buffer, which then has the right number of elements:
main_settings.new_buffer = hdu.read_region(&mut fits_file, &ranges).expect("Couldn't read fits file");
Why can I not access the original buffer at all, and why does it report length 0, when the hdu.write_region function can access data from somewhere? And where exactly is it accessing the data from, and how can correctly I access it as well? I am bit new to borrowing and referencing, so I believe I might be doing something wrong in borrowing/referencing the buffer, or is it something else?
Sorry for the long story, but the details would probably be important for everything here. Thanks!
Well, first of all, you need to know that Vec<u8> and &mut [u8] are not quite the same as C or C++'s uint8_t *. The main difference is that Vec<u8> and &mut [u8] have the size of the array or slice saved within themselves, while uint8_t * doesn't. The Rust equivalent to C/C++ pointers are raw pointers, like *mut [u8]. Raw pointers are safe to build, but requires unsafe to be used. However, even tho they are different types, a smart pointer as &mut [u8] can be casted to a raw pointer without issue AFAIK.
Secondly, the capacity of a Vec is different of its size. Indeed, to have good performances, a Vec allocates more memory than you use, to avoid reallocating on each new element added into vector. The length however is the size of the used part. In your case, you ask the Vec to allocate a heap space of length length_buffer, but you don't tell them to consider any of the allocated space to be used, so the initial length is 0. Since C++ doesn't know about Vec and only use a raw pointer, it can't change the length written inside the Vec, that stays at 0. Thus the panicking.
To resolve it, I see multiple solutions:
Changing the Vec::with_capacity(length_buffer) into vec![0; length_buffer], explicilty asking to have a length of length_buffer from the start
Using unsafe code to explicitly set the length of the Vec without touching what is inside (using Vec::from_raw_parts). This might be faster than the first solution, but I'm not sure.
Using a Box<[u8; length_buffer]>, which is like a Vec but without reallocation and with the length that is the capacity
If your length_buffer is constant at compile time, using a [u8; length_buffer] would be much more efficient as no allocation is needed, but it comes with downsides, as you probably know
I have a code block that queries AD and retrive the results and write to a channel.
func GetFromAD(connect *ldap.Conn, ADBaseDN, ADFilter string, ADAttribute []string, ADPage uint32) *[]ADElement {
searchRequest := ldap.NewSearchRequest(ADBaseDN, ldap.ScopeWholeSubtree, ldap.NeverDerefAliases, 0, 0, false, ADFilter, ADAttribute, nil)
sr, err := connect.SearchWithPaging(searchRequest, ADPage)
CheckForError(err)
fmt.Println(len(sr.Entries))
ADElements := []ADElement{}
for _, entry := range sr.Entries{
NewADEntity := new(ADElement) //struct
NewADEntity.DN = entry.DN
for _, attrib := range entry.Attributes {
NewADEntity.attributes = append(NewADEntity.attributes, keyvalue{attrib.Name: attrib.Values})
}
ADElements = append(ADElements, *NewADEntity)
}
return &ADElements
}
The above function returns a pointer to []ADElements.
And in my initialrun function, I call this function like
ADElements := GetFromAD(connectAD, ADBaseDN, ADFilter, ADAttribute, uint32(ADPage))
fmt.Println(reflect.TypeOf(ADElements))
ADElementsChan <- ADElements
And the output says
*[]somemodules.ADElement
as the output of reflect.TypeOf.
My doubt here is,
since ADElements := []ADElement{} defined in GetFromAD() is a local variable, it must be allocated in the stack, and when GetFromAD() exits, contents of the stack must be destroyed, and further references to GetFromAD() must be pointing to invalid memory references, whereas I still am getting the exact number of elements returned by GetFromAD() without any segfault. How is this working? Is it safe to do it this way?
Yes, it is safe because Go compiler performs escape analysis and allocates such variables on heap.
Check out FAQ - How do I know whether a variable is allocated on the heap or the stack?
The storage location does have an effect on writing efficient programs. When possible, the Go compilers will allocate variables that are local to a function in that function's stack frame. However, if the compiler cannot prove that the variable is not referenced after the function returns, then the compiler must allocate the variable on the garbage-collected heap to avoid dangling pointer errors. Also, if a local variable is very large, it might make more sense to store it on the heap rather than the stack.
Define "safe"...
You will not end up freeing the memory of ADElements, since there's at least one live reference to it.
In this case, you should be completely safe, since you're only passing the slice once and then you seem to not modify it, but in the general case it might be better to pass it element-by-element across a chan ADElement, to avoid multiple unsynchronized accesses to the slice (or, more specifically, the array backing the slice).
This also holds for maps, where you can get curious problems if you pass a map over a channel, then continue to access it.
I'm attempting to write Rust bindings for a C collection library (Judy Arrays [1]) which only provides itself room to store a pointer-width value. My company has a fair amount of existing code which uses this space to directly store non-pointer values such as pointer-width integers and small structs. I'd like my Rust bindings to allow type-safe access to such collections using generics, but am having trouble getting the pointer-stashing semantics working correctly.
I have a basic interface working using std::mem::transmute_copy() to store the value, but that function explicitly does nothing to ensure the source and destination types are the same size. I'm able to verify that collection type parameter is of a compatible size at run-time via an assertion, but I'd really like the check to somehow be at compile-time.
Example code:
pub struct Example<T> {
v: usize,
t: PhantomData<T>,
}
impl<T> Example<T> {
pub fn new() -> Example<T> {
assert!(mem::size_of::<usize>() == mem::size_of::<T>());
Example { v: 0, t: PhantomData }
}
pub fn insert(&mut self, val: T) {
unsafe {
self.v = mem::transmute_copy(&val);
mem::forget(val);
}
}
}
Is there a better way to do this, or is this run-time check the best Rust 1.0 supports?
(Related question, explaining why I'm not using mem::transmute().)
[1] I'm aware of the existing rust-judy project, but it doesn't support the pointer-stashing I want, and I'm writing these new bindings largely as a learning exercise anyway.
Note: for Rust 1.57 and newer, see this answer.
Compile-time check?
Is there a better way to do this, or is this run-time check the best Rust 1.0 supports?
In general, there are some hacky solutions to do some kind of compile time testing of arbitrary conditions. For example, there is the static_assertions crate which offers some useful macros (including one macro similar to C++'s static_assert). However, this is hacky and very limited.
In your particular situation, I haven't found a way to perform the check at compile time. The root problem here is that you can't use mem::size_of or mem::transmute on a generic type. Related issues: #43408 and #47966. For this reason, the static_assertions crate doesn't work either.
If you think about it, this would also allow a kind of error very unfamiliar to Rust programmers: an error when instantiating a generic function with a specific type. This is well known to C++ programmers -- Rust's trait bounds are used to fix those often very bad and unhelpful error messages. In the Rust world, one would need to specify your requirement as trait bound: something like where size_of::<T> == size_of::<usize>().
However, this is currently not possible. There once was a fairly famous "const-dependent type system" RFC which would allow these kinds of bounds, but got rejected for now. Support for these kinds of features are slowly but steadily progressing. "Miri" was merged into the compiler some time ago, allowing much more powerful constant evaluation. This is an enabler for many things, including the "Const Generics" RFC, which was actually merged. It is not yet implemented, but it is expected to land in 2018 or 2019.
Unfortunately, it still doesn't enable the kind of bound you need. Comparing two const expressions for equality, was purposefully left out of the main RFC to be resolved in a future RFC.
So it is to be expected that a bound similar to where size_of::<T> == size_of::<usize>() will eventually be possible. But this shouldn't be expected in the near future!
Workaround
In your situation, I would probably introduce an unsafe trait AsBigAsUsize. To implement it, you could write a macro impl_as_big_as_usize which performs a size check and implements the trait. Maybe something like this:
unsafe trait AsBigAsUsize: Sized {
const _DUMMY: [(); 0];
}
macro_rules! impl_as_big_as_usize {
($type:ty) => {
unsafe impl AsBigAsUsize for $type {
const _DUMMY: [(); 0] =
[(); (mem::size_of::<$type>() == mem::size_of::<usize>()) as usize];
// We should probably also check the alignment!
}
}
}
This uses basically the same trickery as static_assertions is using. This works, because we never use size_of on a generic type, but only on concrete types of the macro invocation.
So... this is obviously far from perfect. The user of your library has to invoke impl_as_big_as_usize once for every type they want to use in your data structure. But at least it's safe: as long as programmers only use the macro to impl the trait, the trait is in fact only implemented for types that have the same size as usize. Also, the error "trait bound AsBigAsUsize is not satisfied" is very understandable.
What about the run-time check?
As bluss said in the comments, in your assert! code, there is no run-time check, because the optimizer constant-folds the check. Let's test that statement with this code:
#![feature(asm)]
fn main() {
foo(3u64);
foo(true);
}
#[inline(never)]
fn foo<T>(t: T) {
use std::mem::size_of;
unsafe { asm!("" : : "r"(&t)) }; // black box
assert!(size_of::<usize>() == size_of::<T>());
unsafe { asm!("" : : "r"(&t)) }; // black box
}
The crazy asm!() expressions serve two purposes:
“hiding” t from LLVM, such that LLVM can't perform optimizations we don't want (like removing the whole function)
marking specific spots in the resulting ASM code we'll be looking at
Compile it with a nightly compiler (in a 64 bit environment!):
rustc -O --emit=asm test.rs
As usual, the resulting assembly code is hard to read; here are the important spots (with some cleanup):
_ZN4test4main17he67e990f1745b02cE: # main()
subq $40, %rsp
callq _ZN4test3foo17hc593d7aa7187abe3E
callq _ZN4test3foo17h40b6a7d0419c9482E
ud2
_ZN4test3foo17h40b6a7d0419c9482E: # foo<bool>()
subq $40, %rsp
movb $1, 39(%rsp)
leaq 39(%rsp), %rax
#APP
#NO_APP
callq _ZN3std9panicking11begin_panic17h0914615a412ba184E
ud2
_ZN4test3foo17hc593d7aa7187abe3E: # foo<u64>()
pushq %rax
movq $3, (%rsp)
leaq (%rsp), %rax
#APP
#NO_APP
#APP
#NO_APP
popq %rax
retq
The #APP-#NO_APP pair is our asm!() expression.
The foo<bool> case: you can see that our first asm!() instruction is compiled, then an unconditioned call to panic!() is made and afterwards comes nothing (ud2 just says “the program can never reach this spot, panic!() diverges”).
The foo<u64> case: you can see both #APP-#NO_APP pairs (both asm!() expressions) without anything in between.
So yes: the compiler removes the check completely.
It would be way better if the compiler would just refuse to compile the code. But this way we at least know, that there's no run-time overhead.
Since Rust 1.57, compile-time checks have been possible in safe code. As of this writing (Rust 1.67) they can be achieved using an intermediate compile-time constant outside the function. Here is how to do it:
pub struct Example<T> {
pub v: usize,
pub t: PhantomData<T>,
}
impl<T> Example<T> {
const SIZE_OK: () = assert!(size_of::<T>() == size_of::<usize>());
pub fn new() -> Example<T> {
let _ = Self::SIZE_OK;
Example {
v: 0,
t: PhantomData,
}
}
}
pub struct Good(usize);
pub struct Bad(u8);
fn main() {
let _e1 = Example::<Good>::new(); // compiles
//let _e2 = Example::<Bad>::new(); // doesn't compile
}
Playground
Contrary to the accepted answer, you can check at compile-time!
The trick is to insert, when compiling with optimizations, a call to an undefined C function in the dead-code path. You will get a linker error if your assertion would fail.
I'm a bit confused when I see code such as:
bigBox := &BigBox{}
bigBox.BubbleGumsCount = 4 // correct...
bigBox.SmallBox.AnyMagicItem = true // also correct
Why, or when, would I want to do bigBox := &BigBox{} instead of bigBox := BigBox{} ? Is it more efficient in some way?
Code sample was taken from here.
Sample no.2:
package main
import "fmt"
type Ints struct {
x int
y int
}
func build_struct() Ints {
return Ints{0,0}
}
func build_pstruct() *Ints {
return &Ints{0,0}
}
func main() {
fmt.Println(build_struct())
fmt.Println(build_pstruct())
}
Sample no. 3: ( why would I go with &BigBox in this example, and not with BigBox as a struct directly ? )
func main() {
bigBox := &BigBox{}
bigBox.BubbleGumsCount = 4
fmt.Println(bigBox.BubbleGumsCount)
}
Is there ever a reason to call build_pstruct instead of the the build_struct variant? Isn't that why we have the GC?
I figured out one motivation for this kind of code: avoidance of "struct copying by accident".
If you use a struct variable to hold the newly created struct:
bigBox := BigBox{}
you may copy the struct by accident like this
myBox := bigBox // Where you just want a refence of bigBox.
myBox.BubbleGumsCount = 4
or like this
changeBoxColorToRed(bigBox)
where changeBoxColorToRed is
// It makes a copy of entire struct as parameter.
func changeBoxColorToRed(box bigBox){
// !!!! This function is buggy. It won't work as expected !!!
// Please see the fix at the end.
box.Color=red
}
But if you use a struct pointer:
bigBox := &BigBox{}
there will be no copying in
myBox := bigBox
and
changeBoxColorToRed(bigBox)
will fail to compile, giving you a chance to rethink the design of changeBoxColorToRed. The fix is obvious:
func changeBoxColorToRed(box *bigBox){
box.Color=red
}
The new version of changeBoxColorToRed does not copy the entire struct and works correctly.
bb := &BigBox{} creates a struct, but sets the variable to be a pointer to it. It's the same as bb := new(BigBox). On the other hand, bb := BigBox{} makes bb a variable of type BigBox directly. If you want a pointer (because perhaps because you're going to use the data via a pointer), then it's better to make bb a pointer, otherwise you're going to be writing &bb a lot. If you're going to use the data as a struct directly, then you want bb to be a struct, otherwise you're going to be dereferencing with *bb.
It's off the point of the question, but it's usually better to create data in one go, rather than incrementally by creating the object and subsequently updating it.
bb := &BigBox{
BubbleGumsCount: 4,
SmallBox: {
AnyMagicItem: true,
},
}
The & takes an address of something. So it means "I want a pointer to" rather than "I want an instance of". The size of a variable containing a value depends on the size of the value, which could be large or small. The size of a variable containing a pointer is 8 bytes.
Here are examples and their meanings:
bigBox0 := &BigBox{} // bigBox0 is a pointer to an instance of BigBox{}
bigBox1 := BigBox{} // bigBox1 contains an instance of BigBox{}
bigBox2 := bigBox // bigBox2 is a copy of bigBox
bigBox3 := &bigBox // bigBox3 is a pointer to bigBox
bigBox4 := *bigBox3 // bigBox4 is a copy of bigBox, dereferenced from bigBox3 (a pointer)
Why would you want a pointer?
To prevent copying a large object when passing it as an argument to a function.
You want to modify the value by passing it as an argument.
To keep a slice, backed by an array, small. [10]BigBox would take up "the size of BigBox" * 10 bytes. [10]*BigBox would take up 8 bytes * 10. A slice when resized has to create a larger array when it reaches its capacity. This means the memory of the old array has to be copied to the new array.
Why do you not what to use a pointer?
If an object is small, it's better just to make a copy. Especially if it's <= 8 bytes.
Using pointers can create garbage. This garbage has to be collected by the garbage collector. The garbage collector is a mark-and-sweep stop-the-world implementation. This means that it has to freeze your application to collect the garbage. The more garbage it has to collect, the longer that pause is. This individual, for example. experienced a pause up to 10 seconds.
Copying an object uses the stack rather than the heap. The stack is usually always faster than the heap. You really don't have to think about stack vs heap in Go as it decides what should go where, but you shouldn't ignore it either. It really depends on the compiler implementation, but pointers can result in memory going on the heap, resulting in the need for garbage collection.
Direct memory access is faster. If you have a slice []BigBox and it doesn't change size it can be faster to access. []BigBox is faster to read, whereas []*BigBox is faster to resize.
My general advice is use pointers sparingly. Unless you're dealing with a very large object that needs to be passed around, it's often better to pass around a copy on the stack. Reducing garbage is a big deal. The garbage collector will get better, but you're better off by keeping it as low as possible.
As always test your application and profile it.
The difference is between creating a reference object (with the ampersand) vs. a value object (without the ampersand).
There's a nice explanation of the general concept of value vs. reference type passing here... What's the difference between passing by reference vs. passing by value?
There is some discussion of these concepts with regards to Go here... http://www.goinggo.net/2013/07/understanding-pointers-and-memory.html
In general there is no difference between a &BigBox{} and BigBox{}. The Go compiler is free to do whatever it likes as long as the semantics are correct.
func StructToStruct() {
s := Foo{}
StructFunction(&s)
}
func PointerToStruct() {
p := &Foo{}
StructFunction(p)
}
func StructToPointer() {
s := Foo{}
PointerFunction(&s)
}
func PointerToPointer() {
p := &Foo{}
PointerFunction(p)
}
//passed as a pointer, but used as struct
func StructFunction(f *Foo) {
fmt.Println(*f)
}
func PointerFunction(f *Foo) {
fmt.Println(f)
}
Summary of the assembly:
StructToStruct: 13 lines, no allocation
PointerToStruct: 16 lines, no allocation
StructToPointer: 20 lines, heap allocated
PointerToPointer: 12 lines, heap allocated
With a perfect compiler the *ToStruct functions would be the identical as would the *ToPointer functions. Go's escape analysis is good enough to tell if a pointer escapes even across module boundries. Which ever way is most efficient is the way the compiler will do it.
If you're really into micro-optimization note that Go is most efficient when the syntax lines up with the semantics (struct used as a struct, pointer used as a pointer). Or you can just forget about it and declare the variable the way it will be used and you will be right most of the time.
Note: if Foo is really big PointerToStruct will heap allocate it. The spec threatens to that even StructToStruct is allowed to do this but I couldn't make it happen. The lesson here is that the compiler will do whatever it wants. Just as the details of the registers is shielded from the code, so is the state of the heap/stack. Don't change your code because you think you know how the compiler is going to use the heap.