This question already has answers here:
Pointers vs. values in parameters and return values
(5 answers)
Closed 7 years ago.
I'm new to the Go Language, and have only minimal background in C/C++, so naturally I'm struggling with the idea of when to use pointers and when not to use pointers. Although this question might be considered open-ended, I'm wondering what some guidelines on when to return structs and when to return pointers, (and equivalently when to accept structs / pointers as arguments).
From what I can guess, the following statements hold true:
structs are passed into functions by value. That is, a copy of a structure is made when passing it into a function.
if I want to pass a structure by reference, then I would instead use a pointer argument in the function definition, and use the addressof operator when calling the function.
The reason why I would want to pass in a structure by reference is because either the structure I'm passing in is large, and it would be taxing on memory to pass it by value (unlikely) or if I want to make changes to the copy that I'm passing in (more likely).
As a corollary to 3.), I should pass by value unless I have one of the reasons above to pass by reference.
Are my assumptions correct? Or am I missing the mark on pointers?
Your assumptions are correct. About #3, Go is concurrent language and passing by reference in goroutines make them all read same structure which is safe, but also make them modify same structure which is dangerous.
Related
Coming from Wolfram Mathematica, I like the idea that whenever I pass a variable to a function I am effectively creating a copy of that variable. On the other hand, I am learning that in Julia there are the notions of mutable and immutable types, with the former passed by reference and the latter passed by value. Can somebody explain me the advantage of such a distinction? why arrays are passed by reference? Naively I see this as a bad aspect, since it creates side effects and ruins the possibility to write purely functional code. Where I am wrong in my reasoning? is there a way to make immutable an array, such that when it is passed to a function it is effectively passed by value?
here an example of code
#x is an in INT and so is immutable: it is passed by value
x = 10
function change_value(x)
x = 17
end
change_value(x)
println(x)
#arrays are mutable: they are passed by reference
arr = [1, 2, 3]
function change_array!(A)
A[1] = 20
end
change_array!(arr)
println(arr)
which indeed modifies the array arr
There is a fair bit to respond to here.
First, Julia does not pass-by-reference or pass-by-value. Rather it employs a paradigm known as pass-by-sharing. Quoting the docs:
Function arguments themselves act as new variable bindings (new
locations that can refer to values), but the values they refer to are
identical to the passed values.
Second, you appear to be asking why Julia does not copy arrays when passing them into functions. This is a simple one to answer: Performance. Julia is a performance oriented language. Making a copy every time you pass an array into a function is bad for performance. Every copy operation takes time.
This has some interesting side-effects. For example, you'll notice that a lot of the mature Julia packages (as well as the Base code) consists of many short functions. This code structure is a direct consequence of near-zero overhead to function calls. Languages like Mathematica and MatLab on the other hand tend towards long functions. I have no desire to start a flame war here, so I'll merely state that personally I prefer the Julia style of many short functions.
Third, you are wondering about the potential negative implications of pass-by-sharing. In theory you are correct that this can result in problems when users are unsure whether a function will modify its inputs. There were long discussions about this in the early days of the language, and based on your question, you appear to have worked out that the convention is that functions that modify their arguments have a trailing ! in the function name. Interestingly, this standard is not compulsory so yes, it is in theory possible to end up with a wild-west type scenario where users live in a constant state of uncertainty. In practice this has never been a problem (to my knowledge). The convention of using ! is enforced in Base Julia, and in fact I have never encountered a package that does not adhere to this convention. In summary, yes, it is possible to run into issues when pass-by-sharing, but in practice it has never been a problem, and the performance benefits far outweigh the cost.
Fourth (and finally), you ask whether there is a way to make an array immutable. First things first, I would strongly recommend against hacks to attempt to make native arrays immutable. For example, you could attempt to disable the setindex! function for arrays... but please don't do this. It will break so many things.
As was mentioned in the comments on the question, you could use StaticArrays. However, as Simeon notes in the comments on this answer, there are performance penalties for using static arrays for really big datasets. More than 100 elements and you can run into compilation issues. The main benefit of static arrays really is the optimizations that can be implemented for smaller static arrays.
Another package-based options suggested by phipsgabler in the comments below is FunctionalCollections. This appears to do what you want, although it looks to be only sporadically maintained. Of course, that isn't always a bad thing.
A simpler approach is just to copy arrays in your own code whenever you want to implement pass-by-value. For example:
f!(copy(x))
Just be sure you understand the difference between copy and deepcopy, and when you may need to use the latter. If you're only working with arrays of numbers, you'll never need the latter, and in fact using it will probably drastically slow down your code.
If you wanted to do a bit of work then you could also build your own array type in the spirit of static arrays, but without all the bells and whistles that static arrays entails. For example:
struct MyImmutableArray{T,N}
x::Array{T,N}
end
Base.getindex(y::MyImmutableArray, inds...) = getindex(y.x, inds...)
and similarly you could add any other functions you wanted to this type, while excluding functions like setindex!.
I know that from here:
Julia function arguments follow a convention sometimes called "pass-by-sharing", which means that values are not copied when they are passed to functions. Function arguments themselves act as new variable bindings (new locations that can refer to values), but the values they refer to are identical to the passed values. Modifications to mutable values (such as Arrays) made within a function will be visible to the caller. This is the same behavior found in Scheme, most Lisps, Python, Ruby and Perl, among other dynamic languages.
Given this, it's clear to me that to pass by reference, all you need to do is have a mutable type that you pass into a function and edit.
My question then becomes, how can I clearly distinguish between pass by value and pass by reference? Does anyone have an example that shows a function being called twice; once with pass by reference, and once with pass by value?
I saw this post which alludes to some similar ideas, but it did not fully answer my question.
In Julia, functions always have pass-by-sharing argument-passing behavior:
https://docs.julialang.org/en/v1/manual/functions/
This argument-passing convention is also used in most general purpose dynamic programming languages, including various Lisps, Python, Perl and Ruby. A good and useful description can be found here:
https://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_sharing
In short, pass-by-sharing works like pass-by-reference but you cannot change which value a binding in the calling scope refers to by reassigning to an argument in the function being called—if you reassign an argument, the binding in the caller is unchanged. This means that in general you cannot use functions to change bindings, such as for example to swap to variables. (Macros can, however, modify bindings in the caller.) In particular, if a variable in the caller refers to an immutable value like an integer or a floating-point number, its value cannot be changed by a function call since which object the variable refers to cannot be changed by a function call and the value itself cannot be modified as it is immutable.
If you want to have something like R or Matlab pass by value behavior, you need to explicitly create a copy of the argument before modifying it. This is precisely what R and Matlab do when an argument is passed in a modified and an external reference to the argument remains. In Julia it must be done explicitly by the programmer rather than being done automatically by the system. A downside is that the system can sometimes know that no copy is required (no external references remain) when the programmer cannot generally know this. That ability, however, is deeply tied with the reference counting garbage collections technique, which is not used by Julia due to performance considerations.
By convention, functions which mutate the contents of an argument have a ! postfix (e.g., sort v/s sort!).
The Go docs say (emphasis added):
Programs using times should typically store and pass them as values, not pointers. That is, time variables and struct fields should be of type time.Time, not *time.Time. A Time value can be used by multiple goroutines simultaneously.
Is the last sentence (about using a Time value in multiple goroutines simultaneously) the only reason that they should "typically" be stored and passed as a value, rather than a pointer? Is this common to other structs as well? I tried looking for any logic that specifically enables this in the time.Time declaration and methods, but didn't notice anything special there.
Update: I often have to serve JSON representations of my structs, and I'd rather omit empty/uninitialized times. The json:",omitempty" tag doesn't work with time.Time values, which appears to be the expected behavior, but the best workaround seems to be to use a pointer, which goes against the advice in the docs quoted above.
It's common for many kind of simple values.
In Go, when some value isn't bigger than one or two words, it's common to simply use it as a value instead of using a pointer. Simply because there's no reason to use a pointer if the object is small and you don't pass it to be changed.
You might have to unlearn the practice of languages where everything structured couldn't be handled as values. It's probably natural for you to use integers or floating point numbers as values, not pointers. Why not do the same for times ?
Regarding your precise problem with JSON and assuming you don't want to write a specific Marshaller just for this, there's no problem in using a *time.Time. In fact this issue was already mentioned in the golang-nuts list.
Is it unwise to return a var bound using let?
(let [pipeline (Channels/pipeline)]
(.addLast pipeline "codec" (HttpClientCodec.))
;; several more lines like this
pipeline)
Is the binding here just about the lexical scope (as opposed to def) and not unsafe to pass around?
Update
In writing this question I realised the above was ugly. And if something is ugly in Clojure you are probably doing it wrong.
I think this is probably the more idiomatic way of handling the above (which makes the question moot, btw, but still handy knowledge).
(doto (Channels/pipeline)
(.addLast "codec" (HttpClientCodec.)))
let is purely lexically scoped and doesn't create a var. The locals created by let (or loop) behave exactly like function arguments. So yeah, it's safe to use as many let/loop-defined locals as you like, close over them, etc. Returning a local from the function simply returns its value, not the internal representation (which is actually on the stack, unless closed over). let/loop bindings are therefore also reentrancy/thread-safe.
By the way, for your specific code example with lots of java calls, you may want to consider using doto instead or additionally. http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/doto
Do you know if are there pointers in Haskell?
If yes: how do you use them? Are there any problems with them? And why aren't they popular?
If no: is there any reason for it?
Yes there are. Take a look at Foreign.Ptr or Data.IORef
I suspect this wasn't what you are asking for though. As Haskell is for the most part without state, it means pointers don't fit into the language design. Having a pointer to memory outside the function would mean that a function is no longer pure and only allowing pointers to values within the current function is useless.
Haskell does provide pointers, via the foreign function interface extension. Look at, for example, Foreign.Storable.
Pointers are used for interoperating with C code. Not for every day Haskell programming.
If you're looking for references -- pointers to objects you wish to mutate -- there are STRef and IORef, which serve many of the same uses as pointers. However, you should rarely -- if ever -- need Refs.
If you simply wish to avoid copying large values, as sepp2k supposes, then you need do nothing: in most implementation, all non-trivial values are allocated separately on a heap and refer to one another by machine-level addresses (i.e. pointers). But again, you need do nothing about any of this, it is taken care of for you.
To answer your question about how values are passed, they are passed in whatever way the implementation sees fit: since you can't mutate the values anyway, it doesn't impact the meaning of the code (as long as the strictness is respected); usually this works out to by-need unless you're passing in e.g. Int values that the compiler can see have already been evaluated...
Pass-by-need is like pass-by-reference, except that any given reference could refer either to an actual evaluated value (which cannot be changed), or to a "thunk" for a not-yet-evaluated value. Wikipedia has more.