Weird behaviour with struct constructors - pointers

I've written a basic Node struct in D, designed to be used as a part of a tree-like structure. The code is as follows:
import std.algorithm: min;
alias Number = size_t;
struct Node {
private {
Node* left, right, parent;
Number val;
}
this(Number n) {val = n;}
this(ref Node u, ref Node v) {
this.left = &u;
this.right = &v;
val = min(u.val, v.val);
u.parent = &this;
v.parent = &this;
}
}
Now, I wrote a simple function which is supposed to give me a Node (meaning a whole tree) with the argument array providing the leaves, as follows.
alias Number = size_t;
Node make_tree (Number[] nums) {
if (nums.length == 1) {
return Node(nums[0]);
} else {
Number half = nums.length/2;
return Node(make_tree(nums[0..half]), make_tree(nums[half..$]));
}
}
Now, when I try to run it through dmd, I get the following error message:
Error: constructor Node.this (ulong n) is not callable using argument types (Node, Node)
This makes no sense to me - why is it trying to call a one-argument constructor when given two arguments?

The problem has nothing to do with constructors. It has to do with passing by ref. The constructor that you're trying to use
this(ref Node u, ref Node v) {...}
accepts its arguments by ref. That means that they must be lvalues (i.e. something that can be on the left-hand side of an assignment). But you're passing it the result of a function call which does not return by ref (so, it's returning a temporary, which is an rvalue - something that can go on the right-hand side of an assignment but not the left). So, what you're trying to do is illegal. Now, the error message isn't great, since it's giving an error with regards to the first constructor rather than the second, but regardless, you don't have a constructor which matches what you're trying to do. At the moment, I can think of 3 options:
Get rid of the ref on the constructor's parameters. If you're only going to be passing it the result of a function call like you're doing now, having it accept ref doesn't help you anyway. The returned value will be moved into the function's parameter, so no copy will take place, and ref isn't buying you anything. Certainly, assigning the return values to local variables so that you can pass them to the constructor as it's currently written would lose you something, since then you'd be making unnecessary copies.
Overload the constructor so that it accepts either ref or non-ref. e.g.
void foo(ref Bar b) { ... }
void foo(Bar b) { foo(b); } //this calls the other foo
In general, this works reasonably well when you have one parameter, but it would be a bit annoying here, because you end up with an exponential explosion of function signatures as you add parameters. So, for your constructor, you'd end up with
this(ref Node u, ref Node v) {...}
this(ref Node u, Node v) { this(u, v); }
this(Node u, ref Node v) { this(u, v); }
this(Node u, Node v) { this(u, v); }
And if you added a 3rd parameter, you'd end up with eight overloads. So, it really doesn't scale beyond a single parameter.
Templatize the constructor and use auto ref. This essentially does what #2 does, but you only have to write the function once:
this()(auto ref Node u, auto ref Node v) {...}
This will then generate a copy of the function to match the arguments given (up to 4 different versions of it with the full function body in each rather than 3 of them just forwarding to the 4th one), but you only had to write it once. And in this particular case, it's probably reasonable to templatize the function, since you're dealing with a struct. If Node were a class though, it might not make sense, since templated functions can't be virtual.
So, if you really want to be able to pass by ref, then in this particular case, you should probably go with #3 and templatize the constructor and use auto ref. However, personally, I wouldn't bother. I'd just go with #1. Your usage pattern here wouldn't get anything from auto ref, since you're always passing it two rvalues, and your Node struct isn't exactly huge anyway, so while you obviously wouldn't want to copy it if you don't need to, copying an lvalue to pass it to the constructor probably wouldn't matter much unless you were doing it a lot. But again, you're only going to end up with a copy if you pass it an lvalue, since an rvalue can be moved rather than copied, and you're only passing it rvalues right now (at least with the code shown here). So, unless you're doing something different with that constructor which would involve passing it lvalues, there's no point in worrying about lvalues - or about the Nodes being copied when they're returned from a function and passed into the constructor (since that's a move, not a copy). As such, just removing the refs would be the best choice.

Related

When the form parameter in go is map, what is passed in?

When the formal parameter is map, assigning a value directly to a formal parameter cannot change the actual argument, but if you add a new key and value to the formal parameter, the actual argument outside the function can also be seen. Why is that?
I don't understand the output value of the following code, and the formal parameters are different from the actual parameters.
unc main() {
t := map[int]int{
1: 1,
}
fmt.Println(unsafe.Pointer(&t))
copysss(t)
fmt.Println(t)
}
func copysss(m map[int]int) {
//pointer := unsafe.Pointer(&m)
//fmt.Println(pointer)
m = map[int]int{
1: 2,
}
}
stdout :0xc000086010
map[1:1]
func main() {
t := map[int]int{
1: 1,
}
fmt.Println(unsafe.Pointer(&t))
copysss(t)
fmt.Println(t)
}
func copysss(m map[int]int) {
//pointer := unsafe.Pointer(&m)
//fmt.Println(pointer)
m[1] = 2
}
stdout :0xc00007a010
map[1:2]
func main() {
t := map[int]int{
1: 1,
}
fmt.Println(unsafe.Pointer(&t))
copysss(t)
fmt.Println(t)
}
func copysss(m map[int]int) {
pointer := unsafe.Pointer(&m)
fmt.Println(pointer)
m[1] = 2
}
stdout:0xc00008a008
0xc00008a018
map[1:2]
I want to know if the parameter is a value or a pointer.
The parameter is both a value and a pointer.
Wait.. whut?
Yes, a map (and slices, for that matter) are types, pretty similar to what you would implement. Think of a map like this:
type map struct {
// meta information on the map
meta struct{
keyT type
valueT type
len int
}
value *hashTable // pointer to the underlying data structure
}
So in your first function, where you reassign m, you're passing a copy of the struct above (pass by value), and you're assigning a new map to it, creating a new hashtable pointer in the process. The variable in the function scope is updated, but the one you passed still holds a reference to the original map, and with it, the pointer to the original map is preserved.
In the second snippet, you're accessing the underlying hash table (a copy of the pointer, but the pointer points to the same memory). You're directly manipulating the original map, because you're just changing the contents of the memory.
So TL;DR
A map is a value, containing meta information of what the map looks like, and a pointer to the actual data stored inside. The pointer is passed by value, like anything else (same way pointers are passed by value in C/C++), but of course, dereferencing a pointer means you're changing the values in memory directly.
Careful...
Like I said, slices work pretty much in the same way:
type slice struct {
meta struct {
type T
len, cap int
}
value *array // yes, it's a pointer to an underlying array
}
The underlying array is of say, a slice of ints will be [10]int if the cap of the slice is 10, regardless of the length. A slice is managed by the go runtime, so if you exceed the capacity, a new array is allocated (twice the cap of the previous one), the existing data is copied over, and the slice value field is set to point to the new array. That's the reason why append returns the slice that you're appending to, the underlying pointer may have changed etc.. you can find more in-depth information on this.
The thing you have to be careful with is that a function like this:
func update(s []int) {
for i, v := range s {
s[i] = v*2
}
}
will behave much in the same way as the function you have were you're assigning m[1] = 2, but once you start appending, the runtime is free to move the underlying array around, and point to a new memory address. So bottom line: maps and slices have an internal pointer, which can produce side-effects, but you're better off avoiding bugs/ambiguities. Go supports multiple return values, so just return a slice if you set about changing it.
Notes:
In your attempt to figure out what a map is (reference, value, pointer...), I noticed you tried this:
pointer := unsafe.Pointer(&m)
fmt.Println(pointer)
What you're doing there, is actually printing the address of the argument variable, not any address that actually corresponds to the map itself. the argument passed to unsafe.Pointer isn't of the type map[int]int, but rather it's of type *map[int]int.
Personally, I think there's too much confusion around passing by value vs passing by . Go works exactly like C in this regard, just like C, absolutely everything is passed by value. It just so happens that this value can sometimes be a memory address (pointer).
More details (references)
Slices: usage & internals
Maps Note: there's some confusion caused by this one, as pointers, slices, and maps are referred to as *reference types*, but as explained by others, and elsewhere, this is not to be confused with C++ references
In Go, map is a reference type. This means that the map actually resides in the heap and variable is just a pointer to that.
The map is passed by copy. You can change the local copy in your function, but this will not be reflected in caller's scope.
But, since the map variable is a pointer to the unique map residing in the heap, every change can be seen by any variable that points to the same map.
This article can clarify the concept: https://www.ardanlabs.com/blog/2014/12/using-pointers-in-go.html.

"cannot take the address of" and "cannot call pointer method on"

This compiles and works:
diff := projected.Minus(c.Origin)
dir := diff.Normalize()
This does not (yields the errors in the title):
dir := projected.Minus(c.Origin).Normalize()
Can someone help me understand why? (learning Go)
Here are those methods:
// Minus subtracts another vector from this one
func (a *Vector3) Minus(b Vector3) Vector3 {
return Vector3{a.X - b.X, a.Y - b.Y, a.Z - b.Z}
}
// Normalize makes the vector of length 1
func (a *Vector3) Normalize() Vector3 {
d := a.Length()
return Vector3{a.X / d, a.Y / d, a.Z / d}
}
The Vector3.Normalize() method has a pointer receiver, so in order to call this method, a pointer to Vector3 value is required (*Vector3). In your first example you store the return value of Vector3.Minus() in a variable, which will be of type Vector3.
Variables in Go are addressable, and when you write diff.Normalize(), this is a shortcut, and the compiler will automatically take the address of the diff variable to have the required receiver value of type *Vector3 in order to call Normalize(). So the compiler will "transform" it to
(&diff).Normalize()
This is detailed in Spec: Calls:
A method call x.m() is valid if the method set of (the type of) x contains m and the argument list can be assigned to the parameter list of m. If x is addressable and &x's method set contains m, x.m() is shorthand for (&x).m().
The reason why your second example doesn't work is because return values of function and method calls are not addressable, so the compiler is not able to do the same here, the compiler is not able to take the address of the return value of the Vector3.Minus() call.
What is addressable is exactly listed in the Spec: Address operators:
The operand must be addressable, that is, either a variable, pointer indirection, or slice indexing operation; or a field selector of an addressable struct operand; or an array indexing operation of an addressable array. As an exception to the addressability requirement, x [in the expression of &x] may also be a (possibly parenthesized) composite literal.
See related questions:
How to get the pointer of return value from function call?
How can I store reference to the result of an operation in Go?
Possible "workarounds"
"Easiest" (requiring the least change) is simply to assign to a variable, and call the method after that. This is your first working solution.
Another way is to modify the methods to have a value receiver (instead of pointer receiver), so that there is no need to take the address of the return values of the methods, so calls can be "chained". Note that this might not be viable if a method needs to modify the receiver, as that is only possible if it is a pointer (as the receiver is passed just like any other parameters – by making a copy –, and if it's not a pointer, you could only modify the copy).
Another way is to modify the return values to return pointers (*Vector3) instead of Vector3. If the return value is already a pointer, no need to take its address as it's good as-is for the receiver to a method that requires a pointer receiver.
You may also create a simple helper function which returns its address. It could look something like this:
func pv(v Vector3) *Vector3 {
return &v
}
Using it:
dir := pv(projected.Minus(c.Origin)).Normalize()
This could also be a method of Vector3, e.g.:
func (v Vector3) pv() *Vector3 {
return &v
}
And then using it:
dir := projected.Minus(c.Origin).pv().Normalize()
Some notes:
If your type consists of 3 float64 values only, you should not see significant performance differences. But you should be consistent about your receiver and result types. If most of your methods have pointer receivers, so should all of them. If most of your methods return pointers, so should all of them.
The accepted answer is really long so I'm just going to post what helped me:
I got this error regarding this line:
services.HashingServices{}.Hash("blabla")
so I just changed it to:
(&services.HashingServices{}).Hash("blabla")

Rust cannot move out of dereference pointer

I try to run this code:
impl FibHeap {
fn insert(&mut self, key: int) -> () {
let new_node = Some(box create_node(key, None, None));
match self.min{
Some(ref mut t) => t.right = new_node,
None => (),
};
println!("{}",get_right(self.min));
}
}
fn get_right(e: Option<Box<Node>>) -> Option<Box<Node>> {
match e {
Some(t) => t.right,
None => None,
}
}
And get error
error: cannot move out of dereference of `&mut`-pointer
println!("{}",get_right(self.min));
^
I dont understand why I get this problem, and what I must use to avoid problem.
Your problem is that get_right() accepts Option<Box<Node>>, while it should really accept Option<&Node> and return Option<&Node> as well. The call site should be also changed appropriately.
Here is the explanation. Box<T> is a heap-allocated box. It obeys value semantics (that is, it behaves like plain T except that it has associated destructor so it is always moved, never copied). Hence passing just Box<T> into a function means giving up ownership of the value and moving it into the function. However, it is not what you really want and neither can do here. get_right() function only queries the existing structure, so it does not need ownership. And if ownership is not needed, then references are the answer. Moreover, it is just impossible to move the self.min into a function, because self.min is accessed through self, which is a borrowed pointer. However, you can't move out from a borrowed data, it is one of the basic safety guarantees provided by the compiler.
Change your get_right() definition to something like this:
fn get_right(e: Option<&Node>) -> Option<&Node> {
e.and_then(|n| n.right.as_ref().map(|r| &**r))
}
Then println!() call should be changed to this:
println!("{}", get_right(self.min.map(|r| &**r))
Here is what happens here. In order to obtain Option<&Node> from Option<Box<Node>> you need to apply the "conversion" to insides of the original Option. There is a method exactly for that, called map(). However, map() takes its target by value, which would mean moving Box<Node> into the closure. However, we only want to borrow Node, so first we need to go from Option<Box<Node>> to Option<&Box<Node>> in order for map() to work.
Option<T> has a method, as_ref(), which takes its target by reference and returns Option<&T>, a possible reference to the internals of the option. In our case it would be Option<&Box<Node>>. Now this value can be safely map()ped over since it contains a reference and a reference can be freely moved without affecting the original value.
So, next, map(|r| &**r) is a conversion from Option<&Box<Node>> to Option<&Node>. The closure argument is applied to the internals of the option if they are present, otherwise None is just passed through. &**r should be read inside out: &(*(*r)), that is, first we dereference &Box<Node>, obtaining Box<Node>, then we dereference the latter, obtaining just Node, and then we take a reference to it, finally getting &Node. Because these reference/dereference operations are juxtaposed, there is no movement/copying involved. So, we got an optional reference to a Node, Option<&Node>.
You can see that similar thing happens in get_right() function. However, there is also a new method, and_then() is called. It is equivalent to what you have written in get_right() initially: if its target is None, it returns None, otherwise it returns the result of Option-returning closure passed as its argument:
fn and_then<U>(self, f: |T| -> Option<U>) -> Option<U> {
match self {
Some(e) => f(e),
None => None
}
}
I strongly suggest reading the official guide which explains what ownership and borrowing are and how to use them, because these are the very foundation of Rust language and it is very important to grasp them in order to be productive with Rust.

Passing custom slice types by reference

I'm having trouble wrapping my head around how pointers, slices, and interfaces interact in Go. This is what I currently have coded up:
type Loader interface {
Load(string, string)
}
type Foo struct {
a, b string
}
type FooList []Foo
func (l FooList) Load(a, b string) {
l = append(l, Foo{a, b})
// l contains 1 Foo here
}
func Load(list Loader) {
list.Load("1", "2")
// list is still nil here
}
Given this setup, I then try to do the following:
var list FooList
Load(list)
fmt.Println(list)
However, list is always nil here. My FooList.Load function does add an element to the l slice, but that's as far as it gets. The list in Load continues to be nil. I think I should be able to just pass the reference to my slice around and have things append to it. I'm obviously missing something on how to get it to work though.
(Code in http://play.golang.org/p/uuRKjtxs9D)
If you intend your method to make changes, you probably want to use a pointer receiver.
// We also define a method Load on a FooList pointer receiver.
func (l *FooList) Load(a, b string) {
*l = append(*l, Foo{a, b})
}
This has a consequence, though, that a FooList value won't itself satisfy the Loader interface.
var list FooList
Load(list) // You should see a compiler error at this point.
A pointer to a FooList value, though, will satisfy the Loader interface.
var list FooList
Load(&list)
Complete code below:
package main
import "fmt"
/////////////////////////////
type Loader interface {
Load(string, string)
}
func Load(list Loader) {
list.Load("1", "2")
}
/////////////////////////////
type Foo struct {
a, b string
}
// We define a FooList to be a slice of Foo.
type FooList []Foo
// We also define a method Load on a FooList pointer receiver.
func (l *FooList) Load(a, b string) {
*l = append(*l, Foo{a, b})
}
// Given that we've defined the method with a pointer receiver, then a plain
// old FooList won't satisfy the Loader interface... but a FooList pointer will.
func main() {
var list FooList
Load(&list)
fmt.Println(list)
}
I'm going to simplify the problem so it's easier to understand. What is being done there is very similar to this, which also does not work (you can run it here):
type myInt int
func (a myInt) increment() { a = a + 1 }
func increment(b myInt) { b.increment() }
func main() {
var c myInt = 42
increment(c)
fmt.Println(c) // => 42
}
The reason why this does not work is because Go passes parameters by value, as the documentation describes:
In a function call, the function value and arguments are evaluated in the usual
order. After they are evaluated, the parameters of the call are passed by value
to the function and the called function begins execution.
In practice, this means that each of a, b, and c in the example above are pointing to different int variables, with a and b being copies of the initial c value.
To fix it, we must use pointers so that we can refer to the same area of memory (runnable here):
type myInt int
func (a *myInt) increment() { *a = *a + 1 }
func increment(b *myInt) { b.increment() }
func main() {
var c myInt = 42
increment(&c)
fmt.Println(c) // => 43
}
Now a and b are both pointers that contain the address of variable c, allowing their respective logic to change the original value. Note that the documented behavior still holds here: a and b are still copies of the original value, but the original value provided as a parameter to the increment function is the address of c.
The case for slices is no different than this. They are references, but the reference itself is provided as a parameter by value, so if you change the reference, the call site will not observe the change since they are different variables.
There's also a different way to make it work, though: implementing an API that resembles that of the standard append function. Again using the simpler example, we might implement increment without mutating the original value, and without using a pointer, by returning the changed value instead:
func increment(i int) int { return i+1 }
You can see that technique used in a number of places in the standard library, such as the strconv.AppendInt function.
It's worth keeping a mental model of how Go's data structures are implemented. That usually makes it easier to reason about behaviour like this.
http://research.swtch.com/godata is a good introduction to the high-level view.
Go is pass-by-value. This is true for both parameters and receivers. If you need to assign to the slice value, you need to use a pointer.
Then I read somewhere that you shouldn't pass pointers to slices since
they are already references
This is not entirely true, and is missing part of the story.
When we say something is a "reference type", including a map type, a channel type, etc., we mean that it is actually a pointer to an internal data structure. For example, you can think of a map type as basically defined as:
// pseudocode
type map *SomeInternalMapStructure
So to modify the "contents" of the associative array, you don't need to assign to a map variable; you can pass a map variable by value and that function can change the contents of the associative array pointed to by the map variable, and it will be visible to the caller. This makes sense when you realize it's a pointer to some internal data structure. You would only assign to a map variable if you want to change which internal associative array you want it to point to.
However, a slice is more complicated. It is a pointer (to an internal array), plus the length and capacity, two integers. So basically, you can think of it as:
// pseudocode
type slice struct {
underlyingArray uintptr
length int
capacity int
}
So it's not "just" a pointer. It is a pointer with respect to the underlying array. But the length and capacity are "value" parts of the slice type.
So if you just need to change an element of the slice, then yes, it acts like a reference type, in that you can pass the slice by value and have the function change an element and it's visible to the caller.
However, when you append() (which is what you're doing in the question), it's different. First, appending affects the length of the slice, and length is one of the direct parts of the slice, not behind a pointer. Second, appending may produce a different underlying array (if the capacity of the original underlying array is not enough, it allocates a new one); thus the array pointer part of the slice might also be changed. Thus it is necessary to change the slice value. (This is why append() returns something.) In this sense, it cannot be regarded as a reference type, because we are not just "changing what it points to"; we are changing the slice directly.

What is a 'Closure'?

I asked a question about Currying and closures were mentioned.
What is a closure? How does it relate to currying?
Variable scope
When you declare a local variable, that variable has a scope. Generally, local variables exist only within the block or function in which you declare them.
function() {
var a = 1;
console.log(a); // works
}
console.log(a); // fails
If I try to access a local variable, most languages will look for it in the current scope, then up through the parent scopes until they reach the root scope.
var a = 1;
function() {
console.log(a); // works
}
console.log(a); // works
When a block or function is done with, its local variables are no longer needed and are usually blown out of memory.
This is how we normally expect things to work.
A closure is a persistent local variable scope
A closure is a persistent scope which holds on to local variables even after the code execution has moved out of that block. Languages which support closure (such as JavaScript, Swift, and Ruby) will allow you to keep a reference to a scope (including its parent scopes), even after the block in which those variables were declared has finished executing, provided you keep a reference to that block or function somewhere.
The scope object and all its local variables are tied to the function and will persist as long as that function persists.
This gives us function portability. We can expect any variables that were in scope when the function was first defined to still be in scope when we later call the function, even if we call the function in a completely different context.
For example
Here's a really simple example in JavaScript that illustrates the point:
outer = function() {
var a = 1;
var inner = function() {
console.log(a);
}
return inner; // this returns a function
}
var fnc = outer(); // execute outer to get inner
fnc();
Here I have defined a function within a function. The inner function gains access to all the outer function's local variables, including a. The variable a is in scope for the inner function.
Normally when a function exits, all its local variables are blown away. However, if we return the inner function and assign it to a variable fnc so that it persists after outer has exited, all of the variables that were in scope when inner was defined also persist. The variable a has been closed over -- it is within a closure.
Note that the variable a is totally private to fnc. This is a way of creating private variables in a functional programming language such as JavaScript.
As you might be able to guess, when I call fnc() it prints the value of a, which is "1".
In a language without closure, the variable a would have been garbage collected and thrown away when the function outer exited. Calling fnc would have thrown an error because a no longer exists.
In JavaScript, the variable a persists because the variable scope is created when the function is first declared and persists for as long as the function continues to exist.
a belongs to the scope of outer. The scope of inner has a parent pointer to the scope of outer. fnc is a variable which points to inner. a persists as long as fnc persists. a is within the closure.
Further reading (watching)
I made a YouTube video looking at this code with some practical examples of usage.
I'll give an example (in JavaScript):
function makeCounter () {
var count = 0;
return function () {
count += 1;
return count;
}
}
var x = makeCounter();
x(); returns 1
x(); returns 2
...etc...
What this function, makeCounter, does is it returns a function, which we've called x, that will count up by one each time it's called. Since we're not providing any parameters to x, it must somehow remember the count. It knows where to find it based on what's called lexical scoping - it must look to the spot where it's defined to find the value. This "hidden" value is what is called a closure.
Here is my currying example again:
function add (a) {
return function (b) {
return a + b;
}
}
var add3 = add(3);
add3(4); returns 7
What you can see is that when you call add with the parameter a (which is 3), that value is contained in the closure of the returned function that we're defining to be add3. That way, when we call add3, it knows where to find the a value to perform the addition.
First of all, contrary to what most of the people here tell you, closure is not a function! So what is it?
It is a set of symbols defined in a function's "surrounding context" (known as its environment) which make it a CLOSED expression (that is, an expression in which every symbol is defined and has a value, so it can be evaluated).
For example, when you have a JavaScript function:
function closed(x) {
return x + 3;
}
it is a closed expression because all the symbols occurring in it are defined in it (their meanings are clear), so you can evaluate it. In other words, it is self-contained.
But if you have a function like this:
function open(x) {
return x*y + 3;
}
it is an open expression because there are symbols in it which have not been defined in it. Namely, y. When looking at this function, we can't tell what y is and what does it mean, we don't know its value, so we cannot evaluate this expression. I.e. we cannot call this function until we tell what y is supposed to mean in it. This y is called a free variable.
This y begs for a definition, but this definition is not part of the function – it is defined somewhere else, in its "surrounding context" (also known as the environment). At least that's what we hope for :P
For example, it could be defined globally:
var y = 7;
function open(x) {
return x*y + 3;
}
Or it could be defined in a function which wraps it:
var global = 2;
function wrapper(y) {
var w = "unused";
return function(x) {
return x*y + 3;
}
}
The part of the environment which gives the free variables in an expression their meanings, is the closure. It is called this way, because it turns an open expression into a closed one, by supplying these missing definitions for all of its free variables, so that we could evaluate it.
In the example above, the inner function (which we didn't give a name because we didn't need it) is an open expression because the variable y in it is free – its definition is outside the function, in the function which wraps it. The environment for that anonymous function is the set of variables:
{
global: 2,
w: "unused",
y: [whatever has been passed to that wrapper function as its parameter `y`]
}
Now, the closure is that part of this environment which closes the inner function by supplying the definitions for all its free variables. In our case, the only free variable in the inner function was y, so the closure of that function is this subset of its environment:
{
y: [whatever has been passed to that wrapper function as its parameter `y`]
}
The other two symbols defined in the environment are not part of the closure of that function, because it doesn't require them to run. They are not needed to close it.
More on the theory behind that here:
https://stackoverflow.com/a/36878651/434562
It's worth to note that in the example above, the wrapper function returns its inner function as a value. The moment we call this function can be remote in time from the moment the function has been defined (or created). In particular, its wrapping function is no longer running, and its parameters which has been on the call stack are no longer there :P This makes a problem, because the inner function needs y to be there when it is called! In other words, it requires the variables from its closure to somehow outlive the wrapper function and be there when needed. Therefore, the inner function has to make a snapshot of these variables which make its closure and store them somewhere safe for later use. (Somewhere outside the call stack.)
And this is why people often confuse the term closure to be that special type of function which can do such snapshots of the external variables they use, or the data structure used to store these variables for later. But I hope you understand now that they are not the closure itself – they're just ways to implement closures in a programming language, or language mechanisms which allows the variables from the function's closure to be there when needed. There's a lot of misconceptions around closures which (unnecessarily) make this subject much more confusing and complicated than it actually is.
Kyle's answer is pretty good. I think the only additional clarification is that the closure is basically a snapshot of the stack at the point that the lambda function is created. Then when the function is re-executed the stack is restored to that state before executing the function. Thus as Kyle mentions, that hidden value (count) is available when the lambda function executes.
A closure is a function that can reference state in another function. For example, in Python, this uses the closure "inner":
def outer (a):
b = "variable in outer()"
def inner (c):
print a, b, c
return inner
# Now the return value from outer() can be saved for later
func = outer ("test")
func (1) # prints "test variable in outer() 1
To help facilitate understanding of closures it might be useful to examine how they might be implemented in a procedural language. This explanation will follow a simplistic implementation of closures in Scheme.
To start, I must introduce the concept of a namespace. When you enter a command into a Scheme interpreter, it must evaluate the various symbols in the expression and obtain their value. Example:
(define x 3)
(define y 4)
(+ x y) returns 7
The define expressions store the value 3 in the spot for x and the value 4 in the spot for y. Then when we call (+ x y), the interpreter looks up the values in the namespace and is able to perform the operation and return 7.
However, in Scheme there are expressions that allow you to temporarily override the value of a symbol. Here's an example:
(define x 3)
(define y 4)
(let ((x 5))
(+ x y)) returns 9
x returns 3
What the let keyword does is introduces a new namespace with x as the value 5. You will notice that it's still able to see that y is 4, making the sum returned to be 9. You can also see that once the expression has ended x is back to being 3. In this sense, x has been temporarily masked by the local value.
Procedural and object-oriented languages have a similar concept. Whenever you declare a variable in a function that has the same name as a global variable you get the same effect.
How would we implement this? A simple way is with a linked list - the head contains the new value and the tail contains the old namespace. When you need to look up a symbol, you start at the head and work your way down the tail.
Now let's skip to the implementation of first-class functions for the moment. More or less, a function is a set of instructions to execute when the function is called culminating in the return value. When we read in a function, we can store these instructions behind the scenes and run them when the function is called.
(define x 3)
(define (plus-x y)
(+ x y))
(let ((x 5))
(plus-x 4)) returns ?
We define x to be 3 and plus-x to be its parameter, y, plus the value of x. Finally we call plus-x in an environment where x has been masked by a new x, this one valued 5. If we merely store the operation, (+ x y), for the function plus-x, since we're in the context of x being 5 the result returned would be 9. This is what's called dynamic scoping.
However, Scheme, Common Lisp, and many other languages have what's called lexical scoping - in addition to storing the operation (+ x y) we also store the namespace at that particular point. That way, when we're looking up the values we can see that x, in this context, is really 3. This is a closure.
(define x 3)
(define (plus-x y)
(+ x y))
(let ((x 5))
(plus-x 4)) returns 7
In summary, we can use a linked list to store the state of the namespace at the time of function definition, allowing us to access variables from enclosing scopes, as well as providing us the ability to locally mask a variable without affecting the rest of the program.
Functions containing no free variables are called pure functions.
Functions containing one or more free variables are called closures.
var pure = function pure(x){
return x
// only own environment is used
}
var foo = "bar"
var closure = function closure(){
return foo
// foo is a free variable from the outer environment
}
src: https://leanpub.com/javascriptallongesix/read#leanpub-auto-if-functions-without-free-variables-are-pure-are-closures-impure
Here's a real world example of why Closures kick ass... This is straight out of my Javascript code. Let me illustrate.
Function.prototype.delay = function(ms /*[, arg...]*/) {
var fn = this,
args = Array.prototype.slice.call(arguments, 1);
return window.setTimeout(function() {
return fn.apply(fn, args);
}, ms);
};
And here's how you would use it:
var startPlayback = function(track) {
Player.play(track);
};
startPlayback(someTrack);
Now imagine you want the playback to start delayed, like for example 5 seconds later after this code snippet runs. Well that's easy with delay and it's closure:
startPlayback.delay(5000, someTrack);
// Keep going, do other things
When you call delay with 5000ms, the first snippet runs, and stores the passed in arguments in it's closure. Then 5 seconds later, when the setTimeout callback happens, the closure still maintains those variables, so it can call the original function with the original parameters.
This is a type of currying, or function decoration.
Without closures, you would have to somehow maintain those variables state outside the function, thus littering code outside the function with something that logically belongs inside it. Using closures can greatly improve the quality and readability of your code.
tl;dr
A closure is a function and its scope assigned to (or used as) a variable. Thus, the name closure: the scope and the function is enclosed and used just like any other entity.
In depth Wikipedia style explanation
According to Wikipedia, a closure is:
Techniques for implementing lexically scoped name binding in languages with first-class functions.
What does that mean? Lets look into some definitions.
I will explain closures and other related definitions by using this example:
function startAt(x) {
return function (y) {
return x + y;
}
}
var closure1 = startAt(1);
var closure2 = startAt(5);
console.log(closure1(3)); // 4 (x == 1, y == 3)
console.log(closure2(3)); // 8 (x == 5, y == 3)
First-class functions
Basically that means we can use functions just like any other entity. We can modify them, pass them as arguments, return them from functions or assign them for variables. Technically speaking, they are first-class citizens, hence the name: first-class functions.
In the example above, startAt returns an (anonymous) function which function get assigned to closure1 and closure2. So as you see JavaScript treats functions just like any other entities (first-class citizens).
Name binding
Name binding is about finding out what data a variable (identifier) references. The scope is really important here, as that is the thing that will determine how a binding is resolved.
In the example above:
In the inner anonymous function's scope, y is bound to 3.
In startAt's scope, x is bound to 1 or 5 (depending on the closure).
Inside the anonymous function's scope, x is not bound to any value, so it needs to be resolved in an upper (startAt's) scope.
Lexical scoping
As Wikipedia says, the scope:
Is the region of a computer program where the binding is valid: where the name can be used to refer to the entity.
There are two techniques:
Lexical (static) scoping: A variable's definition is resolved by searching its containing block or function, then if that fails searching the outer containing block, and so on.
Dynamic scoping: Calling function is searched, then the function which called that calling function, and so on, progressing up the call stack.
For more explanation, check out this question and take a look at Wikipedia.
In the example above, we can see that JavaScript is lexically scoped, because when x is resolved, the binding is searched in the upper (startAt's) scope, based on the source code (the anonymous function that looks for x is defined inside startAt) and not based on the call stack, the way (the scope where) the function was called.
Wrapping (closuring) up
In our example, when we call startAt, it will return a (first-class) function that will be assigned to closure1 and closure2 thus a closure is created, because the passed variables 1 and 5 will be saved within startAt's scope, that will be enclosed with the returned anonymous function. When we call this anonymous function via closure1 and closure2 with the same argument (3), the value of y will be found immediately (as that is the parameter of that function), but x is not bound in the scope of the anonymous function, so the resolution continues in the (lexically) upper function scope (that was saved in the closure) where x is found to be bound to either 1 or 5. Now we know everything for the summation so the result can be returned, then printed.
Now you should understand closures and how they behave, which is a fundamental part of JavaScript.
Currying
Oh, and you also learned what currying is about: you use functions (closures) to pass each argument of an operation instead of using one functions with multiple parameters.
Closure is a feature in JavaScript where a function has access to its own scope variables, access to the outer function variables and access to the global variables.
Closure has access to its outer function scope even after the outer function has returned. This means a closure can remember and access variables and arguments of its outer function even after the function has finished.
The inner function can access the variables defined in its own scope, the outer function’s scope, and the global scope. And the outer function can access the variable defined in its own scope and the global scope.
Example of Closure:
var globalValue = 5;
function functOuter() {
var outerFunctionValue = 10;
//Inner function has access to the outer function value
//and the global variables
function functInner() {
var innerFunctionValue = 5;
alert(globalValue + outerFunctionValue + innerFunctionValue);
}
functInner();
}
functOuter();
Output will be 20 which sum of its inner function own variable, outer function variable and global variable value.
In a normal situation, variables are bound by scoping rule: Local variables work only within the defined function. Closure is a way of breaking this rule temporarily for convenience.
def n_times(a_thing)
return lambda{|n| a_thing * n}
end
in the above code, lambda(|n| a_thing * n} is the closure because a_thing is referred by the lambda (an anonymous function creator).
Now, if you put the resulting anonymous function in a function variable.
foo = n_times(4)
foo will break the normal scoping rule and start using 4 internally.
foo.call(3)
returns 12.
In short, function pointer is just a pointer to a location in the program code base (like program counter). Whereas Closure = Function pointer + Stack frame.
.
Closures provide JavaScript with state.
State in programming simply means remembering things.
Example
var a = 0;
a = a + 1; // => 1
a = a + 1; // => 2
a = a + 1; // => 3
In the case above, state is stored in the variable "a". We follow by adding 1 to "a" several times. We can only do that because we are able to "remember" the value. The state holder, "a", holds that value in memory.
Often, in programming languages, you want to keep track of things, remember information and access it at a later time.
This, in other languages, is commonly accomplished through the use of classes. A class, just like variables, keeps track of its state. And instances of that class, in turns, also have state within them. State simply means information that you can store and retrieve later.
Example
class Bread {
constructor (weight) {
this.weight = weight;
}
render () {
return `My weight is ${this.weight}!`;
}
}
How can we access "weight" from within the "render" method? Well, thanks to state. Each instance of the class Bread can render its own weight by reading it from the "state", a place in memory where we could store that information.
Now, JavaScript is a very unique language which historically does not have classes (it now does, but under the hood there's only functions and variables) so Closures provide a way for JavaScript to remember things and access them later.
Example
var n = 0;
var count = function () {
n = n + 1;
return n;
};
count(); // # 1
count(); // # 2
count(); // # 3
The example above achieved the goal of "keeping state" with a variable. This is great! However, this has the disadvantage that the variable (the "state" holder) is now exposed. We can do better. We can use Closures.
Example
var countGenerator = function () {
var n = 0;
var count = function () {
n = n + 1;
return n;
};
return count;
};
var count = countGenerator();
count(); // # 1
count(); // # 2
count(); // # 3
This is fantastic.
Now our "count" function can count. It is only able to do so because it can "hold" state. The state in this case is the variable "n". This variable is now closed. Closed in time and space. In time because you won't ever be able to recover it, change it, assign it a value or interact directly with it. In space because it's geographically nested within the "countGenerator" function.
Why is this fantastic? Because without involving any other sophisticated and complicated tool (e.g. classes, methods, instances, etc) we are able to
1. conceal
2. control from a distance
We conceal the state, the variable "n", which makes it a private variable!
We also have created an API that can control this variable in a pre-defined way. In particular, we can call the API like so "count()" and that adds 1 to "n" from a "distance". In no way, shape or form anyone will ever be able to access "n" except through the API.
JavaScript is truly amazing in its simplicity.
Closures are a big part of why this is.
Here is another real life example, and using a scripting language popular in games - Lua. I needed to slightly change the way a library function worked to avoid a problem with stdin not being available.
local old_dofile = dofile
function dofile( filename )
if filename == nil then
error( 'Can not use default of stdin.' )
end
old_dofile( filename )
end
The value of old_dofile disappears when this block of code finishes it's scope (because it's local), however the value has been enclosed in a closure, so the new redefined dofile function CAN access it, or rather a copy stored along with the function as an 'upvalue'.
From Lua.org:
When a function is written enclosed in another function, it has full access to local variables from the enclosing function; this feature is called lexical scoping. Although that may sound obvious, it is not. Lexical scoping, plus first-class functions, is a powerful concept in a programming language, but few languages support that concept.
If you are from the Java world, you can compare a closure with a member function of a class. Look at this example
var f=function(){
var a=7;
var g=function(){
return a;
}
return g;
}
The function g is a closure: g closes a in. So g can be compared with a member function, a can be compared with a class field, and the function f with a class.
Closures
Whenever we have a function defined inside another function, the inner function has access to the variables declared
in the outer function. Closures are best explained with examples.
In Listing 2-18, you can see that the inner function has access to a variable (variableInOuterFunction) from the
outer scope. The variables in the outer function have been closed by (or bound in) the inner function. Hence the term
closure. The concept in itself is simple enough and fairly intuitive.
Listing 2-18:
function outerFunction(arg) {
var variableInOuterFunction = arg;
function bar() {
console.log(variableInOuterFunction); // Access a variable from the outer scope
}
// Call the local function to demonstrate that it has access to arg
bar();
}
outerFunction('hello closure!'); // logs hello closure!
source: http://index-of.es/Varios/Basarat%20Ali%20Syed%20(auth.)-Beginning%20Node.js-Apress%20(2014).pdf
Please have a look below code to understand closure in more deep:
for(var i=0; i< 5; i++){
setTimeout(function(){
console.log(i);
}, 1000);
}
Here what will be output? 0,1,2,3,4 not that will be 5,5,5,5,5 because of closure
So how it will solve? Answer is below:
for(var i=0; i< 5; i++){
(function(j){ //using IIFE
setTimeout(function(){
console.log(j);
},1000);
})(i);
}
Let me simple explain, when a function created nothing happen until it called so for loop in 1st code called 5 times but not called immediately so when it called i.e after 1 second and also this is asynchronous so before this for loop finished and store value 5 in var i and finally execute setTimeout function five time and print 5,5,5,5,5
Here how it solve using IIFE i.e Immediate Invoking Function Expression
(function(j){ //i is passed here
setTimeout(function(){
console.log(j);
},1000);
})(i); //look here it called immediate that is store i=0 for 1st loop, i=1 for 2nd loop, and so on and print 0,1,2,3,4
For more, please understand execution context to understand closure.
There is one more solution to solve this using let (ES6 feature) but under the hood above function is worked
for(let i=0; i< 5; i++){
setTimeout(function(){
console.log(i);
},1000);
}
Output: 0,1,2,3,4
=> More explanation:
In memory, when for loop execute picture make like below:
Loop 1)
setTimeout(function(){
console.log(i);
},1000);
Loop 2)
setTimeout(function(){
console.log(i);
},1000);
Loop 3)
setTimeout(function(){
console.log(i);
},1000);
Loop 4)
setTimeout(function(){
console.log(i);
},1000);
Loop 5)
setTimeout(function(){
console.log(i);
},1000);
Here i is not executed and then after complete loop, var i stored value 5 in memory but it's scope is always visible in it's children function so when function execute inside setTimeout out five time it prints 5,5,5,5,5
so to resolve this use IIFE as explain above.
Currying : It allows you to partially evaluate a function by only passing in a subset of its arguments. Consider this:
function multiply (x, y) {
return x * y;
}
const double = multiply.bind(null, 2);
const eight = double(4);
eight == 8;
Closure: A closure is nothing more than accessing a variable outside of a function's scope. It is important to remember that a function inside a function or a nested function isn't a closure. Closures are always used when need to access the variables outside the function scope.
function apple(x){
function google(y,z) {
console.log(x*y);
}
google(7,2);
}
apple(3);
// the answer here will be 21
Closure is very easy. We can consider it as follows :
Closure = function + its lexical environment
Consider the following function:
function init() {
var name = “Mozilla”;
}
What will be the closure in the above case ?
Function init() and variables in its lexical environment ie name.
Closure = init() + name
Consider another function :
function init() {
var name = “Mozilla”;
function displayName(){
alert(name);
}
displayName();
}
What will be the closures here ?
Inner function can access variables of outer function. displayName() can access the variable name declared in the parent function, init(). However, the same local variables in displayName() will be used if they exists.
Closure 1 : init function + ( name variable + displayName() function) --> lexical scope
Closure 2 : displayName function + ( name variable ) --> lexical scope
A simple example in Groovy for your reference:
def outer() {
def x = 1
return { -> println(x)} // inner
}
def innerObj = outer()
innerObj() // prints 1
Here is an example illustrating a closure in the Scheme programming language.
First we define a function defining a local variable, not visible outside the function.
; Function using a local variable
(define (function)
(define a 1)
(display a) ; prints 1, when calling (function)
)
(function) ; prints 1
(display a) ; fails: a undefined
Here is the same example, but now the function uses a global variable, defined outside the function.
; Function using a global variable
(define b 2)
(define (function)
(display b) ; prints 2, when calling (function)
)
(function) ; prints 2
(display 2) ; prints 2
And finally, here is an example of a function carrying its own closure:
; Function with closure
(define (outer)
(define c 3)
(define (inner)
(display c))
inner ; outer function returns the inner function as result
)
(define function (outer))
(function) ; prints 3

Resources