Rcpp function to construct a function - r

In R the possibility exists to have a function that creates another function, e.g.
create_ax2 <- function(a) {
ax2 <- function(x) {
y <- a * x^2
return(y)
}
return(ax2)
}
The result of which is
> fun <- create_ax2(3)
> fun(1)
[1] 3
> fun(2)
[1] 12
> fun(2.5)
[1] 18.75
I have such a complicated create function in R which take a couple of arguments, sets some of the constants used in the returned function, does some intermediary computations etc... But the result is a function that is way too slow. Hence I tried to translate the code to C++ to use it with Rcpp. However, I can't figure out a way to construct a function inside a C++ function and return it to be used in R.
This is what I have so far:
Rcpp::Function createax2Rcpp(int a) {
double ax2(double x) {
return(a * pow(x, 2));
};
return (ax2);
}
This gives me the error 'function definition is not allowed here', I am stuck about how to create the function.
EDIT: The question RcppArmadillo pass user-defined function comes close, but as far as I can tell, it only provides a way to pass a C++ function to R. It does not provide a way to initialise some values in the C++ function before it is passed to R.

Ok, as far as I understand, you want a function returning function with a closure, a.k.a. " the function defined in the closure 'remembers' the environment in which it was created."
In C++11 and up it is quite possible to define such function, along the lines
std::function<double(double)> createax2Rcpp(int a) {
auto ax2 = [a](double x) { return(double(a) * pow(x, 2)); };
return ax2;
}
What happens, the anonymous class and object with overloaded operator() will be created, it will capture the closure and moved out of the creator function. Return will be captured into instance of std::function with type erasure etc.
But! C/C++ function in R requires to be of a certain type, which is narrower (as an opposite to wider, you could capture narrow objects into wide one, but not vice versa).
Thus, I don't know how to make from std::function a proper R function, looks like it is impossible.
Perhaps, emulation of the closure like below might help
static int __a;
double ax2(double x) {
return(__a * pow(x, 2));
}
Rcpp::Function createax2Rcpp(int a) {
__a = a;
return (ax2);
}

Related

Nargin function in R (number of function inputs)

Goal
I am trying to create a function in R to replicate the functionality of a homonymous MATLAB function which returns the number of arguments that were passed to a function.
Example
Consider the function below:
addme <- function(a, b) {
if (nargin() == 2) {
c <- a + b
} else if (nargin() == 1) {
c <- a + a
} else {
c <- 0
}
return(c)
}
Once the user runs addme(), I want nargin() to basically look at how many parameters were passed―2 (a and b), only 1 (a) or none―and calculate c accordingly.
What I have tried
After spending a lot of time messing around with environments, this is the closest I ever got to a working solution:
nargin <- function() {
length(as.list(match.call(envir = parent.env(environment()))))
}
The problem with this function is that it always returns 0, and the reason why is that I think it's looking at its own environment instead of its parent's (in spite of my attempt of throwing in a parent.env there).
I know I can use missing() and args() inside addme() to achieve the same functionality, but I'll be needing this quite a few other times throughout my project, so wrapping it in a function is definitely something I should try to do.
Question
How can I get nargin() to return the number of arguments that were passed to its parent function?
You could use
nargin <- function() {
if(sys.nframe()<2) stop("must be called from inside a function")
length(as.list(sys.call(-1)))-1
}
Basically you just use sys.call(-1) to go up the call stack to the calling function and get it's call and then count the number of elements and subtract one for the function name itself.

using callCC with higher-order functions in R

I'm trying to figure out how to get R's callCC function for short-circuiting evalutation of a function to work with functions like lapply and Reduce.
Motivation
This would make Reduce and and lapply have asymptotic efficiency > O(n), by allowing you to
exit a computation early.
For example, if I'm searching for a value in a list I could map a 'finder' function across the list, and the second it is found lapply stops running and that value is returned (much like breaking a loop, or using a return statement to break out early).
The problem is I am having trouble writing the functions that lapply and Reduce should take using a style that callCC requires.
Example
Say I'm trying to write a function to find the value '100' in a list: something equivalent to
imperativeVersion <- function (xs) {
for (val in xs) if (val == 100) return (val)
}
The function to pass to lapply would look like:
find100 <- function (val) { if (val == 100) SHORT_CIRCUIT(val) }
functionalVersion <- function (xs) lapply(xs, find100)
This (obviously) crashes, since the short circuiting function hasn't been defined yet.
callCC( function (SHORT_CIRCUIT) lapply(1:1000, find100) )
The problem is that this also crashes, because the short circuiting function wasn't around when find100 was defined. I would like for something similar to this to work.
the following works because SHORT_CIRCUIT IS defined at the time that the function passed to lapply is created.
callCC(
function (SHORT_CIRCUIT) {
lapply(1:1000, function (val) {
if (val == 100) SHORT_CIRCUIT(val)
})
)
How can I make SHORT_CIRCUIT be defined in the function passed to lapply without defining it inline like above?
I'm aware this example can be achieved using loops, reduce or any other number of ways. I am looking for a solution to the problem of using callCC with lapply and Reduce in specific.
If I was vague or any clarification is needed please leave a comment below. I hope someone can help with this :)
Edit One:
The approach should be 'production-quality'; no deparsing functions or similar black magic.
I found a soluton to this problem:
find100 <- function (val) {
if (val == 100) SHORT_CIRCUIT(val)
}
short_map <- function (fn, coll) {
callCC(function (SHORT_CIRCUIT) {
clone_env <- new.env(parent = environment(fn))
clone_env$SHORT_CIRCUIT <- SHORT_CIRCUIT
environment(fn) <- clone_env
lapply(coll, fn)
})
}
short_map(find100, c(1,2,100,3))
The trick to making higher-order functions work with callCC is to assign the short-circuiting function into the input functions environment before carrying on with the rest of the program. I made a clone of the environment to avoid unintended side-effects.
You can achieve this using metaprogramming in R.
#alexis_laz's approach was in fact already metaprogramming.
However, he used strings which are a dirty hack and error prone. So you did well to reject it.
The correct way to approach #alexis_laz's approach would be by wrangling on code level. In base R this is done using substitute(). There are however better packages e.g. rlang by Hadley Wickham. But I give you a base R solution (less dependency).
lapply_ <- function(lst, FUN) {
eval.parent(
substitute(
callCC(function(return_) {
lapply(lst_, FUN_)
}),
list(lst_ = lst, FUN_=substitute(FUN))))
}
Your SHORT_CIRCUIT function is actually a more general, control flow return function (or a break function which takes an argument to return it). Thus, I call it return_.
We want to have a lapply_ function, in which we can in the FUN= part use a return_ to break out of the usual lapply().
As you showed, this is the aim:
callCC(
function (return_) {
lapply(1:1000, function (x) if (x == 100) return_(x))
}
)
Just with the problem, that we want to be able to generalize this expression.
We want
callCC(
function(return_) lapply(lst, FUN_)
)
Where we can use inside the function definition we give for FUN_ the return_.
We can let, however, the function defintion see return_ only if we insert the function definition code into this expression.
This exactly #alexis_laz tried using string and eval.
Or you did this by manipulating environment variables.
We can safely achieve the insertion of literal code using substitute(expr, replacer_list) where expr is the code to be manipulated and replacer_list is the lookup table for the replacement of code.
By substitute(FUN) we take the literal code given for FUN= for lapply_ without evaluating it. This expression returns literal quoted code (better than the string in #alexis_laz's approach).
The big substitute command says: "Take the expression callCC(function(return_) lapply(lst_, FUN_)) and replace lst_ in this expression by the list given for coll and FUN_ by the literal quoted expression given for FUN.
This replaced expression is then evaluated in the parent environment (eval.parent()) meaning: the resulting expression replaces the lapply_() call and is executed exactly where it was placed.
Such use of eval.parent() (or eval( ... , envir=parent.frame())) is fool proof. (otherwise, tidyverse packages wouldn't be production level ...).
So in this way, you can generalize callCC() calls.
lapply_(1:1000, FUN=function(x) if (x==100) return_(x))
## [1] 100
I don't know if it can be of use, but:
find100 <- "function (val) { if (val == 100) SHORT_CIRCUIT(val) }"
callCC( function (SHORT_CIRCUIT) lapply(1:1000, eval(parse(text = find100))) )
#[1] 100

How do I convert a function into point free form?

Let's say I have a JavaScript function
function f(x) {
return a(b(x), c(x));
}
How would I convert that into a point free function? through composing functions? Also are there resources for more info on this?
In general, there's no easy rule to follow when you turn functions into point free style. Either you are going to have to guess, or you can just automate it. In the Haskell IRC channel, we have the lambdabot which is great at turning Haskell functions into point-free style. I usually just consult that, and then work my way backwards if I need to know how it works.
Your particular example can be solved using a couple of helpful functions. I'll show you below how it works, but be aware that it might require a lot of playing around to understand. It also helps if you know really, really basic lambda calculus, because the JavaScript syntax tends to get in the way sometimes.
Anyway, here goes:
Basically, to do this properly, you need three functions: fmap(f, g), ap(f, g) and curry(f). When you have those, f(x) is easily defined as (and this looks much neater in e.g. Haskell)
f = ap(fmap(curry(a), b), c);
The interesting bit lies in defining those three functions.
curry
Normally when you define functions of multiple arguments in JavaScript, you define them like
function f(x, y) {
// body
}
and you call them by doing something like f(3, 4). This is what is called an "uncurried function" in functional programming. You could also imagine defining functions like
function f(x) {
return function(y) {
//body
}
}
These functions are called "curried functions." (By the way, they are named after a mathematician whose name was Curry, if you wonder about the strange name.) Curried functions are instead called by doing
f(3)(4)
but other than that, the two functions behave very similarly. One difference is that it is easier to work with a point-free style when the functions are curried. Our curry function simply takes an uncurried function like the first one and turns it into a curried function like the second one. curry can be defined as
function curry(f) {
return function(a) {
return function(b) {
return f(a, b);
}
}
}
Now, you can use this. Instead of doing pow(3, 4) to get 81, you can do
cpow = curry(pow);
cpow(3)(4);
cpow is the curried version of pow. It doesn't take both arguments at the same time -- it takes them separately. In your specific case, this allows us to go from
function f(x) {
return a(b(x), c(x));
}
to
function f(x) {
return curry(a)(b(x))(c(x));
}
This is progress! (Although I admit it looks very weird in JavaScript...) Now, on to less spicy pastures.
fmap
The second piece of the puzzle is fmap(f, g), which takes two functions as arguments and composes them. What I'm saying is,
fmap(f, g)(x) == f(g(x))
This is easy to define, we just let
function fmap(f, g) {
return function(x) {
return f(g(x));
}
}
This is useful when you want to do two things in sequence. Say you want to do the useless operation log(exp(x)). You could do this the traditional way:
function logexp(x) {
return log(exp(x));
}
You could instead just do
logexp = fmap(log, exp);
This is commonly called composing two functions. To connect this to your example, last we left it off, we had refactored it into
function f(x) {
return curry(a)(b(x))(c(x));
}
We now notice some visual similarity between this and the function body of fmap. Let's rewrite this with fmap and it becomes
function f(x) {
return fmap(curry(a), b)(x)(c(x));
}
(to see how I got there, imagine that f = curry(a) and g = b. The last bit with c(x) isn't changed.)
ap
Our last puzzle piece is ap(f, g), which takes two functions and an argument, and does a weird thing. I won't even try to explain it, so I'll just show you what it does:
ap(f, g)(x) == f(x)(g(x))
Remember that f is really just a function of two arguments, only we write it a little differently to be able to do magic. ap is defined in JavaScript as
function ap(f, g) {
return function(x) {
return f(x)(g(x));
}
}
So, to put this in a more practical context: Say you want to raise a number to the square root of itself. You could do
function powsqrt(x) {
return pow(x, sqrt(x));
}
or, with your newfound knowledge of ap and remembering cpow from the first part about currying, you could also do
powsqrt = ap(cpow, sqrt);
This works because cpow is the curried version of pow. You can verify for yourself that this becomes the right thing when the definition of ap is expanded.
Now, to tie all this together with your example, we need to turn
function f(x) {
return fmap(curry(a), b)(x)(c(x));
}
Into the final, completely point-free version. If we look at the definition of ap, we see we can do something here to turn this into the point-free version!
function f(x) {
return ap(fmap(curry(a), b), c)(x);
}
Basically, the easiest way to understand this is to now "unfold" the call to ap. Replace the call to ap with the function body! What we get then, by merely substituting, is
function f(x) {
return function(y) {
return fmap(curry(a), b)(y)(c(y));
}(x);
}
I've renamed one x to y to avoid name collisions. This is still a bit weird, but we can make it a little shorter. After all, it is the same thing as
function f(x) {
return fmap(curry(a), b)(x)(c(x));
}
which was what we started with! Our call to ap was correct. If you want to, you can further unfold this to see that after everything is said and done, we actually end up with the very thing we started with. I leave that as an exercise.
Wrapping Up
Anyway, the last refactoring of your code made it into
function f(x) {
return ap(fmap(curry(a), b), c)(x);
}
which of course is the same thing as
f = ap(fmap(curry(a), b), c);
And that's it!

Forcing specific data types as arguments to a function

I was just wondering if there was a way to force a function to only accept certain data types, without having to check for it within the function; or, is this not possible because R's type-checking is done at runtime (as opposed to those programming languages, such as Java, where type-checking is done during compilation)?
For example, in Java, you have to specify a data type:
class t2 {
public int addone (int n) {
return n+1;
}
}
In R, a similar function might be
addone <- function(n)
{
return(n+1)
}
but if a vector is supplied, a vector will (obviously) be returned. If you only want a single integer to be accepted, then is the only way to do to have a condition within the function, along the lines of
addone <- function(n)
{
if(is.vector(n) && length(n)==1)
{
return(n+1)
} else
{
return ("You must enter a single integer")
}
}
Thanks,
Chris
This is entirely possible using S3 classes. Your example is somewhat contrived in the context or R, since I can't think of a practical reason why one would want to create a class of a single value. Nonetheless, this is possible. As an added bonus, I demonstrate how the function addone can be used to add the value of one to numeric vectors (trivial) and character vectors (so A turns to B, etc.):
Start by creating a generic S3 method for addone, utlising the S3 despatch mechanism UseMethod:
addone <- function(x){
UseMethod("addone", x)
}
Next, create the contrived class single, defined as the first element of whatever is passed to it:
as.single <- function(x){
ret <- unlist(x)[1]
class(ret) <- "single"
ret
}
Now create methods to handle the various classes. The default method will be called unless a specific class is defined:
addone.default <- function(x) x + 1
addone.character <- function(x)rawToChar(as.raw(as.numeric(charToRaw(x))+1))
addone.single <- function(x)x + 1
Finally, test it with some sample data:
addone(1:5)
[1] 2 3 4 5 6
addone(as.single(1:5))
[1] 2
attr(,"class")
[1] "single"
addone("abc")
[1] "bcd"
Some additional information:
Hadley's devtools wiki is a valuable source of information on all things, including the S3 object system.
The S3 method doesn't provide strict typing. It can quite easily be abused. For stricter object orientation, have a look at S4 classes, reference based classesor the proto package for Prototype object-based programming.
You could write a wrapper like the following:
check.types = function(classes, func) {
n = as.name
params = formals(func)
param.names = lapply(names(params), n)
handler = function() { }
formals(handler) = params
checks = lapply(seq_along(param.names), function(I) {
as.call(list(n('assert.class'), param.names[[I]], classes[[I]]))
})
body(handler) = as.call(c(
list(n('{')),
checks,
list(as.call(list(n('<-'), n('.func'), func))),
list(as.call(c(list(n('.func')), lapply(param.names, as.name))))
))
handler
}
assert.class = function(x, cls) {
stopifnot(cls %in% class(x))
}
And use it like
f = check.types(c('numeric', 'numeric'), function(x, y) {
x + y
})
> f(1, 2)
[1] 3
> f("1", "2")
Error: cls %in% class(x) is not TRUE
Made somewhat inconvenient by R not having decorators. This is kind of hacky
and it suffers from some serious problems:
You lose lazy evaluation, because you must evaluate an argument to determine
its type.
You still can't check the types until call time; real static type checking
lets you check the types even of a call that never actually happens.
Since R uses lazy evaluation, (2) might make type checking not very useful,
because the call might not actually occur until very late, or never.
The answer to (2) would be to add static type information. You could probably
do this by transforming expressions, but I don't think you want to go there.
I've found stopifnot() to be highly useful for these situations as well.
x <- function(n) {
stopifnot(is.vector(n) && length(n)==1)
print(n)
}
The reason it is so useful is because it provides a pretty clear error message to the user if the condition is false.

What is a 'Closure'?

I asked a question about Currying and closures were mentioned.
What is a closure? How does it relate to currying?
Variable scope
When you declare a local variable, that variable has a scope. Generally, local variables exist only within the block or function in which you declare them.
function() {
var a = 1;
console.log(a); // works
}
console.log(a); // fails
If I try to access a local variable, most languages will look for it in the current scope, then up through the parent scopes until they reach the root scope.
var a = 1;
function() {
console.log(a); // works
}
console.log(a); // works
When a block or function is done with, its local variables are no longer needed and are usually blown out of memory.
This is how we normally expect things to work.
A closure is a persistent local variable scope
A closure is a persistent scope which holds on to local variables even after the code execution has moved out of that block. Languages which support closure (such as JavaScript, Swift, and Ruby) will allow you to keep a reference to a scope (including its parent scopes), even after the block in which those variables were declared has finished executing, provided you keep a reference to that block or function somewhere.
The scope object and all its local variables are tied to the function and will persist as long as that function persists.
This gives us function portability. We can expect any variables that were in scope when the function was first defined to still be in scope when we later call the function, even if we call the function in a completely different context.
For example
Here's a really simple example in JavaScript that illustrates the point:
outer = function() {
var a = 1;
var inner = function() {
console.log(a);
}
return inner; // this returns a function
}
var fnc = outer(); // execute outer to get inner
fnc();
Here I have defined a function within a function. The inner function gains access to all the outer function's local variables, including a. The variable a is in scope for the inner function.
Normally when a function exits, all its local variables are blown away. However, if we return the inner function and assign it to a variable fnc so that it persists after outer has exited, all of the variables that were in scope when inner was defined also persist. The variable a has been closed over -- it is within a closure.
Note that the variable a is totally private to fnc. This is a way of creating private variables in a functional programming language such as JavaScript.
As you might be able to guess, when I call fnc() it prints the value of a, which is "1".
In a language without closure, the variable a would have been garbage collected and thrown away when the function outer exited. Calling fnc would have thrown an error because a no longer exists.
In JavaScript, the variable a persists because the variable scope is created when the function is first declared and persists for as long as the function continues to exist.
a belongs to the scope of outer. The scope of inner has a parent pointer to the scope of outer. fnc is a variable which points to inner. a persists as long as fnc persists. a is within the closure.
Further reading (watching)
I made a YouTube video looking at this code with some practical examples of usage.
I'll give an example (in JavaScript):
function makeCounter () {
var count = 0;
return function () {
count += 1;
return count;
}
}
var x = makeCounter();
x(); returns 1
x(); returns 2
...etc...
What this function, makeCounter, does is it returns a function, which we've called x, that will count up by one each time it's called. Since we're not providing any parameters to x, it must somehow remember the count. It knows where to find it based on what's called lexical scoping - it must look to the spot where it's defined to find the value. This "hidden" value is what is called a closure.
Here is my currying example again:
function add (a) {
return function (b) {
return a + b;
}
}
var add3 = add(3);
add3(4); returns 7
What you can see is that when you call add with the parameter a (which is 3), that value is contained in the closure of the returned function that we're defining to be add3. That way, when we call add3, it knows where to find the a value to perform the addition.
First of all, contrary to what most of the people here tell you, closure is not a function! So what is it?
It is a set of symbols defined in a function's "surrounding context" (known as its environment) which make it a CLOSED expression (that is, an expression in which every symbol is defined and has a value, so it can be evaluated).
For example, when you have a JavaScript function:
function closed(x) {
return x + 3;
}
it is a closed expression because all the symbols occurring in it are defined in it (their meanings are clear), so you can evaluate it. In other words, it is self-contained.
But if you have a function like this:
function open(x) {
return x*y + 3;
}
it is an open expression because there are symbols in it which have not been defined in it. Namely, y. When looking at this function, we can't tell what y is and what does it mean, we don't know its value, so we cannot evaluate this expression. I.e. we cannot call this function until we tell what y is supposed to mean in it. This y is called a free variable.
This y begs for a definition, but this definition is not part of the function – it is defined somewhere else, in its "surrounding context" (also known as the environment). At least that's what we hope for :P
For example, it could be defined globally:
var y = 7;
function open(x) {
return x*y + 3;
}
Or it could be defined in a function which wraps it:
var global = 2;
function wrapper(y) {
var w = "unused";
return function(x) {
return x*y + 3;
}
}
The part of the environment which gives the free variables in an expression their meanings, is the closure. It is called this way, because it turns an open expression into a closed one, by supplying these missing definitions for all of its free variables, so that we could evaluate it.
In the example above, the inner function (which we didn't give a name because we didn't need it) is an open expression because the variable y in it is free – its definition is outside the function, in the function which wraps it. The environment for that anonymous function is the set of variables:
{
global: 2,
w: "unused",
y: [whatever has been passed to that wrapper function as its parameter `y`]
}
Now, the closure is that part of this environment which closes the inner function by supplying the definitions for all its free variables. In our case, the only free variable in the inner function was y, so the closure of that function is this subset of its environment:
{
y: [whatever has been passed to that wrapper function as its parameter `y`]
}
The other two symbols defined in the environment are not part of the closure of that function, because it doesn't require them to run. They are not needed to close it.
More on the theory behind that here:
https://stackoverflow.com/a/36878651/434562
It's worth to note that in the example above, the wrapper function returns its inner function as a value. The moment we call this function can be remote in time from the moment the function has been defined (or created). In particular, its wrapping function is no longer running, and its parameters which has been on the call stack are no longer there :P This makes a problem, because the inner function needs y to be there when it is called! In other words, it requires the variables from its closure to somehow outlive the wrapper function and be there when needed. Therefore, the inner function has to make a snapshot of these variables which make its closure and store them somewhere safe for later use. (Somewhere outside the call stack.)
And this is why people often confuse the term closure to be that special type of function which can do such snapshots of the external variables they use, or the data structure used to store these variables for later. But I hope you understand now that they are not the closure itself – they're just ways to implement closures in a programming language, or language mechanisms which allows the variables from the function's closure to be there when needed. There's a lot of misconceptions around closures which (unnecessarily) make this subject much more confusing and complicated than it actually is.
Kyle's answer is pretty good. I think the only additional clarification is that the closure is basically a snapshot of the stack at the point that the lambda function is created. Then when the function is re-executed the stack is restored to that state before executing the function. Thus as Kyle mentions, that hidden value (count) is available when the lambda function executes.
A closure is a function that can reference state in another function. For example, in Python, this uses the closure "inner":
def outer (a):
b = "variable in outer()"
def inner (c):
print a, b, c
return inner
# Now the return value from outer() can be saved for later
func = outer ("test")
func (1) # prints "test variable in outer() 1
To help facilitate understanding of closures it might be useful to examine how they might be implemented in a procedural language. This explanation will follow a simplistic implementation of closures in Scheme.
To start, I must introduce the concept of a namespace. When you enter a command into a Scheme interpreter, it must evaluate the various symbols in the expression and obtain their value. Example:
(define x 3)
(define y 4)
(+ x y) returns 7
The define expressions store the value 3 in the spot for x and the value 4 in the spot for y. Then when we call (+ x y), the interpreter looks up the values in the namespace and is able to perform the operation and return 7.
However, in Scheme there are expressions that allow you to temporarily override the value of a symbol. Here's an example:
(define x 3)
(define y 4)
(let ((x 5))
(+ x y)) returns 9
x returns 3
What the let keyword does is introduces a new namespace with x as the value 5. You will notice that it's still able to see that y is 4, making the sum returned to be 9. You can also see that once the expression has ended x is back to being 3. In this sense, x has been temporarily masked by the local value.
Procedural and object-oriented languages have a similar concept. Whenever you declare a variable in a function that has the same name as a global variable you get the same effect.
How would we implement this? A simple way is with a linked list - the head contains the new value and the tail contains the old namespace. When you need to look up a symbol, you start at the head and work your way down the tail.
Now let's skip to the implementation of first-class functions for the moment. More or less, a function is a set of instructions to execute when the function is called culminating in the return value. When we read in a function, we can store these instructions behind the scenes and run them when the function is called.
(define x 3)
(define (plus-x y)
(+ x y))
(let ((x 5))
(plus-x 4)) returns ?
We define x to be 3 and plus-x to be its parameter, y, plus the value of x. Finally we call plus-x in an environment where x has been masked by a new x, this one valued 5. If we merely store the operation, (+ x y), for the function plus-x, since we're in the context of x being 5 the result returned would be 9. This is what's called dynamic scoping.
However, Scheme, Common Lisp, and many other languages have what's called lexical scoping - in addition to storing the operation (+ x y) we also store the namespace at that particular point. That way, when we're looking up the values we can see that x, in this context, is really 3. This is a closure.
(define x 3)
(define (plus-x y)
(+ x y))
(let ((x 5))
(plus-x 4)) returns 7
In summary, we can use a linked list to store the state of the namespace at the time of function definition, allowing us to access variables from enclosing scopes, as well as providing us the ability to locally mask a variable without affecting the rest of the program.
Functions containing no free variables are called pure functions.
Functions containing one or more free variables are called closures.
var pure = function pure(x){
return x
// only own environment is used
}
var foo = "bar"
var closure = function closure(){
return foo
// foo is a free variable from the outer environment
}
src: https://leanpub.com/javascriptallongesix/read#leanpub-auto-if-functions-without-free-variables-are-pure-are-closures-impure
Here's a real world example of why Closures kick ass... This is straight out of my Javascript code. Let me illustrate.
Function.prototype.delay = function(ms /*[, arg...]*/) {
var fn = this,
args = Array.prototype.slice.call(arguments, 1);
return window.setTimeout(function() {
return fn.apply(fn, args);
}, ms);
};
And here's how you would use it:
var startPlayback = function(track) {
Player.play(track);
};
startPlayback(someTrack);
Now imagine you want the playback to start delayed, like for example 5 seconds later after this code snippet runs. Well that's easy with delay and it's closure:
startPlayback.delay(5000, someTrack);
// Keep going, do other things
When you call delay with 5000ms, the first snippet runs, and stores the passed in arguments in it's closure. Then 5 seconds later, when the setTimeout callback happens, the closure still maintains those variables, so it can call the original function with the original parameters.
This is a type of currying, or function decoration.
Without closures, you would have to somehow maintain those variables state outside the function, thus littering code outside the function with something that logically belongs inside it. Using closures can greatly improve the quality and readability of your code.
tl;dr
A closure is a function and its scope assigned to (or used as) a variable. Thus, the name closure: the scope and the function is enclosed and used just like any other entity.
In depth Wikipedia style explanation
According to Wikipedia, a closure is:
Techniques for implementing lexically scoped name binding in languages with first-class functions.
What does that mean? Lets look into some definitions.
I will explain closures and other related definitions by using this example:
function startAt(x) {
return function (y) {
return x + y;
}
}
var closure1 = startAt(1);
var closure2 = startAt(5);
console.log(closure1(3)); // 4 (x == 1, y == 3)
console.log(closure2(3)); // 8 (x == 5, y == 3)
First-class functions
Basically that means we can use functions just like any other entity. We can modify them, pass them as arguments, return them from functions or assign them for variables. Technically speaking, they are first-class citizens, hence the name: first-class functions.
In the example above, startAt returns an (anonymous) function which function get assigned to closure1 and closure2. So as you see JavaScript treats functions just like any other entities (first-class citizens).
Name binding
Name binding is about finding out what data a variable (identifier) references. The scope is really important here, as that is the thing that will determine how a binding is resolved.
In the example above:
In the inner anonymous function's scope, y is bound to 3.
In startAt's scope, x is bound to 1 or 5 (depending on the closure).
Inside the anonymous function's scope, x is not bound to any value, so it needs to be resolved in an upper (startAt's) scope.
Lexical scoping
As Wikipedia says, the scope:
Is the region of a computer program where the binding is valid: where the name can be used to refer to the entity.
There are two techniques:
Lexical (static) scoping: A variable's definition is resolved by searching its containing block or function, then if that fails searching the outer containing block, and so on.
Dynamic scoping: Calling function is searched, then the function which called that calling function, and so on, progressing up the call stack.
For more explanation, check out this question and take a look at Wikipedia.
In the example above, we can see that JavaScript is lexically scoped, because when x is resolved, the binding is searched in the upper (startAt's) scope, based on the source code (the anonymous function that looks for x is defined inside startAt) and not based on the call stack, the way (the scope where) the function was called.
Wrapping (closuring) up
In our example, when we call startAt, it will return a (first-class) function that will be assigned to closure1 and closure2 thus a closure is created, because the passed variables 1 and 5 will be saved within startAt's scope, that will be enclosed with the returned anonymous function. When we call this anonymous function via closure1 and closure2 with the same argument (3), the value of y will be found immediately (as that is the parameter of that function), but x is not bound in the scope of the anonymous function, so the resolution continues in the (lexically) upper function scope (that was saved in the closure) where x is found to be bound to either 1 or 5. Now we know everything for the summation so the result can be returned, then printed.
Now you should understand closures and how they behave, which is a fundamental part of JavaScript.
Currying
Oh, and you also learned what currying is about: you use functions (closures) to pass each argument of an operation instead of using one functions with multiple parameters.
Closure is a feature in JavaScript where a function has access to its own scope variables, access to the outer function variables and access to the global variables.
Closure has access to its outer function scope even after the outer function has returned. This means a closure can remember and access variables and arguments of its outer function even after the function has finished.
The inner function can access the variables defined in its own scope, the outer function’s scope, and the global scope. And the outer function can access the variable defined in its own scope and the global scope.
Example of Closure:
var globalValue = 5;
function functOuter() {
var outerFunctionValue = 10;
//Inner function has access to the outer function value
//and the global variables
function functInner() {
var innerFunctionValue = 5;
alert(globalValue + outerFunctionValue + innerFunctionValue);
}
functInner();
}
functOuter();
Output will be 20 which sum of its inner function own variable, outer function variable and global variable value.
In a normal situation, variables are bound by scoping rule: Local variables work only within the defined function. Closure is a way of breaking this rule temporarily for convenience.
def n_times(a_thing)
return lambda{|n| a_thing * n}
end
in the above code, lambda(|n| a_thing * n} is the closure because a_thing is referred by the lambda (an anonymous function creator).
Now, if you put the resulting anonymous function in a function variable.
foo = n_times(4)
foo will break the normal scoping rule and start using 4 internally.
foo.call(3)
returns 12.
In short, function pointer is just a pointer to a location in the program code base (like program counter). Whereas Closure = Function pointer + Stack frame.
.
Closures provide JavaScript with state.
State in programming simply means remembering things.
Example
var a = 0;
a = a + 1; // => 1
a = a + 1; // => 2
a = a + 1; // => 3
In the case above, state is stored in the variable "a". We follow by adding 1 to "a" several times. We can only do that because we are able to "remember" the value. The state holder, "a", holds that value in memory.
Often, in programming languages, you want to keep track of things, remember information and access it at a later time.
This, in other languages, is commonly accomplished through the use of classes. A class, just like variables, keeps track of its state. And instances of that class, in turns, also have state within them. State simply means information that you can store and retrieve later.
Example
class Bread {
constructor (weight) {
this.weight = weight;
}
render () {
return `My weight is ${this.weight}!`;
}
}
How can we access "weight" from within the "render" method? Well, thanks to state. Each instance of the class Bread can render its own weight by reading it from the "state", a place in memory where we could store that information.
Now, JavaScript is a very unique language which historically does not have classes (it now does, but under the hood there's only functions and variables) so Closures provide a way for JavaScript to remember things and access them later.
Example
var n = 0;
var count = function () {
n = n + 1;
return n;
};
count(); // # 1
count(); // # 2
count(); // # 3
The example above achieved the goal of "keeping state" with a variable. This is great! However, this has the disadvantage that the variable (the "state" holder) is now exposed. We can do better. We can use Closures.
Example
var countGenerator = function () {
var n = 0;
var count = function () {
n = n + 1;
return n;
};
return count;
};
var count = countGenerator();
count(); // # 1
count(); // # 2
count(); // # 3
This is fantastic.
Now our "count" function can count. It is only able to do so because it can "hold" state. The state in this case is the variable "n". This variable is now closed. Closed in time and space. In time because you won't ever be able to recover it, change it, assign it a value or interact directly with it. In space because it's geographically nested within the "countGenerator" function.
Why is this fantastic? Because without involving any other sophisticated and complicated tool (e.g. classes, methods, instances, etc) we are able to
1. conceal
2. control from a distance
We conceal the state, the variable "n", which makes it a private variable!
We also have created an API that can control this variable in a pre-defined way. In particular, we can call the API like so "count()" and that adds 1 to "n" from a "distance". In no way, shape or form anyone will ever be able to access "n" except through the API.
JavaScript is truly amazing in its simplicity.
Closures are a big part of why this is.
Here is another real life example, and using a scripting language popular in games - Lua. I needed to slightly change the way a library function worked to avoid a problem with stdin not being available.
local old_dofile = dofile
function dofile( filename )
if filename == nil then
error( 'Can not use default of stdin.' )
end
old_dofile( filename )
end
The value of old_dofile disappears when this block of code finishes it's scope (because it's local), however the value has been enclosed in a closure, so the new redefined dofile function CAN access it, or rather a copy stored along with the function as an 'upvalue'.
From Lua.org:
When a function is written enclosed in another function, it has full access to local variables from the enclosing function; this feature is called lexical scoping. Although that may sound obvious, it is not. Lexical scoping, plus first-class functions, is a powerful concept in a programming language, but few languages support that concept.
If you are from the Java world, you can compare a closure with a member function of a class. Look at this example
var f=function(){
var a=7;
var g=function(){
return a;
}
return g;
}
The function g is a closure: g closes a in. So g can be compared with a member function, a can be compared with a class field, and the function f with a class.
Closures
Whenever we have a function defined inside another function, the inner function has access to the variables declared
in the outer function. Closures are best explained with examples.
In Listing 2-18, you can see that the inner function has access to a variable (variableInOuterFunction) from the
outer scope. The variables in the outer function have been closed by (or bound in) the inner function. Hence the term
closure. The concept in itself is simple enough and fairly intuitive.
Listing 2-18:
function outerFunction(arg) {
var variableInOuterFunction = arg;
function bar() {
console.log(variableInOuterFunction); // Access a variable from the outer scope
}
// Call the local function to demonstrate that it has access to arg
bar();
}
outerFunction('hello closure!'); // logs hello closure!
source: http://index-of.es/Varios/Basarat%20Ali%20Syed%20(auth.)-Beginning%20Node.js-Apress%20(2014).pdf
Please have a look below code to understand closure in more deep:
for(var i=0; i< 5; i++){
setTimeout(function(){
console.log(i);
}, 1000);
}
Here what will be output? 0,1,2,3,4 not that will be 5,5,5,5,5 because of closure
So how it will solve? Answer is below:
for(var i=0; i< 5; i++){
(function(j){ //using IIFE
setTimeout(function(){
console.log(j);
},1000);
})(i);
}
Let me simple explain, when a function created nothing happen until it called so for loop in 1st code called 5 times but not called immediately so when it called i.e after 1 second and also this is asynchronous so before this for loop finished and store value 5 in var i and finally execute setTimeout function five time and print 5,5,5,5,5
Here how it solve using IIFE i.e Immediate Invoking Function Expression
(function(j){ //i is passed here
setTimeout(function(){
console.log(j);
},1000);
})(i); //look here it called immediate that is store i=0 for 1st loop, i=1 for 2nd loop, and so on and print 0,1,2,3,4
For more, please understand execution context to understand closure.
There is one more solution to solve this using let (ES6 feature) but under the hood above function is worked
for(let i=0; i< 5; i++){
setTimeout(function(){
console.log(i);
},1000);
}
Output: 0,1,2,3,4
=> More explanation:
In memory, when for loop execute picture make like below:
Loop 1)
setTimeout(function(){
console.log(i);
},1000);
Loop 2)
setTimeout(function(){
console.log(i);
},1000);
Loop 3)
setTimeout(function(){
console.log(i);
},1000);
Loop 4)
setTimeout(function(){
console.log(i);
},1000);
Loop 5)
setTimeout(function(){
console.log(i);
},1000);
Here i is not executed and then after complete loop, var i stored value 5 in memory but it's scope is always visible in it's children function so when function execute inside setTimeout out five time it prints 5,5,5,5,5
so to resolve this use IIFE as explain above.
Currying : It allows you to partially evaluate a function by only passing in a subset of its arguments. Consider this:
function multiply (x, y) {
return x * y;
}
const double = multiply.bind(null, 2);
const eight = double(4);
eight == 8;
Closure: A closure is nothing more than accessing a variable outside of a function's scope. It is important to remember that a function inside a function or a nested function isn't a closure. Closures are always used when need to access the variables outside the function scope.
function apple(x){
function google(y,z) {
console.log(x*y);
}
google(7,2);
}
apple(3);
// the answer here will be 21
Closure is very easy. We can consider it as follows :
Closure = function + its lexical environment
Consider the following function:
function init() {
var name = “Mozilla”;
}
What will be the closure in the above case ?
Function init() and variables in its lexical environment ie name.
Closure = init() + name
Consider another function :
function init() {
var name = “Mozilla”;
function displayName(){
alert(name);
}
displayName();
}
What will be the closures here ?
Inner function can access variables of outer function. displayName() can access the variable name declared in the parent function, init(). However, the same local variables in displayName() will be used if they exists.
Closure 1 : init function + ( name variable + displayName() function) --> lexical scope
Closure 2 : displayName function + ( name variable ) --> lexical scope
A simple example in Groovy for your reference:
def outer() {
def x = 1
return { -> println(x)} // inner
}
def innerObj = outer()
innerObj() // prints 1
Here is an example illustrating a closure in the Scheme programming language.
First we define a function defining a local variable, not visible outside the function.
; Function using a local variable
(define (function)
(define a 1)
(display a) ; prints 1, when calling (function)
)
(function) ; prints 1
(display a) ; fails: a undefined
Here is the same example, but now the function uses a global variable, defined outside the function.
; Function using a global variable
(define b 2)
(define (function)
(display b) ; prints 2, when calling (function)
)
(function) ; prints 2
(display 2) ; prints 2
And finally, here is an example of a function carrying its own closure:
; Function with closure
(define (outer)
(define c 3)
(define (inner)
(display c))
inner ; outer function returns the inner function as result
)
(define function (outer))
(function) ; prints 3

Resources