Closure Because of What it Can Do or Because it Does - functional-programming

Ok, this is a bit of a pedantic question but I'd like to make sure I'm understanding the definition correctly. Is the closure moniker used to describe anonymous functions that can lift variables in the local scope (Whether they actually do or not) or only when they do lift variables in local scope?
In other words if an anonymous function has the ability to lift variables in its local scope (Because the language offers that capability) but doesn't, is it still considered a closure?
My understanding is that it is a closure only when local variables are lifted. But an anonymous function that doesn't (Even though it can) is not a closure. So not all anonymous functions are closures but all closures are anonymous functions.
Again, sorry for the pedantry, but these things gnaw at me. :)

Assuming you mean within the context of computer science...
A closure is a first class function which captures the lexical bindings of free variables in its defining environment. Once it has captured the lexical bindings the function becomes a closure because it "closes over" those variables.
Note this means closures only exist at run time.
For a function to be a closure is orthogonal to the function being anonymous or named. You can create a language that allows you to define named functions to be closures.
Here is a "named" closure in Python:
def maker():
count=[0]
def counter():
count[0]=count[0]+1
return count[0]
return counter

One great definition of a closure is defined in lua.org:
When a function is written enclosed in another function, it has full access to local variables from the enclosing function; this feature is called lexical scoping. Although that may sound obvious, it is not. Lexical scoping, plus first-class functions, is a powerful concept in a programming language, but few languages support that concept.

Related

How to pass an object by reference and value in Julia?

I know that from here:
Julia function arguments follow a convention sometimes called "pass-by-sharing", which means that values are not copied when they are passed to functions. Function arguments themselves act as new variable bindings (new locations that can refer to values), but the values they refer to are identical to the passed values. Modifications to mutable values (such as Arrays) made within a function will be visible to the caller. This is the same behavior found in Scheme, most Lisps, Python, Ruby and Perl, among other dynamic languages.
Given this, it's clear to me that to pass by reference, all you need to do is have a mutable type that you pass into a function and edit.
My question then becomes, how can I clearly distinguish between pass by value and pass by reference? Does anyone have an example that shows a function being called twice; once with pass by reference, and once with pass by value?
I saw this post which alludes to some similar ideas, but it did not fully answer my question.
In Julia, functions always have pass-by-sharing argument-passing behavior:
https://docs.julialang.org/en/v1/manual/functions/
This argument-passing convention is also used in most general purpose dynamic programming languages, including various Lisps, Python, Perl and Ruby. A good and useful description can be found here:
https://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_sharing
In short, pass-by-sharing works like pass-by-reference but you cannot change which value a binding in the calling scope refers to by reassigning to an argument in the function being called—if you reassign an argument, the binding in the caller is unchanged. This means that in general you cannot use functions to change bindings, such as for example to swap to variables. (Macros can, however, modify bindings in the caller.) In particular, if a variable in the caller refers to an immutable value like an integer or a floating-point number, its value cannot be changed by a function call since which object the variable refers to cannot be changed by a function call and the value itself cannot be modified as it is immutable.
If you want to have something like R or Matlab pass by value behavior, you need to explicitly create a copy of the argument before modifying it. This is precisely what R and Matlab do when an argument is passed in a modified and an external reference to the argument remains. In Julia it must be done explicitly by the programmer rather than being done automatically by the system. A downside is that the system can sometimes know that no copy is required (no external references remain) when the programmer cannot generally know this. That ability, however, is deeply tied with the reference counting garbage collections technique, which is not used by Julia due to performance considerations.
By convention, functions which mutate the contents of an argument have a ! postfix (e.g., sort v/s sort!).

Are functions mutable in multiple dispatch systems?

Have I understood correctly that in (most? some?) multiple dispatch languages each method gets added to the function at some point in time of program's execution.
Can I then conclude that multiple dispatch as a feature forces functions to be mutable?
Is there a multiple dispatch language, where all methods are attached to a (generic)function together (at load time?), so that it's not possible to see the function in different states at different points in time?
at some point in time of program's execution.
In Common Lisp the methods get added/replaced when the method definitions are executed - for a compiled system this is typically at load-time of the compiled code - not necessarily during the program's execution.
Remember, that Common Lisp has an object system (CLOS, the Common Lisp Object System), which is defined by its behaviour. It's slightly different from a language or a language extension.
Common Lisp allows runtime modification of the object system. For example also adding/removing/replacing methods.
Common Lisp also may combine more than one applicable method into an effective method, which then gets executed. Typical example: all applicable :before methods and the most specific applicable primary method will be combined into one effective method.
There exist extensions for CLOS in some implementations, which seal a generic function against changes.
For a longer treatment of the idea of an object system see: The Structure of a Programming Language Revolution by Richard P. Gabriel.
In Common Lisp, you can read the following from the specification:
7.6.1 Introduction to Generic Functions
When a defgeneric form is evaluated, one of three actions is taken (due to ensure-generic-function):
If a generic function of the given name already exists, the existing generic function object is modified. Methods specified by the current defgeneric form are added, and any methods in the existing generic function that were defined by a previous defgeneric form are removed. Methods added by the current defgeneric form might replace methods defined by defmethod, defclass, define-condition, or defstruct. No other methods in the generic function are affected or replaced.
If the given name names an ordinary function, a macro, or a special operator, an error is signaled.
Otherwise a generic function is created with the methods specified by the method definitions in the defgeneric form.
7.6.2 Introduction to Methods
When a method-defining form is evaluated, a method object is created and one of four actions is taken:
If a generic function of the given name already exists and if a method object already exists that agrees with the new one on parameter specializers and qualifiers, the new method object replaces the old one. For a definition of one method agreeing with another on parameter specializers and qualifiers, see Section 7.6.3 (Agreement on Parameter Specializers and Qualifiers).
If a generic function of the given name already exists and if there is no method object that agrees with the new one on parameter specializers and qualifiers, the existing generic function object is modified to contain the new method object.
If the given name names an ordinary function, a macro, or a special operator, an error is signaled.
Otherwise a generic function is created with the method specified by the method-defining form.
The definition of ensure-generic-function:
If function-name specifies a generic function that has a different value for the :lambda-list argument, and the new value is congruent with the lambda lists of all existing methods or there are no methods, the value is changed; otherwise an error is signaled.
If function-name specifies a generic function that has a different value for the :generic-function-class argument and if the new generic function class is compatible with the old, change-class is called to change the class of the generic function; otherwise an error is signaled.
If function-name specifies a generic function that has a different value for the :method-class argument, the value is changed, but any existing methods are not changed.
You also have add-method and remove-method.
As you can see, generic functions retain their identify between defmethod definitions, and even between defgeneric definitions. Generic functions are mutable in Common Lisp.
In Julia, you can read the following from the documentation:
Defining Methods
To define a function with multiple methods, one simply defines the function multiple times, with different numbers and types of arguments. The first method definition for a function creates the function object, and subsequent method definitions add new methods to the existing function object.
As you can see, functions objects are mutable in Julia.
This says nothing about all other multiple dispatch languages. You can invent a multiple dispatch language right now just for the purpose of showing you can do it with immutability, e.g. adding methods would return a new function similar to the previous function but with the added method. Or a language where functions are generated statically at compile-time, such that you can't change it at runtime in any way, not even to add or remove methods.
Paraphrasing from the excellent "Getting started with Julia" book which has a nice section on this (emphasis mine):
We already saw that functions are inherently defined as generic, that is, they can be used for different types of their arguments. The compiler will generate a separate version of the function each time it is called with arguments of a new type. A concrete version of a function for a specific combination of argument types is called a method in Julia. To define a new method for a function (also called overloading), just use the same function name but a different signature, that is, with different argument types.
A list of all the methods is stored in a virtual method table ( vtable ) on the function itself; methods do not belong to a particular type. When a function is called, Julia will do a lookup in that vtable at runtime to find which concrete method it should call based on the types of all its arguments; this is Julia's mechanism of multiple dispatch, which neither Python, nor C++ or Fortran implements. It allows open extensions where normal object-oriented code would have forced you to change a class or subclass an existing class and thus change your library. Note that only the positional arguments are taken into account for multiple dispatch, and not the keyword arguments.
For each of these different methods, specialized low-level code is generated, targeted to the processor's instruction set. In contrast to object-oriented (OO) languages, vtable is stored in the function, and not in the type (or class). In OO languages, a method is called on a single object, object.method(), which is generally called single dispatch. In Julia, one can say that a function belongs to multiple types, or that a function is specialized or overloaded for different types. Julia's ability to compile code that reads like a high-level dynamic language into machine code that performs like C almost entirely is derived from its ability to do multiple dispatch.
So, the way I understand this (I may be wrong) is that:
The generic function needs to be defined in the session before you can use it
Explicitly defined methods for concrete arguments are added to the function's multiple-dispatch lookup table at the point where they're defined.
Whenever a function is called with specific arguments for which an explicitly defined method does not exist, a concrete version for those arguments is compiled and added to the vtable. (however, this does not show up as an explicit method if you run methods() on that function name)
The first call of such a function will result in some compilation overhead; however, subsequent calls will use the existing compiled version*.
I wouldn't say this makes functions mutable though, that's an altogether different issue. You can confirm yourself they're immutable using the isimmutable() function on a function 'handle'.
*I know modules can be precompiled, but I am not entirely sure if these on-the-fly compiled versions are saved between sessions in any form -- comments welcome :)
Dynamicity can be a real asset in your application, if only for debugging. Trying to prevent a function from being later updated, redefined, etc. might be a little bit short-sighted. But if you are sure you want static dispatch, you can define your own class of generic functions thanks to the MOP, the Meta-Object Protocol, which is not part of the standard but still largely supported. That's what the Inlined-Generic-Function library provides (and this is possible because CLOS is open to extensions).
In Dylan, methods are generally added to a generic function at compile time, but they may also be added at run time (via add-method or remove-method). However, a generic function may be sealed, which prevents libraries other than the one in which the g.f. is defined from adding methods. So to answer your question, in Dylan generic functions are always mutable within the defining library but they may be rendered immutable to other libraries.

Is there any difference between closure in Scheme and usual closure in other languages?

I'm studying SICP right now. And I found the definition of closure in SICP is (maybe) different from closure definition in other languages.
Here's what SICP says:
The ability to create pairs whose elements are pairs is the essence of list structure's importance as a representational tool. We refer to this ability as the closure property of cons. In general, an operation for combining data objects satisfies the closure property if the results of combining things with that operation can themselves be combined using the same operation.
Here closure is more close to closure in Mathematics I think, not what I have seen in JavaScript, which means the ability of a function to access enclosed environment variables.
Am I wrong?
You're right; this text is not referring to "closures"--an implementation strategy to ensure that functions-as-values refer correctly to lexical bindings--but more generally to the mathematical notion of "closure", as for instance in the statement "the integers are closed under the addition operation". That is: applying the operation to any two elements of the set produces a result that is still a member of the set.
There is a difference in the use of "closure" in SICP from the way it is typically used in computing. From SICP Chapter 2, footnote 6:
The use of the word 'closure' here comes from abstract algebra,
where a set of elements is said to be closed under an operation if
applying the operation to elements in the set produces an element that
is again an element of the set. The Lisp community also
(unfortunately) uses the word 'closure' to describe a totally
unrelated concept: A closure is an implementation technique for
representing procedures with free variables. We do not use the word
'closure' in this second sense in this book.
On the other hand, Schemer's use "closure" to refer to lexical closures just like programmers using other languages with lexical closures.

Is it possible to determine the calling context (function, symbol) in a Common Lisp function?

There are probably several ways to implement this introspection feature through macros and code walkers, but is there a simpler (possible, implementation-dependent) way? I'd imagine, invoking and then releasing the debugger could open access to frame stack, but that seems like an overkill too.
What would be some simpler ideas to try?
Macros can take an &env argument that passes in the lexical environment of the calling context. You can then query the lexical environment using these functions: https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node102.html
In particular, check out variable-information and function-information.
I believe there are also implementation-specific ways to get the current lexical environment at run-time.

More explanation on Lexical Binding in Closures?

There are many SO posts related to this, but I am asking this again with a different purpose
I am trying to understand why closures are important and useful. One of things that I've read in other SO posts related to this is that when you pass a variable to closure, the closure starts remembering this value from then onwards. Is this the entire Technical aspect of it or there is more to what happens there.
What I wonder then is what would happen when the variable used inside the closure gets modified from outside. Should they be constants only?
In the language Clojure, I can do the following: But since there are value is immutable, this issue does not arise. What about other languages and what is the proper technical definition of a closure?
(defn make-greeter [greeting-prefix]
(fn [username] (str greeting-prefix ", " username)))
((make-greeter "Hello") "World")
This is not the sort of answer that appears to get up-votes around here, but I would heartily urge you to discover the answer to your question by reading Shriram Krishnamurthi's (free!) (online!) textbook, Programming Languages: Application and Interpretation.
I will paraphrase the book very, very briefly, by summarizing the development of the teeny tiny interpreters that it leads you through:
an arithmetic expression language (AE)
an arithmetic expression language with named expressions (WAE);
implementing this involves developing a substitution function that can
replace names with values
a language that adds first-order functions (F1WAE): using a function involves substituting
values for each of the parameter names.
The same language, without substitution: it turns out that "environments" allow you to avoid the overhead of pre-emptive substitution.
a language that eliminates the separation between functions and expressions by allowing
functions to be defined at arbitrary locations (FWAE)
This is the key point: you implement this, and then you discover that with substitution it works fine, but with environments it's broken. In particular, in order to fix it up, you must be sure to associate with an evaluated function definition the environment that was in place when it was evaluated. This pair (fundef + environment-of-definition) is what's called a "closure".
Whew!
Okay, what happens when we add mutable bindings to the picture? If you try this yourself, you'll see that the natural implementation replaces an environment that associates names with values with an environment that associates names with bindings. This is orthogonal to the notion of closures; since closures capture environments, and since environments now map names to bindings, you get the behavior you describe, whereby mutation of a variable captured in an environment is visible and persistent.
Again, I would very much urge you to take a look at PLAI.
A closure is really a data structure used by the compiler to make sure that a function will always have access to the data that it needs to opperate. here is an example of a function that recordes when it was defined.
(defn outer []
(let [foo (get-time-of-day)]
(defn inner []
#(str "then:" foo " now:" (get-time-of-day)))))
(def then-and-now (outer))
(then-and-now) ==> "then:1:02:03 now:2:30:01"
....
(then-and-now) ==> "then:1:02:03 now:2:31:02"
when this function is defined a class is created and a small structure (a closure) is allocated on the heap that stores the value of foo. the class has a pointer to that (or it contains it im not sure). if you run this again then a second closure would be allocated to hold that other foo. When we say "this function closes over foo" we mean to say that it has a reference to a stricture/class/whatever that stores the state of foo at the time it was compiled. The reason you need to close over something is because the function that contains it is going away before the data will be used. In this case outer (which contains the value of foo) is going to end and be gone long before foo is used so nobody will be around to modify foo. of course foo could pas a ref to somebody who could then modify it.
A lexical closure is one in which the enclosed variables (e.g. greeting-prefix in your example) are enclosed by reference. The closure created does not simply get the value of greeting-prefix at the time it is created, but gets a reference. If greeting-prefix is modified after the closure is created, then its new value will be used by the closure every time it is called.
In pure functional languages this isn't much of a distinction, because values are never changed. So it doesn't matter if the value of greeting-prefix is copied into the closure: there's no possible difference in behaviour that could arise from referring to the original versus its copy.
In "imperative-languages-with-closures", such as C# and Java (via anonymous classes), some decision has to be made about whether the enclosed variable is enclosed by value or by reference. In Java this decision is pre-empted by only allowing final variables to be enclosed, effectively mimicking a functional language as far as that variable is concerned. In C# I believe it is a different matter.
Enclosing by value simplifies the implementation: the variable to be enclosed will often exist on the stack and hence will be destroyed when the function constructing the closure returns -- that means it can't be enclosed by reference. If you need enclosure by reference, a workaround is to identify such variables and keep them in an object allocated each time that function is called. This object is then kept as part of the closure's environment and must remain live as long as all closures using it are live. (I do not know if any compiled languages directly use this technique.)
For more descriptions see for example:
Common Lisp HyperSpec, 3.1.4 Closures and Lexical Binding
and
Common Lisp the Language, 2nd Edition, Chapter 3., Scope and Extent
You can think of a closure as an "environment", in which names are bound to values. Those names are entirely private to the closure, which is why we say that it "closes over" its environment. So your question isn't meaningful, in that the "outside" cannot affect the closed-over environment. Yes, a closure can refer to a name in a global environment (in other words, if it uses a name that is not bound in its private, closed-over environment), but that's a different story.
If you like, you can think of an environment as a dictionary, or hash table. A closure gets its own little dictionary where names are looked up.
You might enjoy reading On lambdas, capture, and mutability, which describes how this works in C# and F#, for comparison.
Have a look at this blog post: ADTs in Clojure. It shows a nice application of closures to the problem of locking up data so that it is accessible exclusively through a particular interface (rendering the data type opaque).
The main idea behind this type of locking is more simply illustrated with the counter example, which huaiyuan posted in Common Lisp while I was composing this answer. Actually, the Clojure version is interesting in that it shows that the issue of a closed-over variable changing its value does arise in Clojure if the variable happens to hold an instance of one of the reference types.
(defn create-counter []
(let [counter (atom 0)
inc-counter! #(swap! counter inc)
get-counter (fn [] #counter)]
[inc-counter! get-counter]))
As for the original make-greeter example, you could rewrite it thus (note the deref/#):
(defn make-greeter [greeting-prefix]
(fn [username] (str #greeting-prefix ", " username)))
Then you can use it to render personalised greetings from the different operators of various sections of a website. :-)
((make-greeter "Hello from Gizmos Dept") "John")
((make-greeter "Hello from Gadgets Dept") "Jack").
You can think of a closure as an
"environment", in which names are
bound to values. Those names are
entirely private to the closure, which
is why we say that it "closes over"
its environment. So your question
isn't meaningful, in that the
"outside" cannot affect the
closed-over environment. Yes, a
closure can refer to a name in a
global environment (in other words, if
it uses a name that is not bound in
its private, closed-over environment),
but that's a different story.
I suppose that the question was if things like these are possible in languages which allow mutation of local variables:
CL-USER> (let ((x (list 1 2 3)))
(prog1
(let ((y x))
(lambda () y))
(rplaca x 2)))
#<COMPILED-LEXICAL-CLOSURE #x9FEC77E>
CL-USER> (funcall *)
(2 2 3)
And -- since they are obviously possible -- I think the question is legitimate.

Resources