There are many SO posts related to this, but I am asking this again with a different purpose
I am trying to understand why closures are important and useful. One of things that I've read in other SO posts related to this is that when you pass a variable to closure, the closure starts remembering this value from then onwards. Is this the entire Technical aspect of it or there is more to what happens there.
What I wonder then is what would happen when the variable used inside the closure gets modified from outside. Should they be constants only?
In the language Clojure, I can do the following: But since there are value is immutable, this issue does not arise. What about other languages and what is the proper technical definition of a closure?
(defn make-greeter [greeting-prefix]
(fn [username] (str greeting-prefix ", " username)))
((make-greeter "Hello") "World")
This is not the sort of answer that appears to get up-votes around here, but I would heartily urge you to discover the answer to your question by reading Shriram Krishnamurthi's (free!) (online!) textbook, Programming Languages: Application and Interpretation.
I will paraphrase the book very, very briefly, by summarizing the development of the teeny tiny interpreters that it leads you through:
an arithmetic expression language (AE)
an arithmetic expression language with named expressions (WAE);
implementing this involves developing a substitution function that can
replace names with values
a language that adds first-order functions (F1WAE): using a function involves substituting
values for each of the parameter names.
The same language, without substitution: it turns out that "environments" allow you to avoid the overhead of pre-emptive substitution.
a language that eliminates the separation between functions and expressions by allowing
functions to be defined at arbitrary locations (FWAE)
This is the key point: you implement this, and then you discover that with substitution it works fine, but with environments it's broken. In particular, in order to fix it up, you must be sure to associate with an evaluated function definition the environment that was in place when it was evaluated. This pair (fundef + environment-of-definition) is what's called a "closure".
Whew!
Okay, what happens when we add mutable bindings to the picture? If you try this yourself, you'll see that the natural implementation replaces an environment that associates names with values with an environment that associates names with bindings. This is orthogonal to the notion of closures; since closures capture environments, and since environments now map names to bindings, you get the behavior you describe, whereby mutation of a variable captured in an environment is visible and persistent.
Again, I would very much urge you to take a look at PLAI.
A closure is really a data structure used by the compiler to make sure that a function will always have access to the data that it needs to opperate. here is an example of a function that recordes when it was defined.
(defn outer []
(let [foo (get-time-of-day)]
(defn inner []
#(str "then:" foo " now:" (get-time-of-day)))))
(def then-and-now (outer))
(then-and-now) ==> "then:1:02:03 now:2:30:01"
....
(then-and-now) ==> "then:1:02:03 now:2:31:02"
when this function is defined a class is created and a small structure (a closure) is allocated on the heap that stores the value of foo. the class has a pointer to that (or it contains it im not sure). if you run this again then a second closure would be allocated to hold that other foo. When we say "this function closes over foo" we mean to say that it has a reference to a stricture/class/whatever that stores the state of foo at the time it was compiled. The reason you need to close over something is because the function that contains it is going away before the data will be used. In this case outer (which contains the value of foo) is going to end and be gone long before foo is used so nobody will be around to modify foo. of course foo could pas a ref to somebody who could then modify it.
A lexical closure is one in which the enclosed variables (e.g. greeting-prefix in your example) are enclosed by reference. The closure created does not simply get the value of greeting-prefix at the time it is created, but gets a reference. If greeting-prefix is modified after the closure is created, then its new value will be used by the closure every time it is called.
In pure functional languages this isn't much of a distinction, because values are never changed. So it doesn't matter if the value of greeting-prefix is copied into the closure: there's no possible difference in behaviour that could arise from referring to the original versus its copy.
In "imperative-languages-with-closures", such as C# and Java (via anonymous classes), some decision has to be made about whether the enclosed variable is enclosed by value or by reference. In Java this decision is pre-empted by only allowing final variables to be enclosed, effectively mimicking a functional language as far as that variable is concerned. In C# I believe it is a different matter.
Enclosing by value simplifies the implementation: the variable to be enclosed will often exist on the stack and hence will be destroyed when the function constructing the closure returns -- that means it can't be enclosed by reference. If you need enclosure by reference, a workaround is to identify such variables and keep them in an object allocated each time that function is called. This object is then kept as part of the closure's environment and must remain live as long as all closures using it are live. (I do not know if any compiled languages directly use this technique.)
For more descriptions see for example:
Common Lisp HyperSpec, 3.1.4 Closures and Lexical Binding
and
Common Lisp the Language, 2nd Edition, Chapter 3., Scope and Extent
You can think of a closure as an "environment", in which names are bound to values. Those names are entirely private to the closure, which is why we say that it "closes over" its environment. So your question isn't meaningful, in that the "outside" cannot affect the closed-over environment. Yes, a closure can refer to a name in a global environment (in other words, if it uses a name that is not bound in its private, closed-over environment), but that's a different story.
If you like, you can think of an environment as a dictionary, or hash table. A closure gets its own little dictionary where names are looked up.
You might enjoy reading On lambdas, capture, and mutability, which describes how this works in C# and F#, for comparison.
Have a look at this blog post: ADTs in Clojure. It shows a nice application of closures to the problem of locking up data so that it is accessible exclusively through a particular interface (rendering the data type opaque).
The main idea behind this type of locking is more simply illustrated with the counter example, which huaiyuan posted in Common Lisp while I was composing this answer. Actually, the Clojure version is interesting in that it shows that the issue of a closed-over variable changing its value does arise in Clojure if the variable happens to hold an instance of one of the reference types.
(defn create-counter []
(let [counter (atom 0)
inc-counter! #(swap! counter inc)
get-counter (fn [] #counter)]
[inc-counter! get-counter]))
As for the original make-greeter example, you could rewrite it thus (note the deref/#):
(defn make-greeter [greeting-prefix]
(fn [username] (str #greeting-prefix ", " username)))
Then you can use it to render personalised greetings from the different operators of various sections of a website. :-)
((make-greeter "Hello from Gizmos Dept") "John")
((make-greeter "Hello from Gadgets Dept") "Jack").
You can think of a closure as an
"environment", in which names are
bound to values. Those names are
entirely private to the closure, which
is why we say that it "closes over"
its environment. So your question
isn't meaningful, in that the
"outside" cannot affect the
closed-over environment. Yes, a
closure can refer to a name in a
global environment (in other words, if
it uses a name that is not bound in
its private, closed-over environment),
but that's a different story.
I suppose that the question was if things like these are possible in languages which allow mutation of local variables:
CL-USER> (let ((x (list 1 2 3)))
(prog1
(let ((y x))
(lambda () y))
(rplaca x 2)))
#<COMPILED-LEXICAL-CLOSURE #x9FEC77E>
CL-USER> (funcall *)
(2 2 3)
And -- since they are obviously possible -- I think the question is legitimate.
Related
I know that from here:
Julia function arguments follow a convention sometimes called "pass-by-sharing", which means that values are not copied when they are passed to functions. Function arguments themselves act as new variable bindings (new locations that can refer to values), but the values they refer to are identical to the passed values. Modifications to mutable values (such as Arrays) made within a function will be visible to the caller. This is the same behavior found in Scheme, most Lisps, Python, Ruby and Perl, among other dynamic languages.
Given this, it's clear to me that to pass by reference, all you need to do is have a mutable type that you pass into a function and edit.
My question then becomes, how can I clearly distinguish between pass by value and pass by reference? Does anyone have an example that shows a function being called twice; once with pass by reference, and once with pass by value?
I saw this post which alludes to some similar ideas, but it did not fully answer my question.
In Julia, functions always have pass-by-sharing argument-passing behavior:
https://docs.julialang.org/en/v1/manual/functions/
This argument-passing convention is also used in most general purpose dynamic programming languages, including various Lisps, Python, Perl and Ruby. A good and useful description can be found here:
https://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_sharing
In short, pass-by-sharing works like pass-by-reference but you cannot change which value a binding in the calling scope refers to by reassigning to an argument in the function being called—if you reassign an argument, the binding in the caller is unchanged. This means that in general you cannot use functions to change bindings, such as for example to swap to variables. (Macros can, however, modify bindings in the caller.) In particular, if a variable in the caller refers to an immutable value like an integer or a floating-point number, its value cannot be changed by a function call since which object the variable refers to cannot be changed by a function call and the value itself cannot be modified as it is immutable.
If you want to have something like R or Matlab pass by value behavior, you need to explicitly create a copy of the argument before modifying it. This is precisely what R and Matlab do when an argument is passed in a modified and an external reference to the argument remains. In Julia it must be done explicitly by the programmer rather than being done automatically by the system. A downside is that the system can sometimes know that no copy is required (no external references remain) when the programmer cannot generally know this. That ability, however, is deeply tied with the reference counting garbage collections technique, which is not used by Julia due to performance considerations.
By convention, functions which mutate the contents of an argument have a ! postfix (e.g., sort v/s sort!).
(define hypot
(lambda (a b)
(sqrt (+ (* a a) (* b b)))))
This is a Scheme programming language.
"define" creates a variable and global binding
lambda creates a procedure
I would like to know if "define" would be considered as an imperative language feature! As long as I know imperative feature is static scoping. I think it is an imperative feature since "define" create a global binding and static scoped looks at global binding for any variable definition where as in dynamic it looks at the current most active binding.
Please help me find the correct answer!! And I would like to know why or why not?
In a Scheme program (define var expr) statement is both a declaration and an initialization. Declarations introduce a new name into the scope. Declarations and initialization are present in both imperative and declarative languages.
However if the same variable is defined twice, then define behave as an assignment - which belongs to the imperative paradigm.
You've put your finger on a subtle and contentious issue. There have long been two informal camps on how define should work, which I would label (very imperfectly, and very controversially!) as the static vs. dynamic camps.
The static camp sees define as a non-side-effecting top-level declaration—it's a syntax that simply defines a name in a top-level scope, just like let is a syntax that defines a name in a local scope. A bit more precisely, this camp tends to see the top-level environment as equivalent to a big letrec with all the defines as the bindings, and all "loose" top-level expressions as the body. This is, incidentally, similar to the way that simple compilers work—read the whole program from one or more files, figure out all of the top-level bindings and generate code with knowledge of the whole program's source text.
The dynamic camp, on the other hand, tends to conceive of the top-level environment as a mutable data structure to which bindings can be added at runtime, and define is then an operation that modifies the top-level environment. This is, incidentally, similar to how simple interactive interpreters work—read definitions interactively from input, one at a time, and incorporate them into the environment as the user provides them.
To give one example, the SLIB library is one that I recall has been criticized for being much too firmly in the "dynamic" camp. If you read Section 1.1 on "features", you see this right from the beginning:
SLIB maintains a list of features supported by a Scheme session. The set of features provided by a session may change during that session.
The documentation for the require form that you use in SLIB to "load" modules continues with this:
Procedure: require feature
If (provided? feature) is true, then require just returns.
Otherwise, if feature is found in the catalog, then the corresponding files will be loaded and (provided? feature) will henceforth return #t. That feature is thereafter provided.
Otherwise (feature not found in the catalog), an error is signaled.
If you read this carefully, you will be struck that it's framing the whole thing as modules being "loaded" at runtime—and not as compile-time linking, which is foreign to the design.
So a "session" is a set of bindings whose keys—not just their values—changes during the runtime of the program. Programs are able to mutate the session with provide and require. They are able to directly observe the mutation with provided?. And it is implied that they can indirectly observe the set of identifiers bound in top-level environment change as a result of require—a call to require causes procedure invocations that would result in a runtime error before its invocation to no longer be so afterwards.
So we can't help but conclude that going by the philosophy of the people who designed this library, define is imperative. But not every Scheme user or implementer shares this philosophy.
First off Scheme is lexically scoped. Define usually is not limited to top level bindings like it is in Racket. It can create bindings within other procedure bodies.
In some implementations define can manipulate state but only for top level definitions. Otherwise it acts like let and binds a variable to the local scope. To actually take advantage of the top-level rebinding programatically is difficult.
So define doesn't introduce an imperative style into scheme code. Compare define to set! and its relatives, which by modify the variable in whatever environment it is bound, thereby allowing imperative style in scheme code.
I would like to read in a string from an input file (which may or may not have been modified by the user). I would like to treat this string as a format directive to be called with a fixed number of arguments. However, I understand that some format directives (particularly, the ~/ comes to mind) could potentially be used to inject function calls, making this approach inherently unsafe.
When using read to parse data in Common Lisp, the language provides the *read-eval* dynamic variable which can be set to nil to disable #. code injection. I'm looking for something similar that would prevent code injection and arbitrary function calls inside format directives.
If the user cannot introduce custom code but only format strings, then you can avoid the problems of print-object. Remember to use with-standard-io-syntax (or a customized version of it) to control to exact kind of output you will generate (think about *print-base*, ...).
You can scan the input strings to detect the presence of ~/ (but ~~/ is valid) and refuse to interpret format that contains blacklisted constructs.
However, some analysis are more difficult and you might need to act at runtime.
For example, if the format string is malformed, you will probably encouter an error, which must be handled (also, you may give bad values to the expected arguments).
Even if the user is not malicious, you can also have problems with iteration constructs:
~{<X>~:*~}
... never stops because ~:* rewinds current argument. In order to handle this, you must consider that <X> may, or not, print something. You could implement both of those strategies:
have a timeout to limit the time formatting takes
have the underlying stream reach end-of-file when writing too much (e.g. write into a string buffer).
There might be other problems I currently don't see, be careful.
It's not just ~/ that you'd need to worry about. The pretty printer functionality has lots of possibilities for code extension, and even ~A can cause problems, because objects may have methods on print-object defined. E.g.,
(defclass x () ())
(defmethod print-object ((x x) stream)
(format *error-output* "Executing arbitrary code...~%")
(call-next-method x stream))
CL-USER> (format t "~A" (make-instance 'x))
Executing arbitrary code...
#<X {1004E4B513}>
NIL
I think you'd need to define for yourself which directives are safe, using whatever criteria you consider important, and then include only those.
What is the correct definition of destructive and non-destructive constructs in LISP (or in general). I have tried to search for the actual meaning but I have only found a lot of usage of these terms without actually explaining them.
It came to my understanding, that by destructive function is meant a function, that changes the meaning of the construct (or variable) - so when I pass a list as a parameter to a function, which changes it, it is called a destructive operation, because it changes the initial list and return a brand new one. Is this right or are there some exceptions?
So is for example set a destructive function (because it changes the value of x)? I think not but I do not how, how would I justify this.
(set 'x 1)
Sorry for probably a very basic question.... Thanks for any answers!
I would not interpret too much into the word 'destructive'.
In list processing, a destructive operation is one that potentially changes one or more of the input lists as a visible side effect.
Now, you can widen the meaning to operations over arrays, structures, CLOS objects, etc. You can also call variable assignment 'destructive' and so on.
In Common Lisp, it makes sense to talk about destructive operations over sequences (which are lists, strings, and vectors in general) and multi-dimensional arrays.
Practical Common Lisp distinguishes two kinds of destructive operations: for-side-effect operations and recycling operations.
set is destructive and for-side-effect: it always modifies its first argument. Beware, that it changes the binding for a symbol, but not the thing currently bound to that symbol. setf can change either bindings or objects in-place.
By contrast, nreverse is recycling: it is allowed to modify its argument list, although there's no guarantee that it will, so it should be used just like reverse (take the return value), except that the input argument may be "destroyed" and should no longer be used. [Scheme programmers may call this a "linear update" function.]
Is it unwise to return a var bound using let?
(let [pipeline (Channels/pipeline)]
(.addLast pipeline "codec" (HttpClientCodec.))
;; several more lines like this
pipeline)
Is the binding here just about the lexical scope (as opposed to def) and not unsafe to pass around?
Update
In writing this question I realised the above was ugly. And if something is ugly in Clojure you are probably doing it wrong.
I think this is probably the more idiomatic way of handling the above (which makes the question moot, btw, but still handy knowledge).
(doto (Channels/pipeline)
(.addLast "codec" (HttpClientCodec.)))
let is purely lexically scoped and doesn't create a var. The locals created by let (or loop) behave exactly like function arguments. So yeah, it's safe to use as many let/loop-defined locals as you like, close over them, etc. Returning a local from the function simply returns its value, not the internal representation (which is actually on the stack, unless closed over). let/loop bindings are therefore also reentrancy/thread-safe.
By the way, for your specific code example with lots of java calls, you may want to consider using doto instead or additionally. http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/doto