The place of closures in functional programming - functional-programming

I have watched the talk of Robert C Martin "Functional Programming; What? Why? When?"
https://www.youtube.com/watch?v=7Zlp9rKHGD4
The main message of this talk is that a state is unacceptable in functional programming.
Martin goes even further, claims that assigments are 'evil'.
So... keeping in mind this talk my question is, where is a place for closure in functional programming?
When there is no state or no variable in a functional code, what would be a main reason to create and use such closure (closure that does not enclose any state, any variable)? Is the closure mechanism useful?
Without a state or a variable, (maybe only with immutables ids), there is no need to reference to a current lexical scope (there is nothing that could be changed)?
In this approach, that is enough to use Java-like lambda mechanism, where there is no link to current lexical scope (that's why the variables have to be final).
In some sources, closures are meant to be a must have element of functional language.

A lexical scope that can be closed over does not need to be mutable to be useful. Just consider curried functions as an example:
add = \a -> \b -> a+b
add1 = add(1)
add3 = add(3)
[add1(0), add1(2), add3(2), add3(5)] // [1, 2, 5, 8]
Here, the inner lamba closes over the value of a (or over the variable a, which doesn't make a difference because of immutability).
Closures are not ultimately necessary for functional programming, but local variables are not either. Still, they're both very good ideas. Closures allow for a very simple notation of the most(?) important task of functional programming: to dynamically create new functions with specialised behaviour from an abstracted code.

You use closures as you would in a language with mutable variables. The difference is obviously that they (usually) can't be modified.
The following is a simple example, in Clojure (which ironically I'm writing with right now):
(let [a 10
f (fn [b]
(+ a b))]
(println (f 4))) ; Prints "14"
The main benefit to closures in a case like this is I can "partially apply" a function using a closure, then pass the partially applied function around, instead of needed to pass the non-applied function, and any data I'll need to call it (very useful, in many scenarios). In the below example, what if I didn't want to call the function right away? I would need to pass a with it so it's available when f is called.
But you could also add some mutability into the mix if you deemed it necessary (although, as #Bergi points out, this example is "evil"):
(let [a (atom 10) ; Atoms are mutable
f (fn [b]
(do
(swap! a inc) ; Increment a
(+ #a b)))]
(println (f 4)) ; Prints "15"
(println (f 4))); Prints "16"
In this way you can emulate static variables. You can use this to do cool things like define memoize. It uses a "static variable" to cache the input/output of referentially transparent functions. This increases memory use, but can save CPU time if used properly.
I have to disagree with being against the idea of having a state. States aren't evil; they're necessary. Every program has a state. Global, mutable states are evil.
Also note, you can have mutability, and still program functionally. Say I have a function, containing a map over a list. Also say, I need to maintain an accumulator while mapping. I really have 2 options (ignoring "doing it manually"):
Switch the map to a fold.
Create a mutable variable, and mutate it while mapping.
Although option one should be preferred, both these methods can be utilized during functional programming. From the view of "outside the function", there would be no difference, even if one version is internally using a mutable variable. The function can still be referentially transparent, and pure, since the only mutable state being affected is local to the function, and can't possibly effect anything outside.
Example code mutating a local variable:
(defn mut-fn [xs]
(let [a (atom 0)]
(map
(fn [x]
(swap! a inc) ; Increment a
(+ x #a)) ; Set the accumulator to x + a
xs)))
Note the variable a cannot be seen from outside the function, so any effect it has can in no way cause global changes. The function will produce the same output for each input, so it's effectively pure.

Related

What are the typical use-cases of (defun (setf …)) defsetf and define-setf-expander

When developing with Common Lisp, we have three possibilities to define new setf-forms:
We can define a function whose name is a list of two symbols, the first one being setf, e.g. (defun (setf some-observable) (…)).
We can use the short form of defsetf.
We can use the long form of defsetf.
We can use define-setf-expander.
I am not sure what is the right or intended use-case for each of these possibilities.
A response to this question could hint at the most generic solution and outline contexts where other solutions are superior.
define-setf-expander is the most general of these. All of setf's functionality is encompassed by it.
Defining a setf function works fine for most accessors. It is also valid to use a generic function, so polymorphism is insufficient to require using something else. Controlling evaluation either for correctness or performance is the main reason to not use a setf function.
For correctness, many forms of destructuring are not possible to do with a setf function (e.g. (setf (values ...) ...)). Similarly I've seen an example that makes functional data structures behave locally like a mutable one by changing (setf (dict-get key some-dict) 2) to assign a new dictionary to some-dict.
For performance, consider the silly case of (incf (nth 10000 list)) which if the nth writer were implemented as a function would require traversing 10k list nodes twice, but in a setf expander can be done with a single traversal.

How does one implement a "stackless" interpreted language?

I am making my own Lisp-like interpreted language, and I want to do tail call optimization. I want to free my interpreter from the C stack so I can manage my own jumps from function to function and my own stack magic to achieve TCO. (I really don't mean stackless per se, just the fact that calls don't add frames to the C stack. I would like to use a stack of my own that does not grow with tail calls). Like Stackless Python, and unlike Ruby or... standard Python I guess.
But, as my language is a Lisp derivative, all evaluation of s-expressions is currently done recursively (because it's the most obvious way I thought of to do this nonlinear, highly hierarchical process). I have an eval function, which calls a Lambda::apply function every time it encounters a function call. The apply function then calls eval to execute the body of the function, and so on. Mutual stack-hungry non-tail C recursion. The only iterative part I currently use is to eval a body of sequential s-expressions.
(defun f (x y)
(a x y)) ; tail call! goto instead of call.
; (do not grow the stack, keep return addr)
(defun a (x y)
(+ x y))
; ...
(print (f 1 2)) ; how does the return work here? how does it know it's supposed to
; return the value here to be used by print, and how does it know
; how to continue execution here??
So, how do I avoid using C recursion? Or can I use some kind of goto that jumps across c functions? longjmp, perhaps? I really don't know. Please bear with me, I am mostly self- (Internet- ) taught in programming.
One solution is what is sometimes called "trampolined style". The trampoline is a top-level loop that dispatches to small functions that do some small step of computation before returning.
I've sat here for nearly half an hour trying to contrive a good, short example. Unfortunately, I have to do the unhelpful thing and send you to a link:
http://en.wikisource.org/wiki/Scheme:_An_Interpreter_for_Extended_Lambda_Calculus/Section_5
The paper is called "Scheme: An Interpreter for Extended Lambda Calculus", and section 5 implements a working scheme interpreter in an outdated dialect of Lisp. The secret is in how they use the **CLINK** instead of a stack. The other globals are used to pass data around between the implementation functions like the registers of a CPU. I would ignore **QUEUE**, **TICK**, and **PROCESS**, since those deal with threading and fake interrupts. **EVLIS** and **UNEVLIS** are, specifically, used to evaluate function arguments. Unevaluated args are stored in **UNEVLIS**, until they are evaluated and out into **EVLIS**.
Functions to pay attention to, with some small notes:
MLOOP: MLOOP is the main loop of the interpreter, or "trampoline". Ignoring **TICK**, its only job is to call whatever function is in **PC**. Over and over and over.
SAVEUP: SAVEUP conses all the registers onto the **CLINK**, which is basically the same as when C saves the registers to the stack before a function call. The **CLINK** is actually a "continuation" for the interpreter. (A continuation is just the state of a computation. A saved stack frame is technically continuation, too. Hence, some Lisps save the stack to the heap to implement call/cc.)
RESTORE: RESTORE restores the "registers" as they were saved in the **CLINK**. It's similar to restoring a stack frame in a stack-based language. So, it's basically "return", except some function has explicitly stuck the return value into **VALUE**. (**VALUE** is obviously not clobbered by RESTORE.) Also note that RESTORE doesn't always have to return to a calling function. Some functions will actually SAVEUP a whole new computation, which RESTORE will happily "restore".
AEVAL: AEVAL is the EVAL function.
EVLIS: EVLIS exists to evaluate a function's arguments, and apply a function to those args. To avoid recursion, it SAVEUPs EVLIS-1. EVLIS-1 would just be regular old code after the function application if the code was written recursively. However, to avoid recursion, and the stack, it is a separate "continuation".
I hope I've been of some help. I just wish my answer (and link) was shorter.
What you're looking for is called continuation-passing style. This style adds an additional item to each function call (you could think of it as a parameter, if you like), that designates the next bit of code to run (the continuation k can be thought of as a function that takes a single parameter). For example you can rewrite your example in CPS like this:
(defun f (x y k)
(a x y k))
(defun a (x y k)
(+ x y k))
(f 1 2 print)
The implementation of + will compute the sum of x and y, then pass the result to k sort of like (k sum).
Your main interpreter loop then doesn't need to be recursive at all. It will, in a loop, apply each function application one after another, passing the continuation around.
It takes a little bit of work to wrap your head around this. I recommend some reading materials such as the excellent SICP.
Tail recursion can be thought of as reusing for the callee the same stack frame that you are currently using for the caller. So you could just re-set the arguments and goto to the beginning of the function.

What are best practices for including parameters such as an accumulator in functions?

I've been writing more Lisp code recently. In particular, recursive functions that take some data, and build a resulting data structure. Sometimes it seems I need to pass two or three pieces of information to the next invocation of the function, in addition to the user supplied data. Lets call these accumulators.
What is the best way to organize these interfaces to my code?
Currently, I do something like this:
(defun foo (user1 user2 &optional acc1 acc2 acc3)
;; do something
(foo user1 user2 (cons x acc1) (cons y acc2) (cons z acc3)))
This works as I'd like it to, but I'm concerned because I don't really need to present the &optional parameters to the programmer.
3 approaches I'm somewhat considering:
have a wrapper function that a user is encouraged to use that immediately invokes the extended definiton.
use labels internally within a function whose signature is concise.
just start using a loop and variables. However, I'd prefer not since I'd like to really wrap my head around recursion.
Thanks guys!
If you want to write idiomatic Common Lisp, I'd recommend the loop and variables for iteration. Recursion is cool, but it's only one tool of many for the Common Lisper. Besides, tail-call elimination is not guaranteed by the Common Lisp spec.
That said, I'd recommend the labels approach if you have a structure, a tree for example, that is unavoidably recursive and you can't get tail calls anyway. Optional arguments let your implementation details leak out to the caller.
Your impulse to shield implementation details from the user is a smart one, I think. I don't know common lisp, but in Scheme you do it by defining your helper function in the public function's lexical scope.
(define (fibonacci n)
(let fib-accum ((a 0)
(b 1)
(n n))
(if (< n 1)
a
(fib-accum b (+ a b) (- n 1)))))
The let expression defines a function and binds it to a name that's only visible within the let, then invokes the function.
I have used all the options you mention. All have their merits, so it boils down to personal preference.
I have arrived at using whatever I deem appropriate. If I think that leaving the &optional accumulators in the API might make sense for the user, I leave it in. For example, in a reduce-like function, the accumulator can be used by the user for providing a starting value. Otherwise, I'll often rewrite it as a loop, do, or iter (from the iterate library) form, if it makes sense to perceive it as such. Sometimes, the labels helper is also used.

What are "downward funargs"?

Jamie Zawinski uses that term in his (1997) article "java sucks" as if you should know what it means:
I really hate the lack of downward-funargs; anonymous classes are a lame substitute. (I can live without long-lived closures, but I find lack of function pointers a huge pain.)
It seems to be Lisper's slang, and I could find the following brief definition here, but somehow, I think I still don't get it:
Many closures are used only during the extent of the bindings they refer to; these are known as "downward funargs" in Lisp parlance.
Were it not for Steve Yegge, I'd just feel stupid now, but it seems, it might be OK to ask:
Jamie Zawinski is a hero. A living legend. [...] A guy who can use the term "downward funargs" and then glare at you just daring you to ask him to explain it, you cretin.
-- XEmacs is dead, long live XEmacs
So is there a Lisper here who can compile this for C-style-programmers like me?
Downward funargs are local functions that are not returned or otherwise leave their declaration scope. They only can be passed downwards to other functions from the current scope.
Two examples. This is a downward funarg:
function () {
var a = 42;
var f = function () { return a + 1; }
foo(f); // `foo` is a function declared somewhere else.
}
While this is not:
function () {
var a = 42;
var f = function () { return a + 1; }
return f;
}
To better understand where the term comes from, you need to know some history.
The reason why an old Lisp hacker might distinguish downward funargs from funargs in general is that downward funargs are easy to implement in a traditional Lisp that lacks lexical variables, whereas the general case is hard.
Traditionally a local variable was implemented in a Lisp interpreter by adding a binding (the symbol name of the variable, paired with its value) to the environment. Such an environment was simple to implement using an association list. Each function had its own environment, and a pointer to the environment of the parent function. A variable reference was resolved by looking in the current environment, and if not found there, then in the parent environment, and so on up the stack of environments until the global environment was reached.
In such an implementation, local variables shadow global variables with the same name. For example, in Emacs Lisp, print-length is a global variable that specifies the maximum length of list to print before abbreviating it. By binding this variable around the call to a function you can change the behaviour of print statements within that function:
(defun foo () (print '(1 2 3 4 5 6))) ; output depends on the value of print-length
(foo) ; use global value of print-length
==> (1 2 3 4 5 6)
(let ((print-length 3)) (foo)) ; bind print-length locally around the call to foo.
==> (1 2 3 ...)
You can see that in such an implementation, downward funargs are really easy to implement, because variables that are in the environment of the function when it's created will still be in the environment of the function when it's evaluated.
Variables that act like this are called special or dynamic variables, and you can create them in Common Lisp using the special declaration.
In Common Lisp:
(let ((a 3))
(mapcar (lambda (b) (+ a b))
(list 1 2 3 4)))
-> (4 5 6 7)
In above form the lambda function is passed DOWNWARD. When called by the higher-order function MAPCAR (which gets a function and a list of values as arguments, and then applies the function to each element of the list and returns a list of the results), the lambda function still refers to the variable 'a' from the LET expression. But it happens all within the LET expression.
Compare above with this version:
(mapcar (let ((a 3))
(lambda (b) (+ a b)))
(list 1 2 3 4))
Here the lambda function is returned from the LET. UPWARD a bit. It then gets passed to the MAPCAR. When MAPCAR calls the lambda function, its surrounding LET is no longer executing - still the function needs to reference the variable 'a' from the LET.
There's a pretty descriptive article on Wiki called Funarg problem
"A downwards funarg may also refer to
a function's state when that function
is not actually executing. However,
because, by definition, the existence
of a downwards funarg is contained in
the execution of the function that
creates it, the activation record for
the function can usually still be
stored on the stack."

Nested functions: Improper use of side-effects?

I'm learning functional programming, and have tried to solve a couple problems in a functional style. One thing I experienced, while dividing up my problem into functions, was it seemed I had two options: use several disparate functions with similar parameter lists, or using nested functions which, as closures, can simply refer to bindings in the parent function.
Though I ended up going with the second approach, because it made function calls smaller and it seemed to "feel" better, from my reading it seems like I may be missing one of the main points of functional programming, in that this seems "side-effecty"? Now granted, these nested functions cannot modify the outer bindings, as the language I was using prevents that, but if you look at each individual inner function, you can't say "given the same parameters, this function will return the same results" because they do use the variables from the parent scope... am I right?
What is the desirable way to proceed?
Thanks!
Functional programming isn't all-or-nothing. If nesting the functions makes more sense, I'd go with that approach. However, If you really want the internal functions to be purely functional, explicitly pass all the needed parameters into them.
Here's a little example in Scheme:
(define (foo a)
(define (bar b)
(+ a b)) ; getting a from outer scope, not purely functional
(bar 3))
(define (foo a)
(define (bar a b)
(+ a b)) ; getting a from function parameters, purely functional
(bar a 3))
(define (bar a b) ; since this is purely functional, we can remove it from its
(+ a b)) ; environment and it still works
(define (foo a)
(bar a 3))
Personally, I'd go with the first approach, but either will work equally well.
Nesting functions is an excellent way to divide up the labor in many functions. It's not really "side-effecty"; if it helps, think of the captured variables as implicit parameters.
One example where nested functions are useful is to replace loops. The parameters to the nested function can act as induction variables which accumulate values. A simple example:
let factorial n =
let rec facHelper p n =
if n = 1 then p else facHelper (p*n) (n-1)
in
facHelper 1 n
In this case, it wouldn't really make sense to declare a function like facHelper globally, since users shouldn't have to worry about the p parameter.
Be aware, however, that it can be difficult to test nested functions individually, since they cannot be referred to outside of their parent.
Consider the following (contrived) Haskell snippet:
putLines :: [String] -> IO ()
putLines lines = putStr string
where string = concat lines
string is a locally bound named constant. But isn't it also a function taking no arguments that closes over lines and is therefore referentially intransparent? (In Haskell, constants and nullary functions are indeed indistinguishable!) Would you consider the above code “side-effecty” or non-functional because of this?

Resources