In functional languages such as Scheme or Lisp there exist for and for-all loops. However for loops require mutation since it's not a new stack frame each iteration. Since mutation is not available in these languages explicitly how do these functional languages implement their respective iterative loops?
Scheme loops are implemented using recursion under the hood; constructs such as do are just macros that get translated to recursive procedures. For example, this loop in a typical procedural language:
void print(int n) {
for (int i = 0; i < n; i++) {
display(i);
}
}
... Is equivalent to the following procedure in Scheme; here you can see that each part of the loop (initialization, exit condition, increment, body) has a corresponding expression:
(define (print n)
(define (loop i) ; helper procedure, a "named let" would be better
(when (< i n) ; exit condition, if this is false the recursion ends
(display i) ; body
(loop (+ i 1)))) ; increment
(loop 0)) ; initialization
Did you notice that there's nothing left to do after the recursion is called? the compiler is smart enough to optimize this to use a single stack frame, effectively making it as efficient as a for loop - read about tail recursion for more details. And just to clarify, in Scheme mutation is explicitly available, read about the set! instruction.
This question is really two questions, and a confusion.
Iteration in Scheme
In Scheme, iteration is implemented by recursion together with the semantics of the language mandating that certain kinds of recursion do not consume memory, in particular tail recursion. Note that this does not imply mutation. So, for instance, here is a definition of a while loop i Racket.
(define-syntax-rule (while test form ...)
(let loop ([val test])
(if val
(begin
form ...
(loop test))
(void))))
As you can see the recursive call to loop is in tail position and thus consumes no memory.
Iteration in traditional Lisps
Traditional Lisps do not mandate tail-call elimination and thus require iterative constructs: these are generally provided by the language, but usually can be implemented in terms of lower-level constructs, such as GO TO. Here is a definition of while in Common Lisp which does this:
(defmacro while (test &body forms)
(let ((lname (make-symbol "LOOP")))
`(tagbody
,lname
(if ,test
(progn
,#forms
(go ,lname))))))
A confusion about mutation
Both Scheme and traditional Lisps provide mutation operators: neither are the pure functional languages that you may think they are. Scheme is closer to being one, but it still isn't very close.
Related
Given the following function, am I allowed to say that it is recursive? Why I ask this question is because the 'fac' function actually doesn't call itself recursively, so am I still allowed to say that it is a recursive function even though the only function that calls itself is fac-iter?
(define (fac n)
(define (fac-iter counter result)
(if (= counter 0)
result
(fac-iter (- counter 1) (* result counter))))
(fac-iter n 1))
fac is not recursive: it does not refer to its own definition in any way.
fac-iter is recursive: it does refer to its own definition. But in Scheme it will create an iterative process, since its calls to itself are tail calls.
(In casual speech I think people would often say that neither fac nor fac-iter is recursive in Scheme, but I think speaking more precisely the above is correct.)
One problem with calling fac formally recursive is that fac-iter is liftable out of fac. You can write the code like this:
(define (fac-iter counter result)
(if (= counter 0)
result
(fac-iter (- counter 1) (* result counter))))
(define (fac n)
(fac-iter n 1))
fac is an interface which is implemented by a recursive helper function.
If the helper function had a reason to be inside fac, such as accessing the parent's local variables, then there would be more justification for calling fac formally recursive: a significant piece of the interior of fac, a local funcction doing the bulk of the work, is internally recursive, and that interior cannot be moved to the top level without some refactoring.
Informally we can call fac recursive regardless if what we mean by that is that the substantial bulk of its work is done by a recursive approach. We are emphasizing the algorithm that is used, not the details over how it is integrated.
If a homework problem states "please implement a recursive solution to the binary search problem", and the solution is required to take the form of a bsearch.scm file, then obviously the problem statement doesn't mean that the bsearch.scm file must literally invoke itself, right? It means that the main algorithmic content in that file is recursive.
Or when we say that "the POSIX utility find performs a recursive traversal of the filesystem" we don't mean that find forks and executes a copy of itself for every directory it visits.
There is room for informality: calling something recursive without meaning that the entry point of that thing which has that thing's name is literally calling itself.
On another note, in some situations the term "recursion" in the Scheme context is used to denote recursion that requires stack storage; tail calls that are required to be rewritten to express iteration aren't called recursion. That's just taking the point of view of the implementation; what the compiled code is doing. Tail calls are sometimes called "stackless recursion" as a kind of compromise. The situation is complicated because tail calls alone do not eliminate true recursion. There is a way of compiling programs such that all procedure calls become tail calls, namely transformation to CPS (continuation passing style). Yet if the source program performs true recursion that requires a stack, the CPS-transformed program will also, in spite of using nothing but tail calls. What will happen is that an ad hoc stack will emerge via a chain of captured lambda environments. A lambda being used as a continuation captures the previous continuation as a lexical variable. The previous continuation itself captures another such a continuation in its environment, and so on. A heap-allocated chain emerges which constitutes the de facto return stack for the recursion. For reasons like this we cannot automatically conclude that when we see tail calls, we have iteration and not recursion.
An example looks like this. The traversal of a binary tree is truly recursive, right? When we visit the left child, that visitation must return, so that we can then visit the right child. The right child visit can be a tail call, but the left one isn't. Under CPS, they can both be tail calls:
(define (traverse tree contin)
(cond
[(null? tree) (contin)] ;; tail call to continuation
[else (traverse (tree-left tree) ;; tail call to traverse
(lambda ()
(traverse (right tree) contin)))])) ;; ditto!
so here, when the left node is traversed, that is a tail call: the last thing our procedure does is call (traverse (tree-left tree) (lambda ...)). But it passes that lambda as a continuation, and that continuation contains more statements to execute when it is invoked, which is essentially the same as if control returned there via a procedure retun. If we take the point of view that tail calls aren't recursion then we are justified in saying that the function isn't recursive. Yet it has the recursive control flow structure, uses storage proportional to the left depth of the tree, and does so without appearing to maintain an explicit stack structure. As if that weren't enough, the following obviously recursive program can be automatically converted to the above:
(define (traverse tree)
(cond
[(null? tree)] ;; return
[else (traverse (tree-left tree))
(traverse (tree-right tree))]))
The CPS transformation inserts the continuations and lambdas, turning everything into tail calls that pass a continuation argument.
I've looked at everything I can find about letrec, and I still don't understand what it brings to a language as a feature. It seems like everything expressible with letrec could just as easily be written as a recursive function. But are there any reasons to expose letrec as a feature of a programming language, if the language already supports recursive functions? Why do several languages expose both?
I get that letrec might be used to implement other features including recursive functions, but that's not relevant to why it should itself be a feature. I've also read that some people find it more readable than recursive functions in some lisps, but again this is not relevant, because the designer of the language can make an effort to make recursive functions readable enough to not need another feature. Finally, I've been told that letrec makes it possible to express some kinds of recursive values more succinctly, but I have yet to find a motivating example.
TL;DR: define is letrec. This is what enables us to write recursive defintions in the first place.
Consider
let fact = fun (n => (n==0 -> 1 ; n * fact (n-1)))
To what entity does the name fact inside the body of this definiton refer? With let foo = val, val is defined in terms of already known entities, so it can't refer to foo which is not defined yet. In terms of scope this can be said (and usually is) that the RHS of the let equation is defined in the outer scope.
The only way for the inner fact to actually point at the one being defined, is to use letrec, where the entity being defined is allowed to refer to the scope in which it is being defined. So while causing evaluation of an entity while its definition is in progress is an error, storing a reference to its (future, at this point in time) value is fine -- in the case of using letrec that is.
The define you refer to, is just letrec under another name. In Scheme as well.
Without the ability of an entity being defined to refer to itself, i.e. in languages with non-recursive let, to have recursion one has to resort to the use of arcane devices such as the y-combinator. Which is cumbersome and usually inefficient. Another way is the definitions like
let fact = (fun (f => f f)) (fun (r => n => (n==0 -> 1 ; n * r r (n-1))))
So letrec brings to the table the efficiency of implementation, and convenience for a programmer.
The quesion then becomes, why expose the non-recursive let? Haskell indeed does not. Scheme has both letrec and let. One reason might be for completeness. Another might be a simpler implementation for let, with less self-referential run-time structures in memory making it easier on the garbage collector.
You ask for a motivational example. Consider defining Fibonacci numbers as a self-referential lazy list:
letrec fibs = {0} + {1} + add fibs (tail fibs)
With non-recursive let another copy of the list fibs will be defined, to be used as the input to the element-wise addition function add. Which will cause the definition of another copy of fibs for this one to be defined in its terms. And so on; accessing the nth Fibonacci number will cause a chain of n-1 lists to be created and maintained at run-time! Not a pretty picture.
And that's assuming the same fibs was used for tail fibs as well. If not, all bets are off.
What is needed is that fibs uses itself, refers to itself, so only one copy of the list is maintained.
NB: Although this is not a Scheme specific problem I'm using Scheme to demonstrate the differences. Hope you can read a little lisp code
A letrec is just a special let where the bindings themselves are defined before the expressions that represent their values are evaluated. Imagine this:
(define (fib n)
(let ((fib (lambda (n a b)
(if (zero? n)
a
(fib (- n 1) b (+ a b))))))
(fib n))
This code fails since while fib does exist in the body of the let it does exist in the closure it defines since the binding didn't exist when the lambda was evaluated. To fix this letrec comes to the rescue:
(define (fib n)
(letrec ((fib (lambda (n a b)
(if (zero? n)
a
(fib (- n 1) b (+ a b))))))
(fib n))
That letrec is just syntax that does something like this:
(define (fib n)
(let ((fib 'undefined))
(let ((tmp (lambda (n a b)
(if (zero? n)
a
(fib (- n 1) b (+ a b))))))
(set! fib tmp))
(fib n)))
So here you clearly see fib exists when the lambda gets evaluated and the binding is later set to the closure itself. The binding is the same, only it's pointer has changed. It's circular reference 101..
So what happens when you make a global function? Clearly if it is to recurse it needs to exist before the lambda is evaluated or the environment has to be mutated. It needs to fix the same problem here too.
In a functional language implementation where mutation is not ok you can solve this problem with a Y (or Z) combinator.
If you are interested in how languages are implemented I suggest you start at Matt Mights articles.
I was randomly reading through Clojure source code and I saw that partition function was defined in terms of recursion without using "recur":
(defn partition
... ...
([n step coll]
(lazy-seq
(when-let [s (seq coll)]
(let [p (doall (take n s))]
(when (= n (count p))
(cons p (partition n step (nthrest s step))))))))
... ...)
Is there any reason of doing this?
Partition is lazy. The recursive call to partition occurs within the body of a lazy-seq. Therefore, it is not immediately invoked but returned in a special seq-able object to be evaluated when needed and cache the results realized thus far. The stack depth is limited to one invocation at a time.
A recur without a lazy-seq could be used to create an eager version, but you would not want to use it on sequences of indeterminate length as you can with the version in core.
To build on #A.Webb's answer and #amalloy's comment:
recur is not a shorthand to call the function and it's not a function. It's a special form (what in another language you would call a syntax) to perform tail call optimization.
Tail call optimization is a technique that allows to use recursion without blowing up the stack (without it, every recursive call adds its call frame to the stack). It is not implemented natively in Java (yet), which is why recur is used to mark a tail call in Clojure.
Recursion using lazy-seq is different because it delays the recursive call by wrapping it in a closure. What it means is that a call to a function implemented in terms of lazy-seq (and in particular every recursive call in such a function) returns (immediately) a LazySeq sequence, whose computation is delayed until it is iterated through.
To illustrate and qualify #amalloy's comment that recur and laziness are mutually exclusive, here's an implementation of filter that uses both techniques:
(defn filter [pred coll]
(letfn [(step [pred coll]
(when-let [[x & more] (seq coll)]
(if (pred x)
(cons x (lazy-seq (step pred more))) ;; lazy recursive call
(recur pred more))))] ;; tail call
(lazy-seq (step pred coll))))
(filter even? (range 10))
;; => (0 2 4 6 8)
Both techniques can be used in the same function, but not for the same recursive call ; the function wouldn't compile if the lazy recursive call to step used recur because in this case recur wouldn't be in tail call position (the result of the tail call would not be returned directly but would be passed to lazy-seq).
All of the lazy functions are written this way. This implementation of partition would blow the stack without the call to lazy-seq for a large enough sequence.
Read a bit about TCO (tail call optimization) if you are more interested in how recur works. When you are using tail recursion it means you can jump out of your current function call without losing anything. In the case of this implementation you wouldn't be able to do that because you are cons-ing p on the next call of partition. Being in tail position means you are the last thing being called. In the implementation cons is in tail position. recur only works on tail position to guarantee TCO.
In the following (Clojure) SO question: my own interpose function as an exercise
The accepted answers says this:
Replace your recursive call with a call to recur because as written it
will hit a stack overflow
(defn foo [stuff]
(dostuff ... )
(foo (rest stuff)))
becomes:
(defn foo [stuff]
(dostuff ...)
(recur (rest stuff)))
to avoid blowing the stack.
It may be a silly question but I'm wondering why the recursive call to foo isn't automatically replaced by recur?
Also, I took another SO example and wrote this ( without using cond on purpose, just to try things out):
(defn is-member [elem ilist]
(if (empty? ilist)
false
(if (= elem (first ilist))
true
(is-member elem (rest ilist)))))
And I was wondering if I should replace the call to is-member with recur (which also seems to work) or not.
Are there cases where you do recurse and specifically should not use recur?
There's pretty much never a reason not to use recur if you have a tail-recursive method, although unless you're in a performance-sensitive area of code it just won't make any difference.
I think the basic argument is that having recur be explicit makes it very clear whether a function is tail-recursive or not; all tail-recursive functions use recur, and all functions that recur are tail-recursive (the compiler will complain if you try to use recur from a non-tail-position.) So it's a bit of an aesthetic decision.
recur also helps distinguish Clojure from languages which will do TCO on all tail calls, like Scheme; Clojure can't do that effectively because its functions are compiled as Java functions and the JVM doesn't support it. By making recur a special case, there's hopefully no inflated expectations.
I don't think there would be any technical reason why the compiler couldn't insert recur for you, if it were designed that way, but maybe someone will correct me.
I asked Rich Hickey that and his reasoning was basically (and I paraphrase)
"make the special cases look special"
he did not see the value in papering over a special case most of the time and leaving people
to wonder why if blows the stack later when something changes and the compiler can't guarantee the optimization. Untimely it was just one of the design decisions made to try and keep the language simple
I was wondering if I should replace the call to is-member with recur
In general, as mquander says, there is no reason to not use recur whenever you can. With small inputs (a few dozen to a few hundred elements) they are the same, but the version without recur will blow up on large inputs (a few thousand elements).
Explicit recursion (i.e. without 'recur') is good for many things, but iterating through long sequences is not one of them.
Are there cases where you specifically should not use recur?
Only when you can't use it, which is when
it is not tail recursive - i.e. it wants to do something with the return value.
the recursion is to a different function.
Some examples:
(defn foo [coll]
(when coll
(println (first coll))
(recur (next coll))) ;; OK: Tail recursive
(defn fie [coll]
(when coll
(cons (first coll)
(fie (next coll))))) ;; Can't use recur: Not tail recursive.
(defn fum
([coll]
(fum coll [])) ;; Can't use recur: Different function.
([coll acc]
(if (empty? coll) acc
(recur (next coll) ;; OK: Tail recursive
(conj acc (first coll))))))
As to why recur isn't inserted automatically when appropriate: I don't know, but at least one positive side-effect is to make actual function calls visually distinct from the non-calls (i.e. recur).
Since this can be the difference between "works" and "blows up with StackOverflowError", I think it's a fair design choice to make it explicit - visible in the code - rather than implicit, where you would have to start second-guessing the compiler when it doesn't work as expected.
I enjoy using recursion whenever I can, it seems like a much more natural way to loop over something then actual loops. I was wondering if there is any limit to recursion in lisp? Like there is in python where it freaks out after like 1000 loops?
Could you use it for say, a game loop?
Testing it out now, simple counting recursive function. Now at >7000000!
Thanks alot
First, you should understand what tail call is about.
Tail call are call that do not consumes stack.
Now you need to recognize when your are consuming stack.
Let's take the factorial example:
(defun factorial (n)
(if (= n 1)
1
(* n (factorial (- n 1)))))
Here is the non-tail recursive implementation of factorial.
Why? This is because in addition to a return from factorial, there is a pending computation.
(* n ..)
So you are stacking n each time you call factorial.
Now let's write the tail recursive factorial:
(defun factorial-opt (n &key (result 1))
(if (= n 1)
result
(factorial-opt (- n 1) :result (* result n))))
Here, the result is passed as an argument to the function.
So you're also consuming stack, but the difference is that the stack size stays constant.
Thus, the compiler can optimize it by using only registers and leaving the stack empty.
The factorial-opt is then faster, but is less readable.
factorial is limited to the size of the stack will factorial-opt is not.
So you should learn to recognize tail recursive function in order to know if the recursion is limited.
There might be some compiler technique to transform a non-tail recursive function into a tail recursive one. Maybe someone could point out some link here.
Scheme mandates tail call optimization, and some CL implementations offer it as well. However, CL does not mandate it.
Note that for tail call optimization to work, you need to make sure that you don't have to return. E.g. a naive implementation of Fibonacci where there is a need to return and add to another recursive call, will not be tail call optimized and as a result you will run out of stack space.