Why aren't recursive calls automagically replaced by recur? - recursion

In the following (Clojure) SO question: my own interpose function as an exercise
The accepted answers says this:
Replace your recursive call with a call to recur because as written it
will hit a stack overflow
(defn foo [stuff]
(dostuff ... )
(foo (rest stuff)))
becomes:
(defn foo [stuff]
(dostuff ...)
(recur (rest stuff)))
to avoid blowing the stack.
It may be a silly question but I'm wondering why the recursive call to foo isn't automatically replaced by recur?
Also, I took another SO example and wrote this ( without using cond on purpose, just to try things out):
(defn is-member [elem ilist]
(if (empty? ilist)
false
(if (= elem (first ilist))
true
(is-member elem (rest ilist)))))
And I was wondering if I should replace the call to is-member with recur (which also seems to work) or not.
Are there cases where you do recurse and specifically should not use recur?

There's pretty much never a reason not to use recur if you have a tail-recursive method, although unless you're in a performance-sensitive area of code it just won't make any difference.
I think the basic argument is that having recur be explicit makes it very clear whether a function is tail-recursive or not; all tail-recursive functions use recur, and all functions that recur are tail-recursive (the compiler will complain if you try to use recur from a non-tail-position.) So it's a bit of an aesthetic decision.
recur also helps distinguish Clojure from languages which will do TCO on all tail calls, like Scheme; Clojure can't do that effectively because its functions are compiled as Java functions and the JVM doesn't support it. By making recur a special case, there's hopefully no inflated expectations.
I don't think there would be any technical reason why the compiler couldn't insert recur for you, if it were designed that way, but maybe someone will correct me.

I asked Rich Hickey that and his reasoning was basically (and I paraphrase)
"make the special cases look special"
he did not see the value in papering over a special case most of the time and leaving people
to wonder why if blows the stack later when something changes and the compiler can't guarantee the optimization. Untimely it was just one of the design decisions made to try and keep the language simple

I was wondering if I should replace the call to is-member with recur
In general, as mquander says, there is no reason to not use recur whenever you can. With small inputs (a few dozen to a few hundred elements) they are the same, but the version without recur will blow up on large inputs (a few thousand elements).
Explicit recursion (i.e. without 'recur') is good for many things, but iterating through long sequences is not one of them.
Are there cases where you specifically should not use recur?
Only when you can't use it, which is when
it is not tail recursive - i.e. it wants to do something with the return value.
the recursion is to a different function.
Some examples:
(defn foo [coll]
(when coll
(println (first coll))
(recur (next coll))) ;; OK: Tail recursive
(defn fie [coll]
(when coll
(cons (first coll)
(fie (next coll))))) ;; Can't use recur: Not tail recursive.
(defn fum
([coll]
(fum coll [])) ;; Can't use recur: Different function.
([coll acc]
(if (empty? coll) acc
(recur (next coll) ;; OK: Tail recursive
(conj acc (first coll))))))
As to why recur isn't inserted automatically when appropriate: I don't know, but at least one positive side-effect is to make actual function calls visually distinct from the non-calls (i.e. recur).
Since this can be the difference between "works" and "blows up with StackOverflowError", I think it's a fair design choice to make it explicit - visible in the code - rather than implicit, where you would have to start second-guessing the compiler when it doesn't work as expected.

Related

Is this Scheme function recursive?

Given the following function, am I allowed to say that it is recursive? Why I ask this question is because the 'fac' function actually doesn't call itself recursively, so am I still allowed to say that it is a recursive function even though the only function that calls itself is fac-iter?
(define (fac n)
(define (fac-iter counter result)
(if (= counter 0)
result
(fac-iter (- counter 1) (* result counter))))
(fac-iter n 1))
fac is not recursive: it does not refer to its own definition in any way.
fac-iter is recursive: it does refer to its own definition. But in Scheme it will create an iterative process, since its calls to itself are tail calls.
(In casual speech I think people would often say that neither fac nor fac-iter is recursive in Scheme, but I think speaking more precisely the above is correct.)
One problem with calling fac formally recursive is that fac-iter is liftable out of fac. You can write the code like this:
(define (fac-iter counter result)
(if (= counter 0)
result
(fac-iter (- counter 1) (* result counter))))
(define (fac n)
(fac-iter n 1))
fac is an interface which is implemented by a recursive helper function.
If the helper function had a reason to be inside fac, such as accessing the parent's local variables, then there would be more justification for calling fac formally recursive: a significant piece of the interior of fac, a local funcction doing the bulk of the work, is internally recursive, and that interior cannot be moved to the top level without some refactoring.
Informally we can call fac recursive regardless if what we mean by that is that the substantial bulk of its work is done by a recursive approach. We are emphasizing the algorithm that is used, not the details over how it is integrated.
If a homework problem states "please implement a recursive solution to the binary search problem", and the solution is required to take the form of a bsearch.scm file, then obviously the problem statement doesn't mean that the bsearch.scm file must literally invoke itself, right? It means that the main algorithmic content in that file is recursive.
Or when we say that "the POSIX utility find performs a recursive traversal of the filesystem" we don't mean that find forks and executes a copy of itself for every directory it visits.
There is room for informality: calling something recursive without meaning that the entry point of that thing which has that thing's name is literally calling itself.
On another note, in some situations the term "recursion" in the Scheme context is used to denote recursion that requires stack storage; tail calls that are required to be rewritten to express iteration aren't called recursion. That's just taking the point of view of the implementation; what the compiled code is doing. Tail calls are sometimes called "stackless recursion" as a kind of compromise. The situation is complicated because tail calls alone do not eliminate true recursion. There is a way of compiling programs such that all procedure calls become tail calls, namely transformation to CPS (continuation passing style). Yet if the source program performs true recursion that requires a stack, the CPS-transformed program will also, in spite of using nothing but tail calls. What will happen is that an ad hoc stack will emerge via a chain of captured lambda environments. A lambda being used as a continuation captures the previous continuation as a lexical variable. The previous continuation itself captures another such a continuation in its environment, and so on. A heap-allocated chain emerges which constitutes the de facto return stack for the recursion. For reasons like this we cannot automatically conclude that when we see tail calls, we have iteration and not recursion.
An example looks like this. The traversal of a binary tree is truly recursive, right? When we visit the left child, that visitation must return, so that we can then visit the right child. The right child visit can be a tail call, but the left one isn't. Under CPS, they can both be tail calls:
(define (traverse tree contin)
(cond
[(null? tree) (contin)] ;; tail call to continuation
[else (traverse (tree-left tree) ;; tail call to traverse
(lambda ()
(traverse (right tree) contin)))])) ;; ditto!
so here, when the left node is traversed, that is a tail call: the last thing our procedure does is call (traverse (tree-left tree) (lambda ...)). But it passes that lambda as a continuation, and that continuation contains more statements to execute when it is invoked, which is essentially the same as if control returned there via a procedure retun. If we take the point of view that tail calls aren't recursion then we are justified in saying that the function isn't recursive. Yet it has the recursive control flow structure, uses storage proportional to the left depth of the tree, and does so without appearing to maintain an explicit stack structure. As if that weren't enough, the following obviously recursive program can be automatically converted to the above:
(define (traverse tree)
(cond
[(null? tree)] ;; return
[else (traverse (tree-left tree))
(traverse (tree-right tree))]))
The CPS transformation inserts the continuations and lambdas, turning everything into tail calls that pass a continuation argument.

How to write a racket function with using tail recursion with couple conditionals

I'm having a problem understanding how to set up a racket function that has a couple conditionals for tail recursion. Normally with one conditional, I would set up the helper function and assign acc to my base case, and then call the helper function. With multiple conditionals though I'm confused on how to proceed.
Please provide more specifics. But in general, you can simple call any function of your choosing in tail position.
So with "couple of conditionals" you join them into one cond with several clauses, and in each clause's tail position you're free to call your function again (or any other):
(cond
(if this then do this)
(if this then do that))
(cond
(if this then do the other))
joined together becomes
(cond
(if this then do this)
(if this then do that)
(if this then do the other))
Now this, that, and other in the end of each clause are each in tail position, provided the whole cond form is in tail position.

Limit the recursion level in a Clojure repl

While working in the repl is there a way to specify the maximum times to recur before the repl will automatically end the evaluation of an expression. As an example, suppose the following function:
(defn looping []
(loop [n 1]
(recur (inc n))))
(looping)
Is there a way to instruct the repl to give up after 100 levels of recursion? Something similar to print-level.
I respectfully hope that I'm not ignoring the spirit of your question, but why not simply use a when expression? It's nice and succinct and wouldn't change the body of your function much at all (1 extra line and a closing paren).
Whilst I don't believe what you want exists, it would be trivial to implement your own:
(def ^:dynamic *recur-limit* 100)
(defn looping []
(loop [n 1]
(when (< n *recur-limit*)
(recur (inc n)))))
Presumably this hasn't been added to the language because it's easy to construct what you need with the existing language primitives; apart from that, if the facility did exist but was 'invisible', it could cause an awful lot of confusion and bugs because code wouldn't always behave in a predictable and referentially transparent manner.

Is recursion a smell (in idiomatic Clojure) because of of zippers and HOFs?

The classic book The Little Lisper (The Little Schemer) is founded on two big ideas
You can solve most problems in a recursive way (instead of using loops) (assuming you have Tail Call Optimisation)
Lisp is great because it is easy to implement in itself.
Now one might think this holds true for all Lispy languages (including Clojure). The trouble is, the book is an artefact of its time (1989), probably before Functional Programming with Higher Order Functions (HOFs) was what we have today.(Or was at least considered palatable for undergraduates).
The benefit of recursion (at least in part) is the ease of traversal of nested data structures like ('a 'b ('c ('d 'e))).
For example:
(def leftmost
(fn [l]
(println "(leftmost " l)
(println (non-atom? l))
(cond
(null? l) '()
(non-atom? (first l)) (leftmost (first l))
true (first l))))
Now with Functional Zippers - we have a non-recursive approach to traversing nested data structures, and can traverse them as we would any lazy data structure. For example:
(defn map-zipper [m]
(zip/zipper
(fn [x] (or (map? x) (map? (nth x 1))))
(fn [x] (seq (if (map? x) x (nth x 1))))
(fn [x children]
(if (map? x)
(into {} children)
(assoc x 1 (into {} children))))
m))
(def m {:a 3 :b {:x true :y false} :c 4})
(-> (map-zipper m) zip/down zip/right zip/node)
;;=> [:b {:y false, :x true}]
Now it seems you can solve any nested list traversal problem with either:
a zipper as above, or
a zipper that walks the structure and returns a set of keys that will let you modify the structure using assoc.
Assumptions:
I'm assuming of course data structures that fixed-size, and fully known prior to traversal
I'm excluding the streaming data source scenario.
My question is: Is recursion a smell (in idiomatic Clojure) because of of zippers and HOFs?
I would say that, yes, if you are doing manual recursion you should at least reconsider whether you need to. But I wouldn't say that zippers have anything to do with this. My experience with zippers has been that they are of theoretical use, and are very exciting to Clojure newcomers, but of little practical value once you get the hang of things, because the situations in which they are useful are vanishingly rare.
It's really because of higher-order functions that have already implemented the common recursive patterns for you that manual recursion is uncommon. However, it's certainly not the case that you should never use manual recursion: it's just a warning sign, suggesting you might be able to do something else. I can't even recall a situation in my four years of using Clojure that I've actually needed a zipper, but I end up using recursion fairly often.
Clojure idioms discourage explicit recursion because the call stack is limited: usually to about 10K deep. Amending the first of Halloway & Bedra's Six Rules of Clojure Functional Programming (Programming Clojure (p 89)),
Avoid unbounded recursion. The JVM cannot optimize recursive calls and
Clojure programs that recurse without bound will blow their stack.
There are a couple of palliatives:
recur deals with tail recursion.
Lazy sequences can turn a deep call stack into a shallow call stack
across an unfolding data structure. Many HOFs in the sequence
library, such as map and filter, do this.

recursion in implementation of "partition" function

I was randomly reading through Clojure source code and I saw that partition function was defined in terms of recursion without using "recur":
(defn partition
... ...
([n step coll]
(lazy-seq
(when-let [s (seq coll)]
(let [p (doall (take n s))]
(when (= n (count p))
(cons p (partition n step (nthrest s step))))))))
... ...)
Is there any reason of doing this?
Partition is lazy. The recursive call to partition occurs within the body of a lazy-seq. Therefore, it is not immediately invoked but returned in a special seq-able object to be evaluated when needed and cache the results realized thus far. The stack depth is limited to one invocation at a time.
A recur without a lazy-seq could be used to create an eager version, but you would not want to use it on sequences of indeterminate length as you can with the version in core.
To build on #A.Webb's answer and #amalloy's comment:
recur is not a shorthand to call the function and it's not a function. It's a special form (what in another language you would call a syntax) to perform tail call optimization.
Tail call optimization is a technique that allows to use recursion without blowing up the stack (without it, every recursive call adds its call frame to the stack). It is not implemented natively in Java (yet), which is why recur is used to mark a tail call in Clojure.
Recursion using lazy-seq is different because it delays the recursive call by wrapping it in a closure. What it means is that a call to a function implemented in terms of lazy-seq (and in particular every recursive call in such a function) returns (immediately) a LazySeq sequence, whose computation is delayed until it is iterated through.
To illustrate and qualify #amalloy's comment that recur and laziness are mutually exclusive, here's an implementation of filter that uses both techniques:
(defn filter [pred coll]
(letfn [(step [pred coll]
(when-let [[x & more] (seq coll)]
(if (pred x)
(cons x (lazy-seq (step pred more))) ;; lazy recursive call
(recur pred more))))] ;; tail call
(lazy-seq (step pred coll))))
(filter even? (range 10))
;; => (0 2 4 6 8)
Both techniques can be used in the same function, but not for the same recursive call ; the function wouldn't compile if the lazy recursive call to step used recur because in this case recur wouldn't be in tail call position (the result of the tail call would not be returned directly but would be passed to lazy-seq).
All of the lazy functions are written this way. This implementation of partition would blow the stack without the call to lazy-seq for a large enough sequence.
Read a bit about TCO (tail call optimization) if you are more interested in how recur works. When you are using tail recursion it means you can jump out of your current function call without losing anything. In the case of this implementation you wouldn't be able to do that because you are cons-ing p on the next call of partition. Being in tail position means you are the last thing being called. In the implementation cons is in tail position. recur only works on tail position to guarantee TCO.

Resources