Recursively run through a vector in Clojure - recursion

I'm just starting to play with Clojure.
How do I run through a vector of items?
My naive recursive function would have a form like the classic map eg.
(defn map [f xs] (
(if (= xs [])
[]
(cons (f (first xs)) (map f (rest xs))
)
))
The thing is I can't find any examples of this kind of code on the web. I find a lot of examples using built-in sequence traversing functions like for, map and loop. But no-one doing the raw recursive version.
Is that because you SHOULDN'T do this kind of thing in Clojure? (eg. because it uses lower-level Java primitives that don't have tail-call optimisation or something?)?

When you say "run through a vector" this is quite vague; as Clojure is a lisp and thus specializes in sequence analysis and manipulation, the beauty of using this language is that you don't think in terms "run through a vector and then do something with each element," instead you'd more idiomatically say "pull this out of a vector" or "transform this vector into X" or "I want this vector to give me X".
It is because of this type of perspective in lisp languages that you will see so many examples and production code that doesn't just loop/recur through a vector but rather specifically goes after what is wanted in a short, idiomatic way. Using simple functions like reduce map filter for into and others allow you to elegantly move over a sequence such as a vector while simultaneously doing what you want with the contents. In most other languages, this would be at least 2 different parts: the loop, and then the actual logic to do what you want.
You'll often find that if you think about sequences using the more imperative idea you get with languages like C, C++, Java, etc, that your code is about 4x longer (at least) than it would otherwise be if you first thought about your plan in a more functional approach.

Clojure re-uses stack frames only with tail-recurstion and only when you use the explicit recur call. Everything else will be stack consuming. The above map example is not tail recursive because the cons happens after the recursive call so it can't be TCO'd in any language. If you switch it to use the continuation passing style and use an explicit call to recur instead of map then you should be good to go.

Related

Filter a range without using an intermediate list

I have written a function is-prime that verifies whether a given number is a prime number or not, and returns t or nil accordingly.
(is-prime 2) ; => T
(is-prime 3) ; => T
(is-prime 4) ; => NIL
So far, so good. Now I would like to generate a list of prime numbers between min and max, so I would like to come up with a function that takes those two values as parameters and returns a list of all prime numbers:
(defun get-primes (min max)
...)
Now this is where I am currently stuck. What I could do, of course, is create a list with the range of all numbers from min to max and run remove-if-not on it.
Anyway, this means, that I first have to create a potentially huge list with lots of numbers that I throw away anyway. So I would like to do it the other way round, build up a list that contains only the numbers between min and max that actually are prime according to the is-prime predicate.
How can I do this in a functional way, i.e. without using loop? My current approach (with loop) looks like this:
(defun get-primes (min max)
(loop
for guess from min to max
when (is-prime guess)
collect guess))
Maybe this is a totally dumb question, but I guess I don't see the forest for the trees.
Any hints?
Common Lisp does not favor pure Functional Programming approaches. The language is based on a more pragmatic view of an underlying machine: no TCO, stacks are limited, various resources are limited (the number of arguments which are allowed, etc.), mutation is possible. That does not sound very motivating for any Haskell enthusiast. But Common Lisp was developed to write Lisp applications, not for advancing FP research and development. Evaluation in Common Lisp (as usual in Lisp) is strict and not lazy. The default data structures are also not lazy. Though there are lazy libraries. Common Lisp also does not provide in the standard features like continuations or coroutines - which might be useful in this case.
The default Common Lisp approach is non-functional and uses some kind of iteration construct: DO, LOOP or the ITERATE library. Like in your example. Common Lisp users find it perfectly fine. Some, like me, think that Iterate is the best looping construct ever invented. ;-) The advantage is that it is relatively easy to create efficient code, even though the macros have a large implementation.
There are various ways to do it differently:
lazy streams. reading SICP helps, see the part on streams. Can be easily done in Common Lisp, too. Note that Common Lisp uses the word stream for I/O streams - which is something different.
use a generator function next-prime. Generate this function from the initial range and call it until you got all primes you are interested in.
Here is a simple example of a generator function, generating even numbers from some start value:
(defun make-next-even-fn (start)
(let ((current (- start
(if (evenp start) 2 1))))
(lambda ()
(incf current 2))))
CL-USER 14 > (let ((next-even-fn (make-next-even-fn 13)))
(flet ((get-next-even ()
(funcall next-even-fn)))
(print (get-next-even))
(print (get-next-even))
(print (get-next-even))
(print (get-next-even))
(list (get-next-even)
(get-next-even)
(get-next-even))))
14
16
18
20
(22 24 26)
Defining the next-prime generator is left as an exercise. ;-)
use some kind of more sophisticated lazy data structure. For Common Lisp there is Series, which was an early iteration proposal - also published in CLtL2: Series. The new book Common Lisp Recipes by Edi Weitz (math prof in Hamburg) has a Series example for computing primes. See chapter 7.15. You can download the source code for the book and find the example there.
the are simpler variants of Series, for example pipes

Fixed-Point Combinators

I am new to the world of fixed-point combinators and I guess they are used to recurse on anonymous lambdas, but I haven't really got to use them, or even been able to wrap my head around them completely.
I have seen the example in Javascript for a Y-combinator but haven't been able to successfully run it.
The question here is, can some one give an intuitive answer to:
What are Fixed-point combinators, (not just theoretically, but in context of some example, to reveal what exactly is the fixed-point in that context)?
What are the other kinds of fixed-point combinators, apart from the Y-combinator?
Bonus Points: If the example is not just in one language, preferably in Clojure as well.
UPDATE:
I have been able to find a simple example in Clojure, but still find it difficult to understand the Y-Combinator itself:
(defn Y [r]
((fn [f] (f f))
(fn [f]
(r (fn [x] ((f f) x))))))
Though the example is concise, I find it difficult to understand what is happening within the function. Any help provided would be useful.
Suppose you wanted to write the factorial function. Normally, you would write it as something like
function fact(n) = if n=0 then 1 else n * fact(n-1)
But that uses explicit recursion. If you wanted to use the Y-combinator instead, you could first abstract fact as something like
function factMaker(myFact) = lamba n. if n=0 then 1 else n * myFact(n-1)
This takes an argument (myFact) which it calls were the "true" fact would have called itself. I call this style of function "Y-ready", meaning it's ready to be fed to the Y-combinator.
The Y-combinator uses factMaker to build something equivalent to the "true" fact.
newFact = Y(factMaker)
Why bother? Two reasons. The first is theoretical: we don't really need recursion if we can "simulate" it using the Y-combinator.
The second is more pragmatic. Sometimes we want to wrap each function call with some extra code to do logging or profiling or memoization or a host of other things. If we try to do this to the "true" fact, the extra code will only be called for the original call to fact, not all the recursive calls. But if we want to do this for every call, including all the recursive call, we can do something like
loggingFact = LoggingY(factMaker)
where LoggingY is a modified version of the Y combinator that introduces logging. Notice that we did not need to change factMaker at all!
All this is more motivation why the Y-combinator matters than a detailed explanation from how that particular implementation of Y works (because there are many different ways to implement Y).
To answer your second question about fix-point combinators other than Y. There are countably infinitely many standard fix-point combinators, that is, combinators fix that satisfy the equation
fix f = f (fix f)
There are also contably many non-standard fix-point combinators, which satisfy the equation
fix f = f (f (fix f))
etc. Standard fix-point combinators are recursively enumerable, but non-standard are not. Please see the following web page for examples, references and discussion.
http://okmij.org/ftp/Computation/fixed-point-combinators.html#many-fixes

What are best practices for including parameters such as an accumulator in functions?

I've been writing more Lisp code recently. In particular, recursive functions that take some data, and build a resulting data structure. Sometimes it seems I need to pass two or three pieces of information to the next invocation of the function, in addition to the user supplied data. Lets call these accumulators.
What is the best way to organize these interfaces to my code?
Currently, I do something like this:
(defun foo (user1 user2 &optional acc1 acc2 acc3)
;; do something
(foo user1 user2 (cons x acc1) (cons y acc2) (cons z acc3)))
This works as I'd like it to, but I'm concerned because I don't really need to present the &optional parameters to the programmer.
3 approaches I'm somewhat considering:
have a wrapper function that a user is encouraged to use that immediately invokes the extended definiton.
use labels internally within a function whose signature is concise.
just start using a loop and variables. However, I'd prefer not since I'd like to really wrap my head around recursion.
Thanks guys!
If you want to write idiomatic Common Lisp, I'd recommend the loop and variables for iteration. Recursion is cool, but it's only one tool of many for the Common Lisper. Besides, tail-call elimination is not guaranteed by the Common Lisp spec.
That said, I'd recommend the labels approach if you have a structure, a tree for example, that is unavoidably recursive and you can't get tail calls anyway. Optional arguments let your implementation details leak out to the caller.
Your impulse to shield implementation details from the user is a smart one, I think. I don't know common lisp, but in Scheme you do it by defining your helper function in the public function's lexical scope.
(define (fibonacci n)
(let fib-accum ((a 0)
(b 1)
(n n))
(if (< n 1)
a
(fib-accum b (+ a b) (- n 1)))))
The let expression defines a function and binds it to a name that's only visible within the let, then invokes the function.
I have used all the options you mention. All have their merits, so it boils down to personal preference.
I have arrived at using whatever I deem appropriate. If I think that leaving the &optional accumulators in the API might make sense for the user, I leave it in. For example, in a reduce-like function, the accumulator can be used by the user for providing a starting value. Otherwise, I'll often rewrite it as a loop, do, or iter (from the iterate library) form, if it makes sense to perceive it as such. Sometimes, the labels helper is also used.

Are there problems that cannot be written using tail recursion?

Tail recursion is an important performance optimisation stragegy in functional languages because it allows recursive calls to consume constant stack (rather than O(n)).
Are there any problems that simply cannot be written in a tail-recursive style, or is it always possible to convert a naively-recursive function into a tail-recursive one?
If so, one day might functional compilers and interpreters be intelligent enough to perform the conversion automatically?
Yes, actually you can take some code and convert every function call—and every return—into a tail call. What you end up with is called continuation-passing style, or CPS.
For example, here's a function containing two recursive calls:
(define (count-tree t)
(if (pair? t)
(+ (count-tree (car t)) (count-tree (cdr t)))
1))
And here's how it would look if you converted this function to continuation-passing style:
(define (count-tree-cps t ctn)
(if (pair? t)
(count-tree-cps (car t)
(lambda (L) (count-tree-cps (cdr t)
(lambda (R) (ctn (+ L R))))))
(ctn 1)))
The extra argument, ctn, is a procedure which count-tree-cps tail-calls instead of returning. (sdcvvc's answer says that you can't do everything in O(1) space, and that is correct; here each continuation is a closure which takes up some memory.)
I didn't transform the calls to car or cdr or + into tail-calls. That could be done as well, but I assume those leaf calls would actually be inlined.
Now for the fun part. Chicken Scheme actually does this conversion on all code it compiles. Procedures compiled by Chicken never return. There's a classic paper explaining why Chicken Scheme does this, written in 1994 before Chicken was implemented: CONS should not cons its arguments, Part II: Cheney on the M.T.A.
Surprisingly enough, continuation-passing style is fairly common in JavaScript. You can use it to do long-running computation, avoiding the browser's "slow script" popup. And it's attractive for asynchronous APIs. jQuery.get (a simple wrapper around XMLHttpRequest) is clearly in continuation-passing style; the last argument is a function.
It's true but not useful to observe that any collection of mutually recursive functions can be turned into a tail-recursive function. This observation is on a par with the old chestnut fro the 1960s that control-flow constructs could be eliminated because every program could be written as a loop with a case statement nested inside.
What's useful to know is that many functions which are not obviously tail-recursive can be converted to tail-recursive form by the addition of accumulating parameters. (An extreme version of this transformation is the transformation to continuation-passing style (CPS), but most programmers find the output of the CPS transform difficult to read.)
Here's an example of a function that is "recursive" (actually it's just iterating) but not tail-recursive:
factorial n = if n == 0 then 1 else n * factorial (n-1)
In this case the multiply happens after the recursive call.
We can create a version that is tail-recursive by putting the product in an accumulating parameter:
factorial n = f n 1
where f n product = if n == 0 then product else f (n-1) (n * product)
The inner function f is tail-recursive and compiles into a tight loop.
I find the following distinctions useful:
In an iterative or recursive program, you solve a problem of size n by
first solving one subproblem of size n-1. Computing the factorial function
falls into this category, and it can be done either iteratively or
recursively. (This idea generalizes, e.g., to the Fibonacci function, where
you need both n-1 and n-2 to solve n.)
In a recursive program, you solve a problem of size n by first solving two
subproblems of size n/2. Or, more generally, you solve a problem of size n
by first solving a subproblem of size k and one of size n-k, where 1 < k < n. Quicksort and mergesort are two examples of this kind of problem, which
can easily be programmed recursively, but is not so easy to program
iteratively or using only tail recursion. (You essentially have to simulate recursion using an explicit
stack.)
In dynamic programming, you solve a problem of size n by first solving all
subproblems of all sizes k, where k<n. Finding the shortest route from one
point to another on the London Underground is an example of this kind of
problem. (The London Underground is a multiply-connected graph, and you
solve the problem by first finding all points for which the shortest path
is 1 stop, then for which the shortest path is 2 stops, etc etc.)
Only the first kind of program has a simple transformation into tail-recursive form.
Any recursive algorithm can be rewritten as an iterative algorithm (perhaps requiring a stack or list) and iterative algorithms can always be rewritten as tail-recursive algorithms, so I think it's true that any recursive solution can somehow be converted to a tail-recursive solution.
(In comments, Pascal Cuoq points out that any algorithm can be converted to continuation-passing style.)
Note that just because something is tail-recursive doesn't mean that its memory usage is constant. It just means that the call-return stack doesn't grow.
You can't do everything in O(1) space (space hierarchy theorem). If you insist on using tail recursion, then you can store the call stack as one of the arguments. Obviously this doesn't change anything; somewhere internally, there is a call stack, you're simply making it explicitly visible.
If so, one day might functional compilers and interpreters be intelligent enough to perform the conversion automatically?
Such conversion will not decrease space complexity.
As Pascal Cuoq commented, another way is to use CPS; all calls are tail recursive then.
I don't think something like tak could be implemented using only tail calls. (not allowing continuations)

Nested functions: Improper use of side-effects?

I'm learning functional programming, and have tried to solve a couple problems in a functional style. One thing I experienced, while dividing up my problem into functions, was it seemed I had two options: use several disparate functions with similar parameter lists, or using nested functions which, as closures, can simply refer to bindings in the parent function.
Though I ended up going with the second approach, because it made function calls smaller and it seemed to "feel" better, from my reading it seems like I may be missing one of the main points of functional programming, in that this seems "side-effecty"? Now granted, these nested functions cannot modify the outer bindings, as the language I was using prevents that, but if you look at each individual inner function, you can't say "given the same parameters, this function will return the same results" because they do use the variables from the parent scope... am I right?
What is the desirable way to proceed?
Thanks!
Functional programming isn't all-or-nothing. If nesting the functions makes more sense, I'd go with that approach. However, If you really want the internal functions to be purely functional, explicitly pass all the needed parameters into them.
Here's a little example in Scheme:
(define (foo a)
(define (bar b)
(+ a b)) ; getting a from outer scope, not purely functional
(bar 3))
(define (foo a)
(define (bar a b)
(+ a b)) ; getting a from function parameters, purely functional
(bar a 3))
(define (bar a b) ; since this is purely functional, we can remove it from its
(+ a b)) ; environment and it still works
(define (foo a)
(bar a 3))
Personally, I'd go with the first approach, but either will work equally well.
Nesting functions is an excellent way to divide up the labor in many functions. It's not really "side-effecty"; if it helps, think of the captured variables as implicit parameters.
One example where nested functions are useful is to replace loops. The parameters to the nested function can act as induction variables which accumulate values. A simple example:
let factorial n =
let rec facHelper p n =
if n = 1 then p else facHelper (p*n) (n-1)
in
facHelper 1 n
In this case, it wouldn't really make sense to declare a function like facHelper globally, since users shouldn't have to worry about the p parameter.
Be aware, however, that it can be difficult to test nested functions individually, since they cannot be referred to outside of their parent.
Consider the following (contrived) Haskell snippet:
putLines :: [String] -> IO ()
putLines lines = putStr string
where string = concat lines
string is a locally bound named constant. But isn't it also a function taking no arguments that closes over lines and is therefore referentially intransparent? (In Haskell, constants and nullary functions are indeed indistinguishable!) Would you consider the above code “side-effecty” or non-functional because of this?

Resources