Practical use of fold/reduce in functional languages - functional-programming

Fold (aka reduce) is considered a very important higher order function. Map can be expressed in terms of fold (see here). But it sounds more academical than practical to me. A typical use could be to get the sum, or product, or maximum of numbers, but these functions usually accept any number of arguments. So why write (fold + 0 '(2 3 5)) when (+ 2 3 5) works fine. My question is, in what situation is it easiest or most natural to use fold?

The point of fold is that it's more abstract. It's not that you can do things that you couldn't before, it's that you can do them more easily.
Using a fold, you can generalize any function that is defined on two elements to apply to an arbitrary number of elements. This is a win because it's usually much easier to write, test, maintain and modify a single function that applies two arguments than to a list. And it's always easier to write, test, maintain, etc. one simple function instead of two with similar-but-not-quite functionality.
Since fold (and for that matter, map, filter, and friends) have well-defined behaviour, it's often much easier to understand code using these functions than explicit recursion.
Basically, once you have the one version, you get the other "for free". Ultimately, you end up doing less work to get the same result.

Here are a few simple examples where reduce works really well.
Find the sum of the maximum values of each sub-list
Clojure:
user=> (def x '((1 2 3) (4 5) (0 9 1)))
#'user/x
user=> (reduce #(+ %1 (apply max %2)) 0 x)
17
Racket:
> (define x '((1 2 3) (4 5) (0 9 1)))
> (foldl (lambda (a b) (+ b (apply max a))) 0 x)
17
Construct a map from a list
Clojure:
user=> (def y '(("dog" "bark") ("cat" "meow") ("pig" "oink")))
#'user/y
user=> (def z (reduce #(assoc %1 (first %2) (second %2)) {} y))
#'user/z
user=> (z "pig")
"oink"
For a more complicated clojure example featuring reduce, check out my solution to Project Euler problems 18 & 67.
See also: reduce vs. apply

In Common Lisp functions don't accept any number of arguments.
There is a constant defined in every Common Lisp implementation CALL-ARGUMENTS-LIMIT, which must be 50 or larger.
This means that any such portably written function should accept at least 50 arguments. But it could be just 50.
This limit exists to allow compilers to possibly use optimized calling schemes and to not provide the general case, where an unlimited number of arguments could be passed.
Thus to really process large (larger than 50 elements) lists or vectors in portable Common Lisp code, it is necessary to use iteration constructs, reduce, map, and similar. Thus it is also necessary to not use (apply '+ large-list) but use (reduce '+ large-list).

Code using fold is usually awkward to read. That's why people prefer map, filter, exists, sum, and so on—when available. These days I'm primarily writing compilers and interpreters; here's some ways I use fold:
Compute the set of free variables for a function, expression, or type
Add a function's parameters to the symbol table, e.g., for type checking
Accumulate the collection of all sensible error messages generated from a sequence of definitions
Add all the predefined classes to a Smalltalk interpreter at boot time
What all these uses have in common is that they're accumulating information about a sequence into some kind of set or dictionary. Eminently practical.

Your example (+ 2 3 4) only works because you know the number of arguments beforehand. Folds work on lists the size of which can vary.
fold/reduce is the general version of the "cdr-ing down a list" pattern. Each algorithm that's about processing every element of a sequence in order and computing some return value from that can be expressed with it. It's basically the functional version of the foreach loop.

Here's an example that nobody else mentioned yet.
By using a function with a small, well-defined interface like "fold", you can replace that implementation without breaking the programs that use it. You could, for example, make a distributed version that runs on thousands of PCs, so a sorting algorithm that used it would become a distributed sort, and so on. Your programs become more robust, simpler, and faster.
Your example is a trivial one: + already takes any number of arguments, runs quickly in little memory, and has already been written and debugged by whoever wrote your compiler. Those properties are not often true of algorithms I need to run.

Related

Is there a intersection function for vectors?

Very often one finds statements that lists have a performance disadvantage compared to vectors because of consing and additional gc steps and some function work on generic sequences accepting lists and vectors.
But some functions like intersection expect two lists. Is there a library providing an alternative for vectors?
I started with something like this, but have the feeling that there should be a more mature solution.
(defun vec-intersec (vec-1 vec-2 &aux (result (make-array 0 :adjustable t :fill-pointer 0)))
"A simple implementation of intersection for vectors instead of lists."
(loop :for v1 :across vec-1
:if (find v1 vec-2 :test #'equal)
:do (vector-push-extend v1 result))
result)
It always depends on the size of your collection and what you want to do with it.
Below about 20 to 50 elements, lists are often perfectly OK even for random access (if you're not in a tight inner loop, or consing a lot).
If you already have vectors, it might be most convenient to sort one of them so that you can do a binary search instead of a naïve linear one. If that is not enough, and your collections bigger, putting the elements into a hash-table (as keys, with an appropriate :test) gives you faster (amortized) lookup.
This should take you quite far. If you identify an issue that cannot be solved in such a simple way, you might want to look into FSet or CL-Containers, which support more advanced data structures.

Why does apply throw a CONTROL-STACK-EXHAUSTED-ERROR on a large list?

(apply #'+ (loop for i from 1 to x collect 1))
works if x has value 253391, but fails with a (SB-KERNEL::CONTROL-STACK-EXHAUSTED-ERROR) on 253392*. This is orders of magnitude smaller than call-arguments-limit**.
Is recursion exhausting the stack? If so, is it in apply? Why hasn't it been optimized out?
*Also interesting, (apply #'max (loop for i from 1 to 253391 collect 1)) throws the error, but 253390 is fine.
**call-arguments-limit evaluates to 4611686018427387903 (with the help of format's ~R, it turns out this is four quintillion six hundred eleven quadrillion six hundred eighty-six trillion eighteen billion four hundred twenty-seven million three hundred eighty-seven thousand nine hundred three)
parameters that can be passed to a function in SBCL
You don't pass parameters. You pass arguments.
(defun foo (x y) (list x y))
x and y are parameters of the function foo.
(foo 20 22)
20 and 22 are arguments in a call of the function foo.
See the variables call-arguments-limit and lambda-parameters-limit.
SBCL and call-arguments-limit
If a function can't nearly handle the claimed number of arguments, then this looks like a bug in SBCL. You might want to report this bug. Maybe they need to change the value of call-arguments-limit.
Testing
APPLY is one way to test it.
Another:
(eval (append '(drop-params)
(loop for i from 1 to 2533911 collect 1)))
One can also use FUNCALL with a number of arguments spread out.
Why does a limit exist?
The Common Lisp standard was written to allow efficient implementations on various different computers. It was thought that some machine-level function calling implementations only support limited number of arguments. The standard says the number of supported arguments can be as low as 50. Actually some implementations have a relatively low number of supported arguments.
Thus apply in Common Lisp is not a tool for list processing, but to call functions with computed arglists.
For list and vector processing use REDUCE, instead of APPLY
If we want to sum all numbers in a list, replace
(apply #'+ list) ; don't use this
with
(reduce #'+ list) ; can handle arbitrary long lists
Recursion
apply is a non-optimized recursive function
I cannot see why the function APPLY should use recursion.
For example if you think of
(apply #'+ '(1 2 3 4 5))
The repeated summing of the arguments is done by the function + and not by apply.
This is different from
(reduce #'+ '(1 2 3 4 5))
where the repeated call of the function + with two arguments is done by reduce.
What's causing the stack exhaustion?
Though recursion is often a likely culprit for stack exhaustion, that is not true in this case. According to the SBCL Internals Manual:
In full call, the arguments are passed creating a partial frame on the stack top and storing stack arguments into that frame.
Each generated list element is stored in the new stack frame, quickly exhausting the stack. Presumably, passing SBCL a larger value through –control-stack-size would increase this practical limit to the number of arguments that can be passed in a function call.
Why is call-arguments-limit so much larger than the practical limit?
An SBCL mailing list response to someone with a similar problem explains why the practical limit of the stack size isn't reflected in call-arguments-limit:
The condition you're seeing here is not due to a fundamental implementation limitation, but rather because of the particular choice of stack size that someone chose -- if the stack size were bigger, this call of yours would not error. [...] Consider that, given our strategy of passing excess arguments on the stack, that the actual maximum number of arguments passable at any given time depends on the program state, and isn't in fact a constant.
Spec says that call-arguments-limit must be a constant, so SBCL seems to have defined it as most-positive-fixnum.
There are and a couple of bug reports discussing the issue, and a TODO in the source suggesting that at least one contributor feels it should be reduced to a less absurd value:
;; TODO: Reducing CALL-ARGUMENTS-LIMIT to something reasonable to
;; allow DWORD ops without it looking like a bug would make sense.
;; With a stack size of about 2MB, the limit is absurd anyway.
SBCL's particular way of implementing call-arguments-limit might have room for improvement and could lead to unexpected behaviour, but it does follow ANSI spec.
The practical limit varies depending on the space remaining on the stack, so defining call-arguments-limit according to this value would not obey the spec requirement for a constant value.

Filter a range without using an intermediate list

I have written a function is-prime that verifies whether a given number is a prime number or not, and returns t or nil accordingly.
(is-prime 2) ; => T
(is-prime 3) ; => T
(is-prime 4) ; => NIL
So far, so good. Now I would like to generate a list of prime numbers between min and max, so I would like to come up with a function that takes those two values as parameters and returns a list of all prime numbers:
(defun get-primes (min max)
...)
Now this is where I am currently stuck. What I could do, of course, is create a list with the range of all numbers from min to max and run remove-if-not on it.
Anyway, this means, that I first have to create a potentially huge list with lots of numbers that I throw away anyway. So I would like to do it the other way round, build up a list that contains only the numbers between min and max that actually are prime according to the is-prime predicate.
How can I do this in a functional way, i.e. without using loop? My current approach (with loop) looks like this:
(defun get-primes (min max)
(loop
for guess from min to max
when (is-prime guess)
collect guess))
Maybe this is a totally dumb question, but I guess I don't see the forest for the trees.
Any hints?
Common Lisp does not favor pure Functional Programming approaches. The language is based on a more pragmatic view of an underlying machine: no TCO, stacks are limited, various resources are limited (the number of arguments which are allowed, etc.), mutation is possible. That does not sound very motivating for any Haskell enthusiast. But Common Lisp was developed to write Lisp applications, not for advancing FP research and development. Evaluation in Common Lisp (as usual in Lisp) is strict and not lazy. The default data structures are also not lazy. Though there are lazy libraries. Common Lisp also does not provide in the standard features like continuations or coroutines - which might be useful in this case.
The default Common Lisp approach is non-functional and uses some kind of iteration construct: DO, LOOP or the ITERATE library. Like in your example. Common Lisp users find it perfectly fine. Some, like me, think that Iterate is the best looping construct ever invented. ;-) The advantage is that it is relatively easy to create efficient code, even though the macros have a large implementation.
There are various ways to do it differently:
lazy streams. reading SICP helps, see the part on streams. Can be easily done in Common Lisp, too. Note that Common Lisp uses the word stream for I/O streams - which is something different.
use a generator function next-prime. Generate this function from the initial range and call it until you got all primes you are interested in.
Here is a simple example of a generator function, generating even numbers from some start value:
(defun make-next-even-fn (start)
(let ((current (- start
(if (evenp start) 2 1))))
(lambda ()
(incf current 2))))
CL-USER 14 > (let ((next-even-fn (make-next-even-fn 13)))
(flet ((get-next-even ()
(funcall next-even-fn)))
(print (get-next-even))
(print (get-next-even))
(print (get-next-even))
(print (get-next-even))
(list (get-next-even)
(get-next-even)
(get-next-even))))
14
16
18
20
(22 24 26)
Defining the next-prime generator is left as an exercise. ;-)
use some kind of more sophisticated lazy data structure. For Common Lisp there is Series, which was an early iteration proposal - also published in CLtL2: Series. The new book Common Lisp Recipes by Edi Weitz (math prof in Hamburg) has a Series example for computing primes. See chapter 7.15. You can download the source code for the book and find the example there.
the are simpler variants of Series, for example pipes

What is the definition of "natural recursion"?

The Third Commandment of The Little Schemer states:
When building a list, describe the first typical element, and then cons it onto the natural recursion.
What is the exact definition of "natural recursion"? The reason why I am asking is because I am taking a class on programming language principles by Daniel Friedman and the following code is not considered "naturally recursive":
(define (plus x y)
(if (zero? y) x
(plus (add1 x) (sub1 y))))
However, the following code is considered "naturally recursive":
(define (plus x y)
(if (zero? y) x
(add1 (plus x (sub1 y)))))
I prefer the "unnaturally recursive" code because it is tail recursive. However, such code is considered anathema. When I asked as to why we shouldn't write the function in tail recursive form then the associate instructor simply replied, "You don't mess with the natural recursion."
What's the advantage of writing the function in the "naturally recursive" form?
"Natural" (or just "Structural") recursion is the best way to start teaching students about recursion. This is because it has the wonderful guarantee that Joshua Taylor points out: it's guaranteed to terminate[*]. Students have a hard enough time wrapping their heads around this kind of program that making this a "rule" can save them a huge amount of head-against-wall-banging.
When you choose to leave the realm of structural recursion, you (the programmer) have taken on an additional responsibility, which is to ensure that your program halts on all inputs; it's one more thing to think about & prove.
In your case, it's a bit more subtle. You have two arguments, and you're making structurally recursive calls on the second one. In fact, with this observation (program is structurally recursive on argument 2), I would argue that your original program is pretty much just as legitimate as the non-tail-calling one, since it inherits the same proof-of-convergence. Ask Dan about this; I'd be interested to hear what he has to say.
[*] To be precise here you have to legislate out all kinds of other goofy stuff like calls to other functions that don't terminate, etc.
The natural recursion has to do with the "natural", recursive definition of the type you are dealing with. Here, you are working with natural numbers; since "obviously" a natural number is either zero or the successor of another natural number, when you want to build a natural number, you naturally output 0 or (add1 z) for some other natural z which happens to be computed recursively.
The teacher probably wants you to make the link between recursive type definitions and recursive processing of values of that type. You would not have the kind of problem you have with numbers if you tried to process trees or lists, because you routinely use natural numbers in "unnatural ways" and thus, you might have natural objections thinking in terms of Church numerals.
The fact that you already know how to write tail-recursive functions is irrelevant in that context: this is apparently not the objective of your teacher to talk about tail-call optimizations, at least for now.
The associate instructor was not very helpful at first ("messing with natural recursion" sounds as "don't ask"), but the detailed explanation he/she gave in the snapshot you gave was more appropriate.
(define (plus x y)
(if (zero? y) x
(add1 (plus x (sub1 y)))))
When y != 0 it has to remember that once the result of (plus x (sub1 y)) is known, it has to compute add1 on it. Hence when y reaches zero, the recursion is at its deepest. Now the backtracking phase begins and the add1's are executed. This process can be observed using trace.
I did the trace for :
(require racket/trace)
(define (add1 x) ...)
(define (sub1 x) ...)
(define (plus x y) ...)
(trace plus)
(plus 2 3)
Here's the trace :
>(plus 2 3)
> (plus 2 2)
> >(plus 2 1)
> > (plus 2 0) // Deepest point of recursion
< < 2 // Backtracking begins, performing add1 on the results
< <3
< 4
<5
5 // Result
The difference is that the other version has no backtracking phase. It is calling itself for a few times but it is iterative, because it is remembering intermediate results (passed as arguments). Hence the process is not consuming extra space.
Sometimes implementing a tail-recursive procedure is easier or more elegant then writing it's iterative equivalent. But for some purposes you can not/may not implement it in a recursive way.
PS : I had a class which was covering a bit about garbage collection algorithms. Such algorithms may not be recursive as there may be no space left, hence having no space for the recursion. I remember an algorithm called "Deutsch-Schorr-Waite" which was really hard to understand at first. First he implemented the recursive version just to understand the concept, afterwards he wrote the iterative version (hence manually having to remember from where in memory he came), believe me the recursive one was way easier but could not be used in practice...

Why is foldl defined in a strange way in Racket?

In Haskell, like in many other functional languages, the function foldl is defined such that, for example, foldl (-) 0 [1,2,3,4] = -10.
This is OK, because foldl (-) 0 [1, 2,3,4] is, by definition, ((((0 - 1) - 2) - 3) - 4).
But, in Racket, (foldl - 0 '(1 2 3 4)) is 2, because Racket "intelligently" calculates like this: (4 - (3 - (2 - (1 - 0)))), which indeed is 2.
Of course, if we define auxiliary function flip, like this:
(define (flip bin-fn)
(lambda (x y)
(bin-fn y x)))
then we could in Racket achieve the same behavior as in Haskell: instead of (foldl - 0 '(1 2 3 4)) we can write: (foldl (flip -) 0 '(1 2 3 4))
The question is: Why is foldl in racket defined in such an odd (nonstandard and nonintuitive) way, differently than in any other language?
The Haskell definition is not uniform. In Racket, the function to both folds have the same order of inputs, and therefore you can just replace foldl by foldr and get the same result. If you do that with the Haskell version you'd get a different result (usually) — and you can see this in the different types of the two.
(In fact, I think that in order to do a proper comparison you should avoid these toy numeric examples where both of the type variables are integers.)
This has the nice byproduct where you're encouraged to choose either foldl or foldr according to their semantic differences. My guess is that with Haskell's order you're likely to choose according to the operation. You have a good example for this: you've used foldl because you want to subtract each number — and that's such an "obvious" choice that it's easy to overlook the fact that foldl is usually a bad choice in a lazy language.
Another difference is that the Haskell version is more limited than the Racket version in the usual way: it operates on exactly one input list, whereas Racket can accept any number of lists. This makes it more important to have a uniform argument order for the input function).
Finally, it is wrong to assume that Racket diverged from "many other functional languages", since folding is far from a new trick, and Racket has roots that are far older than Haskell (or these other languages). The question could therefore go the other way: why is Haskell's foldl defined in a strange way? (And no, (-) is not a good excuse.)
Historical update:
Since this seems to bother people again and again, I did a little bit of legwork. This is not definitive in any way, just my second-hand guessing. Feel free to edit this if you know more, or even better, email the relevant people and ask. Specifically, I don't know the dates where these decisions were made, so the following list is in rough order.
First there was Lisp, and no mention of "fold"ing of any kind. Instead, Lisp has reduce which is very non-uniform, especially if you consider its type. For example, :from-end is a keyword argument that determines whether it's a left or a right scan and it uses different accumulator functions which means that the accumulator type depends on that keyword. This is in addition to other hacks: usually the first value is taken from the list (unless you specify an :initial-value). Finally, if you don't specify an :initial-value, and the list is empty, it will actually apply the function on zero arguments to get a result.
All of this means that reduce is usually used for what its name suggests: reducing a list of values into a single value, where the two types are usually the same. The conclusion here is that it's serving a kind of a similar purpose to folding, but it's not nearly as useful as the generic list iteration construct that you get with folding. I'm guessing that this means that there's no strong relation between reduce and the later fold operations.
The first relevant language that follows Lisp and has a proper fold is ML. The choice that was made there, as noted in newacct's answer below, was to go with the uniform types version (ie, what Racket uses).
The next reference is Bird & Wadler's ItFP (1988), which uses different types (as in Haskell). However, they note in the appendix that Miranda has the same type (as in Racket).
Miranda later on switched the argument order (ie, moved from the Racket order to the Haskell one). Specifically, that text says:
WARNING - this definition of foldl differs from that in older versions of Miranda. The one here is the same as that in Bird and Wadler (1988). The old definition had the two args of `op' reversed.
Haskell took a lot of stuff from Miranda, including the different types. (But of course I don't know the dates so maybe the Miranda change was due to Haskell.) In any case, it's clear at this point that there was no consensus, hence the reversed question above holds.
OCaml went with the Haskell direction and uses different types
I'm guessing that "How to Design Programs" (aka HtDP) was written at roughly the same period, and they chose the same type. There is, however, no motivation or explanation — and in fact, after that exercise it's simply mentioned as one of the built-in functions.
Racket's implementation of the fold operations was, of course, the "built-ins" that are mentioned here.
Then came SRFI-1, and the choice was to use the same-type version (as Racket). This decision was question by John David Stone, who points at a comment in the SRFI that says
Note: MIT Scheme and Haskell flip F's arg order for their reduce and fold functions.
Olin later addressed this: all he said was:
Good point, but I want consistency between the two functions.
state-value first: srfi-1, SML
state-value last: Haskell
Note in particular his use of state-value, which suggests a view where consistent types are a possibly more important point than operator order.
"differently than in any other language"
As a counter-example, Standard ML (ML is a very old and influential functional language)'s foldl also works this way: http://www.standardml.org/Basis/list.html#SIG:LIST.foldl:VAL
Racket's foldl and foldr (and also SRFI-1's fold and fold-right) have the property that
(foldr cons null lst) = lst
(foldl cons null lst) = (reverse lst)
I speculate the argument order was chosen for that reason.
From the Racket documentation, the description of foldl:
(foldl proc init lst ...+) → any/c
Two points of interest for your question are mentioned:
the input lsts are traversed from left to right
And
foldl processes the lsts in constant space
I'm gonna speculate on how the implementation for that might look like, with a single list for simplicity's sake:
(define (my-foldl proc init lst)
(define (iter lst acc)
(if (null? lst)
acc
(iter (cdr lst) (proc (car lst) acc))))
(iter lst init))
As you can see, the requirements of left-to-right traversal and constant space are met (notice the tail recursion in iter), but the order of the arguments for proc was never specified in the description. Hence, the result of calling the above code would be:
(my-foldl - 0 '(1 2 3 4))
> 2
If we had specified the order of the arguments for proc in this way:
(proc acc (car lst))
Then the result would be:
(my-foldl - 0 '(1 2 3 4))
> -10
My point is, the documentation for foldl doesn't make any assumptions on the evaluation order of the arguments for proc, it only has to guarantee that constant space is used and that the elements in the list are evaluated from left to right.
As a side note, you can get the desired evaluation order for your expression by simply writing this:
(- 0 1 2 3 4)
> -10

Resources