Where is the recursion in this code? - recursion

I'm studying Scheme by myself and recently I've encountered this code:
((lambda (gcd) (gcd (12 8 gcd))
(lambda(a b gcdnew)
(if (= b 0)
a
(gcdnew b (modulo a b) gcdnew))))
The author said there is a recursion in this one. It was an old topic so I couldn't contact him. So where is it (=recursion)? It seems that the second 'lambda' goes to first 'gcd' as value so our body is actually:
(gcdnew 8 4 gcdnew)
Well, what's the point of 'gcdnew' as a parameter in 'gcdnew'? Thanks for help.

The point of gcdnew is that it holds the function to use for the recursive call.
The outer lambda sets the recursion up by taking a function (gcd) and applying it to 12 and 8, and also passing that same function as a third parameter.
Inside the inner function, gcdnew refers to the inner function itself, and it calls itself recursively by using gcdnew, making sure to pass it (i.e. itself) along in the recursion.
BTW: There's a slight typo in that you have too many brackets - it should probably be
((lambda (gcd) (gcd 12 8 gcd))
(lambda (a b gcdnew)
(if (= b 0)
a
(gcdnew b (modulo a b) gcdnew))))

This is known as the Y combinator.
And it's actually possible to do recursion with it in languages that don't support explicit recursion. (in fact you don't even need to define it recursivly)
http://mvanier.livejournal.com/2700.html

Related

What is the definition of "natural recursion"?

The Third Commandment of The Little Schemer states:
When building a list, describe the first typical element, and then cons it onto the natural recursion.
What is the exact definition of "natural recursion"? The reason why I am asking is because I am taking a class on programming language principles by Daniel Friedman and the following code is not considered "naturally recursive":
(define (plus x y)
(if (zero? y) x
(plus (add1 x) (sub1 y))))
However, the following code is considered "naturally recursive":
(define (plus x y)
(if (zero? y) x
(add1 (plus x (sub1 y)))))
I prefer the "unnaturally recursive" code because it is tail recursive. However, such code is considered anathema. When I asked as to why we shouldn't write the function in tail recursive form then the associate instructor simply replied, "You don't mess with the natural recursion."
What's the advantage of writing the function in the "naturally recursive" form?
"Natural" (or just "Structural") recursion is the best way to start teaching students about recursion. This is because it has the wonderful guarantee that Joshua Taylor points out: it's guaranteed to terminate[*]. Students have a hard enough time wrapping their heads around this kind of program that making this a "rule" can save them a huge amount of head-against-wall-banging.
When you choose to leave the realm of structural recursion, you (the programmer) have taken on an additional responsibility, which is to ensure that your program halts on all inputs; it's one more thing to think about & prove.
In your case, it's a bit more subtle. You have two arguments, and you're making structurally recursive calls on the second one. In fact, with this observation (program is structurally recursive on argument 2), I would argue that your original program is pretty much just as legitimate as the non-tail-calling one, since it inherits the same proof-of-convergence. Ask Dan about this; I'd be interested to hear what he has to say.
[*] To be precise here you have to legislate out all kinds of other goofy stuff like calls to other functions that don't terminate, etc.
The natural recursion has to do with the "natural", recursive definition of the type you are dealing with. Here, you are working with natural numbers; since "obviously" a natural number is either zero or the successor of another natural number, when you want to build a natural number, you naturally output 0 or (add1 z) for some other natural z which happens to be computed recursively.
The teacher probably wants you to make the link between recursive type definitions and recursive processing of values of that type. You would not have the kind of problem you have with numbers if you tried to process trees or lists, because you routinely use natural numbers in "unnatural ways" and thus, you might have natural objections thinking in terms of Church numerals.
The fact that you already know how to write tail-recursive functions is irrelevant in that context: this is apparently not the objective of your teacher to talk about tail-call optimizations, at least for now.
The associate instructor was not very helpful at first ("messing with natural recursion" sounds as "don't ask"), but the detailed explanation he/she gave in the snapshot you gave was more appropriate.
(define (plus x y)
(if (zero? y) x
(add1 (plus x (sub1 y)))))
When y != 0 it has to remember that once the result of (plus x (sub1 y)) is known, it has to compute add1 on it. Hence when y reaches zero, the recursion is at its deepest. Now the backtracking phase begins and the add1's are executed. This process can be observed using trace.
I did the trace for :
(require racket/trace)
(define (add1 x) ...)
(define (sub1 x) ...)
(define (plus x y) ...)
(trace plus)
(plus 2 3)
Here's the trace :
>(plus 2 3)
> (plus 2 2)
> >(plus 2 1)
> > (plus 2 0) // Deepest point of recursion
< < 2 // Backtracking begins, performing add1 on the results
< <3
< 4
<5
5 // Result
The difference is that the other version has no backtracking phase. It is calling itself for a few times but it is iterative, because it is remembering intermediate results (passed as arguments). Hence the process is not consuming extra space.
Sometimes implementing a tail-recursive procedure is easier or more elegant then writing it's iterative equivalent. But for some purposes you can not/may not implement it in a recursive way.
PS : I had a class which was covering a bit about garbage collection algorithms. Such algorithms may not be recursive as there may be no space left, hence having no space for the recursion. I remember an algorithm called "Deutsch-Schorr-Waite" which was really hard to understand at first. First he implemented the recursive version just to understand the concept, afterwards he wrote the iterative version (hence manually having to remember from where in memory he came), believe me the recursive one was way easier but could not be used in practice...

operation between two lists

In common lisp there is map, which lets you do this kind of thing:
(map (lambda (x y) (/ x y)) (list 2 4 6 8 10 12) (list 1 2 3 4 5 6))
returning (2 2 2 2 2 2)
However now I am working at ACL2 and there is no such a thing as map.
So in my mind the only choice left I have is doing recursion to calculate what I want, unless there is another simpler and/or more efficient way of doing it.
... Which is exactly my question. Is there a better way of doing it than to create a recursive function called something like divide-two-lists? It just feels like something that a lisp-based language should naturally do instead of having you to create another function specifically just for it, hence why I am asking.
You could pretty easily write your own map. From the GNU Emacs guide:
(defun mapcar* (function &rest args)
"Apply FUNCTION to successive cars of all ARGS.
Return the list of results."
;; If no list is exhausted,
(if (not (memq nil args))
;; apply function to cars.
(cons (apply function (mapcar 'car args))
(apply 'mapcar* function
;; Recurse for rest of elements.
(mapcar 'cdr args)))))
(mapcar* 'cons '(a b c) '(1 2 3 4))
⇒ ((a . 1) (b . 2) (c . 3))
I'm unfamiliar with acl2, so you might have to change some functions (e.g. memq), or deal differently with how apply or &rest arguments work, but this is the meat of the code.
ACL2 is based on first order logic. In first order logic, statements like
(define (P R A) (R A))
are not allowed because R is being used as both a parameter and a function.
It is theoretically possible to get around this limitation by literally defining your own language within first order logic that includes the constructs for higher order logic. Otherwise, you are correct, your best option is to define something like divide-two-lists every single time you want to use a map function.
That's tedious, but it is how ACL2 was meant to be used.
This isn't exactly suitable to your question, but it's related, and so I mention it in case it helps someone else who is looking at your question.
Consider the book "std/util/defprojection", which provides a macro that lets you map a function across a list.

Scheme: Procedures that return another inner procedure

This is from the SICP book that I am sure many of you are familiar with. This is an early example in the book, but I feel an extremely important concept that I am just not able to get my head around yet. Here it is:
(define (cons x y)
(define (dispatch m)
(cond ((= m 0) x)
((= m 1) y)
(else (error "Argument not 0 or 1 - CONS" m))))
dispatch)
(define (car z) (z 0))
(define (cdr z) (z 1))
So here I understand that car and cdr are being defined within the scope of cons, and I get that they map some argument z to 1 and 0 respectively (argument z being some cons). But say I call (cons 3 4)...how are the arguments 3 and 4 evaluated, when we immediately go into this inner-procedure dispatch which takes some argument m that we have not specified yet? And, maybe more importantly, what is the point of returning 'dispatch? I don't really get that part at all. Any help is appreciated, thanks!
This is one of the weirder (and possibly one of the more wonderful) examples of exploiting first-class functions in Scheme. Something similar is also in the Little Schemer, which is where I first saw it, and I remember scratching my head for days over it. Let me see if I can explain it in a way that makes sense, but I apologize if it's not clear.
I assume you understand the primitives cons, car, and cdr as they are implemented in Scheme already, but just to remind you: cons constructs a pair, car selects the first component of the pair and returns it, and cdr selects the second component and returns it. Here's a simple example of using these functions:
> (cons 1 2)
(1 . 2)
> (car (cons 1 2))
1
> (cdr (cons 1 2))
2
The version of cons, car, and cdr that you've pasted should behave exactly the same way. I'll try to show you how.
First of all, car and cdr are not defined within the scope of cons. In your snippet of code, all three (cons, car, and cdr) are defined at the top-level. The function dispatch is the only one that is defined inside cons.
The function cons takes two arguments and returns a function of one argument. What's important about this is that those two arguments are visible to the inner function dispatch, which is what is being returned. I'll get to that in a moment.
As I said in my reminder, cons constructs a pair. This version of cons should do the same thing, but instead it's returning a function! That's ok, we don't really care how the pair is implemented or laid out in memory, so long as we can get at the first and second components.
So with this new function-based pair, we need to be able to call car and pass the pair as an argument, and get the first component. In the definition of car, this argument is called z. If you were to execute the same REPL session I had above with these new cons ,car, and cdr functions, the argument z in car will be bound to the function-based pair, which is what cons returns, which is dispatch. It's confusing, but just think it through carefully and you'll see.
Based on the implementation of car, it appears to be that it take a function of one argument, and applies it to the number 0. So it's applying dispatch to 0, and as you can see from the definition of dispatch, that's what we want. The cond inside there compares m with 0 and 1 and returns either x or y. In this case, it returns x, which is the first argument to cons, in other words the first component of the pair! So car selects the first component, just as the normal primitive does in Scheme.
If you follow this same logic for cdr, you'll see that it behaves almost the same way, but returns the second argument to cons, y, which is the second component of the pair.
There are a couple of things that might help you understand this better. One is to go back to the description of the substitution model of evaluation in Chapter 1. If you carefully and meticulously follow that substitution model for some very simple example of using these functions, you'll see that they work.
Another way, which is less tedious, is to try playing with the dispatch function directly at the REPL. Below, the variable p is defined to refer to the dispatch function returned by cons.
> (define p (cons 1 2))
#<function> ;; what the REPL prints here will be implementation specific
> (p 0)
1
> (p 1)
2
The code in the question shows how to redefine the primitive procedure cons that creates a cons-cell (a pair of two elements: the car and the cdr), using only closures and message-dispatching.
The dispatch procedure acts as a selector for the arguments passed to cons: x and y. If the message 0 is received, then the first argument of cons is returned (the car of the cell). Likewise, if 1 is received, then the second argument of cons is returned (the cdr of the cell). Both arguments are stored inside the closure defined implicitly for the dispatch procedure, a closure that captures x and y and is returned as the product of invoking this procedural implementation of cons.
The next redefinitions of car and cdr build on this: car is implemented as a procedure that passes 0 to a closure as returned in the above definition, and cdr is implemented as a procedure that passes 1 to the closure, in each case ultimately returning the original value that was passed as x and y respectively.
The really nice part of this example is that it shows that the cons-cell, the most basic unit of data in a Lisp system can be defined as a procedure, therefore blurring the distinction between data and procedure.
This is the "closure/object isomorphism", basically.
The outer function (cons) is a class constructor. It returns an object, which is a function of one argument, where the argument is equivalent to the name of a method. In this case, the methods are getters, so they evaluate to values. You could just as easily have stored more procedures in the object returned by the constructor.
In this case, numbers where chosen as method names and sugary procedures defined outside the object itself. You could have used symbols:
(define (cons x y)
(lambda (method)
(cond ((eq? method 'car) x)
((eq? method 'cdr) y)
(else (error "unknown method")))))
In which case what you have more closely resembles OO:
# (define p (cons 1 2))
# (p 'car)
1
# (p 'cdr)
2

Accumulators, conj and recursion

I've solved 45 problems from 4clojure.com and I noticed a recurring problem in the way I try to solve some problems using recursion and accumulators.
I'll try to explain the best I can what I'm doing to end up with fugly solutions hoping that some Clojurers would "get" what I'm not getting.
For example, problem 34 asks to write a function (without using range) taking two integers as arguments and creates a range (without using range). Simply put you do (... 1 7) and you get (1 2 3 4 5 6).
Now this question is not about solving this particular problem.
What if I want to solve this using recursion and an accumulator?
My thought process goes like this:
I need to write a function taking two arguments, I start with (fn [x y] )
I'll need to recurse and I'll need to keep track of a list, I'll use an accumulator, so I write a 2nd function inside the first one taking an additional argument:
(fn
[x y]
((fn g [x y acc] ...)
x
y
'())
(apparently I can't properly format that Clojure code on SO!?)
Here I'm already not sure I'm doing it correctly: the first function must take exactly two integer arguments (not my call) and I'm not sure: if I want to use an accumulator, can I use an accumulator without creating a nested function?
Then I want to conj, but I cannot do:
(conj 0 1)
so I do weird things to make sure I've got a sequence first and I end up with this:
(fn
[x y]
((fn g [x y acc] (if (= x y) y (conj (conj acc (g (inc x) y acc)) x)))
x
y
'()))
But then this produce this:
(1 (2 (3 4)))
Instead of this:
(1 2 3 4)
So I end up doing an additional flatten and it works but it is totally ugly.
I'm beginning to understand a few things and I'm even starting, in some cases, to "think" in a more clojuresque way but I've got a problem writing the solution.
For example here I decided:
to use an accumulator
to recurse by incrementing x until it reaches y
But I end up with the monstrosity above.
There are a lot of way to solve this problem and, once again, it's not what I'm after.
What I'm after is how, after I decided to cons/conj, use an accumulator, and recurse, I can end up with this (not written by me):
#(loop [i %1
acc nil]
(if (<= %2 i)
(reverse acc)
(recur (inc i) (cons i acc))))
Instead of this:
((fn
f
[x y]
(flatten
((fn
g
[x y acc]
(if (= x y) acc (conj (conj acc (g (inc x) y acc)) x)))
x
y
'())))
1
4)
I take it's a start to be able to solve a few problems but I'm a bit disappointed by the ugly solutions I tend to produce...
i think there are a couple of things to learn here.
first, a kind of general rule - recursive functions typically have a natural order, and adding an accumulator reverses that. you can see that because when a "normal" (without accumulator) recursive function runs, it does some work to calculate a value, then recurses to generate the tail of the list, finally ending with an empty list. in contrast, with an accumulator, you start with the empty list and add things to the front - it's growing in the other direction.
so typically, when you add an accumulator, you get a reversed order.
now often this doesn't matter. for example, if you're generating not a sequence but a value that is the repeated application of a commutative operator (like addition or multiplication). then you get the same answer either way.
but in your case, it is going to matter. you're going to get the list backwards:
(defn my-range-0 [lo hi] ; normal recursive solution
(if (= lo hi)
nil
(cons lo (my-range-0 (inc lo) hi))))
(deftest test-my-range-1
(is (= '(0 1 2) (my-range-0 0 3))))
(defn my-range-1 ; with an accumulator
([lo hi] (my-range-1 lo hi nil))
([lo hi acc]
(if (= lo hi)
acc
(recur (inc lo) hi (cons lo acc)))))
(deftest test-my-range-1
(is (= '(2 1 0) (my-range-1 0 3)))) ; oops! backwards!
and often the best you can do to fix this is just reverse that list at the end.
but here there's an alternative - we can actually do the work backwards. instead of incrementing the low limit you can decrement the high limit:
(defn my-range-2
([lo hi] (my-range-2 lo hi nil))
([lo hi acc]
(if (= lo hi)
acc
(let [hi (dec hi)]
(recur lo hi (cons hi acc))))))
(deftest test-my-range-2
(is (= '(0 1 2) (my-range-2 0 3)))) ; back to the original order
[note - there's another way of reversing things below; i didn't structure my argument very well]
second, as you can see in my-range-1 and my-range-2, a nice way of writing a function with an accumulator is as a function with two different sets of arguments. that gives you a very clean (imho) implementation without the need for nested functions.
also you have some more general questions about sequences, conj and the like. here clojure is kind-of messy, but also useful. above i've been giving a very traditional view with cons based lists. but clojure encourages you to use other sequences. and unlike cons lists, vectors grow to the right, not the left. so another way to reverse that result is to use a vector:
(defn my-range-3 ; this looks like my-range-1
([lo hi] (my-range-3 lo hi []))
([lo hi acc]
(if (= lo hi)
acc
(recur (inc lo) hi (conj acc lo)))))
(deftest test-my-range-3 ; except that it works right!
(is (= [0 1 2] (my-range-3 0 3))))
here conj is adding to the right. i didn't use conj in my-range-1, so here it is re-written to be clearer:
(defn my-range-4 ; my-range-1 written using conj instead of cons
  ([lo hi] (my-range-4 lo hi nil))
  ([lo hi acc]
    (if (= lo hi)
      acc
      (recur (inc lo) hi (conj acc lo)))))
(deftest test-my-range-4
(is (= '(2 1 0) (my-range-4 0 3))))
note that this code looks very similar to my-range-3 but the result is backwards because we're starting with an empty list, not an empty vector. in both cases, conj adds the new element in the "natural" position. for a vector that's to the right, but for a list it's to the left.
and it just occurred to me that you may not really understand what a list is. basically a cons creates a box containing two things (its arguments). the first is the contents and the second is the rest of the list. so the list (1 2 3) is basically (cons 1 (cons 2 (cons 3 nil))). in contrast, the vector [1 2 3] works more like an array (although i think it's implemented using a tree).
so conj is a bit confusing because the way it works depends on the first argument. for a list, it calls cons and so adds things to the left. but for a vector it extends the array(-like thing) to the right. also, note that conj takes an existing sequence as first arg, and thing to add as second, while cons is the reverse (thing to add comes first).
all the above code available at https://github.com/andrewcooke/clojure-lab
update: i rewrote the tests so that the expected result is a quoted list in the cases where the code generates a list. = will compare lists and vectors and return true if the content is the same, but making it explicit shows more clearly what you're actually getting in each case. note that '(0 1 2) with a ' in front is just like (list 0 1 2) - the ' stops the list from being evaluated (without it, 0 would be treated as a command).
After reading all that, I'm still not sure why you'd need an accumulator.
((fn r [a b]
(if (<= a b)
(cons a (r (inc a) b))))
2 4)
=> (2 3 4)
seems like a pretty intuitive recursive solution. the only thing I'd change in "real" code is to use lazy-seq so that you won't run out of stack for large ranges.
how I got to that solution:
When you're thinking of using recursion, I find it helps to try and state the problem with the fewest possible terms you can think up, and try to hand off as much "work" to the recursion itself.
In particular, if you suspect you can drop one or more arguments/variables, that is usually the way to go - at least if you want the code to be easy to understand and debug; sometimes you end up compromising simplicity in favor of execution speed or reducing memory usage.
In this case, what I thought when I started writing was: "the first argument to the function is also the start element of the range, and the last argument is the last element". Recursive thinking is something you kind of have to train yourself to do, but a fairly obvious solution then is to say: a range [a, b] is a sequence starting with element a followed by a range of [a + 1, b]. So ranges can indeed be described recursively. The code I wrote is pretty much a direct implementation of that idea.
addendum:
I've found that when writing functional code, accumulators (and indexes) are best avoided. Some problems require them, but if you can find a way to get rid of them, you're usually better off if you do.
addendum 2:
Regarding recursive functions and lists/sequences, the most useful way to think when writing that kind of code is to state your problem in terms of "the first item (head) of a list" and "the rest of the list (tail)".
I cannot add to the already good answers you have received, but I will answer in general. As you go through the Clojure learning process, you may find that many but not all solutions can be solved using Clojure built-ins, like map and also thinking of problems in terms of sequences. This doesn't mean you should not solve things recursively, but you will hear -- and I believe it to be wise advice -- that Clojure recursion is for solving very low level problems you cannot solve another way.
I happen to do a lot of .csv file processing, and recently received a comment that nth creates dependencies. It does, and use of maps can allow me to get at elements for comparison by name and not position.
I'm not going to throw out the code that uses nth with clojure-csv parsed data in two small applications already in production. But I'm going to think about things in a more sequency way the next time.
It is difficult to learn from books that talk about vectors and nth, loop .. recur and so on, and then realize learning Clojure grows you forward from there.
One of the things I have found that is good about learning Clojure, is the community is respectful and helpful. After all, they're helping someone whose first learning language was Fortran IV on a CDC Cyber with punch cards, and whose first commercial programming language was PL/I.
If I solved this using an accumulator I would do something like:
user=> (defn my-range [lb up c]
(if (= lb up)
c
(recur (inc lb) up (conj c lb))))
#'user/my-range
then call it with
#(my-range % %2 [])
Of course, I'd use letfn or something to get around not having defn available.
So yes, you do need an inner function to use the accumulator approach.
My thought process is that once I'm done the answer I want to return will be in the accumulator. (That contrasts with your solution, where you do a lot of work on finding the ending-condition.) So I look for my ending-condition and if I've reached it, I return the accumulator. Otherwise I tack on the next item to the accumulator and recur for a smaller case. So there are only 2 things to figure out, what the end-condition is, and what I want to put in the accumulator.
Using a vector helps a lot because conj will append to it and there's no need to use reverse.
I'm on 4clojure too, btw. I've been busy so I've fallen behind lately.
It looks like your question is more about "how to learn" then a technical/code problem. You end up writing that kind of code because from whatever way or source you learned programming in general or Clojure in specific has created a "neural highway" in your brain that makes you thinking about the solutions in this particular way and you end up writing code like this. Basically whenever you face any problem (in this particular case recursion and/or accumulation) you end up using that "neural highway" and always come up with that kind of code .
The solution for getting rid of this "neural highway" is to stop writing code for the moment, keep that keyboard away and start reading a lot of existing clojure code (from existing solutions of 4clojure problem to open source projects on github) and think about it deeply (even read a function 2-3 times to really let it settle down in your brain). This way you would end up destroying your existing "neural highway" (which produce the code that you write now) and will create a new "neural highway" that would produce the beautiful and idiomatic Clojure code. Also, try not to jump to typing code as soon as you saw a problem, rather give yourself some time to think clearly and deeply about the problem and solutions.

Why is foldl defined in a strange way in Racket?

In Haskell, like in many other functional languages, the function foldl is defined such that, for example, foldl (-) 0 [1,2,3,4] = -10.
This is OK, because foldl (-) 0 [1, 2,3,4] is, by definition, ((((0 - 1) - 2) - 3) - 4).
But, in Racket, (foldl - 0 '(1 2 3 4)) is 2, because Racket "intelligently" calculates like this: (4 - (3 - (2 - (1 - 0)))), which indeed is 2.
Of course, if we define auxiliary function flip, like this:
(define (flip bin-fn)
(lambda (x y)
(bin-fn y x)))
then we could in Racket achieve the same behavior as in Haskell: instead of (foldl - 0 '(1 2 3 4)) we can write: (foldl (flip -) 0 '(1 2 3 4))
The question is: Why is foldl in racket defined in such an odd (nonstandard and nonintuitive) way, differently than in any other language?
The Haskell definition is not uniform. In Racket, the function to both folds have the same order of inputs, and therefore you can just replace foldl by foldr and get the same result. If you do that with the Haskell version you'd get a different result (usually) — and you can see this in the different types of the two.
(In fact, I think that in order to do a proper comparison you should avoid these toy numeric examples where both of the type variables are integers.)
This has the nice byproduct where you're encouraged to choose either foldl or foldr according to their semantic differences. My guess is that with Haskell's order you're likely to choose according to the operation. You have a good example for this: you've used foldl because you want to subtract each number — and that's such an "obvious" choice that it's easy to overlook the fact that foldl is usually a bad choice in a lazy language.
Another difference is that the Haskell version is more limited than the Racket version in the usual way: it operates on exactly one input list, whereas Racket can accept any number of lists. This makes it more important to have a uniform argument order for the input function).
Finally, it is wrong to assume that Racket diverged from "many other functional languages", since folding is far from a new trick, and Racket has roots that are far older than Haskell (or these other languages). The question could therefore go the other way: why is Haskell's foldl defined in a strange way? (And no, (-) is not a good excuse.)
Historical update:
Since this seems to bother people again and again, I did a little bit of legwork. This is not definitive in any way, just my second-hand guessing. Feel free to edit this if you know more, or even better, email the relevant people and ask. Specifically, I don't know the dates where these decisions were made, so the following list is in rough order.
First there was Lisp, and no mention of "fold"ing of any kind. Instead, Lisp has reduce which is very non-uniform, especially if you consider its type. For example, :from-end is a keyword argument that determines whether it's a left or a right scan and it uses different accumulator functions which means that the accumulator type depends on that keyword. This is in addition to other hacks: usually the first value is taken from the list (unless you specify an :initial-value). Finally, if you don't specify an :initial-value, and the list is empty, it will actually apply the function on zero arguments to get a result.
All of this means that reduce is usually used for what its name suggests: reducing a list of values into a single value, where the two types are usually the same. The conclusion here is that it's serving a kind of a similar purpose to folding, but it's not nearly as useful as the generic list iteration construct that you get with folding. I'm guessing that this means that there's no strong relation between reduce and the later fold operations.
The first relevant language that follows Lisp and has a proper fold is ML. The choice that was made there, as noted in newacct's answer below, was to go with the uniform types version (ie, what Racket uses).
The next reference is Bird & Wadler's ItFP (1988), which uses different types (as in Haskell). However, they note in the appendix that Miranda has the same type (as in Racket).
Miranda later on switched the argument order (ie, moved from the Racket order to the Haskell one). Specifically, that text says:
WARNING - this definition of foldl differs from that in older versions of Miranda. The one here is the same as that in Bird and Wadler (1988). The old definition had the two args of `op' reversed.
Haskell took a lot of stuff from Miranda, including the different types. (But of course I don't know the dates so maybe the Miranda change was due to Haskell.) In any case, it's clear at this point that there was no consensus, hence the reversed question above holds.
OCaml went with the Haskell direction and uses different types
I'm guessing that "How to Design Programs" (aka HtDP) was written at roughly the same period, and they chose the same type. There is, however, no motivation or explanation — and in fact, after that exercise it's simply mentioned as one of the built-in functions.
Racket's implementation of the fold operations was, of course, the "built-ins" that are mentioned here.
Then came SRFI-1, and the choice was to use the same-type version (as Racket). This decision was question by John David Stone, who points at a comment in the SRFI that says
Note: MIT Scheme and Haskell flip F's arg order for their reduce and fold functions.
Olin later addressed this: all he said was:
Good point, but I want consistency between the two functions.
state-value first: srfi-1, SML
state-value last: Haskell
Note in particular his use of state-value, which suggests a view where consistent types are a possibly more important point than operator order.
"differently than in any other language"
As a counter-example, Standard ML (ML is a very old and influential functional language)'s foldl also works this way: http://www.standardml.org/Basis/list.html#SIG:LIST.foldl:VAL
Racket's foldl and foldr (and also SRFI-1's fold and fold-right) have the property that
(foldr cons null lst) = lst
(foldl cons null lst) = (reverse lst)
I speculate the argument order was chosen for that reason.
From the Racket documentation, the description of foldl:
(foldl proc init lst ...+) → any/c
Two points of interest for your question are mentioned:
the input lsts are traversed from left to right
And
foldl processes the lsts in constant space
I'm gonna speculate on how the implementation for that might look like, with a single list for simplicity's sake:
(define (my-foldl proc init lst)
(define (iter lst acc)
(if (null? lst)
acc
(iter (cdr lst) (proc (car lst) acc))))
(iter lst init))
As you can see, the requirements of left-to-right traversal and constant space are met (notice the tail recursion in iter), but the order of the arguments for proc was never specified in the description. Hence, the result of calling the above code would be:
(my-foldl - 0 '(1 2 3 4))
> 2
If we had specified the order of the arguments for proc in this way:
(proc acc (car lst))
Then the result would be:
(my-foldl - 0 '(1 2 3 4))
> -10
My point is, the documentation for foldl doesn't make any assumptions on the evaluation order of the arguments for proc, it only has to guarantee that constant space is used and that the elements in the list are evaluated from left to right.
As a side note, you can get the desired evaluation order for your expression by simply writing this:
(- 0 1 2 3 4)
> -10

Resources