Can someone help me to break down exactly the order of execution for the following versions of flatten? I'm using Racket.
version 1, is from racket itself, while version two is a more common? implementation.
(define (flatten1 list)
(let loop ([l list] [acc null])
(printf "l = ~a acc = ~a\n" l acc)
(cond [(null? l) acc]
[(pair? l) (loop (car l) (loop (cdr l) acc))]
[else (cons l acc)])))
(define (flatten2 l)
(printf "l = ~a\n" l)
(cond [(null? l) null]
[(atom? l) (list l)]
[else (append (flatten2 (car l)) (flatten2 (cdr l)))]))
Now, running the first example with '(1 2 3) produces:
l = (1 2 3) acc = ()
l = (2 3) acc = ()
l = (3) acc = ()
l = () acc = ()
l = 3 acc = ()
l = 2 acc = (3)
l = 1 acc = (2 3)
'(1 2 3)
while the second produces:
l = (1 2 3)
l = 1
l = (2 3)
l = 2
l = (3)
l = 3
l = ()
'(1 2 3)
The order of execution seems different. In the first example, it looks like the second loop (loop (cdr l) acc) is firing before the first loop since '(2 3) is printing right away. Whereas in the second example, 1 prints before the '(2 3), which seems like the first call to flatten inside of append is evaluated first.
I'm going through the Little Schemer but these are more difficult examples that I could really use some help on.
Thanks a lot.
Not really an answer to your question (Chris provided an excellent answer already!), but for completeness' sake here's yet another way to implement flatten, similar to flatten2 but a bit more concise:
(define (atom? x)
(and (not (null? x))
(not (pair? x))))
(define (flatten lst)
(if (atom? lst)
(list lst)
(apply append (map flatten lst))))
And another way to implement the left-fold version (with more in common to flatten1), using standard Racket procedures:
(define (flatten lst)
(define (loop lst acc)
(if (atom? lst)
(cons lst acc)
(foldl loop acc lst)))
(reverse (loop lst '())))
The main difference is this:
flatten1 works by storing the output elements (first from the cdr side, then from the car side) into an accumulator. This works because lists are built from right to left, so working on the cdr side first is correct.
flatten2 works by recursively flattening the car and cdr sides, then appending them together.
flatten1 is faster, especially if the tree is heavy on the car side: the use of an accumulator means that there is no extra list copying, no matter what. Whereas, the append call in flatten2 causes the left-hand side of the append to be copied, which means lots of extra list copying if the tree is heavy on the car side.
So in summary, I would consider flatten2 a beginner's implementation of flatten, and flatten1 a more polished, professional version. See also my implementation of flatten, which works using the same principles as flatten1, but using a left-fold instead of the right-fold that flatten1 uses.
(A left-fold solution uses less stack space but potentially more heap space. A right-fold solution uses more stack and usually less heap, though a quick read of flatten1 suggests in this case that the heap usage is about the same as my implementation.)
Related
I'm making my way through the book The Little Schemer to start to learn to think in Lisp. As you get into it and really cover the use of lambdas, the 'remove' procedure is written in the following general form, which returns a remove procedure for arbitrary test test?:
(define rember-f
(lambda (test?)
(lambda (a l)
(cond
((null? l) (quote ()))
((test? (car l) a) (cdr l))
(else (cons (car l)
((rember-f test?) a (cdr l))))))))
I understand how this works just fine, but a plain reading of it suggests that at each recursive step, it is the procedure rember-f that is called again to generate a new enclosed procedure. This would mean when you call your returned procedure on a list, it calls rember-f to generate the same procedure again anew and then that new one is what is called for recursion (if that is not clear see my fix below). I understand that this may be optimized away, but in lieu of not knowing whether it is (and also in attempting to get my head around this syntax anyway), I managed after some experimentation to move the recursion to the procedure itself rather than the enclosing procedure as follows:
(define rember-f
(lambda (test?)
(define retfun
(lambda (a l)
(cond
((null? l) (quote ()))
((test? (car l) a) (cdr l))
(else (cons (car l) (retfun a (cdr l)))))))
retfun))
I have verified that this works as expected. The return value is a procedure that removes the first element of a list (arg 2) matching a value (arg 1). It looks to me like this one only calls rember-f once, which guarantees it only generates one enclosed procedure (this time with a name, retfun).
This is actually interesting to me because unlike the usual tail call optimization, which is about not consuming space on the call stack and so making recursion about as efficient as iteration, in this case the compiler would have to determine that (rember-f test?) is the enclosing procedure scope unmodified and so replace it with the same return value, which is the anonymous (lambda (a l) ...). It would not surprise me at all to learn that the interpreter / compiler does not catch this.
Yes, I know that scheme is a specification and there are many implementations, which get the various functional programming optimizations right to differing degrees. I am currently learning by experimenting in the guile REPL, but would be interested in how different implementations compare on this issue.
Does anyone know how Scheme is supposed to behave in this instance?
You are right to be concerned about the additional repeated lambda abstractions. For example you wouldn't write this, would you?
(cond ((> (some-expensive-computation x) 0) ...)
((< (some-expensive-computation x) 0) ...)
(else ...))
Instead we bind the result of some-expensive-computation to an identifier so we can check multiple conditions on the same value -
(let ((result (some-expensive-computation x)))
(cond ((> result 0) ...)
((< result 0) ...)
(else ...)))
You discovered the essential purpose of so-called "named let" expressions. Here's your program -
(define rember-f
(lambda (test?)
(define retfun
(lambda (a l)
(cond
((null? l) (quote ()))
((test? (car l) a) (cdr l))
(else (cons (car l) (retfun a (cdr l)))))))
retfun))
And its equivalent using a named-let expression. Below we bind the let body to loop, which is a callable procedure allowing recursion of the body. Notice how the lambda abstractions are used just once, and the inner lambda can be repeated without creating/evaluating additional lambdas -
(define rember-f
(lambda (test?)
(lambda (a l)
(let loop ; name, "loop", or anything of your choice
((l l)) ; bindings, here we shadow l, or could rename it
(cond
((null? l) (quote ()))
((test? (car l) a) (cdr l))
(else (cons (car l) (loop (cdr l))))))))) ; apply "loop" with args
Let's run it -
((rember-f eq?) 'c '(a b c d e f))
'(a b d e f)
The syntax for named-let is -
(let proc-identifier ((arg-identifier initial-expr) ...)
body ...)
Named-let is a syntax sugar of a letrec binding -
(define rember-f
(lambda (test?)
(lambda (a l)
(letrec ((loop (lambda (l)
(cond
((null? l) (quote ()))
((test? (car l) a) (cdr l))
(else (cons (car l) (loop (cdr l))))))))
(loop l)))))
((rember-f eq?) 'c '(a b c d e f))
'(a b d e f)
Similarly, you could imagine using a nested define -
(define rember-f
(lambda (test?)
(lambda (a l)
(define (loop l)
(cond
((null? l) (quote ()))
((test? (car l) a) (cdr l))
(else (cons (car l) (loop (cdr l))))))
(loop l))))
((rember-f eq?) 'c '(a b c d e f))
'(a b d e f)
PS, you you can write '() in place of (quote ())
Both procedures have the same asymptotic time complexity. Let's consider the evaluation of ((rember-f =) 1 '(5 4 3 2 1 0)).
A partial evaluation proceeds as follows:
((rember-f =) 1 '(5 4 3 2 1 0))
((lambda (a l)
(cond
((null? l) (quote ()))
((= (car l) a) (cdr l))
(else (cons (car l)
((rember-f =) a (cdr l)))))) 1 '(5 4 3 2 1 0))
(cons 5 ((rember-f = 1 '(4 3 2 1 0))))
Note that the creation of the temporary lambda procedure takes O(1) time and space. So it doesn't actually add any substantial overhead to the cost of calling the function. At best, factoring out the function will lead to a constant-factor speedup and the use of a constant amount less of memory.
But how much memory does it really take to make a closure? It turns out it takes very little memory. A closure consists of a pointer to the environment and a pointer to compiled code. Basically, creating the closure requires as much time and space as making a cons cell. So even though it looks like we're using a lot of memory when I show the evaluation, very little memory and very little time is actually used to make and store the lambda.
So essentially, by factoring out the recursive function, you've allocated a single cons cell rather than writing code which allocates that cons cell one time per recursive call.
For more information on this, see Lambda is cheap, and Closures are Fast.
to start to learn to think in Lisp
That book is not about thinking in lisp, but about recursive thinking, which is one of the ways of computation discovered in the 20th century by Goedel, Herbrand, Rozsa Peter.
Does anyone know how Scheme is supposed to behave in this instance?
After you finish the little lisper you should take the SICP, which will make you understand what kind of decisions an implementation of a language can make. You mean, how different implementations act. To understand their implementation decision, the best step to do is to learn it from SICP. Take care, unless you are already a certified computer science graduate, this texbook will take you a few years to master, if you study it each day. If you are already a graduate, it will take you only about 1 year to master.
I would like to create a function for class that would take two arguments L and L1 as lists and put all even numbers from L into L1.
I've tried for several hours to make it work, but unfortunatelly I couldn't.
This is my Scheme code:
(define (pair L L1)
(cond
((and (not (empty? L)) (= (modulo (first L) 2) 0))
(begin (append (list (first L)) L1) (pair (rest L) L1)))
((and (not (empty? L)) (= (modulo (first L) 2) 1))
(pair (rest L) L1))
(else L1)
))
I assume you want to use L1 as an accumulator, and at the end return its content.
About your code:
It's enough to check once in the first clause of cond if L is empty (null?).
append is fine when you want to append a list. In your case you append one element, so cons is much better.
You don't have to take modulo of a number, to check if it's even. There is build in even? predicate.
So, after all this considerations, your code should look something like this:
(define (pair L L1)
(cond ((null? L) L1)
((even? (first L))
(pair (rest L) (cons (first L) L1)))
(else (pair (rest L) L1))))
Now let's test it:
> (pair '(0 1 2 3 4 5 6 7) '())
(6 4 2 0)
As you can see, it returns numbers in reverse order. It's because as we move down the list L from head to tail, we cons new values to the head (and not tail, like append would) of the list L1. To fix it, it's enough to (reverse L1) in the first cond clause instead of simply returning L1.
I highly recommend "Little Schemer" book. After reading it, you will be able to write any kind of recursive functions even in your sleep ;)
I'm totally new to Scheme and I am trying to implement my own map function. I've tried to find it online, however all the questions I encountered were about some complex versions of map function (such as mapping functions that take two lists as an input).
The best answer I've managed to find is here: (For-each and map in Scheme). Here is the code from this question:
(define (map func lst)
(let recur ((rest lst))
(if (null? rest)
'()
(cons (func (car rest)) (recur (cdr rest))))))
It doesn't solve my problem though because of the usage of an obscure function recur. It doesn't make sense to me.
My code looks like this:
(define (mymap f L)
(cond ((null? L) '())
(f (car L))
(else (mymap (f (cdr L))))))
I do understand the logic behind the functional approach when programming in this language, however I've been having great difficulties with coding it.
The first code snippet you posted is indeed one way to implement the map function. It uses a named let. See my comment on an URL on how it works. It basically is an abstraction over a recursive function. If you were to write a function that prints all numbers from 10 to 0 you could write it liks this
(define (printer x)
(display x)
(if (> x 0)
(printer (- x 1))))
and then call it:
(printer 10)
But, since its just a loop you could write it using a named let:
(let loop ((x 10))
(display x)
(if (> x 0)
(loop (- x 1))))
This named let is, as Alexis King pointed out, syntactic sugar for a lambda that is immediately called. The above construct is equivalent to the snippet shown below.
(letrec ((loop (lambda (x)
(display x)
(if (> x 0)
(loop (- x 1))))))
(loop 10))
In spite of being a letrec it's not really special. It allows for the expression (the lambda, in this case) to call itself. This way you can do recursion. More on letrec and let here.
Now for the map function you wrote, you are almost there. There is an issue with your two last cases. If the list is not empty you want to take the first element, apply your function to it and then apply the function to the rest of the list. I think you misunderstand what you actually have written down. Ill elaborate.
Recall that a conditional clause is formed like this:
(cond (test1? consequence)
(test2? consequence2)
(else elsebody))
You have any number of tests with an obligatory consequence. Your evaluator will execute test1? and if that evaluated to #t it will execute the consequence as the result of the entire conditional. If test1? and test2? fail it will execute elsebody.
Sidenote
Everything in Scheme is truthy except for #f (false). For example:
(if (lambda (x) x)
1
2)
This if test will evaluate to 1 because the if test will check if (lambda (x) x) is truthy, which it is. It is a lambda. Truthy values are values that will evaluate to true in an expression where truth values are expected (e.g., if and cond).
Now for your cond. The first case of your cond will test if L is null. If that is evaluated to #t, you return the empty list. That is indeed correct. Mapping something over the empty list is just the empty list.
The second case ((f (car L))) literally states "if f is true, then return the car of L".
The else case states "otherwise, return the result mymap on the rest of my list L".
What I think you really want to do is use an if test. If the list is empty, return the empty list. If it is not empty, apply the function to the first element of the list. Map the function over the rest of the list, and then add the result of applying the function the first element of the list to that result.
(define (mymap f L)
(cond ((null? L) '())
(f (car L))
(else (mymap (f (cdr L))))))
So what you want might look look this:
(define (mymap f L)
(cond ((null? L) '())
(else
(cons (f (car L))
(mymap f (cdr L))))))
Using an if:
(define (mymap f L)
(if (null? L) '()
(cons (f (car L))
(mymap f (cdr L)))))
Since you are new to Scheme this function will do just fine. Try and understand it. However, there are better and faster ways to implement this kind of functions. Read this page to understand things like accumulator functions and tail recursion. I will not go in to detail about everything here since its 1) not the question and 2) might be information overload.
If you're taking on implementing your own list procedures, you should probably make sure they're using a proper tail call, when possible
(define (map f xs)
(define (loop xs ys)
(if (empty? xs)
ys
(loop (cdr xs) (cons (f (car xs)) ys))))
(loop (reverse xs) empty))
(map (λ (x) (* x 10)) '(1 2 3 4 5))
; => '(10 20 30 40 50)
Or you can make this a little sweeter with the named let expression, as seen in your original code. This one, however, uses a proper tail call
(define (map f xs)
(let loop ([xs (reverse xs)] [ys empty])
(if (empty? xs)
ys
(loop (cdr xs) (cons (f (car xs)) ys)))))
(map (λ (x) (* x 10)) '(1 2 3 4 5))
; => '(10 20 30 40 50)
I'm new to Scheme (via Racket) and (to a lesser extent) functional programming, and could use some advise on the pros and cons of accumulation via variables vs recursion. For the purposes of this example, I'm trying to calculate a moving average. So, for a list '(1 2 3 4 5), the 3 period moving average would be '(1 2 2 3 4). The idea is that any numbers before the period are not yet part of the calculation, and once we reach the period length in the set, we start averaging the subset of the list according the chosen period.
So, my first attempt looked something like this:
(define (avg lst)
(cond
[(null? lst) '()]
[(/ (apply + lst) (length lst))]))
(define (make-averager period)
(let ([prev '()])
(lambda (i)
(set! prev (cons i prev))
(cond
[(< (length prev) period) i]
[else (avg (take prev period))]))))
(map (make-averager 3) '(1 2 3 4 5))
> '(1 2 2 3 4)
This works. And I like the use of map. It seems composible and open to refactoring. I could see in the future having cousins like:
(map (make-bollinger 5) '(1 2 3 4 5))
(map (make-std-deviation 2) '(1 2 3 4 5))
etc.
But, it's not in the spirit of Scheme (right?) because I'm accumulating with side effects. So I rewrote it to look like this:
(define (moving-average l period)
(let loop ([l l] [acc '()])
(if (null? l)
l
(let* ([acc (cons (car l) acc)]
[next
(cond
[(< (length acc) period) (car acc)]
[else (avg (take acc period))])])
(cons next (loop (cdr l) acc))))))
(moving-average '(1 2 3 4 5) 3)
> '(1 2 2 3 4)
Now, this version is more difficult to grok at first glance. So I have a couple questions:
Is there a more elegant way to express the recursive version using some of the built in iteration constructs of racket (like for/fold)? Is it even tail recursive as written?
Is there any way to write the first version without the use of an accumulator variable?
Is this type of problem part of a larger pattern for which there are accepted best practices, especially in Scheme?
It's a little strange to me that you're starting before the first of the list but stopping sharply at the end of it. That is, you're taking the first element by itself and the first two elements by themselves, but you don't do the same for the last element or the last two elements.
That's somewhat orthogonal to the solution for the problem. I don't think the accumulator is making your life any easier here, and I would write the solution without it:
#lang racket
(require rackunit)
;; given a list of numbers and a period,
;; return a list of the averages of all
;; consecutive sequences of 'period'
;; numbers taken from the list.
(define ((moving-average period) l)
(cond [(< (length l) period) empty]
[else (cons (mean (take l period))
((moving-average period) (rest l)))]))
;; compute the mean of a list of numbers
(define (mean l)
(/ (apply + l) (length l)))
(check-equal? (mean '(4 4 1)) 3)
(check-equal? ((moving-average 3) '(1 3 2 7 6)) '(2 4 5))
Well, as a general rule, you want to separate the manner in which you recurse and/or iterate from the content of the iteration steps. You mention fold in your question, and this points in the right step: you want some form of higher-order function that will handle the list traversal mechanics, and call a function you supply with the values in the window.
I cooked this up in three minutes; it's probably wrong in many ways, but it should give you an idea:
;;;
;;; Traverse a list from left to right and call fn with the "windows"
;;; of the list. fn will be called like this:
;;;
;;; (fn prev cur next accum)
;;;
;;; where cur is the "current" element, prev and next are the
;;; predecessor and successor of cur, and accum either init or the
;;; accumulated result from the preceeding call to fn (like
;;; fold-left).
;;;
;;; The left-edge and right-edge arguments specify the values to use
;;; as the predecessor of the first element of the list and the
;;; successor of the last.
;;;
;;; If the list is empty, returns init.
;;;
(define (windowed-traversal fn left-end right-end init list)
(if (null? list)
init
(windowed-traversal fn
(car list)
right-end
(fn left-end
(car list)
(if (null? (cdr list))
right-end
(second list))
init)
(cdr list))))
(define (moving-average list)
(reverse!
(windowed-traversal (lambda (prev cur next list-accum)
(cons (avg (filter true? (list prev cur next)))
list-accum))
#f
#f
'()
list)))
Alternately, you could define a function that converts a list into n-element windows and then map average over the windows.
(define (partition lst default size)
(define (iter lst len result)
(if (< len 3)
(reverse result)
(iter (rest lst)
(- len 1)
(cons (take lst 3) result))))
(iter (cons default (cons default lst))
(+ (length lst) 2)
empty))
(define (avg lst)
(cond
[(null? lst) 0]
[(/ (apply + lst) (length lst))]))
(map avg (partition (list 1 2 3 4 5) 0 3))
Also notice that the partition function is tail-recursive, so it doesn't eat up stack space -- this is the point of result and the reverse call. I explicitly keep track of the length of the list to avoid either repeatedly calling length (which would lead to O(N^2) runtime) or hacking together a at-least-size-3 function. If you don't care about tail recursion, the following variant of partition should work:
(define (partition lst default size)
(define (iter lst len)
(if (< len 3)
empty
(cons (take lst 3)
(iter (rest lst)
(- len 1)))))
(iter (cons default (cons default lst))
(+ (length lst) 2)))
Final comment - using '() as the default value for an empty list could be dangerous if you don't explicitly check for it. If your numbers are greater than 0, 0 (or -1) would probably work better as a default value - they won't kill whatever code is using the value, but are easy to check for and can't appear as a legitimate average
I've been teaching myself functional programming, and I'm currently writing different higher order functions using folds. I'm stuck implementing scan (also known as prefix sum). My map implementation using fold looks like:
(define (map op sequence)
(fold-right (lambda (x l) (cons (op x) l)) nil sequence))
And my shot at scan looks like:
(define (scan sequence)
(fold-left (lambda (x y) (append x (list (+ y (car (reverse x)))))) (list 0) sequence))
My observation being that the "x" is the resulting array so far, and "y" is the next element in the incoming list. This produces:
(scan (list 1 4 8 3 7 9)) -> (0 1 5 13 16 23 32)
But this looks pretty ugly, with the reversing of the resulting list going on inside the lambda. I'd much prefer to not do global operations on the resulting list, since my next attempt is to try and parallelize much of this (that's a different story, I'm looking at several CUDA papers).
Does anyone have a more elegant solution for scan?
BTW my implementation of fold-left and fold-right is:
(define (fold-left op initial sequence)
(define (iter result rest)
(if (null? rest)
result
(iter (op result (car rest)) (cdr rest))))
(iter initial sequence))
(define (fold-right op initial sequence)
(if (null? sequence)
initial
(op (car sequence) (fold-right op initial (cdr sequence)))))
Imho scan is very well expressible in terms of fold.
Haskell example:
scan func list = reverse $ foldl (\l e -> (func e (head l)) : l) [head list] (tail list)
Should translate into something like this
(define scan
(lambda (func seq)
(reverse
(fold-left
(lambda (l e) (cons (func e (car l)) l))
(list (car seq))
(cdr seq)))))
I wouldn’t do this. fold can actually be implemented in terms of scan (last element of the scanned list). But scan and fold are in fact orthogonal operations. If you’ve read the CUDA papers you’ll notice that a scan consists of two phases: the first yields the fold result as a by-product. The second phase is only used for the scan (of course, this only counts for parallel implementations; a sequential implementation of fold is more efficient if it doesn’t rely on scan at all).
imho Dario cheated by using reverse since the exercise was about expressing in terms of fold not a reverse fold. This, of course, is a horrible way to express scan but it is a fun exercise of jamming a square peg into a round hole.
Here it is in haskell, I don't know lisp
let scan f list = foldl (\ xs next -> xs++[f (last xs) next]) [0] list
scan (+) [1, 4, 8, 3, 7, 9]
[0,1,5,13,16,23,32]
of course, using teh same trick as Dario one can get rid of that leading 0:
let scan f list = foldl (\ xs next -> xs++[f (last xs) next]) [head list] (tail list)
scan (+) [1, 4, 8, 3, 7, 9]
[1,5,13,16,23,32]