Problem
Let's say you have a number of lists or arrays, let's say two for the sake of example :
(defparameter *arr* #(1 2 3))
(defparameter *list* '(4 5 6))
You can loop over them using either across or in keywords :
(loop for elem across *arr* do (format t "~a" elem))
=> 123
(loop for elem in *list* do (format t "~a" elem))
=> 456
I want to be able to loop over these arrays or lists using the same syntax. I am using SBCL and execution speed is a concern.
Using being the elements of
This syntax is nice, as it works regardless of its argument being a list or array.
(loop for elem being the elements of *arr* do (format t "~a" elem))
=> 123
(loop for elem being the elements of *list* do (format t "~a" elem))
=> 456
But its speed is horrendous. If we do a quick comparison by accessing lists or arrays of 100 elements 1M times :
(format t "# Test 1.1.1 : Accessing list of doubles with loop 'in': ") (terpri)
(let ((test-list (make-list 100 :initial-element 12.2d0))
(testvar 0d0))
(declare (optimize (speed 3))
(type list test-list)
(type double-float testvar))
(time (dotimes (it 1000000 t) (loop for el in test-list do
(setf testvar (the double-float el))))))
(format t "# Test 1.1.2 : Accessing list of doubles with loop 'elements': ") (terpri)
(let ((test-list (make-list 100 :initial-element 12.2d0))
(testvar 0d0))
(declare (optimize (speed 3))
(type list test-list)
(type double-float testvar))
(time (dotimes (it 1000000 t) (loop for el being the elements of test-list do
(setf testvar (the double-float el))))))
(format t "# Test 1.2.1 : Accessing simple-array of doubles using loop 'across' : ") (terpri)
(let ((test-array (make-array 100 :initial-element 12.2d0 :element-type 'double-float))
(testvar 0d0))
(declare (optimize (speed 3))
(type double-float testvar)
(type simple-array test-array))
(time (dotimes (it 1000000 t) (loop for el across test-array do
(setf testvar (the double-float el))))))
(format t "# Test 1.2.2 : Accessing simple-array of doubles using loop 'elements' : ") (terpri)
(let ((test-array (make-array 100 :initial-element 12.2d0 :element-type 'double-float))
(testvar 0d0))
(declare (optimize (speed 3))
(type double-float testvar)
(type simple-array test-array))
(time (dotimes (it 1000000 t) (loop for el being the elements of test-array do
(setf testvar (the double-float el))))))
It gives us :
# Test 1.1.1 : Accessing list of doubles with loop 'in':
Evaluation took:
0.124 seconds of real time
0.123487 seconds of total run time (0.123471 user, 0.000016 system)
99.19% CPU
445,008,960 processor cycles
672 bytes consed
# Test 1.1.2 : Accessing list of doubles with loop 'elements':
Evaluation took:
0.843 seconds of real time
0.841639 seconds of total run time (0.841639 user, 0.000000 system)
99.88% CPU
3,034,104,192 processor cycles
0 bytes consed
# Test 1.2.1 : Accessing simple-array of doubles using loop 'across' :
Evaluation took:
0.062 seconds of real time
0.062384 seconds of total run time (0.062384 user, 0.000000 system)
100.00% CPU
224,896,032 processor cycles
0 bytes consed
# Test 1.2.2 : Accessing simple-array of doubles using loop 'elements' :
Evaluation took:
1.555 seconds of real time
1.554472 seconds of total run time (1.541572 user, 0.012900 system)
[ Run times consist of 0.094 seconds GC time, and 1.461 seconds non-GC time. ]
99.94% CPU
5,598,161,100 processor cycles
1,600,032,848 bytes consed
I think it must use the elt accessor ? Anyway the penalty in speed is unacceptable.
Trying to be smart with macros
I wrote something to be able to achieve my goal of having the same syntax for list and array. I think it's not great because it seems overly awkward, but here :
(defun forbuild (el-sym list-or-array list-or-array-sym)
"Outputs either :
* (for el-sym in list-or-array)
* (for el-sym across list-or-array)
Depending on type of list-or-array.
el-sym : symbol, eg. 'it1
list-or-array : declared, actual data for list or array
list-or-array-sym : symbol name for the table, to avoid writing the data in full
in the 'loop' call using eval.
Example call : (forbuild 'it1 arr 'arr)"
(cond ((typep list-or-array 'array)
`(for ,el-sym across ,list-or-array-sym))
((typep list-or-array 'list)
`(for ,el-sym in ,list-or-array-sym))))
(defun forbuild-l (l-elsyms l-lars l-larsyms)
"forbuild but over lists of things."
(let ((for-list nil)
(list-temp nil))
(loop for elem in l-elsyms
for lar in l-lars
for larsym in l-larsyms do
(setf list-temp (forbuild elem lar larsym))
(loop for word-temp in list-temp do
(push word-temp for-list)))
(nreverse for-list)))
(defun loop-expr (forlist body)
"Creates the expression ready to be evaluated to execute the loop.
forlist : List of symbols to be inserted syntactically. eg.
FOR IT1 ACROSS ARR1 FOR IT2 IN ARR2
body : all the expression after the 'for' clauses in the 'loop'."
`(loop ,#forlist ,#body))
(defmacro looparl (element list-or-array &rest body)
(let ((forlist (gensym)))
`(let ((,forlist (forbuild2-l (quote ,element)
(list ,#list-or-array)
(quote ,list-or-array))))
(loop-expr ,forlist (quote ,body)))))
Basically I build the right loop syntax from the arguments. The version of looparl given here can be called this way :
(let ((arr1 #(7 8 9))
(list2 (list 10 11 12)))
(looparl (it1 it2) (arr1 list2) do (format t "~a ~a" it1 it2) (terpri)))
=> (LOOP FOR IT1 ACROSS ARR1
FOR IT2 IN LIST2
DO (FORMAT T "~a ~a" IT1 IT2) (TERPRI))
The actual evaluation of this outputted expression is omitted in this example, because it doesn't work on non-global names. If we throw in an eval at the end of looparl :
(defmacro looparl (element list-or-array &rest body)
(let ((forlist (gensym)))
`(let ((,forlist (forbuild2-l (quote ,element)
(list ,#list-or-array)
(quote ,list-or-array))))
(eval (loop-expr ,forlist (quote ,body))))))
And work on global variables, we see that we still have a speed issue, since there are evaluations happening at runtime :
(looparl (it1 it2) (*arr* *list*) for it from 100
do (format t "~a ~a ~a" it1 it2 it) (terpri))
=> 1 4 100
2 5 101
3 6 102
(time (dotimes (iter 1000 t) (looparl (it1 it2) (*arr* *list*) for it from 100
do (format t "~a ~a ~a" it1 it2 it) (terpri))))
=> Evaluation took:
1.971 seconds of real time
1.932610 seconds of total run time (1.892329 user, 0.040281 system)
[ Run times consist of 0.097 seconds GC time, and 1.836 seconds non-GC time. ]
98.07% CPU
1,000 forms interpreted
16,000 lambdas converted
7,096,353,696 processor cycles
796,545,680 bytes consed
The macros are evaluated each one at a time a thousand times. But surely the type is known at compile time no ? The type of syntax in looparl is very nice, and I'd like to be able to use it without speed penalty.
I read this note in Peter Seibel's book Practical Common Lisp, chapter "Loop for Black Belts"
3 You may wonder why LOOP can't figure out whether it's looping over a list or a vector without needing different prepositions. This is another consequence of LOOP being a macro: the value of the list or vector won't be known until runtime, but LOOP, as a macro, has to generate code at compile time. And LOOP's designers wanted it to generate extremely efficient code. To be able to generate efficient code for looping across, say, a vector, it needs to know at compile time that the value will be a vector at runtime--thus, the different prepositions are needed.
Am I committing some big Common-Lisp nonsense ? How would you go about creating a working, quick looparl ?
Edit 1 : FOR library
Thank you Ehvince for the reference to the FOR library. The over keyword in the for:for function is indeed exactly what I'd need. However the benchmarks are really underwhelming :
(let ((test-list (make-list 100 :initial-element 12.2d0))
(testvar 0d0))
(declare (optimize (speed 3))
(type list test-list)
(type double-float testvar))
(time (dotimes (it 1000000 t)
(for:for ((el over test-list))
(setf testvar (the double-float el))))))
(let ((test-array (make-array 100 :initial-element 12.2d0))
(testvar 0d0))
(declare (optimize (speed 3))
(type simple-array test-array)
(type double-float testvar))
(time (dotimes (it 1000000 t)
(for:for ((el over test-array))
(setf testvar (the double-float el))))))
Evaluation took:
4.802 seconds of real time
4.794485 seconds of total run time (4.792492 user, 0.001993 system)
[ Run times consist of 0.010 seconds GC time, and 4.785 seconds non-GC time. ]
99.83% CPU
17,286,934,536 processor cycles
112,017,328 bytes consed
Evaluation took:
6.758 seconds of real time
6.747879 seconds of total run time (6.747879 user, 0.000000 system)
[ Run times consist of 0.004 seconds GC time, and 6.744 seconds non-GC time. ]
99.85% CPU
24,329,311,848 processor cycles
63,995,808 bytes consed
The speed of this library using the specialized keywords in and across is the same as for the standard loop. But very slow with over.
Edit 2 : map and etypecase
Thank you sds and Rainer Joswig for the suggestions. It would indeed work for the simple case in which I would only have one array/list to iterate over. Let me tell you about one use case I had in mind : I was implementing a gnuplot wrapper, both as training and to have my own program in my toolkit. I wanted to take from the user lists or arrays indifferently to serve as series to pipe to gnuplot. This is why I need to be able to loop over multiple array/lists simultaneously + using the elegant loop clauses for iteration number etc.
In this use case (gnuplot wrapper), I only have two or three for clauses in my loop for each "data block", so I have thought of writing each combination depending on the type of input by hand and it is possible, but very awkward. And I'd be stuck if I had to do something like :
(loop for el1 in list1
for el2 across arr1
for el3 in list2
for el4 in list3
...)
With the list-i and arr-i being inputs. Another fallback plan for this use case is just to convert everything to arrays.
I thought that since it is quite easily conceptualized, I could write something flexible and fast once and for all, but there must be a reason why it is neither in the specs nor in SBCL-specific code.
What you are looking for is called
map:
both
(map nil #'princ '(1 2 3))
and
(map nil #'princ #(1 2 3))
print 123.
However, lists and arrays are very different beasts, and it is best to decide in advance which one you want to use.
The library For, by Shinmera, has the generic over iterator:
(ql:quickload "for")
(for:for ((a over *arr*)
(b over *list*))
(print (list a b)))
;; (1 4)
;; (2 5)
;; (3 6)
It also has "in" and "accross", so it might help to use "over" during development and to refine later, if needed.
I'll let you do the benchmarks :)
For trivial uses you might do
(flet ((do-something (e)
...))
(etypecase foo
(vector (loop for e across foo do (do-something e)))
(list (loop for e in foo do (do-something e))))
The runtime type dispatch probably will be faster than a generic iteration construct using the sequence abstraction.
Coercing an array to a list and then looping in gives the same performance as if it had been a list in the first place, which isn't as good as with array, but not nearly so bad as using element and it does have the virtue of working with either a list or an array without additional machinery:
(loop for x in (coerce array 'list) do something)
Related
I am trying to compare the performance of a function and a macro.
EDIT: Why do I want to compare the two?
Paul Graham wrote in his ON LISP book that macros can be used to make a system more efficient because a lot of the computation can be done at compile time. so in the example below (length args) is dealt with at compile time in the macro case and at run time in the function case. So, I just wanted how much faster did (avg2 super-list) get computed relative to (avg super-list).
Here is the function and the macro:
(defun avg (args)
(/ (apply #'+ args) (length args)))
(defmacro avg2 (args)
`(/ (+ ,#args) ,(length args)))
I have looked at this question How to pass a list to macro in common lisp? and a few other ones but they do not help because their solutions do not work; for example, in one of the questions a user answered by saying to do this:
(avg2 (2 3 4 5))
instead of this:
(avg2 '(2 3 4))
This works but I want a list containg 100,000 items:
(defvar super-list (loop for i from 1 to 100000 collect i))
But this doesnt work.
So, how can I pass super-list to avg2?
First of all, it simply makes no sense to 'compare the performance of a function and a macro'. It only makes sense to compare the performance of the expansion of a macro with a function. So that's what I'll do.
Secondly, it only makes sense to compare the performance of a function with the expansion of a macro if that macro is equivalent to the function. In other words the only places this comparison is useful is where the macro is being used as a hacky way of inlining a function. It doesn't make sense to compare the performance of something which a function can't express, like if or and say. So we must rule out all the interesting uses of macros.
Thirdly it makes no sense to compare the performance of things which are broken: it is very easy to make programs which do not work be as fast as you like. So I'll successively modify both your function and macro so they're not broken.
Fourthly it makes no sense to compare the performance of things which use algorithms which are gratuitously terrible, so I'll modify both your function and your macro to use better algrorithms.
Finally it makes no sense to compare the performance of things without using the tools the language provides to encourage good performance, so I will do that as the last step.
So let's address the third point above: let's see how avg (and therefore avg2) is broken.
Here's the broken definition of avg from the question:
(defun avg (args)
(/ (apply #'+ args) (length args)))
So let's try it:
> (let ((l (make-list 1000000 :initial-element 0)))
(avg l))
Error: Last argument to apply is too long: 1000000
Oh dear, as other people have pointed out. So probably I need instead to make avg at least work. As other people have, again, pointed out, the way to do this is reduce:
(defun avg (args)
(/ (reduce #'+ args) (length args)))
And now a call to avg works, at least. avg is now non-buggy.
We need to make avg2 non-buggy as well. Well, first of all the (+ ,#args) thing is a non-starter: args is a symbol at macroexpansion time, not a list. So we could try this (apply #'+ ,args) (the expansion of the macro is now starting to look a bit like the body of the function, which is unsurprising!). So given
(defmacro avg2 (args)
`(/ (apply #'+ ,args) (length ,args)))
We get
> (let ((l (make-list 1000000 :initial-element 0)))
(avg2 l))
Error: Last argument to apply is too long: 1000000
OK, unsurprising again. let's fix it to use reduce again:
(defmacro avg2 (args)
`(/ (reduce #'+ ,args) (length ,args)))
So now it 'works'. Except it doesn't: it's not safe. Look at this:
> (macroexpand-1 '(avg2 (make-list 1000000 :initial-element 0)))
(/ (reduce #'+ (make-list 1000000 :initial-element 0))
(length (make-list 1000000 :initial-element 0)))
t
That definitely is not right: it will be enormously slow but also it will just be buggy. We need to fix the multiple-evaluation problem.
(defmacro avg2 (args)
`(let ((r ,args))
(/ (reduce #'+ r) (length r))))
This is safe in all sane cases. So this is now a reasonably safe 70s-style what-I-really-want-is-an-inline-function macro.
So, let's write a test-harness both for avg and avg2. You will need to recompile av2 each time you change avg2 and in fact you'll need to recompile av1 for a change we're going to make to avg as well. Also make sure everything is compiled!
(defun av0 (l)
l)
(defun av1 (l)
(avg l))
(defun av2 (l)
(avg2 l))
(defun test-avg-avg2 (nelements niters)
;; Return time per call in seconds per iteration per element
(let* ((l (make-list nelements :initial-element 0))
(lo (let ((start (get-internal-real-time)))
(dotimes (i niters (- (get-internal-real-time) start))
(av0 l)))))
(values
(let ((start (get-internal-real-time)))
(dotimes (i niters (float (/ (- (get-internal-real-time) start lo)
internal-time-units-per-second
nelements niters)))
(av1 l)))
(let ((start (get-internal-real-time)))
(dotimes (i niters (float (/ (- (get-internal-real-time) start lo)
internal-time-units-per-second
nelements niters)))
(av2 l))))))
So now we can test various combinations.
OK, so now the fouth point: both avg and avg2 use awful algorithms: they traverse the list twice. Well we can fix this:
(defun avg (args)
(loop for i in args
for c upfrom 0
summing i into s
finally (return (/ s c))))
and similarly
(defmacro avg2 (args)
`(loop for i in ,args
for c upfrom 0
summing i into s
finally (return (/ s c))))
These changes made a performance difference of about a factor of 4 for me.
OK so now the final point: we should use the tools the language gives us. As has been clear throughout this whole exercise only make sense if you're using a macro as a poor-person's inline function, as people had to do in the 1970s.
But it's not the 1970s any more: we have inline functions.
So:
(declaim (inline avg))
(defun avg (args)
(loop for i in args
for c upfrom 0
summing i into s
finally (return (/ s c))))
And now you will have to make sure you recompile avg and then av1. And when I look at av1 and av2 I can now see that they are the same code: the entire purpose of avg2 has now gone.
Indeed we can do even better than this:
(define-compiler-macro avg (&whole form l &environment e)
;; I can't imagine what other constant forms there might be in this
;; context, but, well, let's be safe
(if (and (constantp l e)
(listp l)
(eql (first l) 'quote))
(avg (second l))
form))
Now we have something which:
has the semantics of a function, so, say (funcall #'avg ...) will work;
isn't broken;
uses a non-terrible algorithm;
will be inlined on any competent implementation of the language (which I bet is 'all implementations' now) when it can be;
will detect (some?) cases where it can be compiled completely away and replaced by a compile-time constant.
Since the value of super-list is known, one can do all computation at macro expansion time:
(eval-when (:execute :compile-toplevel :load-toplevel)
(defvar super-list (loop for i from 1 to 100000 collect i)))
(defmacro avg2 (args)
(setf args (eval args))
(/ (reduce #'+ args) (length args)))
(defun test ()
(avg2 super-list))
Trying the compiled code:
CL-USER 10 > (time (test))
Timing the evaluation of (TEST)
User time = 0.000
System time = 0.000
Elapsed time = 0.000
Allocation = 0 bytes
0 Page faults
100001/2
Thus the runtime is near zero.
The generated code is just a number, the result number:
CL-USER 11 > (macroexpand '(avg2 super-list))
100001/2
Thus for known input this macro call in compiled code has a constant runtime of near zero.
I don't think you really want a list of 100,000 items. That would have terrible performance with all that cons'ing. You should consider a vector instead, e.g.
(avg2 #(2 3 4))
You didn't mention why it didn't work; if the function never returns, it's likely a memory issue from such a large list, or attempting to apply on such a large function argument list; there are implementation defined limits on how many arguments you can pass to a function.
Try reduce on a super-vector instead:
(reduce #'+ super-vector)
The function set-difference is restricted to finding the difference between two sets. Can this be efficiently extended to allow more set arguments--eg, (my-set-difference A B C)--in the same way the function - works--eg, (- 9 3 1) => 5? Using (reduce #'set-difference ...) is not very efficient, as it first requires appending all of the set arguments into a sequence.
Actually, I think concatenating all the lists except the first one is probably the best solution.
Each invocation of set-difference will be O(n) (where n is the maximum size of the two lists), so reducing will be O(n*m) (where m is the number of lists). But if you do
(set-difference A (append B C D E F ...))
Appending all the lists is O(total length of B...), and the complexity of set-difference will be similar.
I don't know how accurate the following quick test is, but it says Barmar's append method is about 14 times faster than the reduce method, but conses twice as much.
(defparameter A
(mapcar (lambda (elt)
(declare (ignore elt))
(random 100))
(make-list 100)))
(defparameter B
(mapcar (lambda (elt)
(declare (ignore elt))
(random 100))
(make-list 100)))
(defparameter C
(mapcar (lambda (elt)
(declare (ignore elt))
(random 100))
(make-list 100)))
* (time (dotimes (i 100000) (reduce #'set-difference (list A B C))))
Evaluation took:
0.877 seconds of real time
0.875000 seconds of total run time (0.875000 user, 0.000000 system)
[ Run times consist of 0.016 seconds GC time, and 0.859 seconds non-GC time. ]
99.77% CPU
3,155,360,287 processor cycles
78,380,176 bytes consed
NIL
* (time (dotimes (i 100000) (set-difference A (append B C))))
Evaluation took:
0.064 seconds of real time
0.062500 seconds of total run time (0.062500 user, 0.000000 system)
96.88% CPU
229,293,666 processor cycles
159,971,568 bytes consed
NIL
But I've heard the SBCL time report is not very accurate (and this test may be faulty!).
This is not a homework assignment. In the following code:
(defparameter nums '())
(defun fib (number)
(if (< number 2)
number
(push (+ (fib (- number 1)) (fib (- number 2))) nums))
return nums)
(format t "~a " (fib 100))
Since I am quite inexperienced with Common Lisp, I am at a loss as to why the function does not return an value. I am a trying to print first 'n' values, e.g., 100, of the Fibonacci Sequence.
Thank you.
An obvious approach to computing fibonacci numbers is this:
(defun fib (n)
(if (< n 2)
n
(+ (fib (- n 1)) (fib (- n 2)))))
(defun fibs (n)
(loop for i from 1 below n
collect (fib i)))
A little thought should tell you why no approach like this is going to help you compute the first 100 Fibonacci numbers: the time taken to compute (fib n) is equal to or a little more than the time taken to compute (fib (- n 1)) plus the time taken to compute (fib (- n 2)): this is exponential (see this stack overflow answer).
A good solution to this is memoization: the calculation of (fib n) repeats subcalculations a huge number of times, and if we can just remember the answer we computed last time we can avoid doing so again.
(An earlier version of this answer has an overcomplex macro here: something like that may be useful in general but is not needed here.)
Here is how you can memoize fib:
(defun fib (n)
(check-type n (integer 0) "natural number")
(let ((so-far '((2 . 1) (1 . 1) (0 . 0))))
(labels ((fibber (m)
(when (> m (car (first so-far)))
(push (cons m (+ (fibber (- m 1))
(fibber (- m 2))))
so-far))
(cdr (assoc m so-far))))
(fibber n))))
This keeps a table – an alist – of the results it has computed so far, and uses this to avoid recomputation.
With this memoized version of the function:
> (time (fib 1000))
Timing the evaluation of (fib 1000)
User time = 0.000
System time = 0.000
Elapsed time = 0.000
Allocation = 101944 bytes
0 Page faults
43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875
The above definition uses a fresh cache for each call to fib: this is fine, because the local function, fibber does reuse the cache. But you can do better than this by putting the cache outside the function altogether:
(defmacro define-function (name expression)
;; Install EXPRESSION as the function value of NAME, returning NAME
;; This is just to avoid having to say `(setf ...)`: it should
;; probably do something at compile-time too so the compiler knows
;; the function will be defined.
`(progn
(setf (fdefinition ',name) ,expression)
',name))
(define-function fib
(let ((so-far '((2 . 1) (1 . 1) (0 . 0))))
(lambda (n)
(block fib
(check-type n (integer 0) "natural number")
(labels ((fibber (m)
(when (> m (car (first so-far)))
(push (cons m (+ (fibber (- m 1))
(fibber (- m 2))))
so-far))
(cdr (assoc m so-far))))
(fibber n))))))
This version of fib will share its cache between calls, which means it is a little faster, allocates a little less memory but may be less thread-safe:
> (time (fib 1000))
[...]
Allocation = 96072 bytes
[...]
> (time (fib 1000))
[...]
Allocation = 0 bytes
[...]
Interestingly memoization was invented (or at least named) by Donald Michie, who worked on breaking Tunny (and hence with Colossus), and who I also knew slightly: the history of computing is still pretty short!
Note that memoization is one of the times where you can end up fighting a battle with the compiler. In particular for a function like this:
(defun f (...)
...
;; no function bindings or notinline declarations of F here
...
(f ...)
...)
Then the compiler is allowed (but not required) to assume that the apparently recursive call to f is a recursive call into the function it is compiling, and thus to avoid a lot of the overhead of a full function call. In particular it is not required to retrieve the current function value of the symbol f: it can just call directly into the function itself.
What this means is that an attempt to write a function, memoize which can be used to mamoize an existing recursive function, as (setf (fdefinition 'f) (memoize #'f)) may not work: the function f still call directly into the unmemoized version of itself and won't notice that the function value of f has been changed.
This is in fact true even if the recursion is indirect in many cases: the compiler is allowed to assume that calls to a function g for which there is a definition in the same file are calls to the version defined in the file, and again avoid the overhead of a full call.
The way to deal with this is to add suitable notinline declarations: if a call is covered by a notinline declaration (which must be known to the compiler) then it must be made as a full call. From the spec:
A compiler is not free to ignore this declaration; calls to the specified functions must be implemented as out-of-line subroutine calls.
What this means is that, in order to memoize functions you have to add suitable notinline declarations for recursive calls, and this means that memoizing either needs to be done by a macro, or must rely on the user adding suitable declarations to the functions to be memoized.
This is only a problem because the CL compiler is allowed to be smart: almost always that's a good thing!
Your function unconditionally returns nums (but only if a variable called return exists). To see why, we can format it like this:
(defun fib (number)
(if (< number 2)
number
(push (+ (fib (- number 1)) (fib (- number 2))) nums))
return
nums)
If the number is less than 2, then it evaluates the expression number, uselessly, and throws away the result. Otherwise, it pushes the result of the (+ ....) expression onto the nums list. Then it uselessly evaluates return, throwing away the result. If a variable called return doesn't exist, that's an error situation. Otherwise, it evaluates nums and that is the return value.
In Common Lisp, there is a return operator for terminating and returning out of anonymous named blocks (blocks whose name is the symbol nil). If you define a named function with defun, then an invisible block exists which is not anonymous: it has the same name as that function. In that case, return-from can be used:
(defun function ()
(return-from function 42) ;; function terminates, returns 42
(print 'notreached)) ;; this never executes
Certain standard control flow and looping constructs establish a hidden anonymous block, so return can be used:
(dolist (x '(1 2 3))
(return 42)) ;; loop terminates, yields 42 as its result
If we use (return ...) but there is no enclosing anonymous block, that is an error.
The expression (return ...) is different from just return, which evaluates a variable named by the symbol return, retrieving its contents.
It is not clear how to repair your fib function, because the requirements are unknown. The side effect of pushing values into a global list normally doesn't belong inside a mathematical function like this, which should be pure (side-effect-free).
So you might know that if you know the two previous numbers you can compute the next. What comes after 3, 5? If you guess 8 you have understood it. Now if you start with 0, 1 and roll 1, 1, 1, 2, etc you collect the first variable until you have the number of numbers you'd like:
(defun fibs (elements)
"makes a list of elements fibonacci numbers starting with the first"
(loop :for a := 0 :then b
:for b := 1 :then c
:for c := (+ a b)
:for n :below elements
:collect a))
(fibs 10)
; ==> (0 1 1 2 3 5 8 13 21 34)
Every form in Common Lisp "returns" a value. You can say it evaluates to. eg.
(if (< a b)
5
10)
This evaluates either to 5 or 10. Thus you can do this and expect that it evaluates to either 15 or 20:
(+ 10
(if (< a b)
5
10))
You basically want your functions to have one expression that calculates the result. eg.
(defun fib (n)
(if (zerop n)
n
(+ (fib (1- n)) (fib (- n 2)))))
This evaluates to the result og the if expression... loop with :collect returns the list. You also have (return expression) and (return-from name expression) but they are usually unnecessary.
Your global variable num is actually not that a bad idea.
It is about to have a central memory about which fibonacci numbers were already calculated. And not to calculate those already calculated numbers again.
This is the very idea of memoization.
But first, I do it in bad manner with a global variable.
Bad version with global variable *fibonacci*
(defparameter *fibonacci* '(1 1))
(defun fib (number)
(let ((len (length *fibonacci*)))
(if (> len number)
(elt *fibonacci* (- len number 1)) ;; already in *fibonacci*
(labels ((add-fibs (n-times)
(push (+ (car *fibonacci*)
(cadr *fibonacci*))
*fibonacci*)
(cond ((zerop n-times) (car *fibonacci*))
(t (add-fibs (1- n-times))))))
(add-fibs (- number len))))))
;;> (fib 10)
;; 89
;;> *fibonacci*
;; (89 55 34 21 13 8 5 3 2 1 1)
Good functional version (memoization)
In memoization, you hide the global *fibonacci* variable
into the environment of a lexical function (the memoized version of a function).
(defun memoize (fn)
(let ((cache (make-hash-table :test #'equal)))
#'(lambda (&rest args)
(multiple-value-bind (val win) (gethash args cache)
(if win
val
(setf (gethash args cache)
(apply fn args)))))))
(defun fib (num)
(cond ((zerop num) 1)
((= 1 num) 1)
(t (+ (fib (- num 1))
(fib (- num 2))))))
The previously global variable *fibonacci* is here actually the local variable cache of the memoize function - encapsulated/hidden from the global environment,
accessible/look-up-able only through the function fibm.
Applying memoization on fib (bad version!)
(defparameter fibm (memoize #'fib))
Since common lisp is a Lisp 2 (separated namespace between function and variable names) but we have here to assign the memoized function to a variable,
we have to use (funcall <variable-name-bearing-function> <args for memoized function>).
(funcall fibm 10) ;; 89
Or we define an additional
(defun fibm (num)
(funcall fibm num))
and can do
(fibm 10)
However, this saves/memoizes only the out calls e.g. here only the
Fibonacci value for 10. Although for that, Fibonacci numbers
for 9, 8, ..., 1 are calculated, too.
To make them saved, look the next section!
Applying memoization on fib (better version by #Sylwester - thank you!)
(setf (symbol-function 'fib) (memoize #'fib))
Now the original fib function is the memoized function,
so all fib-calls will be memoized.
In addition, you don't need funcall to call the memoized version,
but just do
(fib 10)
How do I use cons or other way to print a list of Pell numbers till the Nth number?
(defun pellse (k)
(if (or (zerop k) (= k 1))
k
(+ (* 2 (pellse (- k 1)))
(pellse (- k 2)))))
(print (pellse 7))
Here is how to do it in a way that won’t be exponential:
(defun pells (n)
(loop repeat n
for current = 0 then next
and next = 1 then (+ (* 2 next) current)
collect current))
The time complexity to calculate the nth element given the two previous elements is O(log(Pn)) where Pn is the nth Pell number; you need log(Pn) bits for the answer and log(Pn) operations for the addition. We don’t actually need to work out what Pn is: It is defined by a simple linear recurrence relation so the solution must be exponential so log(Pn) = O(n). Therefore the complexity of calculating the first n Pell numbers is O(n*n) = O(n2).
One cannot[*] do better than O(n2) as one must write O(n2) bits to represent all these numbers.
[*] Although I very much doubt this, it might, in theory, be possible to represent the list in some more compact way by somehow sharing data.
Here is an approach to solving this problem which works by defining an infinite stream of Pell numbers. This is based on the ideas presented in SICP, and particularly section 3.5. Everyone should read this book.
First of all we need to define a construct which will let us talk about infinite data structures. We do this by delaying the evaluation of all but a finite part of them. So start with a macro called delay which delays the evaluation of a form, returning a 'promise' (which is a function of course), and a function called force which forces the system to make good on its promise:
(defmacro delay (form)
;; Delay FORM, which may evaluate to multiple values. This has
;; state so the delayed thing is only called once.
(let ((evaluatedp-n (make-symbol "EVALUATEDP"))
(values-n (make-symbol "VALUES")))
`(let ((,evaluatedp-n nil) ,values-n)
(lambda ()
(unless ,evaluatedp-n
(setf ,evaluatedp-n t
,values-n (multiple-value-list
(funcall (lambda () ,form)))))
(values-list ,values-n)))))
(defun force (promise)
;; force a promise (delayed thing)
(funcall promise))
(This implementation is slightly overcomplex for our purposes, but it's what I had to hand.).
Now we'll use delay to define streams which are potentially infinite chains of conses. There are operations on these corresponding to operations on conses but prefixed by stream-, and there is an object called null-stream which corresponds to () (and is in fact the same object in this implementation).
(defmacro stream-cons (car cdr)
;; a cons whose cdr is delayed
`(cons ,car (delay ,cdr)))
(defun stream-car (scons)
;; car of a delayed cons
(car scons))
(defun stream-cdr (scons)
;; cdr of a delayed cons, forced
(force (cdr scons)))
(defconstant null-stream
;; the empty delayed cons
nil)
(defun stream-null (stream)
;; is a delayed cons empty
(eq stream null-stream))
Now define a function pell-stream which returns a stream of Pell numbers. This function hand-crafts the first two elements of the stream, and then uses a generator to make the rest.
(defun pell-stream ()
;; A stream of Pell numbers
(labels ((pell (pn pn-1)
(let ((p (+ (* 2 pn) pn-1)))
(stream-cons p (pell p pn)))))
(stream-cons 0 (stream-cons 1 (pell 1 0)))))
And now we can simply repeatedly takes stream-cdr to compute Pell numbers.
(defun n-pell-numbers (n)
(loop repeat n
for scons = (pell-stream) then (stream-cdr scons)
collect (stream-car scons)))
And now
> (n-pell-numbers 20)
(0
1
2
5
12
29
70
169
408
985
2378
5741
13860
33461
80782
195025
470832
1136689
2744210
6625109)
Note that, in fact, pell-stream can be a global variable: it doesn't need to be a function:
(defparameter *pell-stream*
(labels ((pell (pn pn-1)
(let ((p (+ (* 2 pn) pn-1)))
(stream-cons p (pell p pn)))))
(stream-cons 0 (stream-cons 1 (pell 1 0)))))
(defun n-stream-elements (stream n)
(loop repeat n
for scons = stream then (stream-cdr scons)
collect (stream-car scons)))
If we define a little benchmarking program:
(defun bench-pell (n)
(progn (n-stream-elements *pell-stream* n) n))
Then it's interesting to see that this is clearly essentially a linear process (it slows down for later elements because the numbers get big and so operations on them take a long time), and that the stateful implementation of promises makes it much faster after the first iteration (at the cost of keeping quite a lot of bignums around):
> (time (bench-pell 100000))
Timing the evaluation of (bench-pell 100000)
User time = 2.020
System time = 0.803
Elapsed time = 2.822
Allocation = 1623803280 bytes
441714 Page faults
100000
> (time (bench-pell 100000))
Timing the evaluation of (bench-pell 100000)
User time = 0.007
System time = 0.000
Elapsed time = 0.006
Allocation = 1708248 bytes
0 Page faults
100000
One possible solution would be to use the LOOP macro of Common Lisp, e.g.:
(print
(loop for x in '(1 2 3 4 5 6 7)
for y = (pellse x)
collect y))
That prints out the following result:
(1 2 5 12 29 70 169)
Based on this, you can build the following function:
(defun list-of-n-pell-numbers (n)
(loop for x from 0 to n
for y = (pellse x)
collect y))
And run it like the following:
(print (list-of-n-pell-numbers 7))
(0 1 2 5 12 29 70 169)
But please be careful when using this code, because your definition of pellse function is recursive, and has the risk of a stack overflow: make it call itself repeatedly enough (e.g. for big values of N), and it might blow up the call stack, unless you do some tail recursion. You might want to check the following explanations:
http://www.lispworks.com/documentation/lcl50/aug/aug-51.html
https://0branch.com/notes/tco-cl.html
On page 329 of Land of Lisp, Conrad Barski explains the technique of memoization with the following example code
(let ((old-neighbors (symbol-function 'neighbors))
(previous (make-hash-table)))
(defun neighbors (pos)
(or (gethash pos previous)
(setf (gethash pos previous) (funcall old-neighbors pos)))))
The idea is nice: when I call the neighbors function, I save the result into a hash table, so that the next time calling neighbors with the same value of pos, I can just look-up the result without having to compute it again.
So I was wondering, whether it would not be easier to rename the function neighbors to old-neighbors by editing and recompiling its source code (given on page 314 of the book). Then the memoization example could be simplified to
(let ((previous (make-hash-table)))
(defun neighbors (pos)
(or (gethash pos previous)
(setf (gethash pos previous) (funcall old-neighbors pos)))))
or, by turning previous into a global variable *previous-neighbors* beforehand, even into
(defun neighbors (pos)
(or (gethash pos *previous-neighbors*)
(setf (gethash pos *previous-neighbors*) (funcall old-neighbors pos))))
thus rendering the closure unnecessary.
So my question is: what is the reason for doing it this way?
Reasons I could imagine:
It's didactical, showing what could be done with a closure (which had been explained just before) and providing an example of symbol-function.
This technique is applicable even in situations, where you cannot or may not change the source code of neighbors.
I am missing something.
You have made some good observations.
Generally the style to use closures like that is more likely to be found in Scheme code - where Scheme developers often use functions to return functions.
Generally as you have detected it makes little sense to have a memoize function foo calling an function old-foo. Using global variables reduces encapsulation (-> information hiding), but increases the ability to debug a program, since one can more easily inspect the memoized values.
A similar, but potentially more useful, pattern would be this:
(defun foo (bar)
<does some expensive computation>)
(memoize 'foo)
Where ˋmemoizeˋ would be something like this
(defun memoize (symbol)
(let ((original-function (symbol-function symbol))
(values (make-hash-table)))
(setf (symbol-function symbol)
(lambda (arg &rest args)
(or (gethash arg values)
(setf (gethash arg values)
(apply original-function arg args)))))))
The advantage is that you don't need to write the memoizing code for each function. You only need one function memoize. In this case the closure also makes sense - though you could also have a global table storing memoize tables.
Note the limitations of above: the comparison uses EQL and only the first argument of the function to memoize.
There are also more extensive tools to provide similar functionality.
See for example:
https://github.com/fare/fare-memoization
https://github.com/sharplispers/cormanlisp/blob/master/examples/memoize.lisp
https://github.com/AccelerationNet/function-cache
Using my memoize from above:
CL-USER 22 > (defun foo (n)
(sleep 3)
(expt 2 n))
FOO
CL-USER 23 > (memoize 'foo)
#<Closure 1 subfunction of MEMOIZE 40600008EC>
The first call with arg 10 runs three seconds:
CL-USER 24 > (foo 10)
1024
The second call with arg 10 runs faster:
CL-USER 25 > (foo 10)
1024
The first call with arg 2 runs three seconds:
CL-USER 26 > (foo 2)
4
The second call with arg 2 runs faster:
CL-USER 27 > (foo 2)
4
The third call with arg 10 runs fast:
CL-USER 28 > (foo 10)
1024