I'm writing a recursive Lisp macro taking a number n and evaluating the body n times (an exercise from ANSI Lisp). I've tried two ways of doing this -- having the macro call itself in its expansion, and having the macro expand into a local recursive function. Neither works as I want it to.
Here's the first one -- it has a stack overflow, but when I look at its expansion using macroexpand-1 it seems fine to me.
(defmacro n-times (n &rest body)
(let ((i (gensym)))
`(let ((,i ,n))
(if (zerop ,i) nil
(progn
,#body
(n-times (- ,i 1) ,#body))))))
Here's the second one -- it makes an error, "undefined function #xxxx called with arguments (z)" where #xxxx is the name of the gensym and z is 1 less than the number I call it with. I think there's a problem with the way I use gensyms and flet together, but I'm not sure how to do this correctly.
(defmacro n-times (n &rest body)
(let ((g (gensym)))
`(flet ((,g (x)
(if (zerop x) nil
(progn
,#body
(,g (- x 1))))))
(,g ,n))))
To answer your first question, you have a recursive macro expansion that never stops recursing. The presence of the if doesn't stop the recursive macro expansion, since macro expansion happens at compile-time and your if happens at run-time.
To answer your second question, you can't use flet to specify recursive functions, you have to use labels instead.
Since macro expansion in Common Lisp happens kind of before runtime, this is slightly tricky.
Remember, the macro sees source code. This means:
the number n must be passed as a number, not a variable, when you use the macro. Thus at macro expansion time the number is known. For such a macro, I would check that in the macro - otherwise you always be tempted to write something like (let ((n 10)) (n-times n ...)) - which won't work.
the macro needs to compute the recursive iteration. Thus the logic is in the macro, and not in the generated code. Each macro needs to generate code, which is one step simpler at macro expansion time - until the basic case is reached.
Related
I've looked at everything I can find about letrec, and I still don't understand what it brings to a language as a feature. It seems like everything expressible with letrec could just as easily be written as a recursive function. But are there any reasons to expose letrec as a feature of a programming language, if the language already supports recursive functions? Why do several languages expose both?
I get that letrec might be used to implement other features including recursive functions, but that's not relevant to why it should itself be a feature. I've also read that some people find it more readable than recursive functions in some lisps, but again this is not relevant, because the designer of the language can make an effort to make recursive functions readable enough to not need another feature. Finally, I've been told that letrec makes it possible to express some kinds of recursive values more succinctly, but I have yet to find a motivating example.
TL;DR: define is letrec. This is what enables us to write recursive defintions in the first place.
Consider
let fact = fun (n => (n==0 -> 1 ; n * fact (n-1)))
To what entity does the name fact inside the body of this definiton refer? With let foo = val, val is defined in terms of already known entities, so it can't refer to foo which is not defined yet. In terms of scope this can be said (and usually is) that the RHS of the let equation is defined in the outer scope.
The only way for the inner fact to actually point at the one being defined, is to use letrec, where the entity being defined is allowed to refer to the scope in which it is being defined. So while causing evaluation of an entity while its definition is in progress is an error, storing a reference to its (future, at this point in time) value is fine -- in the case of using letrec that is.
The define you refer to, is just letrec under another name. In Scheme as well.
Without the ability of an entity being defined to refer to itself, i.e. in languages with non-recursive let, to have recursion one has to resort to the use of arcane devices such as the y-combinator. Which is cumbersome and usually inefficient. Another way is the definitions like
let fact = (fun (f => f f)) (fun (r => n => (n==0 -> 1 ; n * r r (n-1))))
So letrec brings to the table the efficiency of implementation, and convenience for a programmer.
The quesion then becomes, why expose the non-recursive let? Haskell indeed does not. Scheme has both letrec and let. One reason might be for completeness. Another might be a simpler implementation for let, with less self-referential run-time structures in memory making it easier on the garbage collector.
You ask for a motivational example. Consider defining Fibonacci numbers as a self-referential lazy list:
letrec fibs = {0} + {1} + add fibs (tail fibs)
With non-recursive let another copy of the list fibs will be defined, to be used as the input to the element-wise addition function add. Which will cause the definition of another copy of fibs for this one to be defined in its terms. And so on; accessing the nth Fibonacci number will cause a chain of n-1 lists to be created and maintained at run-time! Not a pretty picture.
And that's assuming the same fibs was used for tail fibs as well. If not, all bets are off.
What is needed is that fibs uses itself, refers to itself, so only one copy of the list is maintained.
NB: Although this is not a Scheme specific problem I'm using Scheme to demonstrate the differences. Hope you can read a little lisp code
A letrec is just a special let where the bindings themselves are defined before the expressions that represent their values are evaluated. Imagine this:
(define (fib n)
(let ((fib (lambda (n a b)
(if (zero? n)
a
(fib (- n 1) b (+ a b))))))
(fib n))
This code fails since while fib does exist in the body of the let it does exist in the closure it defines since the binding didn't exist when the lambda was evaluated. To fix this letrec comes to the rescue:
(define (fib n)
(letrec ((fib (lambda (n a b)
(if (zero? n)
a
(fib (- n 1) b (+ a b))))))
(fib n))
That letrec is just syntax that does something like this:
(define (fib n)
(let ((fib 'undefined))
(let ((tmp (lambda (n a b)
(if (zero? n)
a
(fib (- n 1) b (+ a b))))))
(set! fib tmp))
(fib n)))
So here you clearly see fib exists when the lambda gets evaluated and the binding is later set to the closure itself. The binding is the same, only it's pointer has changed. It's circular reference 101..
So what happens when you make a global function? Clearly if it is to recurse it needs to exist before the lambda is evaluated or the environment has to be mutated. It needs to fix the same problem here too.
In a functional language implementation where mutation is not ok you can solve this problem with a Y (or Z) combinator.
If you are interested in how languages are implemented I suggest you start at Matt Mights articles.
I am able to set a default argument and do a regular recursion with it, but for some reason I cannot do with recur for tail optimization... I keep getting an java.lang.UnsupportedOperationException: nth not supported on this type: Long error.
For example, for a Tail Call Factorial, here is what works, but isn't optimized for tail call recursion and will fail for large recursion stacks.
(defn foo [n & [optional]]
(if (= n 0) (or optional 1)
(foo (dec n) (*' (or optional 1) n))))
And I call this by (foo 3)
And when I try this to get TCO, I get the unsupported operation error...
(defn foo [n & [optional]]
(if (= n 0) (or optional 1)
(recur (dec n) (*' (or optional 1) n))))
And I call this one the same way (foo 3)
Why is this difference causing an error? How exactly would I be able to do TCO with optional default arguments?
Thank you!
EDIT:
and when I try to take out the (or optional 1) in the recursion call and make it just optional , i get a null exception error... Which makes sense.
This also does not get fixed when I try to remove the ' from *' in the recursion call
EDIT: I would also prefer to do this without loop as well
It is a known issue:
Recur doesn't re-enter the function, it just goes back to the top (the vararging doesn't happen again) ... recur with a collection and you will be fine.
I personally feel it should either be mentioned in the recur docstring, or at least appear in the doc. Takes a bit of digging to understand what's happening (I had to check Clojure compiler source along with the compiled classes.)
Why is this difference causing an error?
In short, it's trying to destructure a Long, which it can't
Straight foo call
Takes n arguments
Automatically puts everything after the first argument (n) into a seq behind the scenes, which can be destructured
recur call to foo
Takes exactly 2 arguments
First argument: n
Second argument: Something seqable with the rest of the arguments
How exactly would I be able to do TCO with optional default arguments?
Simply wrap the second argument to recur like so:
(defn foo [n & [optional]]
(if (= n 0) (or optional 1)
(recur (dec n) [(*' (or optional 1) n)])))
(foo 3)
;;=> 6
Recommendations
Although he didn't answer your questions, #DanielCompton's recommendation is the way to go to completely avoid the problem in the first place in a clearer and more efficient way
You can give a function multiple different arities. This might be what you're after?
(defn foo
([n]
(foo n 1))
([n optional]
(if (= n 0)
(or optional 1)
(recur (dec n) (*' (or optional 1) n)))))
I don't quite understand why there is an error, but recur wouldn't normally be used in a function with optional arguments.
Edit: after reading the other answer links, I understand the problem now. recur doesn't destructure the rest args like it does when you call the function. If you recur with a collection as the second arg, it will work, but it is probably still better to be explicit with two different arities:
(defn foo [n & [optional]]
(if (= n 0)
(or optional 1)
(recur (dec n) [(*' (or optional 1) n)])))
For Racket programming language, why is lambda not considered a function?
For example, it can't be defined as a higher order function like this.
(define (my-lambda args body)
(lambda args body))
There's a key distinction that your question is missing:
lambda is syntax.
Procedures are values.
A lambda form is a form of expression whose value is a procedure. The question whether "lambda is a function" starts off with a type error, so to speak, because lambdas and procedures don't live in the same world.
But let's set that aside. The other way to look at this is by thinking of it in terms of evaluation rules. The default Scheme evaluation rule, for the application of a procedure to arguments, can be expressed in pseudo-code like this:
(define (eval-application expr env)
(let ((values
;; Evaluate each subexpression in the same environment as the
;; enclosing expression, and collect the result values.
(map (lambda (subexpr) (eval subexpr env))
expr)))
;; Apply the first value (which must be a procedure) to the
;; other ones in the results.
(apply (car values) (cdr values))))
In English:
Evaluate all of the subexpressions in the same environment as the "parent".
apply the first result (which must have evaluated to a procedure) to the list of the rest.
And now, another reason lambda can't be a procedure is that this evaluation rule doesn't work for lambda expressions. In particular, the point of lambda is to not evaluate its body right away! This, in particular, is what afflicts your my-lambda—if you try to use it this way:
(my-lambda (x) (+ x x))
...the (x) in the middle must be immediately evaluated as an invocation of a procedure named x in the environment where the whole expression appears. The (+ x x) must also be immediately evaluated.
So lambda requires its own evaluation rule. As Basile's answer points out, this is normally implemented as a primitive in the Scheme system implementation, but we can sketch it in pseudocode with something like this:
;;;
;;; Evaluate an expression of this form, returning a procedure:
;;;
;;; (lambda <formals> <body> ...)
;;;
(define (eval-lambda expr env)
(let ((formals (second expr))
(body (cddr expr)))
;; We don't evaluate `body` right away, we return a procedure.
(lambda args
;; `formals` is never evaluated, since it's not really an
;; expression on its own, but rather a subpart that cannot
;; be severed from its enclosing `lambda`. Or if we want to
;; say it all fancy, the `formals` is *syncategorematic*...
(let ((bindings (make-bindings formals args)))
;; When the procedure we return is called, *then* we evaluate
;; the `body`--but in an extended environment that binds its
;; formal parameters to the arguments supplied in that call.
(eval `(begin ,#body) (extend-environment env bindings))))))
;;;
;;; "Tie" each formal parameter of the procedure to the corresponding
;;; argument values supplied in a given call. Returns the bindings
;;; as an association list.
;;;
(define (make-bindings formals args)
(cond ((symbol? formals)
`((,formals . args)))
((pair? formals)
`((,(car formals) . ,(car args))
,#(make-bindings (cdr formals) (cdr args))))))
To understand this pseudocode, the time-tested thing is to study one of the many Scheme books that demonstrate how to build a meta-circular interpreter (a Scheme interpreter written in Scheme). See for example this section of Structure and Interpretation of Computer programs.
lambda needs to be a core language feature (like if, let, define are) in Scheme because it is constructing a closure so needs to manage the set of closed or free variables (and somehow put their binding in the closure).
For example:
(define (translate d) (lambda (x) (+ d x)))
When you invoke or evaluate (translate 3) the d is 3 so the dynamically constructed closure should remember that d is bound to 3. BTW, you generally want the result of (translate 3) and of (translate 7) be two different closures sharing some common code (but having different bindings for d).
Read also about λ-calculus.
Explaining that all in details requires an entire book. Fortunately, C. Queinnec has written it, so read his Lisp In Small Pieces book.
(If you read French, you could read the latest French version of that book)
See also the Kernel programming language.
Read also wikipage about evaluation strategy.
PS. You could, and some Lisp implementations (notably MELT and probably SBCL) do that, define lambda as some macro -e.g. which would expand to building some closure in an implementation specific way (but lambda cannot be defined as a function).
A function call (e0 e1 e2) is evaluated like this
e0 is evaluated, the result is (hopefully) a function f
e1 is evaluated, the result is a value v1
e2 is evaluated, the result is a value v2
The function body of f is evaluated in an environment in which
the formal parameters are bound to the values v1 and v2.
Note that all expressions e0, e1, and, e2 are evaluated before the body of the function is activated.
This means that a function call like (foo #t 2 (/ 3 0)) will result in an error when (/ 3 0) is evaluated - before control is handed over to the body of foo.
Now consider the special form lambda. In (lambda (x) (+ x 1)) this creates a function of one variable x which when called with a value v will compute (+ v 1).
If in contrast lambda were a function, then the expressions (x) and (+ x 1) are evaluated before the body of lambda is activated. And now (x) will most likely produce an error - since (x) means call the function x with no arguments.
In short: Function calls will always evaluate all arguments, before the control is passed to the function body. If some expressions are not to be evaluated a special form is needed.
Here lambda is a form, that don't evaluate all subforms - so lambda needs to be a special form.
In Scheme lingo we use the term procedure instead of function throughout the standard report. Thus since this is about scheme dialects I'll use the term procedure.
In eager languages like standard #!racket and #!r6rs procedures get their arguments evaluated before the body is evaluated with the new lexical environment. Thus since if and lambda has special evaluation rules than for procedures special forms and macros are the way to introduce new syntax.
In a lazy language like #!lazy racket evaluation is by need and thus many forms that are implemented as macros/special forms in an eager language can be implemented as procedure. eg. you can make if as a procedure using cond but you cannot make cond using if because the terms themselves would be evaluated as forms on access and eg (cond (#t 'true-value)) would fail since #t is not a procedure. lambda has similar issue with the argument list.
Here are some sample codes from text of OnLisp.
My question is that why it bothers to use a lambda function,
`(funcall (alrec ,rec #'(lambda () ,base)) ,#lsts))
as second argument to alrec in the definition of on-cdrs?
What is the difference if I just define it without using lambda?
`(funcall (alrec ,rec ,base) ,#lsts))
(defun lrec (rec &optional base)
(labels ((self (lst)
(if (null lst)
(if (functionp base)
(funcall base)
base)
(funcall rec (car lst)
#'(lambda ()
(self (cdr lst)))))))
#'self))
(defmacro alrec (rec &optional base)
"cltl2 version"
(let ((gfn (gensym)))
`(lrec #'(lambda (it ,gfn)
(symbol-macrolet ((rec (funcall ,gfn)))
,rec))
,base)))
(defmacro on-cdrs (rec base &rest lsts)
`(funcall (alrec ,rec #'(lambda () ,base)) ,#lsts))
You don't say how this is intended to be called and this code is a bit of a tangle so at a quick glance I couldn't say how it's supposed to work. However, I can answer your question.
First, let me say that
(if (functionp base) (funcall base) base)
is terrible programming style. This effectively puts a whole in your semantic space, creating a completely different handling of functions as objects than other things as objects. In Common Lisp, a function is supposed to be an object you can choose to pass around. If you want to call it, you should do so, but you shouldn't just say to someone "if I give you a function you should call it and otherwise you should not." (Why this matters will be seen as you read on.)
Second, as Barmar notes, if you write ,base you are basically saying "take the code and insert it for evaluation here". If you write
#'(lambda () ,base)
you are saying put the code inside a function so that its execution is delayed. Now, you're passing it to a function that when it receives the function is going to call it. And, moreover, calling it will evaluate it in the lexical environment of the caller, and there is no intervening change in dynamic state. So you'd think this would be the same thing as just evaluating it at the call site (other than just a little more overhead). However, there is a case where it's different.
If the thing you put in the base argument position is a variable (let's say X) or a number (let's say 3), then you'll either be doing (lrec ... X) or (lrec 3) or else you'll be doing
(lrec ... #'(lambda () X))
or
(lref ... #'(lambda () 3))
So far so good. If it gets to the caller, it's going to say "Oh, you just meant the value of X (or of 3)." But there's more...
If you say instead an expression that yields a function in the base argument position of your call to on-cdrs or your call to alrec, you're going to get different results depending on whether you wrote ,base or #'(lambda () ,base). For example, you might have put
#'f
or
#'(lambda () x)
or, even worse,
#'(lambda (x) x)
in the base argument position. In that case, if you had used ,base, then that expression would be immediately evaluated before passing the argument to lrec, and then lrec would receive a function. And then it would be called a second time (which is probably not what the macro user expects unless the documentation is very clear about this inelegance and the user of the macro has cared enough to read the documentation in detail). In the first case, it will return 3, in the second case, the value of x, and in the third case an error situation will occur because it will be called with the wrong number of arguments.
If instead you implemented it with
#'(lambda () ,base)
then lrec will receive as an argument the result of evaluating one of
#'(lambda () #'f)
or
#'(lambda () #'(lambda () 3))
or
#'(lambda () #'(lambda (x) x))
depending on what you gave it as an argument from our examples above. But in any case what lrec gets is a function of one argument that, when evaluated, will return the result of evaluating its body, that is, will return a function.
The important takeaways are these:
The comma is dropping in a piece of evaluable code, and wrapping the comma'd experession with a lambda (or wrapping any expression with a lambda) delays evaluation.
The conditional in the lrec definition should either expect that the value is already evaluated or not, and should not take a conditional effect because it can't know whether you already evaluated something based purely on type unless it basically makes a mess of functions as first-class data.
I hope that helps you see the difference. It's subtle, but it's real.
So using
#'(lambda () ,base)
protects the macro from double-evaluation of a base that might yield a function, but on the other hand the bad style is something that shouldn't (in my view) happen. My recommendation is to remove the conditional function call to base and make it either always or never call the base as a function. If you make it never call the function, the caller should definitely use ,base. If you make it always call the function, the caller should definitely include the lambda wrapper. That would make the number of evaluations deterministic.
Also, as a purely practical matter, I think it's more in the style of Common Lisp just to use ,base and not bother with the closure unless the expression is going to do something more than travel across a function call boundary to be immediately called. It's a waste of time and effort and perhaps extra consing to have the function where it's really not serving any interesting purpose. This is especially true if the only purpose of the lrec function is to support this facility. If lrec has an independent reason to have the contract that it does, that's another matter and maybe you'd write your macro to accommodate.
It's more common in a functional language like Scheme, which has a different aesthetic, to have a regular function as an alternative to any macro, and to have that function take such a zero-argument function as an argument just in case some user doesn't like working with macros. But mostly Common Lisp programmers don't bother, and your question was about Common Lisp, so I've biased the majority of my writing here to that dialect.
This recursive definition of a macro does what it should (sum integers from 1 to n):
(defmacro sum-int-seq (n)
`(cond
((equal 0 ,n) 0)
(t (+ ,n (sum-int-seq (- ,n 1))))))
For example (sum-int-seq 5) gives 15.
But why does it work? When the macro gets expanded i get this:
(macroexpand '(sum-int-seq 5))
(IF (EQUAL 0 5) 0 (+ 5 (SUM-INT-SEQ (- 5 1))))
But because sum-int-seq is a macro the macro evaluation should become an infinite loop. Does the compiler create a recursive function instead? If this definition creates a recursive function is there any way to define macros recursively?
(This is a silly example for the sake of brevity, a function would of course work better for this)
Your example does not work.
It may work in an interpreter. But with a compiler you'll see an endless loop during compilation.
CL-USER 23 > (defun test (foo)
(sum-int-seq 5))
TEST
Let's use the LispWorks interpreter:
CL-USER 24 > (test :foo)
15
Let's try to compile the function:
CL-USER 25 > (compile 'test)
Stack overflow (stack size 15997).
1 (continue) Extend stack by 50%.
2 Extend stack by 300%.
3 (abort) Return to level 0.
4 Return to top loop level 0.
Type :b for backtrace or :c <option number> to proceed.
Type :bug-form "<subject>" for a bug report template or :? for other options.
So, now the next question: why does it work in the interpreter, but the compiler can't compile it?
Okay, I'll explain it.
Let's look at the interpreter first.
it sees (sum-int-seq 5).
it macroexpands it to (COND ((EQUAL 0 5) 0) (T (+ 5 (SUM-INT-SEQ (- 5 1))))).
it then evaluates above form. It determines that it needs to compute (+ 5 (SUM-INT-SEQ (- 5 1))). For that it needs to macroexpand (SUM-INT-SEQ (- 5 1)).
eventually it will expand into something like (cond ((EQUAL 0 (- (- (- (- (- 5 1) 1) 1) 1) 1)) 0) .... Which then will return 0 and the computation can use this result and add the other terms to it.
The interpreter takes the code, evaluates what it can and macroexpands if necessary. The generated code is then evaluated or macroexpanded. And so on.
Now let's look at the compiler.
it sees (sum-int-seq 5) and macroexpands it into (COND ((EQUAL 0 5) 0) (T (+ 5 (SUM-INT-SEQ (- 5 1))))).
now the macroexpansion will be done on the subforms, eventually.
the compiler will macroexpand (SUM-INT-SEQ (- 5 1)). note that the code never gets evaluated, only expanded.
the compiler will macroexpand (SUM-INT-SEQ (- (- 5 1) 1)) and so forth. finally you'll see a stack overflow.
The compiler walks (recursively compiles / expands) the code. It may not execute the code (unless it does optimizations or a macro actually evaluates it explicitly).
For a recursive macro you'll need to actually count down. If you eval inside the macro, then something like (sum-int-seq 5) can made work. But for (defun foo (n) (sum-int-seq n)) this is hopeless, since the compiler does not know what the value of n is.
One other thing to add: in your example, the occurrence of sum-int-seq inside the macro is inside a quoted expression, so it doesn't get expanded when the macro is evaluated. It's just data until the macro is called. And since it is nested inside a cond, at run-time the inner macro only gets called when the condition is true, same as in a regular function.
Expanding a macro generates Lisp code that is then evaluated. Calling a function diverts the execution flow to a copy of pre-existing lisp code which is then run. Other than that, the two are pretty similar, and recursion works in the same way. In particular, macro expansion stops for the same reason that a properly written recursive function stops: because there is a termination condition,and the transformation between one call and the next has been written so that the this condition is actually reached. If it weren't reached, the macro expansion would enter a loop, just like an improperly written recursive function.
To the answer of Kilan I'd add, that macroexpand doesn't have to produce a full expansion of all macros in your form, until there's no macro left :) If you look at Hyperspec, you'll see, that it evaluates the whole form until it's not a macro (in your case it stops at if). And during compilation all the macros are expanded, as if macroexpand was applied to each element of the source tree, not only to its root.
Here's an implementation that does work:
(defmacro sum-int-seq (n)
(cond
((equal 0 n) `0)
(t `(+ ,n (sum-int-seq ,(- n 1))))))
It is possible to write a recursive macro, but (as was mentioned), the expansion must be able to hit the base case at compile time. So the values of all arguments passed to the macro must be known at compile time.
(sum-int-seq 5)
Works, but
(sum-int-seq n)
Does not.