I have this function that finds the even numbers in a list and returns a new list with only those numbers:
def even([]), do: []
def even([head | tail]) when rem(head, 2) == 0 do
[head | even(tail)]
end
def even([_head| tail]) do
even(tail)
end
Is this already tail-call optimized? Or does every clause have to call itself at the end (the second version of the "even" function doesn't)? If not, how can it be refactored to be tail-call recursive?
I know this can be done with filter or reduce but I wanted to try without it.
You're right that this function is not tail recursive because the second clause's last call is the list prepend operation, not a call to itself. To make this tail-recursive, you'll have to use an accumulator. Since the accumulation happens in reverse, in the first clause you'll need to reverse the list.
def even(list), do: even(list, [])
def even([], acc), do: :lists.reverse(acc)
def even([head | tail], acc) when rem(head, 2) == 0 do
even(tail, [head | acc])
end
def even([_head| tail], acc) do
even(tail, acc)
end
But in Erlang, your "body-recursive" code is automatically optimized and may not be slower than a tail-recursive solution which does a :lists.reverse call at the end. The Erlang documentation recommends writing whichever of the two results in cleaner code in such cases.
According to the myth, using a tail-recursive function that builds a list in reverse followed by a call to lists:reverse/1 is faster than a body-recursive function that builds the list in correct order; the reason being that body-recursive functions use more memory than tail-recursive functions.
That was true to some extent before R12B. It was even more true before R7B. Today, not so much. A body-recursive function generally uses the same amount of memory as a tail-recursive function. It is generally not possible to predict whether the tail-recursive or the body-recursive version will be faster. Therefore, use the version that makes your code cleaner (hint: it is usually the body-recursive version).
For a more thorough discussion about tail and body recursion, see Erlang's Tail Recursion is Not a Silver Bullet.
Myth: Tail-Recursive Functions are Much Faster Than Recursive Functions.
Related
I'm reading Learn Functional Programming with Elixir now, on chapter 4 the author talks about Tail-Call Optimization that a tail-recursive function will use less memory than a body-recursive function. But when I tried the examples in the book, the result is opposite.
# tail-recursive
defmodule TRFactorial do
def of(n), do: factorial_of(n, 1)
defp factorial_of(0, acc), do: acc
defp factorial_of(n, acc) when n > 0, do: factorial_of(n - 1, n * acc)
end
TRFactorial.of(200_000)
# body-recursive
defmodule Factorial do
def of(0), do: 1
def of(n) when n > 0, do: n * of(n - 1)
end
Factorial.of(200_000)
In my computer, the beam.smp of tail-recursive version will use 2.5G ~ 3G memory while the body-recursive only use around 1G. Am I misunderstanding something?
TL;DR: erlang virtual machine seems to optimize both to be TCO.
Nowadays compilers and virtual machines are too smart to predict their behavior. The advantage of tail recursion is not less memory consumption, but:
This is to ensure that no system resources, for example, call stack, are consumed.
When the call is not tail-recursive, the stack must be preserved across calls. Consider the following example.
▶ defmodule NTC do
def inf do
inf()
IO.puts(".")
DateTime.utc_now()
end
end
Here we need to preserve stack to make it possible to continue execution of the caller when the recursion would return. It won’t because this recursion is infinite. The compiler is unable to optimize it and here is what we get:
▶ NTC.inf
[1] 351729 killed iex
Please note, that no output has happened, which means we were recursively calling itself until the stack blew up. With TCO, infinite recursion is possible (and it is used widely in message handling.)
Turning back to your example. As we saw, TCO was made in both cases (otherwise we’d end up with stack overflow,) and the former keeps an accumulator in the dedicated variable, while the latter uses the return value on stack only. This gain you see: elixir is immutable and the variable content (which is huge for factorial of 200K) gets copied and kept in the memory for each call.
Sidenote:
You might disassembly both modules with :erts_debug.df(Factorial) (which would produce Elixir.Factorial.dis files in the same directory) and see that the calls were implicitly TCO’ed.
Quoting from wikipedia
...a tail call is a subroutine call performed as the final action of a procedure. If a tail call might lead to the same subroutine being called again later in the call chain, the subroutine is said to be tail-recursive, which is a special case of recursion.
Now I've the following routine written in C
int foo (int x){
if ( x > 100)
return x-10;
else
return foo(foo(x+11));
}
Based on the definition above, it seems to me that foo should be a tail recursive function since it recursive call is the final action of the procedure, but somewhere I've once read that this is not a tail recursive function.
Hence the question:
why isn't this function tail-recursive?
This function is typically not considered tail-recursive, because it involves multiple recursive calls to foo.
Tail recursion is particularly interesting because it can be trivially rewritten (for example by a compiler optimization) into a loop. It is not possible to completely eliminate recursion with this tail-call-optimization technique in your example, and thus one would not consider this function as tail-recursive, even if its last statement is a recursive call.
I am making my own Lisp-like interpreted language, and I want to do tail call optimization. I want to free my interpreter from the C stack so I can manage my own jumps from function to function and my own stack magic to achieve TCO. (I really don't mean stackless per se, just the fact that calls don't add frames to the C stack. I would like to use a stack of my own that does not grow with tail calls). Like Stackless Python, and unlike Ruby or... standard Python I guess.
But, as my language is a Lisp derivative, all evaluation of s-expressions is currently done recursively (because it's the most obvious way I thought of to do this nonlinear, highly hierarchical process). I have an eval function, which calls a Lambda::apply function every time it encounters a function call. The apply function then calls eval to execute the body of the function, and so on. Mutual stack-hungry non-tail C recursion. The only iterative part I currently use is to eval a body of sequential s-expressions.
(defun f (x y)
(a x y)) ; tail call! goto instead of call.
; (do not grow the stack, keep return addr)
(defun a (x y)
(+ x y))
; ...
(print (f 1 2)) ; how does the return work here? how does it know it's supposed to
; return the value here to be used by print, and how does it know
; how to continue execution here??
So, how do I avoid using C recursion? Or can I use some kind of goto that jumps across c functions? longjmp, perhaps? I really don't know. Please bear with me, I am mostly self- (Internet- ) taught in programming.
One solution is what is sometimes called "trampolined style". The trampoline is a top-level loop that dispatches to small functions that do some small step of computation before returning.
I've sat here for nearly half an hour trying to contrive a good, short example. Unfortunately, I have to do the unhelpful thing and send you to a link:
http://en.wikisource.org/wiki/Scheme:_An_Interpreter_for_Extended_Lambda_Calculus/Section_5
The paper is called "Scheme: An Interpreter for Extended Lambda Calculus", and section 5 implements a working scheme interpreter in an outdated dialect of Lisp. The secret is in how they use the **CLINK** instead of a stack. The other globals are used to pass data around between the implementation functions like the registers of a CPU. I would ignore **QUEUE**, **TICK**, and **PROCESS**, since those deal with threading and fake interrupts. **EVLIS** and **UNEVLIS** are, specifically, used to evaluate function arguments. Unevaluated args are stored in **UNEVLIS**, until they are evaluated and out into **EVLIS**.
Functions to pay attention to, with some small notes:
MLOOP: MLOOP is the main loop of the interpreter, or "trampoline". Ignoring **TICK**, its only job is to call whatever function is in **PC**. Over and over and over.
SAVEUP: SAVEUP conses all the registers onto the **CLINK**, which is basically the same as when C saves the registers to the stack before a function call. The **CLINK** is actually a "continuation" for the interpreter. (A continuation is just the state of a computation. A saved stack frame is technically continuation, too. Hence, some Lisps save the stack to the heap to implement call/cc.)
RESTORE: RESTORE restores the "registers" as they were saved in the **CLINK**. It's similar to restoring a stack frame in a stack-based language. So, it's basically "return", except some function has explicitly stuck the return value into **VALUE**. (**VALUE** is obviously not clobbered by RESTORE.) Also note that RESTORE doesn't always have to return to a calling function. Some functions will actually SAVEUP a whole new computation, which RESTORE will happily "restore".
AEVAL: AEVAL is the EVAL function.
EVLIS: EVLIS exists to evaluate a function's arguments, and apply a function to those args. To avoid recursion, it SAVEUPs EVLIS-1. EVLIS-1 would just be regular old code after the function application if the code was written recursively. However, to avoid recursion, and the stack, it is a separate "continuation".
I hope I've been of some help. I just wish my answer (and link) was shorter.
What you're looking for is called continuation-passing style. This style adds an additional item to each function call (you could think of it as a parameter, if you like), that designates the next bit of code to run (the continuation k can be thought of as a function that takes a single parameter). For example you can rewrite your example in CPS like this:
(defun f (x y k)
(a x y k))
(defun a (x y k)
(+ x y k))
(f 1 2 print)
The implementation of + will compute the sum of x and y, then pass the result to k sort of like (k sum).
Your main interpreter loop then doesn't need to be recursive at all. It will, in a loop, apply each function application one after another, passing the continuation around.
It takes a little bit of work to wrap your head around this. I recommend some reading materials such as the excellent SICP.
Tail recursion can be thought of as reusing for the callee the same stack frame that you are currently using for the caller. So you could just re-set the arguments and goto to the beginning of the function.
Let's say I have this code here:
do_recv_loop(State) ->
receive
{do,Stuff} ->
case Stuff of
one_thing ->
do_one_thing(),
do_recv_loop(State);
another_thing ->
do_another_thing(),
do_recv_loop(State);
_ ->
im_dead_now
end
{die} -> im_dead_now;
_ -> do_recv_loop(State)
end.
Now, in theory this is tail-recursive, as none of the three calls to do_recv_loop require anything to be returned. But will erlang recognize that this is tail recursive and optimize appropriately? I'm worried that the nested structure might make it not able to recognize it.
Yes, it will. Erlang is required to optimize tail calls, and this is clearly a tail call since nothing happens after the function is called.
I used to wish there were a tailcall keyword in Erlang so the compiler could warn me about invalid uses, but then I got used to it.
Yes, it is tail recursive. The main gotcha to be aware of is if you are wrapped inside exceptions. In that case, sometimes the exception needs to live on the stack and that will make something that looks tail-recursive into something deceptively not so.
The tail-call optimization is applicable if the call is in tail-position. Tail position is the "last thing before the function will return". Note that in
fact(0) -> 1;
fact(N) -> N * fact(N-1).
the recursive call to fact is not in tail position because after fact(N-1) is calculated, you need to run the continuation N * _ (i.e., multiply by N).
This I think is relevant because you are asking about how you know if your recursive function is optimized by the compiler. Since you aren't using lists:reverse/1 the below might not apply but for someone else with the exact same question but with a different code example it might be very relevant.
From the The Eight Myths of Erlang Performance in the Erlang Efficiency Guide
In R12B and later releases, there is
an optimization that will in many
cases reduces the number of words used
on the stack in body-recursive calls,
so that a body-recursive list function
and tail-recursive function that calls
lists:reverse/1 at the end will use
exactly the same amount of memory.
http://www.erlang.org/doc/efficiency_guide/myths.html#id58884
I think the take away message is that you may have to measure in some cases to see what will be best.
I'm pretty new to Erlang but from what I've gathered, the rule seems to be that in order to be tail-recursive, the function has to do one of two things in any given logical branch:
not make a recursive call
return the value of the recursive call and do nothing else after it
That recursive call can be nested into as many if, case, or receive calls as you want as long as nothing actually happens after it.
I enjoy using recursion whenever I can, it seems like a much more natural way to loop over something then actual loops. I was wondering if there is any limit to recursion in lisp? Like there is in python where it freaks out after like 1000 loops?
Could you use it for say, a game loop?
Testing it out now, simple counting recursive function. Now at >7000000!
Thanks alot
First, you should understand what tail call is about.
Tail call are call that do not consumes stack.
Now you need to recognize when your are consuming stack.
Let's take the factorial example:
(defun factorial (n)
(if (= n 1)
1
(* n (factorial (- n 1)))))
Here is the non-tail recursive implementation of factorial.
Why? This is because in addition to a return from factorial, there is a pending computation.
(* n ..)
So you are stacking n each time you call factorial.
Now let's write the tail recursive factorial:
(defun factorial-opt (n &key (result 1))
(if (= n 1)
result
(factorial-opt (- n 1) :result (* result n))))
Here, the result is passed as an argument to the function.
So you're also consuming stack, but the difference is that the stack size stays constant.
Thus, the compiler can optimize it by using only registers and leaving the stack empty.
The factorial-opt is then faster, but is less readable.
factorial is limited to the size of the stack will factorial-opt is not.
So you should learn to recognize tail recursive function in order to know if the recursion is limited.
There might be some compiler technique to transform a non-tail recursive function into a tail recursive one. Maybe someone could point out some link here.
Scheme mandates tail call optimization, and some CL implementations offer it as well. However, CL does not mandate it.
Note that for tail call optimization to work, you need to make sure that you don't have to return. E.g. a naive implementation of Fibonacci where there is a need to return and add to another recursive call, will not be tail call optimized and as a result you will run out of stack space.