Related
Let say we had two versions of a recursive function, with one of them being tail-recursive. Is there any benefit in using the function tail recursive if the language being used does not have tail-call optimization? From my understanding, without the optimization, each version of a function (tail and non-tail) would use the same number of stack frames (in most cases).
I know that in some cases, like the Fibonacci function for example, using a tail call can be more efficient even without tail-call optimization since it avoids double calls. But what if neither version of the function makes double calls? Would the tail-recursive function still be more efficient?
The answer to that question is hardware and language implementation dependent. In most cases, however, I'd have to think that a simple GOTO is faster than CALL-RETURN instruction pair.
I've been recently learning about functional languages and how many don't include for loops. While I don't personally view recursion as more difficult than a for loop (and often easier to reason out) I realized that many examples of recursion aren't tail recursive and therefor cannot use simple tail recursion optimization in order to avoid stack overflows. According to this question, all iterative loops can be translated into recursion, and those iterative loops can be transformed into tail recursion, so it confuses me when the answers on a question like this suggest that you have to explicitly manage the translation of your recursion into tail recursion yourself if you want to avoid stack overflows. It seems like it should be possible for a compiler to do all the translation from either recursion to tail recursion, or from recursion straight to an iterative loop with out stack overflows.
Are functional compilers able to avoid stack overflows in more general recursive cases? Are you really forced to transform your recursive code in order to avoid stack overflows yourself? If they aren't able to perform general recursive stack-safe compilation, why aren't they?
Any recursive function can be converted into a tail recursive one.
For instance, consider the transition function of a Turing machine, that
is the mapping from a configuration to the next one. To simulate the
turing machine you just need to iterate the transition function until
you reach a final state, that is easily expressed in tail recursive
form. Similarly, a compiler typically translates a recursive program into
an iterative one simply adding a stack of activation records.
You can also give a translation into tail recursive form using continuation
passing style (CPS). To make a classical example, consider the fibonacci
function.
This can be expressed in CPS style in the following way, where the second
parameter is the continuation (essentially, a callback function):
def fibc(n, cont):
if n <= 1:
return cont(n)
return fibc(n - 1, lambda a: fibc(n - 2, lambda b: cont(a + b)))
Again, you are simulating the recursion stack using a dynamic data structure:
in this case, lambda abstractions.
The use of dynamic structures (lists, stacks, functions, etc.) in all previous
examples is essential. That is to say, that in order to simulate a generic
recursive function iteratively, you cannot avoid dynamic memory allocation,
and hence you cannot avoid stack overflow, in general.
So, memory consumption is not only related to the iterative/recursive
nature of the program. On the other side, if you prevent dynamic memory
allocation, your
programs are essentially finite state machines, with limited computational
capabilities (more interesting would be to parametrise memory according to
the dimension of inputs).
In general, in the same way as you cannot predict termination, you cannot
predict an unbound memory consumption of your program: working with
a Turing complete language, at compile time
you cannot avoid divergence, and you cannot avoid stack overflow.
Tail Call Optimization:
The natural way to do arguments and calls is to sort out the cleaning up when exiting or when returning.
For tail calls to work you need to alter it so that the tail call inherits the current frame. Thus instead of making a new frame it massages the frame so that the next call returns to the current functions caller instead of this function, which really only cleans up and returns if it's a tail call.
Thus TCO is all about cleaning up before the last call.
Continuation Passing Style - make tail calls out of everything
A compiler can change the code such that it only does primitive operations and pass it to continuations. Thus the stack usage gets moved onto the heap since the computation to be continued is made a function.
An example is:
function hypotenuse(k1, k2) {
return sqrt(add(square(k1), square(k2)))
}
becomes
function hypotenuse(k, k1, k2) {
(function (sk1) {
(function (sk2) {
(function (ar) {
k(sqrt(ar));
}(add(sk1,sk2));
}(square(k2));
}(square(k1));
}
Notice every function has exactly one call now and the order of evaluation is set.
According to this question, all iterative loops can be translated into recursion
"Translated" might be a bit of a stretch. The proof that for every iterative loop there is an equivalent recursive program is trivial if you understand Turing completeness: since a Turing machine can be implemented using strictly iterative structures and strictly recursive structures, every program that can be expressed in an iterative language can be expressed in a recursive language, and vice-versa. This means that for every iterative loop there is an equivalent recursive construct (and the other way around). However, that doesn't mean we have some automated way of transforming one into the other.
and those iterative loops can be transformed into tail recursion
Tail recursion can perhaps be easily transformed into an iterative loop, and the other way around. But not all recursion is tail recursion. Here's an example. Suppose we have some binary tree. It consists of nodes. Each node can have a left and a right child and a value. If a node has no children, then isLeaf returns true for it. We'll assume there's some function max that returns the maximum of two values, and if one of the values is null it returns the other one. Now we want to define a function that finds the maximum value among all the leaf nodes. Here it is in some pseudo-code I cooked up.
findmax(node) {
if (node == null) {
return null
}
if (node.isLeaf) {
return node.value
} else {
return max(findmax(node.left), findmax(node.right))
}
}
There's two recursive calls in the max function, so we can't optimize for tail recursion. We need the results of both before we can supply them to the max function and determine the result of the call for the current node.
Now, there may be a way of getting the same result, using recursion and only a single tail-recursive call. It is functionally equivalent, but it is a different algorithm. Compilers can do a lot of transformations to create a functionally equivalent program with lots of optimizations, but they're not quite clever enough to create functionally equivalent algorithms.
Even the transformation of a function that only calls itself recursively once into a tail-recursive version would be far from trivial. Such an adaptation usually employs some argument passed into the recursive invocation that is used as an "accumulator" for the current results.
Look at the next naive implementation for calculating a factorial of a number (e.g. fact(5) = 5*4*3*2*1):
fact(number) {
if (number == 1) {
return 1
} else {
return number * fact(number - 1)
}
}
It's not tail-recursive. But it can be made so in this way:
fact(number, acc) {
if (number == 1) {
return acc
} else {
return fact(number - 1, number * acc)
}
}
// Helper function
fact(number) {
return fact(number, 1)
}
This requires an interpretation of what is being done. Recognizing the case for stuff like this is easy enough, but what if you call a function instead of a multiplication? How will the compiler know that for the initial call the accumulator must be 1 and not, say, 0? How do you translate this program?
recsub(number) {
if (number == 1) {
return 1
} else {
return number - recsub(number - 1)
}
}
This is as of yet outside the scope of the sort of compiler we have now, and may in fact always be.
Maybe it would be interesting to ask this on the computer science Stack Exchange to see if they know of some papers or proofs that investigate this more in-depth.
Is tail recursion better than forward recursion for perfomance in erlang?
Or erlang compiler optimizes forward recursion too?
I mean, are there any reasons to use tail recursion instead of forward recursion?
In my opinion, forward recursion looks more pretty.
Tail recursion and forward recursion are totally different concepts.
See this discussion.
It is possible to write a forward recursion that is tail recursive, and thus optimized. It is also possible to write a forward recursion that is not tail recursive: in this case, it will not be optimized, i.e. it will consume stack space.
Tail recursion is usually better because it uses less memory. You only bring what you need onto the next call, which minimizes memory utilization on the stack. Also, when the tail recursive code is optimized, function returns that are not needed are thrown away which will make it slightly faster in some cases.
For example, if a function's return value is the call to another function, there is no need to keep the intermediary function on the stack. So the code jumps back directly to the caller from the inner function.
Non-tail recursion is optimized to tail recursion in some cases by the Erlang compiler, but don't count on it. Make it a good habit to code tail recursive functions whenever you can.
In the chapter about function in the Oz tutorial, it says that:
similar to lazy functional languages
Oz allows certain forms of
tail-recursion optimizations that are
not found in certain strict functional
languages including Standard ML,
Scheme, and the concurrent functional
language Erlang. However, standard
function definitions in Oz are not
lazy.
It then goes on to show the following function which is tail-recursive in Oz:
fun {Map Xs F}
case Xs
of nil then nil
[] X|Xr then {F X}|{Map Xr F}
end
end
What this does is, it maps the empty list to the empty list and non-empty list, to the result of applying the function F to its head and then prepending that to the result of calling Map on the tail. In other languages this would not be tail recursive, because the last operation is the prepend, not the recursive call to Map.
So my question is: If "standard function definitions in Oz are not lazy", what does Oz do that languages like Scheme or Erlang can't (or won't?) to be able to perform tail-recursion optimization for this function? And exactly when is a function tail-recursive in Oz?
This is called Tail Recursion Modulo Cons. Basically, prepending to the list directly after the recursive call is the same as appending to the list directly before the recursive call (and thus building the list as a "side-effect" of the purely functional "loop"). This is a generalization of tail recursion that works not just with cons lists but any data constructor with constant operations.
It was first described (but not named) as a LISP compilation technique in 1974 by Daniel P. Friedman and David S. Wise in Technical Report TR19: Unwinding Structured Recursions into Iterations and it was formally named and introduced by David H. D. Warren in 1980 in the context of writing the first-ever Prolog compiler.
The interesting thing about Oz, though, is that TRMC is neither a language feature nor an explicit compiler optimization, it's just a side-effect of the language's execution semantics. Specifically, the fact that Oz is a declarative concurrent constraint language, which means that every variable is a dataflow variable (or "everything is a promise", including every storage location). Since everything is a promise, we can model returning from a function as first setting up the return value as a promise, and then later on fulfilling it.
Peter van Roy, co-author of the book Concepts, Techniques, and Models of Computer Programming by Peter Van Roy and Seif Haridi, also one of the designers of Oz, and one of its implementators, explains how exactly TRMC works in a comment thread on Lambda the Ultimate: Tail-recursive map and declarative agents:
The above example of bad Scheme code turns into good tail-recursive Oz code when translated directly into Oz syntax. This gives:
fun {Map F Xs}
if Xs==nil then nil
else {F Xs.1}|{Map F Xs.2} end
end
This is because Oz has single-assignment variables. To understand the execution, we translate this example into the Oz kernel language (I give just a partial translation for clarity):
proc {Map F Xs Ys}
if Xs==nil then Ys=nil
else local Y Yr in
Ys=Y|Yr
{F Xs.1 Y}
{Map F Xs.2 Yr}
end end
end
That is, Map is tail-recursive because Yr is initially unbound. This is not just a clever trick; it is profound because it allows declarative concurrency and declarative multi-agent systems.
I am not too familiar with lazy functional languages, but if you think about the function Map in your question, it is easy to translate to a tail-recursive implementation if temporarily incomplete values in the heap are allowed (muted into more complete values one call at a time).
I have to assume that they are talking about this transformation in Oz. Lispers used to do this optimization by hand -- all values were mutable, in this case a function called setcdr would be used -- but you had to know what you were doing. Computers did not always have gigabytes of memory. It was justified to do this by hand, it arguably no longer is.
Back to your question, others modern languages do not do it automatically probably because it would be possible to observe the incomplete value while it is being built, and this must be what Oz has found a solution to. What other differences are there in Oz as compared to other languages that would explain it?
How good is 'pure' functional programming for basic routine implementations, e.g. list sorting, string matching etc.?
It's common to implement such basic functions within the base interpreter of any functional language, which means that they will be written in an imperative language (c/c++). Although there are many exceptions..
At least, I wish to ask: How difficult is it to emulate imperative style while coding in 'pure' functional language?
How good is 'pure' functional
programming for basic routine
implementations, e.g. list sorting,
string matching etc.?
Very. I'll do your problems in Haskell, and I'll be slightly verbose about it. My aim is not to convince you that the problem can be done in 5 characters (it probably can in J!), but rather to give you an idea of the constructs.
import Data.List -- for `sort`
stdlistsorter :: (Ord a) => [a] -> [a]
stdlistsorter list = sort list
Sorting a list using the sort function from Data.List
import Data.List -- for `delete`
selectionsort :: (Ord a) => [a] -> [a]
selectionsort [] = []
selectionsort list = minimum list : (selectionsort . delete (minimum list) $ list)
Selection sort implementation.
quicksort :: (Ord a) => [a] -> [a]
quicksort [] = []
quicksort (x:xs) =
let smallerSorted = quicksort [a | a <- xs, a <= x]
biggerSorted = quicksort [a | a <- xs, a > x]
in smallerSorted ++ [x] ++ biggerSorted
Quick sort implementation.
import Data.List -- for `isInfixOf`
stdstringmatch :: (Eq a) => [a] -> [a] -> Bool
stdstringmatch list1 list2 = list1 `isInfixOf` list2
String matching using isInfixOf function from Data.list
It's common to implement such basic
functions within the base interpreter
of any functional language, which
means that they will be written in an
imperative language (c/c++). Although
there are many exceptions..
Depends. Some functions are more naturally expressed imperatively. However, I hope I have convinced you that some algorithms are also expressed naturally in a functional way.
At least, I wish to ask: How difficult
is it to emulate imperative style
while coding in 'pure' functional
language?
It depends on how hard you find Monads in Haskell. Personally, I find it quite difficult to grasp.
1) Good by what standard? What properties do you desire?
List sorting? Easy. Let's do Quicksort in Haskell:
sort [] = []
sort (x:xs) = sort (filter (< x) xs) ++ [x] ++ sort (filter (>= x) xs)
This code has the advantage of being extremely easy to understand. If the list is empty, it's sorted. Otherwise, call the first element x, find elements less than x and sort them, find elements greater than x and sort those. Then concatenate the sorted lists with x in the middle. Try making that look comprehensible in C++.
Of course, Mergesort is much faster for sorting linked lists, but the code is also 6 times longer.
2) It's extremely easy to implement imperative style while staying purely functional. The essence of imperative style is sequencing of actions. Actions are sequenced in a pure setting by using monads. The essence of monads is the binding function:
(Monad m) => (>>=) :: m a -> (a -> m b) -> m b
This function exists in C++, and it's called ;.
A sequence of actions in Haskell, for example, is written thusly:
putStrLn "What's your name?" >>=
const (getLine >>= \name -> putStrLn ("Hello, " ++ name))
Some syntax sugar is available to make this look more imperative (but note that this is the exact same code):
do {
putStrLn "What's your name?";
name <- getLine;
putStrLn ("Hello, " ++ name);
}
Nearly all functional programming languages have some construct to allow for imperative coding (like do in Haskell). There are many problem domains that can't be solved with "pure" functional programming. One of those is network protocols, for example where you need a series of commands in the right order. And such things don't lend themselves well to pure functional programming.
I have to agree with Lothar, though, that list sorting and string matching are not really examples you need to solve imperatively. There are well-known algorithms for such things and they can be implemented efficiently in functional languages already.
I think that 'algorithms' (e.g. method bodies and basic data structures) are where functional programming is best. Assuming nothing completely IO/state-dependent, functional programming excels are authoring algorithms and data structures, often resulting in shorter/simpler/cleaner code than you'd get with an imperative solution. (Don't emulate imperative style, FP style is better for most of these kinds of tasks.)
You want imperative stuff sometimes to deal with IO or low-level performance, and you want OOP for partitioning the high-level design and architecture of a large program, but "in the small" where you write most of your code, FP is a win.
See also
How does functional programming affect the structure of your code?
It works pretty well the other way round emulating functional with imperative style.
Remember that the internal of an interpreter or VM ware so close to metal and performance critical that you should even consider going to assember level and count the the clock cycles for each instruction (like Smalltalk Dophin is just doing it and the results are impressive).
CPU's are imperative.
But there is no problem to do all the basic algorithm implementation - the one you mention are NOT low level - they are basics.
I don't know about list sorting, but you'd be hard pressed to bootstrapp a language without some kind of string matching in the compiler or runtime. So you need that routine to create the language. As there isn't a great deal of point writing the same code twice, when you create the library for matching strings within the language, you call the code written earlier. The degree to which this happens in successive releases will depend on how self hosting the language is, but unless that's a strong design goal there won't be any reason to change it.