Quicksort and tail recursive optimization - recursion

In Introduction to Algorithms p169 it talks about using tail recursion for Quicksort.
The original Quicksort algorithm earlier in the chapter is (in pseudo-code)
Quicksort(A, p, r)
{
if (p < r)
{
q: <- Partition(A, p, r)
Quicksort(A, p, q)
Quicksort(A, q+1, r)
}
}
The optimized version using tail recursion is as follows
Quicksort(A, p, r)
{
while (p < r)
{
q: <- Partition(A, p, r)
Quicksort(A, p, q)
p: <- q+1
}
}
Where Partition sorts the array according to a pivot.
The difference is that the second algorithm only calls Quicksort once to sort the LHS.
Can someone explain to me why the 1st algorithm could cause a stack overflow, whereas the second wouldn't? Or am I misunderstanding the book.

First let's start with a brief, probably not accurate but still valid, definition of what stack overflow is.
As you probably know right now there are two different kind of memory which are implemented in too different data structures: Heap and Stack.
In terms of size, the Heap is bigger than the stack, and to keep it simple let's say that every time a function call is made a new environment(local variables, parameters, etc.) is created on the stack. So given that and the fact that stack's size is limited, if you make too many function calls you will run out of space hence you will have a stack overflow.
The problem with recursion is that, since you are creating at least one environment on the stack per iteration, then you would be occupying a lot of space in the limited stack very quickly, so stack overflow are commonly associated with recursion calls.
So there is this thing called Tail recursion call optimization that will reuse the same environment every time a recursion call is made and so the space occupied in the stack is constant, preventing the stack overflow issue.
Now, there are some rules in order to perform a tail call optimization. First, each call most be complete and by that I mean that the function should be able to give a result at any moment if you interrupts the execution, in SICP
this is called an iterative process even when the function is recursive.
If you analyze your first example, you will see that each iteration is defined by two recursive calls, which means that if you stop the execution at any time you won't be able to give a partial result because you the result depends of those calls to be finished, in this scenario you can't reuse the stack environment because the total information is split between all those recursive calls.
However, the second example doesn't have that problem, A is constant and the state of p and r can be locally determined, so since all the information to keep going is there then TCO can be applied.

The essence of the tail recursion optimization is that there is no recursion when the program is actually executed. When the compiler or interpreter is able to kick TRO in, it means that it will essentially figure out how to rewrite your recursively-defined algorithm into a simple iterative process with the stack not used to store nested function invocations.
The first code snippet can't be TR-optimized because there are 2 recursive calls in it.

Tail recursion by itself is not enough. The algorithm with the while loop can still use O(N) stack space, reducing it to O(log(N)) is left as exercise in that section of CLRS.
Assume we are working in a language with array slices and tail call optimization. Consider the difference between these two algorithms:
Bad:
Quicksort(arraySlice) {
if (arraySlice.length > 1) {
slices = Partition(arraySlice)
(smallerSlice, largerSlice) = sortBySize(slices)
Quicksort(largerSlice) // Not a tail call, requires a stack frame until it returns.
Quicksort(smallerSlice) // Tail call, can replace the old stack frame.
}
}
Good:
Quicksort(arraySlice) {
if (arraySlice.length > 1){
slices = Partition(arraySlice)
(smallerSlice, largerSlice) = sortBySize(slices)
Quicksort(smallerSlice) // Not a tail call, requires a stack frame until it returns.
Quicksort(largerSlice) // Tail call, can replace the old stack frame.
}
}
The second one is guarenteed to never need more than log2(length) stack frames because smallerSlice is less than half as long as arraySlice. But for the first one, the inequality is reversed and it will always need more than or equal to log2(length) stack frames, and can require O(N) stack frames in the worst case where smallerslice always has length 1.
If you don't keep track of which slice is smaller or larger, you will have similar worst cases to the first overflowing case, even though it will require O(log(n)) stack frames on average. If you always sort the smaller slice first, you will never need more than log_2(length) stack frames.
If you are using a language that doesn't have tail call optimization, you can write the second (not stack-blowing) version as:
Quicksort(arraySlice) {
while (arraySlice.length > 1) {
slices = Partition(arraySlice)
(smallerSlice, arraySlice) = sortBySize(slices)
Quicksort(smallerSlice) // Still not a tail call, requires a stack frame until it returns.
}
}
Another thing worth noting is that if you are implementing something like Introsort which changes to Heapsort if the recursion depth exceeds some number proportional to log(N), you will never hit the O(N) worst case stack memory usage of quicksort, so you technically don't need to do this. Doing this optimization (popping smaller slices first) still improves the constant factor of the O(log(N)) though, so it is strongly recommended.

Well, the most obvious observation would be:
Most common stack overflow problem - definition
The most common cause of stack overflow is excessively deep or infinite recursion.
The second uses less deep recursion than the first (n branches per call instead of n^2) hence it is less likely to cause a stack overflow..
(so lower complexity means less chance to cause a stack overflow)
But somebody would have to add why the second can never cause a stack overflow while the first can.
source

Well If you consider the complexity of the two methods the first method obviously has more complexity than the second since it calls Recursion on both LHS and RHS as a result there are more chances of getting stack overflow
Note: That doesnt mean that there are absolutely no chances of getting SO in second method

In the function 2 that you shared, Tail Call elimination is implemented. Before proceeding further let us understand what is tail recursion function?. If the last statement in the code is the recursive call and does do anything after that, then it is called tail recursive function. So the first function is a tail recursion function. For such a function with some changes in the code one can remove the last recursion call like you showed in function 2 which performs the same work as function 1. This process is called tail recursion optimization or tail call elimination and following are the result of it
Optimizing in terms of auxiliary space
Optimizing in terms of recursion call overhead
Last recursive call is eliminated by using the while loop. The good thing is that for function 2, no auxiliary space is used for the right call as its recursion is eliminated using p: <- q+1 and the overall function does not have recursion call overhead. So whatever way partition happens maximum space needed is theta(log n)

Related

Iterative Postorder Traversal of Binary Tree in Python Optimality

I'm brushing up on leet code tree questions and every solution of Iterative Postorder Traversal of Binary Tree in Python type problems seem to use recursion.
Since there is no tail recursion in python, I believe iterative algorithm is faster because it takes less time to move around a stack than to jump through stack call frames.
Additionally I believe the iterative approach uses less memory because tracking just one stack takes up less space than the call frame stack from recursion.
I understand the typical approach to postorder iterative traversal takes 2 stacks and while loops so it seems to complicated.
However, there is an algorithm to use just 1 stack and while loop.
Here it is:
def postorderIterative(root):
curr = root
stack = []
while current or stack:
if current:
stack.append[current]
current = current.left
else:
tmp = stack[-1].right
if not tmp:
tmp = stack.pop()
#process postorder here
while stack and tmp == stack[-1].right:
tmp = stack.pop()
#process postorder here
else:
current = tmp
credit and explanation goes to here
Is there any reason why most video solutions seem to use recursive postorder traversal more than iterative even though iterative is better? is it because recursive is easier to understand?
It seems that with a big enough test case the recursive may fail in any call so I tend to always do iterative. Is this a valid approach?
I literally cannot find an iterative solution video to
leet code: 543. Diameter of Binary Tree
these are all recursive and thus suboptimal, even though they say it is optimal.
Please let me know if I am wrong and why. If not. please let me know as well. I am here to learn!
Both the iterative and recursive implementations have the same time and space complexity. The difference in running time is insignificant in terms of big O. When speaking of optimal, people usually mean the time/space complexity is optimal.
Secondly, the call stack limitation will only be a problem for a binary tree that has a height that goes into the hundreds. If a tree of such height were balanced, it would have more nodes than there are atoms in the universe. So we would only get into stack limitations when the binary tree is awfully skewed, which would make it rather useless for a real life problem. In short, I don't think the call stack limitation is such an important argument here.

In functional languages, how is the compiler able to translate non-tail recursion into loops to avoid stack overflows (if at all)?

I've been recently learning about functional languages and how many don't include for loops. While I don't personally view recursion as more difficult than a for loop (and often easier to reason out) I realized that many examples of recursion aren't tail recursive and therefor cannot use simple tail recursion optimization in order to avoid stack overflows. According to this question, all iterative loops can be translated into recursion, and those iterative loops can be transformed into tail recursion, so it confuses me when the answers on a question like this suggest that you have to explicitly manage the translation of your recursion into tail recursion yourself if you want to avoid stack overflows. It seems like it should be possible for a compiler to do all the translation from either recursion to tail recursion, or from recursion straight to an iterative loop with out stack overflows.
Are functional compilers able to avoid stack overflows in more general recursive cases? Are you really forced to transform your recursive code in order to avoid stack overflows yourself? If they aren't able to perform general recursive stack-safe compilation, why aren't they?
Any recursive function can be converted into a tail recursive one.
For instance, consider the transition function of a Turing machine, that
is the mapping from a configuration to the next one. To simulate the
turing machine you just need to iterate the transition function until
you reach a final state, that is easily expressed in tail recursive
form. Similarly, a compiler typically translates a recursive program into
an iterative one simply adding a stack of activation records.
You can also give a translation into tail recursive form using continuation
passing style (CPS). To make a classical example, consider the fibonacci
function.
This can be expressed in CPS style in the following way, where the second
parameter is the continuation (essentially, a callback function):
def fibc(n, cont):
if n <= 1:
return cont(n)
return fibc(n - 1, lambda a: fibc(n - 2, lambda b: cont(a + b)))
Again, you are simulating the recursion stack using a dynamic data structure:
in this case, lambda abstractions.
The use of dynamic structures (lists, stacks, functions, etc.) in all previous
examples is essential. That is to say, that in order to simulate a generic
recursive function iteratively, you cannot avoid dynamic memory allocation,
and hence you cannot avoid stack overflow, in general.
So, memory consumption is not only related to the iterative/recursive
nature of the program. On the other side, if you prevent dynamic memory
allocation, your
programs are essentially finite state machines, with limited computational
capabilities (more interesting would be to parametrise memory according to
the dimension of inputs).
In general, in the same way as you cannot predict termination, you cannot
predict an unbound memory consumption of your program: working with
a Turing complete language, at compile time
you cannot avoid divergence, and you cannot avoid stack overflow.
Tail Call Optimization:
The natural way to do arguments and calls is to sort out the cleaning up when exiting or when returning.
For tail calls to work you need to alter it so that the tail call inherits the current frame. Thus instead of making a new frame it massages the frame so that the next call returns to the current functions caller instead of this function, which really only cleans up and returns if it's a tail call.
Thus TCO is all about cleaning up before the last call.
Continuation Passing Style - make tail calls out of everything
A compiler can change the code such that it only does primitive operations and pass it to continuations. Thus the stack usage gets moved onto the heap since the computation to be continued is made a function.
An example is:
function hypotenuse(k1, k2) {
return sqrt(add(square(k1), square(k2)))
}
becomes
function hypotenuse(k, k1, k2) {
(function (sk1) {
(function (sk2) {
(function (ar) {
k(sqrt(ar));
}(add(sk1,sk2));
}(square(k2));
}(square(k1));
}
Notice every function has exactly one call now and the order of evaluation is set.
According to this question, all iterative loops can be translated into recursion
"Translated" might be a bit of a stretch. The proof that for every iterative loop there is an equivalent recursive program is trivial if you understand Turing completeness: since a Turing machine can be implemented using strictly iterative structures and strictly recursive structures, every program that can be expressed in an iterative language can be expressed in a recursive language, and vice-versa. This means that for every iterative loop there is an equivalent recursive construct (and the other way around). However, that doesn't mean we have some automated way of transforming one into the other.
and those iterative loops can be transformed into tail recursion
Tail recursion can perhaps be easily transformed into an iterative loop, and the other way around. But not all recursion is tail recursion. Here's an example. Suppose we have some binary tree. It consists of nodes. Each node can have a left and a right child and a value. If a node has no children, then isLeaf returns true for it. We'll assume there's some function max that returns the maximum of two values, and if one of the values is null it returns the other one. Now we want to define a function that finds the maximum value among all the leaf nodes. Here it is in some pseudo-code I cooked up.
findmax(node) {
if (node == null) {
return null
}
if (node.isLeaf) {
return node.value
} else {
return max(findmax(node.left), findmax(node.right))
}
}
There's two recursive calls in the max function, so we can't optimize for tail recursion. We need the results of both before we can supply them to the max function and determine the result of the call for the current node.
Now, there may be a way of getting the same result, using recursion and only a single tail-recursive call. It is functionally equivalent, but it is a different algorithm. Compilers can do a lot of transformations to create a functionally equivalent program with lots of optimizations, but they're not quite clever enough to create functionally equivalent algorithms.
Even the transformation of a function that only calls itself recursively once into a tail-recursive version would be far from trivial. Such an adaptation usually employs some argument passed into the recursive invocation that is used as an "accumulator" for the current results.
Look at the next naive implementation for calculating a factorial of a number (e.g. fact(5) = 5*4*3*2*1):
fact(number) {
if (number == 1) {
return 1
} else {
return number * fact(number - 1)
}
}
It's not tail-recursive. But it can be made so in this way:
fact(number, acc) {
if (number == 1) {
return acc
} else {
return fact(number - 1, number * acc)
}
}
// Helper function
fact(number) {
return fact(number, 1)
}
This requires an interpretation of what is being done. Recognizing the case for stuff like this is easy enough, but what if you call a function instead of a multiplication? How will the compiler know that for the initial call the accumulator must be 1 and not, say, 0? How do you translate this program?
recsub(number) {
if (number == 1) {
return 1
} else {
return number - recsub(number - 1)
}
}
This is as of yet outside the scope of the sort of compiler we have now, and may in fact always be.
Maybe it would be interesting to ask this on the computer science Stack Exchange to see if they know of some papers or proofs that investigate this more in-depth.

Recursion in java help

I am new to the site and am not familiar with how and where to post so please excuse me. I am currently studying recursion and am having trouble understanding the output of this program. Below is the method body.
public static int Asterisk(int n)
{
if (n<1)
return;
Asterisk(n-1);
for (int i = 0; i<n; i++)
{
System.out.print("*");
}
System.out.println();
}
This is the output
*
**
***
****
*****
it is due to the fact that the "Asterisk(n-1)" lies before the for loop.
I would think that the output should be
****
***
**
*
This is the way head recursion works. The call to the function is made before execution of other statements. So, Asterisk(5) calls Asterisk(4) before doing anything else. This further cascades into serial function calls from Asterisk(3) → Asterisk(2) → Asterisk(1) → Asterisk(0).
Now, Asterisk(0) simply returns as it passes the condition n<1. The control goes back to Asterisk(1) which now executes the rest of its code by printing n=1 stars. Then it relinquishes control to Asterisk(2) which again prints n=2 stars, and so on. Finally, Asterisk(5) prints its n=5 stars and the function calls end. This is why you see the pattern of ascending number of stars.
There are two ways to create programming loops. One is using imperative loops normally native to the language (for, while, etc) and the other is using functions (functional loops). In your example the two kinds of loops are presented.
One loop is the unrolling of the function
Asterisk(int n)
This unrolling uses recursion, where the function calls itself. Every functional loop must know when to stop, otherwise it goes on forever and blows up the stack. This is called the "stopping condition". In your case it is :
if (n<1)
return;
There is bidirectional equivalence between functional loops and imperative loops (for, while, etc). You can turn any functional loop into a regular loop and vice versa.
IMO this particular exercise was meant to show you the two different ways to build loops. The outer loop is functional (you could substitute it for a for loop) and the inner loop is imperative.
Think of recursive calls in terms of a stack. A stack is a data structure which adds to the top of a pile. A real world analogy is a pile of dishes where the newest dish goes on the top. Therefore recursive calls add another layer to the top of the stack, then once some criteria is met which prevents further recursive calls, the stack starts to unwind and we work our way back down to the original item (the first plate in pile of dishes).
The input of a recursive method tends towards a base case which is the termination factor and prevents the method from calling itself indefinitely (infinite loop). Once this base condition is met the method returns a value rather than calling itself again. This is how the stack in unwound.
In your method, the base case is when $n<1$ and the recursive calls use the input $n-1$. This means the method will call itself, each time decreasing $n$ by 1, until $n<1$ i.e. $n=0$. Once the base condition is met, the value 0 is returned and we start to execute the $for$ loop. This is why the first line contains a single asterix.
So if you run the method with an input of 5, the recursive calls build a stack of values of $n$ as so
0
1
2
3
4
5
Then this stack is unwound starting with the top, 0, all the way down to 5.

R programming: Level of allowed recursion depth differs when calling a local helper function

this is a purely academic question:
I have recently been working with languages that use tail recursion optimization. For practice I wrote two recursive implementations of sum functions in R, one of them being tail recursive. I quickly realized there is no tail recursion optimization in R. I can live with that.
However, I also noticed a different level of allowed depth when using the local helper function for the tail recursion.
Here is the code:
## Recursive
sum <- function (i, end, fun){
if (i>=end) 0
else fun(i) + sum(i+1, end, fun)
}
## Tail recursive
sum_tail <- function (i, end, fun){
sum_helper<- function(i, acc){
if (i>=end) acc
else sum_helper(i+1, acc+fun(i))
}
sum_helper(i, 0)
}
## Simple example
harmonic <- function(k){
return(1/(k))
}
print(sum(1, 1200, harmonic)) # <- This works fine, but is close to the limit
# print(sum_tail(1, 1200, harmonic)) <- This will crash
print(sum_tail(1, 996, harmonic)) # <- This is the deepest allowed
I am fairly intrigued. Can someone explain this behavior or point me towards a document explaining how the allowed recursion depth is calculated?
I'm not sure of R's internal implementation of the call stack, but it's pretty obvious from here that there is a maximum stack depth. (Many languages have this for various reasons, mostly related to memory and detecting infinite recursion.) You can set it with options(), and the default setting seems to depends on the platform -- on my machine, I can do print(sum_tail(1, 996, harmonic)) without difficulty.
Sidebar: you really shouldn't name your naive implementation sum() because you wind up shadowing a builtin. I know you're just playing with recursion here, but you should also generally avoid doing your own implementation of sum() -- it's not provided just as a convenience function but also because it's non trivial to implement a numerically correct version of sum() with floating point.
In your naive implementation, the call to fun() returns before the recursive call -- this means that each recursive call increases the depth of the call stack by exactly 1. In the other case, you've got an additional function call that's waiting to be evaluated. For more details, you should look into how R handles closures and how lazy / eager evaluation in R is handled. If I recall correctly, R uses environments (roughly, R's notion of scope, and deeply related to closures) to wrap arguments in certain situations and delay their evaluation, thus effectively using lazy evaluation. There's a lot of information on R internals available online, see here for a quick overview of argument evaluation. I'm not sure how accurate I am on the details, but it seems that the arguments to the tail-call are themselves getting placed on the call stack, thus increasing the depth of the call-stack by more than 1.
Sidebar the Second: I don't recall well enough how R implements this, and I know placing helper functions in the body is common practice, but placing a helper function definition in the recursive call could lead to each recursive call defining the helper function anew. This could interact in various ways with the way environments and closures are handled, but I'm not sure.
The functions traceback() and trace() could be useful in exploring the call behavior if you're curious about more of the details.

What is recursion and when should I use it?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
One of the topics that seems to come up regularly on mailing lists and online discussions is the merits (or lack thereof) of doing a Computer Science Degree. An argument that seems to come up time and again for the negative party is that they have been coding for some number of years and they have never used recursion.
So the question is:
What is recursion?
When would I use recursion?
Why don't people use recursion?
There are a number of good explanations of recursion in this thread, this answer is about why you shouldn't use it in most languages.* In the majority of major imperative language implementations (i.e. every major implementation of C, C++, Basic, Python, Ruby,Java, and C#) iteration is vastly preferable to recursion.
To see why, walk through the steps that the above languages use to call a function:
space is carved out on the stack for the function's arguments and local variables
the function's arguments are copied into this new space
control jumps to the function
the function's code runs
the function's result is copied into a return value
the stack is rewound to its previous position
control jumps back to where the function was called
Doing all of these steps takes time, usually a little bit more than it takes to iterate through a loop. However, the real problem is in step #1. When many programs start, they allocate a single chunk of memory for their stack, and when they run out of that memory (often, but not always due to recursion), the program crashes due to a stack overflow.
So in these languages recursion is slower and it makes you vulnerable to crashing. There are still some arguments for using it though. In general, code written recursively is shorter and a bit more elegant, once you know how to read it.
There is a technique that language implementers can use called tail call optimization which can eliminate some classes of stack overflow. Put succinctly: if a function's return expression is simply the result of a function call, then you don't need to add a new level onto the stack, you can reuse the current one for the function being called. Regrettably, few imperative language-implementations have tail-call optimization built in.
* I love recursion. My favorite static language doesn't use loops at all, recursion is the only way to do something repeatedly. I just don't think that recursion is generally a good idea in languages that aren't tuned for it.
** By the way Mario, the typical name for your ArrangeString function is "join", and I'd be surprised if your language of choice doesn't already have an implementation of it.
Simple english example of recursion.
A child couldn't sleep, so her mother told her a story about a little frog,
who couldn't sleep, so the frog's mother told her a story about a little bear,
who couldn't sleep, so the bear's mother told her a story about a little weasel...
who fell asleep.
...and the little bear fell asleep;
...and the little frog fell asleep;
...and the child fell asleep.
In the most basic computer science sense, recursion is a function that calls itself. Say you have a linked list structure:
struct Node {
Node* next;
};
And you want to find out how long a linked list is you can do this with recursion:
int length(const Node* list) {
if (!list->next) {
return 1;
} else {
return 1 + length(list->next);
}
}
(This could of course be done with a for loop as well, but is useful as an illustration of the concept)
Whenever a function calls itself, creating a loop, then that's recursion. As with anything there are good uses and bad uses for recursion.
The most simple example is tail recursion where the very last line of the function is a call to itself:
int FloorByTen(int num)
{
if (num % 10 == 0)
return num;
else
return FloorByTen(num-1);
}
However, this is a lame, almost pointless example because it can easily be replaced by more efficient iteration. After all, recursion suffers from function call overhead, which in the example above could be substantial compared to the operation inside the function itself.
So the whole reason to do recursion rather than iteration should be to take advantage of the call stack to do some clever stuff. For example, if you call a function multiple times with different parameters inside the same loop then that's a way to accomplish branching. A classic example is the Sierpinski triangle.
You can draw one of those very simply with recursion, where the call stack branches in 3 directions:
private void BuildVertices(double x, double y, double len)
{
if (len > 0.002)
{
mesh.Positions.Add(new Point3D(x, y + len, -len));
mesh.Positions.Add(new Point3D(x - len, y - len, -len));
mesh.Positions.Add(new Point3D(x + len, y - len, -len));
len *= 0.5;
BuildVertices(x, y + len, len);
BuildVertices(x - len, y - len, len);
BuildVertices(x + len, y - len, len);
}
}
If you attempt to do the same thing with iteration I think you'll find it takes a lot more code to accomplish.
Other common use cases might include traversing hierarchies, e.g. website crawlers, directory comparisons, etc.
Conclusion
In practical terms, recursion makes the most sense whenever you need iterative branching.
Recursion is a method of solving problems based on the divide and conquer mentality.
The basic idea is that you take the original problem and divide it into smaller (more easily solved) instances of itself, solve those smaller instances (usually by using the same algorithm again) and then reassemble them into the final solution.
The canonical example is a routine to generate the Factorial of n. The Factorial of n is calculated by multiplying all of the numbers between 1 and n. An iterative solution in C# looks like this:
public int Fact(int n)
{
int fact = 1;
for( int i = 2; i <= n; i++)
{
fact = fact * i;
}
return fact;
}
There's nothing surprising about the iterative solution and it should make sense to anyone familiar with C#.
The recursive solution is found by recognising that the nth Factorial is n * Fact(n-1). Or to put it another way, if you know what a particular Factorial number is you can calculate the next one. Here is the recursive solution in C#:
public int FactRec(int n)
{
if( n < 2 )
{
return 1;
}
return n * FactRec( n - 1 );
}
The first part of this function is known as a Base Case (or sometimes Guard Clause) and is what prevents the algorithm from running forever. It just returns the value 1 whenever the function is called with a value of 1 or less. The second part is more interesting and is known as the Recursive Step. Here we call the same method with a slightly modified parameter (we decrement it by 1) and then multiply the result with our copy of n.
When first encountered this can be kind of confusing so it's instructive to examine how it works when run. Imagine that we call FactRec(5). We enter the routine, are not picked up by the base case and so we end up like this:
// In FactRec(5)
return 5 * FactRec( 5 - 1 );
// which is
return 5 * FactRec(4);
If we re-enter the method with the parameter 4 we are again not stopped by the guard clause and so we end up at:
// In FactRec(4)
return 4 * FactRec(3);
If we substitute this return value into the return value above we get
// In FactRec(5)
return 5 * (4 * FactRec(3));
This should give you a clue as to how the final solution is arrived at so we'll fast track and show each step on the way down:
return 5 * (4 * FactRec(3));
return 5 * (4 * (3 * FactRec(2)));
return 5 * (4 * (3 * (2 * FactRec(1))));
return 5 * (4 * (3 * (2 * (1))));
That final substitution happens when the base case is triggered. At this point we have a simple algrebraic formula to solve which equates directly to the definition of Factorials in the first place.
It's instructive to note that every call into the method results in either a base case being triggered or a call to the same method where the parameters are closer to a base case (often called a recursive call). If this is not the case then the method will run forever.
Recursion is solving a problem with a function that calls itself. A good example of this is a factorial function. Factorial is a math problem where factorial of 5, for example, is 5 * 4 * 3 * 2 * 1. This function solves this in C# for positive integers (not tested - there may be a bug).
public int Factorial(int n)
{
if (n <= 1)
return 1;
return n * Factorial(n - 1);
}
Recursion refers to a method which solves a problem by solving a smaller version of the problem and then using that result plus some other computation to formulate the answer to the original problem. Often times, in the process of solving the smaller version, the method will solve a yet smaller version of the problem, and so on, until it reaches a "base case" which is trivial to solve.
For instance, to calculate a factorial for the number X, one can represent it as X times the factorial of X-1. Thus, the method "recurses" to find the factorial of X-1, and then multiplies whatever it got by X to give a final answer. Of course, to find the factorial of X-1, it'll first calculate the factorial of X-2, and so on. The base case would be when X is 0 or 1, in which case it knows to return 1 since 0! = 1! = 1.
Consider an old, well known problem:
In mathematics, the greatest common divisor (gcd) … of two or more non-zero integers, is the largest positive integer that divides the numbers without a remainder.
The definition of gcd is surprisingly simple:
where mod is the modulo operator (that is, the remainder after integer division).
In English, this definition says the greatest common divisor of any number and zero is that number, and the greatest common divisor of two numbers m and n is the greatest common divisor of n and the remainder after dividing m by n.
If you'd like to know why this works, see the Wikipedia article on the Euclidean algorithm.
Let's compute gcd(10, 8) as an example. Each step is equal to the one just before it:
gcd(10, 8)
gcd(10, 10 mod 8)
gcd(8, 2)
gcd(8, 8 mod 2)
gcd(2, 0)
2
In the first step, 8 does not equal zero, so the second part of the definition applies. 10 mod 8 = 2 because 8 goes into 10 once with a remainder of 2. At step 3, the second part applies again, but this time 8 mod 2 = 0 because 2 divides 8 with no remainder. At step 5, the second argument is 0, so the answer is 2.
Did you notice that gcd appears on both the left and right sides of the equals sign? A mathematician would say this definition is recursive because the expression you're defining recurs inside its definition.
Recursive definitions tend to be elegant. For example, a recursive definition for the sum of a list is
sum l =
if empty(l)
return 0
else
return head(l) + sum(tail(l))
where head is the first element in a list and tail is the rest of the list. Note that sum recurs inside its definition at the end.
Maybe you'd prefer the maximum value in a list instead:
max l =
if empty(l)
error
elsif length(l) = 1
return head(l)
else
tailmax = max(tail(l))
if head(l) > tailmax
return head(l)
else
return tailmax
You might define multiplication of non-negative integers recursively to turn it into a series of additions:
a * b =
if b = 0
return 0
else
return a + (a * (b - 1))
If that bit about transforming multiplication into a series of additions doesn't make sense, try expanding a few simple examples to see how it works.
Merge sort has a lovely recursive definition:
sort(l) =
if empty(l) or length(l) = 1
return l
else
(left,right) = split l
return merge(sort(left), sort(right))
Recursive definitions are all around if you know what to look for. Notice how all of these definitions have very simple base cases, e.g., gcd(m, 0) = m. The recursive cases whittle away at the problem to get down to the easy answers.
With this understanding, you can now appreciate the other algorithms in Wikipedia's article on recursion!
A function that calls itself
When a function can be (easily) decomposed into a simple operation plus the same function on some smaller portion of the problem. I should say, rather, that this makes it a good candidate for recursion.
They do!
The canonical example is the factorial which looks like:
int fact(int a)
{
if(a==1)
return 1;
return a*fact(a-1);
}
In general, recursion isn't necessarily fast (function call overhead tends to be high because recursive functions tend to be small, see above) and can suffer from some problems (stack overflow anyone?). Some say they tend to be hard to get 'right' in non-trivial cases but I don't really buy into that. In some situations, recursion makes the most sense and is the most elegant and clear way to write a particular function. It should be noted that some languages favor recursive solutions and optimize them much more (LISP comes to mind).
A recursive function is one which calls itself. The most common reason I've found to use it is traversing a tree structure. For example, if I have a TreeView with checkboxes (think installation of a new program, "choose features to install" page), I might want a "check all" button which would be something like this (pseudocode):
function cmdCheckAllClick {
checkRecursively(TreeView1.RootNode);
}
function checkRecursively(Node n) {
n.Checked = True;
foreach ( n.Children as child ) {
checkRecursively(child);
}
}
So you can see that the checkRecursively first checks the node which it is passed, then calls itself for each of that node's children.
You do need to be a bit careful with recursion. If you get into an infinite recursive loop, you will get a Stack Overflow exception :)
I can't think of a reason why people shouldn't use it, when appropriate. It is useful in some circumstances, and not in others.
I think that because it's an interesting technique, some coders perhaps end up using it more often than they should, without real justification. This has given recursion a bad name in some circles.
Recursion is an expression directly or indirectly referencing itself.
Consider recursive acronyms as a simple example:
GNU stands for GNU's Not Unix
PHP stands for PHP: Hypertext Preprocessor
YAML stands for YAML Ain't Markup Language
WINE stands for Wine Is Not an Emulator
VISA stands for Visa International Service Association
More examples on Wikipedia
Recursion works best with what I like to call "fractal problems", where you're dealing with a big thing that's made of smaller versions of that big thing, each of which is an even smaller version of the big thing, and so on. If you ever have to traverse or search through something like a tree or nested identical structures, you've got a problem that might be a good candidate for recursion.
People avoid recursion for a number of reasons:
Most people (myself included) cut their programming teeth on procedural or object-oriented programming as opposed to functional programming. To such people, the iterative approach (typically using loops) feels more natural.
Those of us who cut our programming teeth on procedural or object-oriented programming have often been told to avoid recursion because it's error prone.
We're often told that recursion is slow. Calling and returning from a routine repeatedly involves a lot of stack pushing and popping, which is slower than looping. I think some languages handle this better than others, and those languages are most likely not those where the dominant paradigm is procedural or object-oriented.
For at least a couple of programming languages I've used, I remember hearing recommendations not to use recursion if it gets beyond a certain depth because its stack isn't that deep.
A recursive statement is one in which you define the process of what to do next as a combination of the inputs and what you have already done.
For example, take factorial:
factorial(6) = 6*5*4*3*2*1
But it's easy to see factorial(6) also is:
6 * factorial(5) = 6*(5*4*3*2*1).
So generally:
factorial(n) = n*factorial(n-1)
Of course, the tricky thing about recursion is that if you want to define things in terms of what you have already done, there needs to be some place to start.
In this example, we just make a special case by defining factorial(1) = 1.
Now we see it from the bottom up:
factorial(6) = 6*factorial(5)
= 6*5*factorial(4)
= 6*5*4*factorial(3) = 6*5*4*3*factorial(2) = 6*5*4*3*2*factorial(1) = 6*5*4*3*2*1
Since we defined factorial(1) = 1, we reach the "bottom".
Generally speaking, recursive procedures have two parts:
1) The recursive part, which defines some procedure in terms of new inputs combined with what you've "already done" via the same procedure. (i.e. factorial(n) = n*factorial(n-1))
2) A base part, which makes sure that the process doesn't repeat forever by giving it some place to start (i.e. factorial(1) = 1)
It can be a bit confusing to get your head around at first, but just look at a bunch of examples and it should all come together. If you want a much deeper understanding of the concept, study mathematical induction. Also, be aware that some languages optimize for recursive calls while others do not. It's pretty easy to make insanely slow recursive functions if you're not careful, but there are also techniques to make them performant in most cases.
Hope this helps...
I like this definition:
In recursion, a routine solves a small part of a problem itself, divides the problem into smaller pieces, and then calls itself to solve each of the smaller pieces.
I also like Steve McConnells discussion of recursion in Code Complete where he criticises the examples used in Computer Science books on Recursion.
Don't use recursion for factorials or Fibonacci numbers
One problem with
computer-science textbooks is that
they present silly examples of
recursion. The typical examples are
computing a factorial or computing a
Fibonacci sequence. Recursion is a
powerful tool, and it's really dumb to
use it in either of those cases. If a
programmer who worked for me used
recursion to compute a factorial, I'd
hire someone else.
I thought this was a very interesting point to raise and may be a reason why recursion is often misunderstood.
EDIT:
This was not a dig at Dav's answer - I had not seen that reply when I posted this
1.)
A method is recursive if it can call itself; either directly:
void f() {
... f() ...
}
or indirectly:
void f() {
... g() ...
}
void g() {
... f() ...
}
2.) When to use recursion
Q: Does using recursion usually make your code faster?
A: No.
Q: Does using recursion usually use less memory?
A: No.
Q: Then why use recursion?
A: It sometimes makes your code much simpler!
3.) People use recursion only when it is very complex to write iterative code. For example, tree traversal techniques like preorder, postorder can be made both iterative and recursive. But usually we use recursive because of its simplicity.
Here's a simple example: how many elements in a set. (there are better ways to count things, but this is a nice simple recursive example.)
First, we need two rules:
if the set is empty, the count of items in the set is zero (duh!).
if the set is not empty, the count is one plus the number of items in the set after one item is removed.
Suppose you have a set like this: [x x x]. let's count how many items there are.
the set is [x x x] which is not empty, so we apply rule 2. the number of items is one plus the number of items in [x x] (i.e. we removed an item).
the set is [x x], so we apply rule 2 again: one + number of items in [x].
the set is [x], which still matches rule 2: one + number of items in [].
Now the set is [], which matches rule 1: the count is zero!
Now that we know the answer in step 4 (0), we can solve step 3 (1 + 0)
Likewise, now that we know the answer in step 3 (1), we can solve step 2 (1 + 1)
And finally now that we know the answer in step 2 (2), we can solve step 1 (1 + 2) and get the count of items in [x x x], which is 3. Hooray!
We can represent this as:
count of [x x x] = 1 + count of [x x]
= 1 + (1 + count of [x])
= 1 + (1 + (1 + count of []))
= 1 + (1 + (1 + 0)))
= 1 + (1 + (1))
= 1 + (2)
= 3
When applying a recursive solution, you usually have at least 2 rules:
the basis, the simple case which states what happens when you have "used up" all of your data. This is usually some variation of "if you are out of data to process, your answer is X"
the recursive rule, which states what happens if you still have data. This is usually some kind of rule that says "do something to make your data set smaller, and reapply your rules to the smaller data set."
If we translate the above to pseudocode, we get:
numberOfItems(set)
if set is empty
return 0
else
remove 1 item from set
return 1 + numberOfItems(set)
There's a lot more useful examples (traversing a tree, for example) which I'm sure other people will cover.
Well, that's a pretty decent definition you have. And wikipedia has a good definition too. So I'll add another (probably worse) definition for you.
When people refer to "recursion", they're usually talking about a function they've written which calls itself repeatedly until it is done with its work. Recursion can be helpful when traversing hierarchies in data structures.
An example: A recursive definition of a staircase is:
A staircase consists of:
- a single step and a staircase (recursion)
- or only a single step (termination)
To recurse on a solved problem: do nothing, you're done.
To recurse on an open problem: do the next step, then recurse on the rest.
In plain English:
Assume you can do 3 things:
Take one apple
Write down tally marks
Count tally marks
You have a lot of apples in front of you on a table and you want to know how many apples there are.
start
Is the table empty?
yes: Count the tally marks and cheer like it's your birthday!
no: Take 1 apple and put it aside
Write down a tally mark
goto start
The process of repeating the same thing till you are done is called recursion.
I hope this is the "plain english" answer you are looking for!
A recursive function is a function that contains a call to itself. A recursive struct is a struct that contains an instance of itself. You can combine the two as a recursive class. The key part of a recursive item is that it contains an instance/call of itself.
Consider two mirrors facing each other. We've seen the neat infinity effect they make. Each reflection is an instance of a mirror, which is contained within another instance of a mirror, etc. The mirror containing a reflection of itself is recursion.
A binary search tree is a good programming example of recursion. The structure is recursive with each Node containing 2 instances of a Node. Functions to work on a binary search tree are also recursive.
This is an old question, but I want to add an answer from logistical point of view (i.e not from algorithm correctness point of view or performance point of view).
I use Java for work, and Java doesn't support nested function. As such, if I want to do recursion, I might have to define an external function (which exists only because my code bumps against Java's bureaucratic rule), or I might have to refactor the code altogether (which I really hate to do).
Thus, I often avoid recursion, and use stack operation instead, because recursion itself is essentially a stack operation.
You want to use it anytime you have a tree structure. It is very useful in reading XML.
Recursion as it applies to programming is basically calling a function from inside its own definition (inside itself), with different parameters so as to accomplish a task.
"If I have a hammer, make everything look like a nail."
Recursion is a problem-solving strategy for huge problems, where at every step just, "turn 2 small things into one bigger thing," each time with the same hammer.
Example
Suppose your desk is covered with a disorganized mess of 1024 papers. How do you make one neat, clean stack of papers from the mess, using recursion?
Divide: Spread all the sheets out, so you have just one sheet in each "stack".
Conquer:
Go around, putting each sheet on top of one other sheet. You now have stacks of 2.
Go around, putting each 2-stack on top of another 2-stack. You now have stacks of 4.
Go around, putting each 4-stack on top of another 4-stack. You now have stacks of 8.
... on and on ...
You now have one huge stack of 1024 sheets!
Notice that this is pretty intuitive, aside from counting everything (which isn't strictly necessary). You might not go all the way down to 1-sheet stacks, in reality, but you could and it would still work. The important part is the hammer: With your arms, you can always put one stack on top of the other to make a bigger stack, and it doesn't matter (within reason) how big either stack is.
Recursion is the process where a method call iself to be able to perform a certain task. It reduces redundency of code. Most recurssive functions or methods must have a condifiton to break the recussive call i.e. stop it from calling itself if a condition is met - this prevents the creating of an infinite loop. Not all functions are suited to be used recursively.
hey, sorry if my opinion agrees with someone, I'm just trying to explain recursion in plain english.
suppose you have three managers - Jack, John and Morgan.
Jack manages 2 programmers, John - 3, and Morgan - 5.
you are going to give every manager 300$ and want to know what would it cost.
The answer is obvious - but what if 2 of Morgan-s employees are also managers?
HERE comes the recursion.
you start from the top of the hierarchy. the summery cost is 0$.
you start with Jack,
Then check if he has any managers as employees. if you find any of them are, check if they have any managers as employees and so on. Add 300$ to the summery cost every time you find a manager.
when you are finished with Jack, go to John, his employees and then to Morgan.
You'll never know, how much cycles will you go before getting an answer, though you know how many managers you have and how many Budget can you spend.
Recursion is a tree, with branches and leaves, called parents and children respectively.
When you use a recursion algorithm, you more or less consciously are building a tree from the data.
In plain English, recursion means to repeat someting again and again.
In programming one example is of calling the function within itself .
Look on the following example of calculating factorial of a number:
public int fact(int n)
{
if (n==0) return 1;
else return n*fact(n-1)
}
Any algorithm exhibits structural recursion on a datatype if basically consists of a switch-statement with a case for each case of the datatype.
for example, when you are working on a type
tree = null
| leaf(value:integer)
| node(left: tree, right:tree)
a structural recursive algorithm would have the form
function computeSomething(x : tree) =
if x is null: base case
if x is leaf: do something with x.value
if x is node: do something with x.left,
do something with x.right,
combine the results
this is really the most obvious way to write any algorith that works on a data structure.
now, when you look at the integers (well, the natural numbers) as defined using the Peano axioms
integer = 0 | succ(integer)
you see that a structural recursive algorithm on integers looks like this
function computeSomething(x : integer) =
if x is 0 : base case
if x is succ(prev) : do something with prev
the too-well-known factorial function is about the most trivial example of
this form.
function call itself or use its own definition.

Resources