Iterative Postorder Traversal of Binary Tree in Python Optimality - recursion

I'm brushing up on leet code tree questions and every solution of Iterative Postorder Traversal of Binary Tree in Python type problems seem to use recursion.
Since there is no tail recursion in python, I believe iterative algorithm is faster because it takes less time to move around a stack than to jump through stack call frames.
Additionally I believe the iterative approach uses less memory because tracking just one stack takes up less space than the call frame stack from recursion.
I understand the typical approach to postorder iterative traversal takes 2 stacks and while loops so it seems to complicated.
However, there is an algorithm to use just 1 stack and while loop.
Here it is:
def postorderIterative(root):
curr = root
stack = []
while current or stack:
if current:
stack.append[current]
current = current.left
else:
tmp = stack[-1].right
if not tmp:
tmp = stack.pop()
#process postorder here
while stack and tmp == stack[-1].right:
tmp = stack.pop()
#process postorder here
else:
current = tmp
credit and explanation goes to here
Is there any reason why most video solutions seem to use recursive postorder traversal more than iterative even though iterative is better? is it because recursive is easier to understand?
It seems that with a big enough test case the recursive may fail in any call so I tend to always do iterative. Is this a valid approach?
I literally cannot find an iterative solution video to
leet code: 543. Diameter of Binary Tree
these are all recursive and thus suboptimal, even though they say it is optimal.
Please let me know if I am wrong and why. If not. please let me know as well. I am here to learn!

Both the iterative and recursive implementations have the same time and space complexity. The difference in running time is insignificant in terms of big O. When speaking of optimal, people usually mean the time/space complexity is optimal.
Secondly, the call stack limitation will only be a problem for a binary tree that has a height that goes into the hundreds. If a tree of such height were balanced, it would have more nodes than there are atoms in the universe. So we would only get into stack limitations when the binary tree is awfully skewed, which would make it rather useless for a real life problem. In short, I don't think the call stack limitation is such an important argument here.

Related

In functional languages, how is the compiler able to translate non-tail recursion into loops to avoid stack overflows (if at all)?

I've been recently learning about functional languages and how many don't include for loops. While I don't personally view recursion as more difficult than a for loop (and often easier to reason out) I realized that many examples of recursion aren't tail recursive and therefor cannot use simple tail recursion optimization in order to avoid stack overflows. According to this question, all iterative loops can be translated into recursion, and those iterative loops can be transformed into tail recursion, so it confuses me when the answers on a question like this suggest that you have to explicitly manage the translation of your recursion into tail recursion yourself if you want to avoid stack overflows. It seems like it should be possible for a compiler to do all the translation from either recursion to tail recursion, or from recursion straight to an iterative loop with out stack overflows.
Are functional compilers able to avoid stack overflows in more general recursive cases? Are you really forced to transform your recursive code in order to avoid stack overflows yourself? If they aren't able to perform general recursive stack-safe compilation, why aren't they?
Any recursive function can be converted into a tail recursive one.
For instance, consider the transition function of a Turing machine, that
is the mapping from a configuration to the next one. To simulate the
turing machine you just need to iterate the transition function until
you reach a final state, that is easily expressed in tail recursive
form. Similarly, a compiler typically translates a recursive program into
an iterative one simply adding a stack of activation records.
You can also give a translation into tail recursive form using continuation
passing style (CPS). To make a classical example, consider the fibonacci
function.
This can be expressed in CPS style in the following way, where the second
parameter is the continuation (essentially, a callback function):
def fibc(n, cont):
if n <= 1:
return cont(n)
return fibc(n - 1, lambda a: fibc(n - 2, lambda b: cont(a + b)))
Again, you are simulating the recursion stack using a dynamic data structure:
in this case, lambda abstractions.
The use of dynamic structures (lists, stacks, functions, etc.) in all previous
examples is essential. That is to say, that in order to simulate a generic
recursive function iteratively, you cannot avoid dynamic memory allocation,
and hence you cannot avoid stack overflow, in general.
So, memory consumption is not only related to the iterative/recursive
nature of the program. On the other side, if you prevent dynamic memory
allocation, your
programs are essentially finite state machines, with limited computational
capabilities (more interesting would be to parametrise memory according to
the dimension of inputs).
In general, in the same way as you cannot predict termination, you cannot
predict an unbound memory consumption of your program: working with
a Turing complete language, at compile time
you cannot avoid divergence, and you cannot avoid stack overflow.
Tail Call Optimization:
The natural way to do arguments and calls is to sort out the cleaning up when exiting or when returning.
For tail calls to work you need to alter it so that the tail call inherits the current frame. Thus instead of making a new frame it massages the frame so that the next call returns to the current functions caller instead of this function, which really only cleans up and returns if it's a tail call.
Thus TCO is all about cleaning up before the last call.
Continuation Passing Style - make tail calls out of everything
A compiler can change the code such that it only does primitive operations and pass it to continuations. Thus the stack usage gets moved onto the heap since the computation to be continued is made a function.
An example is:
function hypotenuse(k1, k2) {
return sqrt(add(square(k1), square(k2)))
}
becomes
function hypotenuse(k, k1, k2) {
(function (sk1) {
(function (sk2) {
(function (ar) {
k(sqrt(ar));
}(add(sk1,sk2));
}(square(k2));
}(square(k1));
}
Notice every function has exactly one call now and the order of evaluation is set.
According to this question, all iterative loops can be translated into recursion
"Translated" might be a bit of a stretch. The proof that for every iterative loop there is an equivalent recursive program is trivial if you understand Turing completeness: since a Turing machine can be implemented using strictly iterative structures and strictly recursive structures, every program that can be expressed in an iterative language can be expressed in a recursive language, and vice-versa. This means that for every iterative loop there is an equivalent recursive construct (and the other way around). However, that doesn't mean we have some automated way of transforming one into the other.
and those iterative loops can be transformed into tail recursion
Tail recursion can perhaps be easily transformed into an iterative loop, and the other way around. But not all recursion is tail recursion. Here's an example. Suppose we have some binary tree. It consists of nodes. Each node can have a left and a right child and a value. If a node has no children, then isLeaf returns true for it. We'll assume there's some function max that returns the maximum of two values, and if one of the values is null it returns the other one. Now we want to define a function that finds the maximum value among all the leaf nodes. Here it is in some pseudo-code I cooked up.
findmax(node) {
if (node == null) {
return null
}
if (node.isLeaf) {
return node.value
} else {
return max(findmax(node.left), findmax(node.right))
}
}
There's two recursive calls in the max function, so we can't optimize for tail recursion. We need the results of both before we can supply them to the max function and determine the result of the call for the current node.
Now, there may be a way of getting the same result, using recursion and only a single tail-recursive call. It is functionally equivalent, but it is a different algorithm. Compilers can do a lot of transformations to create a functionally equivalent program with lots of optimizations, but they're not quite clever enough to create functionally equivalent algorithms.
Even the transformation of a function that only calls itself recursively once into a tail-recursive version would be far from trivial. Such an adaptation usually employs some argument passed into the recursive invocation that is used as an "accumulator" for the current results.
Look at the next naive implementation for calculating a factorial of a number (e.g. fact(5) = 5*4*3*2*1):
fact(number) {
if (number == 1) {
return 1
} else {
return number * fact(number - 1)
}
}
It's not tail-recursive. But it can be made so in this way:
fact(number, acc) {
if (number == 1) {
return acc
} else {
return fact(number - 1, number * acc)
}
}
// Helper function
fact(number) {
return fact(number, 1)
}
This requires an interpretation of what is being done. Recognizing the case for stuff like this is easy enough, but what if you call a function instead of a multiplication? How will the compiler know that for the initial call the accumulator must be 1 and not, say, 0? How do you translate this program?
recsub(number) {
if (number == 1) {
return 1
} else {
return number - recsub(number - 1)
}
}
This is as of yet outside the scope of the sort of compiler we have now, and may in fact always be.
Maybe it would be interesting to ask this on the computer science Stack Exchange to see if they know of some papers or proofs that investigate this more in-depth.

R programming: Level of allowed recursion depth differs when calling a local helper function

this is a purely academic question:
I have recently been working with languages that use tail recursion optimization. For practice I wrote two recursive implementations of sum functions in R, one of them being tail recursive. I quickly realized there is no tail recursion optimization in R. I can live with that.
However, I also noticed a different level of allowed depth when using the local helper function for the tail recursion.
Here is the code:
## Recursive
sum <- function (i, end, fun){
if (i>=end) 0
else fun(i) + sum(i+1, end, fun)
}
## Tail recursive
sum_tail <- function (i, end, fun){
sum_helper<- function(i, acc){
if (i>=end) acc
else sum_helper(i+1, acc+fun(i))
}
sum_helper(i, 0)
}
## Simple example
harmonic <- function(k){
return(1/(k))
}
print(sum(1, 1200, harmonic)) # <- This works fine, but is close to the limit
# print(sum_tail(1, 1200, harmonic)) <- This will crash
print(sum_tail(1, 996, harmonic)) # <- This is the deepest allowed
I am fairly intrigued. Can someone explain this behavior or point me towards a document explaining how the allowed recursion depth is calculated?
I'm not sure of R's internal implementation of the call stack, but it's pretty obvious from here that there is a maximum stack depth. (Many languages have this for various reasons, mostly related to memory and detecting infinite recursion.) You can set it with options(), and the default setting seems to depends on the platform -- on my machine, I can do print(sum_tail(1, 996, harmonic)) without difficulty.
Sidebar: you really shouldn't name your naive implementation sum() because you wind up shadowing a builtin. I know you're just playing with recursion here, but you should also generally avoid doing your own implementation of sum() -- it's not provided just as a convenience function but also because it's non trivial to implement a numerically correct version of sum() with floating point.
In your naive implementation, the call to fun() returns before the recursive call -- this means that each recursive call increases the depth of the call stack by exactly 1. In the other case, you've got an additional function call that's waiting to be evaluated. For more details, you should look into how R handles closures and how lazy / eager evaluation in R is handled. If I recall correctly, R uses environments (roughly, R's notion of scope, and deeply related to closures) to wrap arguments in certain situations and delay their evaluation, thus effectively using lazy evaluation. There's a lot of information on R internals available online, see here for a quick overview of argument evaluation. I'm not sure how accurate I am on the details, but it seems that the arguments to the tail-call are themselves getting placed on the call stack, thus increasing the depth of the call-stack by more than 1.
Sidebar the Second: I don't recall well enough how R implements this, and I know placing helper functions in the body is common practice, but placing a helper function definition in the recursive call could lead to each recursive call defining the helper function anew. This could interact in various ways with the way environments and closures are handled, but I'm not sure.
The functions traceback() and trace() could be useful in exploring the call behavior if you're curious about more of the details.

Quicksort and tail recursive optimization

In Introduction to Algorithms p169 it talks about using tail recursion for Quicksort.
The original Quicksort algorithm earlier in the chapter is (in pseudo-code)
Quicksort(A, p, r)
{
if (p < r)
{
q: <- Partition(A, p, r)
Quicksort(A, p, q)
Quicksort(A, q+1, r)
}
}
The optimized version using tail recursion is as follows
Quicksort(A, p, r)
{
while (p < r)
{
q: <- Partition(A, p, r)
Quicksort(A, p, q)
p: <- q+1
}
}
Where Partition sorts the array according to a pivot.
The difference is that the second algorithm only calls Quicksort once to sort the LHS.
Can someone explain to me why the 1st algorithm could cause a stack overflow, whereas the second wouldn't? Or am I misunderstanding the book.
First let's start with a brief, probably not accurate but still valid, definition of what stack overflow is.
As you probably know right now there are two different kind of memory which are implemented in too different data structures: Heap and Stack.
In terms of size, the Heap is bigger than the stack, and to keep it simple let's say that every time a function call is made a new environment(local variables, parameters, etc.) is created on the stack. So given that and the fact that stack's size is limited, if you make too many function calls you will run out of space hence you will have a stack overflow.
The problem with recursion is that, since you are creating at least one environment on the stack per iteration, then you would be occupying a lot of space in the limited stack very quickly, so stack overflow are commonly associated with recursion calls.
So there is this thing called Tail recursion call optimization that will reuse the same environment every time a recursion call is made and so the space occupied in the stack is constant, preventing the stack overflow issue.
Now, there are some rules in order to perform a tail call optimization. First, each call most be complete and by that I mean that the function should be able to give a result at any moment if you interrupts the execution, in SICP
this is called an iterative process even when the function is recursive.
If you analyze your first example, you will see that each iteration is defined by two recursive calls, which means that if you stop the execution at any time you won't be able to give a partial result because you the result depends of those calls to be finished, in this scenario you can't reuse the stack environment because the total information is split between all those recursive calls.
However, the second example doesn't have that problem, A is constant and the state of p and r can be locally determined, so since all the information to keep going is there then TCO can be applied.
The essence of the tail recursion optimization is that there is no recursion when the program is actually executed. When the compiler or interpreter is able to kick TRO in, it means that it will essentially figure out how to rewrite your recursively-defined algorithm into a simple iterative process with the stack not used to store nested function invocations.
The first code snippet can't be TR-optimized because there are 2 recursive calls in it.
Tail recursion by itself is not enough. The algorithm with the while loop can still use O(N) stack space, reducing it to O(log(N)) is left as exercise in that section of CLRS.
Assume we are working in a language with array slices and tail call optimization. Consider the difference between these two algorithms:
Bad:
Quicksort(arraySlice) {
if (arraySlice.length > 1) {
slices = Partition(arraySlice)
(smallerSlice, largerSlice) = sortBySize(slices)
Quicksort(largerSlice) // Not a tail call, requires a stack frame until it returns.
Quicksort(smallerSlice) // Tail call, can replace the old stack frame.
}
}
Good:
Quicksort(arraySlice) {
if (arraySlice.length > 1){
slices = Partition(arraySlice)
(smallerSlice, largerSlice) = sortBySize(slices)
Quicksort(smallerSlice) // Not a tail call, requires a stack frame until it returns.
Quicksort(largerSlice) // Tail call, can replace the old stack frame.
}
}
The second one is guarenteed to never need more than log2(length) stack frames because smallerSlice is less than half as long as arraySlice. But for the first one, the inequality is reversed and it will always need more than or equal to log2(length) stack frames, and can require O(N) stack frames in the worst case where smallerslice always has length 1.
If you don't keep track of which slice is smaller or larger, you will have similar worst cases to the first overflowing case, even though it will require O(log(n)) stack frames on average. If you always sort the smaller slice first, you will never need more than log_2(length) stack frames.
If you are using a language that doesn't have tail call optimization, you can write the second (not stack-blowing) version as:
Quicksort(arraySlice) {
while (arraySlice.length > 1) {
slices = Partition(arraySlice)
(smallerSlice, arraySlice) = sortBySize(slices)
Quicksort(smallerSlice) // Still not a tail call, requires a stack frame until it returns.
}
}
Another thing worth noting is that if you are implementing something like Introsort which changes to Heapsort if the recursion depth exceeds some number proportional to log(N), you will never hit the O(N) worst case stack memory usage of quicksort, so you technically don't need to do this. Doing this optimization (popping smaller slices first) still improves the constant factor of the O(log(N)) though, so it is strongly recommended.
Well, the most obvious observation would be:
Most common stack overflow problem - definition
The most common cause of stack overflow is excessively deep or infinite recursion.
The second uses less deep recursion than the first (n branches per call instead of n^2) hence it is less likely to cause a stack overflow..
(so lower complexity means less chance to cause a stack overflow)
But somebody would have to add why the second can never cause a stack overflow while the first can.
source
Well If you consider the complexity of the two methods the first method obviously has more complexity than the second since it calls Recursion on both LHS and RHS as a result there are more chances of getting stack overflow
Note: That doesnt mean that there are absolutely no chances of getting SO in second method
In the function 2 that you shared, Tail Call elimination is implemented. Before proceeding further let us understand what is tail recursion function?. If the last statement in the code is the recursive call and does do anything after that, then it is called tail recursive function. So the first function is a tail recursion function. For such a function with some changes in the code one can remove the last recursion call like you showed in function 2 which performs the same work as function 1. This process is called tail recursion optimization or tail call elimination and following are the result of it
Optimizing in terms of auxiliary space
Optimizing in terms of recursion call overhead
Last recursive call is eliminated by using the while loop. The good thing is that for function 2, no auxiliary space is used for the right call as its recursion is eliminated using p: <- q+1 and the overall function does not have recursion call overhead. So whatever way partition happens maximum space needed is theta(log n)

Does Memoization improve running time of this algorithm?

"Given an array of n integers, return an array of their factorials."
Instead of the straight forward way of iterating through the array and finding factorial for each, I was thinking of a memoized approach, where I store previously calculated factorials and use them in subsequent ones.
For example: 7! can be calculated much fasted if the result 6! is stored somewhere. However, I noticed the run time of both algorithms is still O(n). (I might be wrong) Does that imply that we're not speeding up the process here? If so, does that mean that memoization is not useful in problems with non-tree recursion? (In Fibonacci, we effectively prune the recursion tree by memoizing the previously found values, in the case of factorial, we don't really have the tree, more like a recursion ladder)
Any comments appreciated.
However, I noticed the run time of both algorithms is still O(n).
O-Notation is hiding the most important characteristic here. It only considers the length of the array and not the size of the numbers which is much much more important when dealing with factorials.
You should implement memoization with a hashtable. If you do then you will get O(1) for each entry asymptotically.
In the case without memoization, the time complexity should be O(n^2) since you would need (i-1) multiplications to calculate factorial(i) without memoization.

Pure functional bottom up tree algorithm

Say I wanted to write an algorithm working on an immutable tree data structure that has a list of leaves as its input. It needs to return a new tree with changes made to the old tree going upwards from those leaves.
My problem is that there seems to be no way to do this purely functional without reconstructing the entire tree checking at leaves if they are in the list, because you always need to return a complete new tree as the result of an operation and you can't mutate the existing tree.
Is this a basic problem in functional programming that only can be avoided by using a better suited algorithm or am I missing something?
Edit: I not only want to avoid to recreate the entire tree but also the functional algorithm should have the same time complexity than the mutating variant.
The most promising I have seen so far (which admittedly is not very long...) is the Zipper data structure: It basically keeps a separate structure, a reverse path from the node to root, and does local edits on this separate structure.
It can do multiple local edits, most of which are constant time, and write them back to the tree (reconstructing the path to root, which are the only nodes that need to change) all in one go.
The Zipper is a standard library in Clojure (see the heading Zippers - Functional Tree Editing).
And there's the original paper by Huet with an implementation in OCaml.
Disclaimer: I have been programming for a long time, but only started functional programming a couple of weeks ago, and had never even heard of the problem of functional editing of trees until last week, so there may very well be other solutions I'm unaware of.
Still, it looks like the Zipper does most of what one could wish for. If there are other alternatives at O(log n) or below, I'd like to hear them.
You may enjoy reading
http://lorgonblog.spaces.live.com/blog/cns!701679AD17B6D310!248.entry
This depends on your functional programming language. For instance in Haskell, which is a Lazy functional programming language, results are calculated at the last moment; when they are acutally needed.
In your example the assumption is that because your function creates a new tree, the whole tree must be processed, whereas in reality the function is just passed on to the next function and only executed when necessary.
A good example of lazy evaluation is the sieve of erastothenes in Haskell, which creates the prime numbers by eliminating the multiples of the current number in the list of numbers. Note that the list of numbers is infinite. Taken from here
primes :: [Integer]
primes = sieve [2..]
where
sieve (p:xs) = p : sieve [x|x <- xs, x `mod` p > 0]
I recently wrote an algorithm that does exactly what you described - https://medium.com/hibob-engineering/from-list-to-immutable-hierarchy-tree-with-scala-c9e16a63cb89
It works in 2 phases:
Sort the list of nodes by their depth in the hierarchy
constructs the tree from bottom up
Some caveats:
No Node mutation, The result is an Immutable-tree
The complexity is O(n)
Ignores cyclic referencing in the incoming list

Resources