OCaml binary tree depth with no stack overflow - recursion

I have the following implementation for a binary tree and a depth function to calculate its depth:
type 'a btree =
| Empty
| Node of 'a * 'a btree * 'a btree;;
let rec depth t = match t with
| Empty -> 0
| Node (_, t1, t2) -> 1 + Int.max (depth t1) (depth t2)
The problem here is that "depth" is recursive and can cause a stack overflow when the tree is too big.
I read about tail recursion and how it can be optimised into a while loop by the compiler to remove the stack call.
How would you make this function tail recursive or make it use a while/for loop instead?

type 'a btree =
| Empty
| Node of 'a * 'a btree * 'a btree;;
let max x y = if x > y then x else y
let depth t =
let rec dep m = function (* d records current level, m records max depth so far *)
| [] -> m
| (Empty,d)::tl -> dep (max m d) tl
| (Node (_,l,r),d)::tl -> dep (max m d) ((l,d+1)::(r,d+1)::tl)
in
dep 0 [(t,0)]
Basically, you need 3 things:
a list (stack) to store nodes along the paths
a indicator to record the current depth
the current max depth so far
Whenever we face a problem that needs to remove the possible stackoverflow problem, we should think two things: tail-recursive and explicit stack.
For tail-recursive, you have to find a way to explicitly store the temporary data generated through each recursion step.
For explicit stack, remember the reason that recursion can work is because internally it uses a stack with a limited size. If we analyse the logic and make that stack explicit, we then don't need that internal stack any more.

In pragmatic cases the solution is to use a balanced tree, which limits the depth to some multiple of log(n). Even for very large n, log(n) is small enough that you won't run out of stack space.
Otherwise see the SO page linked by Kadaku. It has ridiculously good answers to the question.

I already answered similar question once. Reposting the solution:
There's a neat and generic solution using fold_tree and CPS - continuous passing style:
let fold_tree tree f acc =
let loop t cont =
match tree with
| Leaf -> cont acc
| Node (x, left, right) ->
loop left (fun lacc ->
loop right (fun racc ->
cont ## f x lacc racc))
in loop tree (fun x -> x)
let depth tree = fold_tree tree (fun x dl dr -> 1 + (max dl dr)) 0

Related

Tail-recursive solution to the coin change problem

I am trying to solve the coin change problem with tail recursion. The recursive solutions I come across are usually something like this
let rec combinations (amount:int) (coins:list<int>) =
if amount = 0 then
1
elif coins.IsEmpty || amount < 0 then
0
else
combinations (amount - coins.Head) coins + combinations amount coins.Tail
clearly inefficient and non tail recursive. I tried to make the solution tail recursive myself:
let combinationsTail (amount:int) (coins:list<int>) : int =
let rec go (amount:int) (sum:int) (coins:list<int>) =
match amount,sum ,coins with
| _,_, [] -> 0
| n,s,_ when n = 0 -> s
| n,_,cs when n < 0 || cs.IsEmpty -> 0
| n,s,h::t -> go (n - h) (n + s) t
go amount 0 coins
But it doesn't work. Does anyone know how to implement a tail recursive solution to this problem? is it even possible?
For achieving tail-recursiveness, you probably want to look into continuation-passing style. Here's an example applied to the Fibonacci sequence, which you could translate verbatim to the coins change problem, since both problems are dealing with aggregation of a recursive tree structure.
It's not the last word on efficiency.
let cc amount coins =
let rec aux k = function
| amount, _ when amount = 0 -> k 1
| amount, _ when amount < 0 -> k 0
| _, [] -> k 0
| amount, hd::tl ->
let k' x =
let k'' y = k (x + y)
aux k'' (amount - hd, hd::tl)
aux k' (amount, tl)
aux id (amount, coins)

Is there an equivalent for map or fmap to replace while loops?

Haskell replaces for loops over iteratable objects with map :: (a -> b) -> [a] -> [b] or
fmap :: (a -> b) -> f a -> f b. (This question isn't limited to Haskell, I'm just using the syntax here.)
Is there something similar that replaces a while loop, like
wmap :: ([a] -> b) -> [a] -> ([b] -> Bool) -> [b]?
This function returns a list of b.
The first argument is a function that takes a list and computes a value that will end up in the list returned by wmap (so it's a very specific kind of while loop).
The second argument is the list that we use as our starting point.
The third argument is a function that evaluates the stoping criteria.
And as a functor,
wfmap :: (f a -> b) -> f a -> (f b -> Bool) -> f b
For example, a Jacobi solver would look like this (with b now the same type as a):
jacobi :: ([a] -> [a]) -> [a] -> ([a] -> Bool) -> [a]
What I'm looking for isn't really pure. wmap could have values that mutate internally, but only exist inside the function. It also has nondeterministic runtime, if it terminates at all.
In the case of a Gauss-Seidel solver, there would be no return value, since the [a] would be modified in place.
Something like this:
gs :: ([a] -> [a]) -> [a] -> ([a] -> Bool) -> ???
Does wmap or wfmap exist as part of any language by default, and what is it called?
Answer 1 (thanks to Bergi): Instead of the silly wmap/wfmap signature, we already have until.
Does an in place version of until exist for things like gs?
There is a proverb in engineering which states "Don't generalize before you have at least 3 implementations". There is some truth to it - especially when looking for new functional iteration concepts before doing it by foot a few times.
"Doing it by foot" here means, you should - if there is no friendly helper function you know of - resort to recursion. Write your "special cases" recursively. Preferably in a tail recursive form. Then, if you start to see recurring patterns, you might come up with a way to refactor into some recurring iteration scheme and its "kernel".
Let's for the sake of clarification of the above, assume you never heard of foldl and you want accumulate a result from iteration over a list... Then, you would write something like:
myAvg values =
total / (length values)
where
mySum acc [] = acc
mySum acc (x:xs) = mySum (acc + x) xs
total = mySum 0 values
And after doing this a couple of times, the pattern might show, that the recursions in those where clauses always look darn similar. You might then come up with a name like "fold" or "reduce" for that inner recursion snippet and end up with:
myAvg values = (foldl (+) 0.0 values) / fromIntegral (length values) :: Float
So, if you are looking for helper functions which help with your use-cases, my advice is you first write a few instances as recursive functions and then look for patterns.
So, with all that said, let's get our fingers wet and see how the Jacobi algorithm could translate to Haskell. Just so we have something to talk about. Now - usually I do not use Haskell for anything requiring arrays (containers with O(1) element access), because there are at least 5 array packages I know of and I would have to read for 2 days to decide which one is suitable for my application. TL;DR;). So I stick with lists and NO package dependencies beyond prelude in the code below. But that is - given the size of the example equations we try to solve is tiny - not a bad thing at all. Plus, the code demonstrates, that list comprehensions in lazy Haskell allow for un-imperative and yet performant operations on sets of cells (e.g. in the matrix), without any need for explicit looping.
type Matrix = [[Double]]
-- sorry - my mind went blank while looking for a better name for this...
-- but it is useful nonetheless
idefix nr nc =
[ [(r,c) | c <- [0..nc-1]] | r <- [0..nr-1]]
matElem m (r,c) = (m !! r) !! c
transpose (r,c) = (c,r)
matrixDim m = (length m, length . head $ m)
-- constructs a Matrix by enumerating the indices and querying
-- 'unfolder' for a value.
-- try "unfoldMatrix 3 3 id" and you see how indices relate to
-- cells in the matrix.
unfoldMatrix nr nc unfolder =
fmap (\row -> fmap (\cell -> unfolder cell) row) $ idefix nr nc
-- Not really needed for Jacobi problem but good
-- training to get our fingers wet with unfoldMatrix.
transposeMatrix m =
let (nr,nc) = matrixDim m in
unfoldMatrix nc nr (matElem m . transpose)
addMatrix m1 m2
| (matrixDim m1) == (matrixDim m2) =
let (nr,nc) = matrixDim m1 in
unfoldMatrix nr nc (\idx -> matElem m1 idx + matElem m2 idx)
subMatrix m1 m2
| (matrixDim m1) == (matrixDim m2) =
let (nr,nc) = matrixDim m1 in
unfoldMatrix nr nc (\idx -> matElem m1 idx - matElem m2 idx)
dluMatrix :: Matrix -> (Matrix,Matrix,Matrix)
dluMatrix m
| (fst . matrixDim $ m) == (snd . matrixDim $ m) =
let n = fst . matrixDim $ m in
(unfoldMatrix n n (\(r,c) -> if r == c then matElem m (r,c) else 0.0)
,unfoldMatrix n n (\(r,c) -> if r > c then matElem m (r,c) else 0.0)
,unfoldMatrix n n (\(r,c) -> if c > r then matElem m (r,c) else 0.0)
)
mulMatrix m1 m2
| (snd . matrixDim $ m1) == (fst . matrixDim $ m2) =
let (nr, nc) = ((fst . matrixDim $ m1),(snd . matrixDim $ m2)) in
unfoldMatrix nr nc
(\(ro,co) ->
sum [ matElem m1 (ro,i) * matElem m2 (i,co) | i <- [0..nr-1]]
)
isSquareMatrix m = let (nr,nc) = matrixDim m in nr == nc
jacobi :: Double -> Matrix -> Matrix -> Matrix -> Matrix
jacobi errMax a b x0
| isSquareMatrix a && (snd . matrixDim $ a) == (fst . matrixDim $ b) =
approximate x0
-- We could possibly avoid our hand rolled recursion
-- with the help of 'loop' from Control.Monad.Extra
-- according to hoogle. But it would not look better at all.
-- loop (\x -> let x' = jacobiStep x in if converged x' then Right x' else Left x') x0
where
(nra, nca) = matrixDim a
(d,l,u) = dluMatrix a
dinv = unfoldMatrix nra nca (\(r,c) ->
if r == c
then 1.0 / matElem d (r,c)
else 0.0)
lu = addMatrix l u
converged x =
let delta = (subMatrix (mulMatrix a x) b) in
let (nrd,ncd) = matrixDim delta in
let err = sum (fmap (\idx -> let v = matElem delta idx in v * v)
(concat (idefix nrd ncd))) in
err < errMax
jacobiStep x =
(mulMatrix dinv (subMatrix b (mulMatrix lu x)))
approximate x =
let x' = jacobiStep x in
if converged x' then x' else approximate x'
wikiExample errMax =
let a = [[ 2.0, 1.0],[5.0,7.0]] in
let b = [[11], [13]] in
jacobi errMax a b [[1.0],[1.0]]
Function idefix, despite it's silly name, IMHO is an eye opener for people coming from non-lazy languages. Their first reflex is to get scared: "What - he creates a list with the indices instead of writing loops? What a waste!" But a waste, it is not in lazy languages. What you see in this function (the list comprehension) produces a lazy list. It is not really created. What happens behind the scene is similar in spirit to what LINQ does in C# - IEnumerator<T> juggling.
We use idefix a second time when we want to sum all elements in our delta. There, we do not care about the concrete structure of the matrix. And so we use the standard prelude function concat to flatten the Matrix into a linear list. Lazy as well, of course. That is the beauty.
The next notable difference to the imperative wikipedia pseudo code is, that using matrix notation is much less complicated compared to nested looping and operating on single cells. Fortunately, the wikipedia article shows both. So, instead of a while loop with 2 nested loops, we only need an equivalent of the outermost while loop. Which is covered by our 2 liner recursive function approximate.
Lessons learned:
Lists and list comprehensions can help simplify code otherwise requiring nested loops. (In lazy languages).
Ocaml and Common Lisp have mutability and built in arrays and loops. That makes a package, very convenient when translating algorithms from imperative languages or imperative pseudo code.
Haskell has immutability and no built in arrays and no loops, but instead it has a similarly powerful set of tools, namely Laziness, tail call optimization and a terse syntax. That combination requires more planning (and writing some usually short helper functions) instead of the classical C approach of "Let's write it all in main()."
Sometimes it is easier to write a 2 line long recursive function than to think about how to abstract it.
In FP, you don't usually try to fit everything "inside the loop." You do one step and pass it on to the next function. There are lots of combinations that are useful in different situations. A common replacement for a while loop is a map followed by a takeWhile or a dropWhile, but there are many other possibilities, up to just plain recursion.

Simplify a recursive function from 3 to 2 clauses

I am doing some exercises on F#, i have this function that calculate the alternate sum:
let rec altsum = function
| [] -> 0
| [x] -> x
| x0::x1::xs -> x0 - x1 + altsum xs;;
val altsum : int list -> int
The exercise consist in declare the same function with only two clauses...but how to do this?
The answer of mydogisbox is correct and work!
But after some attempts I found a smallest and readable solution of the problem.
let rec altsum2 = function
| [] -> 0
| x0::xs -> x0 - altsum2 xs
Example
altsum2 [1;2;3] essentially do this:
1 - (2 - (3 - 0)
it's is a bit tricky but work!
OFF TOPIC:
Another elegant way to solve the problem, using F# List library is:
let altsum3 list = List.foldBack (fun x acc -> x - acc) list 0;;
After the comment of phoog I started trying to solve the problem with a tail recursive function:
let tail_altsum4 list =
let pl l = List.length l % 2 = 0
let rec rt = function
| ([],acc) -> if pl list then -acc else acc
| (x0::xs,acc) -> rt (xs, x0 - acc)
rt (list,0)
This is also a bit tricky...substraction is not commutative and it's impossible think to revers with List.rev a long list...but I found a workaround! :)
To reduce the number of cases, you need to move your algorithm back closer to the original problem. The problem says to negate alternating values, so that's what your solution should do.
let altsum lst =
let rec altsumRec lst negateNext =
match lst with
| [] -> 0
| head::tail -> (if negateNext then -head else head) + altsumRec tail (not negateNext)
altsumRec lst false

Stack overflow during evaluation (looping recursion?). OCaml

I'm trying to write a function that accepts an int n and returns a list that runs down from n to 0.
This is what I have
let rec downFrom n =
let m = n+1 in
if m = 0 then
[]
else
(m-1) :: downFrom (m - 1);;
The function compiles ok but when I test it with any int it gives me the error
Stack overflow during evaluation (looping recursion?).
I know it's the local varible that gets in the way but I don't know another way to declare it. Thank you!!!
First, the real thing wrong with your program is that you have an infinite loop. Why, because your inductive base case is 0, but you always stay at n! This is because you recurse on m - 1 which is really n + 1 - 1
I'm surprised as to why this compiles, because it doesn't include the rec keyword, which is necessary on recursive functions. To avoid stack overflows in OCaml, you generally switch to a tail recursive style, such as follows:
let downFrom n =
let rec h n acc =
if n = 0 then List.rev acc else h (n-1) (n::acc)
in
h n []
Someone suggested the following edit:
let downFrom n =
let rec h m acc =
if m > n then acc else h (m + 1) (m::acc)
in
h 0 [];
This saves a call to List.rev, I agree.
The key with recursion is that the recursive call has to be a smaller version of the problem. Your recursive call doesn't create a smaller version of the problem. It just repeats the same problem.
You can try with a filtering parameter
syntax:
let f = function
p1 -> expr1
| p2 -> expr2
| p3 -> ...;;
let rec n_to_one =function
0->[]
|n->n::n_to_one (n-1);;
# n_to_one 3;;
- : int list = [3; 2; 1]

F# tail call optimization with 2 recursive calls?

As I was writing this function I knew that I wouldn't get tail call optimization. I still haven't come up with a good way of handling this and was hoping someone else might offer suggestions.
I've got a tree:
type Heap<'a> =
| E
| T of int * 'a * Heap<'a> * Heap<'a>
And I want to count how many nodes are in it:
let count h =
let rec count' h acc =
match h with
| E -> 0 + acc
| T(_, value, leftChild, rightChild) ->
let acc = 1 + acc
(count' leftChild acc) + (count' rightChild acc)
count' h 0
This isn't isn't optimized because of the addition of the counts for the child nodes. Any idea of how to make something like this work if the tree has 1 million nodes?
Thanks, Derek
Here is the implementation of count using CPS. It still blew the stack though.
let count h =
let rec count' h acc cont =
match h with
| E -> cont (1 + acc)
| T(_,_,left,right) ->
let f = (fun lc -> count' right lc cont)
count' left acc f
count' h 0 (fun (x: int) -> x)
Maybe I can come up with some way to partition the tree into enough pieces that I can count without blowing the stack?
Someone asked about the code which generates the tree. It is below.
member this.ParallelHeaps threads =
let rand = new Random()
let maxVal = 1000000
let rec heaper i h =
if i < 1 then
h
else
let heap = LeftistHeap.insert (rand.Next(100,2 * maxVal)) h
heaper (i - 1) heap
let heaps = Array.create threads E
printfn "Creating heap of %d elements, with %d threads" maxVal threads
let startTime = DateTime.Now
seq { for i in 0 .. (threads - 1) ->
async { Array.set heaps i (heaper (maxVal / threads) E) }}
|> Async.Parallel
|> Async.RunSynchronously
|> ignore
printfn "Creating %d sub-heaps took %f milliseconds" threads (DateTime.Now - startTime).TotalMilliseconds
let startTime = DateTime.Now
Array.length heaps |> should_ equal threads <| "The size of the heaps array should match the number of threads to process the heaps"
let rec reMerge i h =
match i with
| -1 -> h
| _ ->
printfn "heap[%d].count = %d" i (LeftistHeap.count heaps.[i])
LeftistHeap.merge heaps.[i] (reMerge (i-1) h)
let heap = reMerge (threads-1) E
printfn "Merging %d heaps took %f milliseconds" threads (DateTime.Now - startTime).TotalMilliseconds
printfn "heap min: %d" (LeftistHeap.findMin heap)
LeftistHeap.count heap |> should_ equal maxVal <| "The count of the reMerged heap should equal maxVal"
You can use continuation-passing style (CPS) to solve that problem. See Recursing on Recursion - Continuation Passing by Matthew Podwysocki.
let tree_size_cont tree =
let rec size_acc tree acc cont =
match tree with
| Leaf _ -> cont (1 + acc)
| Node(_, left, right) ->
size_acc left acc (fun left_size ->
size_acc right left_size cont)
size_acc tree 0 (fun x -> x)
Note also that in Debug builds, tail call optimization is disabled. If you don't want to run in Release mode, you can enable the optimization in the project's properties in Visual Studio.
CPS is a good general solution but you might also like to consider explicit use of a stack because it will be faster and is arguably simpler:
let count heap =
let stack = System.Collections.Generic.Stack[heap]
let mutable n = 0
while stack.Count > 0 do
match stack.Pop() with
| E -> ()
| T(_, _, heap1, heap2) ->
n <- n + 1
stack.Push heap1
stack.Push heap2
n

Resources