How to make an tail recursive function and test it? - recursion

I would like to make this functions recursive but I don't know where to start.
let rec rlist r n =
if n < 1 then []
else Random.int r :: rlist r (n-1);;
let rec divide = function
h1::h2::t -> let t1,t2 = divide t in
h1::t1, h2::t2
| l -> l,[];;
let rec merge ord (l1,l2) = match l1,l2 with
[],l | l,[] -> l
| h1::t1,h2::t2 -> if ord h1 h2
then h1::merge ord (t1,l2)
else h2::merge ord (l1,t2);;
Is there any way to test if a function is recursive or not?

If you give a man a fish, you feed him for a day. But if you give him a fishing rod, you feed him for a lifetime.
Thus, instead of giving you the solution, I would better teach you how to solve it yourself.
A tail-recursive function is a recursive function, where all recursive calls are in a tail position. A call position is called a tail position if it is the last call in a function, i.e., if the result of a called function will become a result of a caller.
Let's take the following simple function as our working example:
let rec sum n = if n = 0 then 0 else n + sum (n-1)
It is not a tail-recursive function as the call sum (n-1) is not in a tail position because its result is then incremented by one. It is not always easy to translate a general recursive function into a tail-recursive form. Sometimes, there is a tradeoff between efficiency, readability, and tail-recursion.
The general techniques are:
use accumulator
use continuation-passing style
Using accumulator
Sometimes a function really needs to store the intermediate results, because the result of recursion must be combined in a non-trivial way. A recursive function gives us a free container to store arbitrary data - the call stack. A place, where the language runtime, stores parameters for the currently called functions. Unfortunately, the stack container is bounded, and its size is unpredictable. So, sometimes, it is better to switch from the stack to the heap. The latter is slightly slower (because it introduces more work to the garbage collector), but is bigger and more controllable. In our case, we need only one word to store the running sum, so we have a clear win. We are using less space, and we're not introducing any memory garbage:
let sum n =
let rec loop n acc = if n = 0 then acc else loop (n-1) (acc+n) in
loop n 0
However, as you may see, this came with a tradeoff - the implementation became slightly bigger and less understandable.
We used here a general pattern. Since we need to introduce an accumulator, we need an extra parameter. Since we don't want or can't change the interface of our function, we introduce a new helper function, that is recursive and will carry the extra parameter. The trick here is that we apply the summation before we do the recursive call, not after.
Using continuation-passing style
It is not always the case when you can rewrite your recursive algorithm using an accumulator. In this case, a more general technique can be used - the continuation-passing style. Basically, it is close to the previous technique, but we will use a continuation in the place of an accumulator. A continuation is a function, that will actually postpone the work, that is needed to be done after the recursion, to a later time. Conventionally, we call this function return or simply k (for the continuation). Mentally, the continuation is a way of throwing the result of computation back into the future. "Back" is because you returning the result back to the caller, in the future, because, the result will be used not now, but once everything is ready. But let's look at the implementation:
let sum n =
let rec loop n k = if n = 0 then k 0 else loop (n-1) (fun x -> k (x+n)) in
loop n (fun x -> x)
You may see, that we employed the same strategy, except that instead of int accumulator we used a function k as a second parameter. If the base case, if n is zero, we will return 0, (you can read k 0 as return 0). In the general case, we recurse in a tail position, with a regular decrement of the inductive variable n, however, we pack the work, that should be done with the result of the recursive function into a function: fun x -> k (x+n). Basically, this function says, once x - the result of recursion call is ready, add it to the number n and return. (Again, if we will use name return instead of k it could be more readable: fun x -> return (x+n)).
There is no magic here, we still have the same tradeoff, as with accumulator, as we create a new closure (functional object) at every recursive call. And each newly created closure contains a reference to the previous one (that was passed via the parameter). For example, fun x -> k (x+n) is a function, that captures two free variables, the value n and function k, that was the previous continuation. Basically, these continuations form a linked list, where each node bears a computation and all arguments except one. So, the computation is delayed until the last one is known.
Of course, for our simple example, there is no need to use CPS, since it will create unnecessary garbage and be much slower. This is only for demonstration. However, for more complex algorithms, in particular for those that combine results of two or more recursive calls in a non-trivial case, e.g., folding over a graph data structure.
So now, armed with the new knowledge, I hope that you will be able to solve your problems as easy as pie.
Testing for the tail recursion
The tail call is a pretty well-defined syntactic notion, so it should be pretty obvious whether the call is in a tail position or not. However, there are still few methods that allow one to check whether the call is in a tail position. In fact, there are other cases, when tail-call optimization may come into play. For example, a call that is right to the shortcircuit logical operator is also a tail call. So, it is not always obvious when a call is using the stack or it is a tail call. The new version of OCaml allows one to put an annotation at the call place, e.g.,
let rec sum n = if n = 0 then 0 else n + (sum [#tailcall]) (n-1)
If the call is not really a tail call, a warning is issued by a compiler:
Warning 51: expected tailcall
Another method is to compile with -annot option. The annotation file will contain an annotation for each call, for example, if we will put the above function into a file sum.ml and compile with ocamlc -annot sum.ml, then we can open sum.annot file and look for all calls:
"sum.ml" 1 0 41 "sum.ml" 1 0 64
call(
stack
)
If we, however, put our third implementation, then the see that all calls are tail calls, e.g. grep call -A1 sum.annot:
call(
tail
--
call(
tail
--
call(
tail
--
call(
tail
Finally, you can just test your program with some big input, and see whether your program will fail with the stack overflow. You can even reduce the size of the stack, this can be controlled with the environment variable OCAMLRUNPARAM, for example, to limit the stack to one thousand words:
export OCAMLRUNPARAM='l=1000'
ocaml sum.ml

You could do the following :
let rlist r n =
let aux acc n =
if n < 1 then acc
else aux (Random.int r :: acc) (n-1)
in aux [] n;;
let divide l =
let aux acc1 acc2 = function
| h1::h2::t ->
aux (h1::acc1) (h2::acc2) t
| [e] -> e::acc1, acc2
| [] -> acc1, acc2
in aux [] [] l;;
But for divide I prefer this solution :
let divide l =
let aux acc1 acc2 = function
| [] -> acc1, acc2
| hd::tl -> aux acc2 (hd :: acc1) tl
in aux [] [] l;;
let merge ord (l1,l2) =
let rec aux acc l1 l2 =
match l1,l2 with
| [],l | l,[] -> List.rev_append acc l
| h1::t1,h2::t2 -> if ord h1 h2
then aux (h1 :: acc) t1 l2
else aux (h2 :: acc) l1 t2
in aux [] l1 l2;;
As to your question about testing if a function is tail recursive or not, by looking out for it a bit you would have find it here.

Related

Simplify multiway tree traversal with continuation passing style

I am fascinated by the approach used in this blog post to traverse a rose tree a.k.a multiway tree a.k.a n-ary tree using CPS.
Here is my code, with type annotations removed and names changed, which I did while trying to understand the technique:
type 'a Tree = Node of 'a * 'a Tree list | Leaf of 'a
let rec reduce recCalls cont =
match recCalls with
| [] -> [] |> cont
| findMaxCall :: pendingCalls ->
findMaxCall (fun maxAtNode ->
reduce pendingCalls (fun maxVals -> maxAtNode :: maxVals |> cont))
let findMaxOf (roseTree : int Tree) =
let rec findMax tr cont =
match tr with
| Leaf i -> i |> cont
| Node (i, chld) ->
let recCalls = chld |> List.map findMax
reduce recCalls (fun maxVals -> List.max (i :: maxVals) |> cont)
findMax roseTree id
// test it
let FindMaxOfRoseTree =
let t = Node (1, [ Leaf 2; Leaf 3 ])
let maxOf = findMaxOf t //will be 3
maxOf
My problem is, I find this approach hard to follow. The mutual recursion (assuming that's the right term) is really clever to my simpleton brain, but I get lost while trying to understand how it works, even when using simple examples and writing down steps manually etc.
I am in need of using CPS with Rose trees, and I'll be doing the kind of traversals that require a CPS, because just like this example, computing results based on my my tree nodes require that children of the nodes are computed first. In any case, I do like CPS and I'd like to improve my understanding of it.
So my question is: Is there an alternative way of implementing CPS on rose trees which I may manage to better follow understand? Is there a way to refactor the above code which may make it easier to follow (eliminating the mutual recursion?)
If there is a name for the above approach, or some resources/books I can read to understand it better, hints are also most welcome.
CPS can definitely be confusing, but there are some things you can do to simplify this code:
Remove the Leaf case from your type because it's redundant. A leaf is just a Node with an empty list of children.
Separate general-purpose CPS logic from logic that's specific to rose trees.
Use the continuation monad to simplify CPS code.
First, let's define the continuation monad:
type ContinuationMonad() =
member __.Bind(m, f) = fun c -> m (fun a -> f a c)
member __.Return(x) = fun k -> k x
let cont = ContinuationMonad()
Using this builder, we can define a general-purpose CPS reduce function that combines a list of "incomplete" computations into a single incomplete computation (where an incomplete computation is any function that takes a continuation of type 't -> 'u and uses it to produce a value of type 'u).
let rec reduce fs =
cont {
match fs with
| [] -> return []
| head :: tail ->
let! result = head
let! results = reduce tail
return result :: results
}
I think this is certainly clearer, but it might seem like magic. The key to understanding let! x = f for this builder is that x is the value passed to f's implied continuation. This allows us to get rid of lots of lambdas and nested parens.
Now we're ready to work with rose trees. Here's the simplified type definition:
type 'a Tree = Node of 'a * 'a Tree list
let leaf a = Node (a, [])
Finding the maximum value in a tree now looks like this:
let rec findMax (Node (i, chld)) =
cont {
let! maxVals = chld |> List.map findMax |> reduce
return List.max (i :: maxVals)
}
Note that there's no mutual recursion here. Both reduce and findMax are self-recursive, but reduce doesn't call findMax and doesn't know anything about rose trees.
You can test the refactored code like this:
let t = Node (1, [ leaf 2; leaf 3 ])
findMax t (printfn "%A") // will be 3
For convenience, I created a gist containing all the code.
The accepted answer from brianberns indeed provides an alternative way of achieving cps on a rose tree.
I'm also adding this alternative solution from Tomas Petricek. It shows how we can eliminate the extra function call by changing the type of the tree from a single node to a list of nodes in the inner loop.
I should have used the term multiway tree (which I'll change in a minute) but at least this question now documents three different methods. Hopefully it'll help others.

Sum of Odds in OCaml

I have some code written in OCaml
let rec sumodds n =
if (n mod 2)<>0 then
let sum = sumodds (n-1) in
n + sum
else sumodds(n-1);;
and I am trying to add up all odd numbers from 0 to n, but I am not sure how to make the program stop once n reaches zero. If I could get some help that would be awesome. If there are any other mistakes in the program, feel free to let me know what they are.
The way to get the function to stop is to test for the "stop condition". I.e., you need to add an if statement that tests whether you have reached such a low value for n that the result is obvious.
Very commonly a recursive function looks like this:
let rec myfun arg =
if is_trivial_value arg then
obvious_answer
else
let s = myfun (smallish_part_of arg) in
combine arg s
You just need to add a test for the trivial value of your argument n.
Update
As #Goswin_von_Brederlow points out, another very common pattern for recursive functions is this:
let rec myfun2 accum arg =
if is_trivial_vaue arg then
accum
else
myfun2 (combine arg accum) (smallish_part_of arg)
This is the tail recursive transformation of the above form. You can code in either form, but they are different. So you need to keep them straight in your mind. As an FP programmer you need to (and will pretty easily) learn to translate between the two forms.
just add "if n <= 0 then n" in the 2nd line

Memoisation in OCaml and a Reference List

I am learning OCaml. I know that OCaml provides us with both imperative style of programming and functional programming.
I came across this code as part of my course to compute the n'th Fibonacci number in OCaml
let memoise f =
let table = ref []
in
let rec find tab n =
match tab with
| [] ->
let v = (f n)
in
table := (n, v) :: !table;
v
| (n', v) :: t ->
if n' = n then v else (find t n)
in
fun n -> find !table n
let fibonacci2 = memoise fibonacci1
Where the function fibonacci1 is implemented in the standard way as follows:
let rec fibonacci1 n =
match n with
| 0 | 1 -> 1
| _ -> (fibonacci1 (n - 1)) + (fibonacci1 (n - 2))
Now my question is that how are we achieving memoisation in fibonacci2. table has been defined inside the function fibonacci2 and thus, my logic dictates that after the function finishes computation, the list table should get lost and after each call the table will get built again and again.
I ran some a simple test where I called the function fibonacci 35 twice in the OCaml REPL and the second function call returned the answer significantly faster than the first call to the function (contrary to my expectations).
I though that this might be possible if declaring a variable using ref gives it a global scope by default.
So I tried this
let f y = let x = ref 5 in y;;
print_int !x;;
But this gave me an error saying that the value of x is unbounded.
Why does this behave this way?
The function memoise returns a value, call it f. (f happens to be a function). Part of that value is the table. Every time you call memoise you're going to get a different value (with a different table).
In the example, the returned value f is given the name fibonacci2. So, the thing named fibonacci2 has a table inside it that can be used by the function f.
There is no global scope by default, that would be a huge mess. At any rate, this is a question of lifetime not of scope. Lifetimes in OCaml last as long as an object can be reached somehow. In the case of the table, it can be reached through the returned function, and hence it lasts as long as the function does.
In your second example you are testing the scope (not the lifetime) of x, and indeed the scope of x is restricted to the subexpresssion of its let. (I.e., it is meaningful only in the expression y, where it's not used.) In the original code, all the uses of table are within its let, hence there's no problem.
Although references are a little tricky, the underlying semantics of OCaml come from lambda calculus, and are extremely clean. That's why it's such a delight to code in OCaml (IMHO).

Create a list by consing vs tail-call accumulator

For example, one function creates a list via consing:
fun example1 _ _ [] = []
| example1 f g (x::xs) =
if f x
then (g x)::(example1 f g xs)
else x::(example1 f g xs)
One creates a list via tail-call accumulator:
fun example2 f g xs =
let fun loop acc [] = acc
| loop acc (x::xs') =
if f x
then loop (acc#[(g x)]) xs'
else loop (acc#[x]) xs'
in
loop [] xs
end
to produce the same list given the same arguments.
Which function has better running time?
Does append operation # traverse to the end of the list to append and end up with the same running time with consing solution, but using much less space and slightly more complicated code?
Does consing or appending create an entire new element (deep copy of object), even if there's no change to the original element or it simply reuses the existing elements?
This question gives a more concrete example for this question
x :: xs creates one new list cell whose head is x and whose tail is xs. It does not create a copy of xs - neither deep nor shallow. So it's an O(1) operation.
xs # [x] creates a shallow copy of xs with the change that the tail of the previously last node is now [x]. This is an O(n) operation.
So the time complexity of your example1 function is O(n) and that of your example2 function is O(n^2). Both functions consume O(n) auxiliary space. example1 because of its stack usage and example2 because # creates lists on the heap that aren't part of the resulting list.
If you change example2 to use :: rather than # and then use List.rev on the result when you reach the end of the list, it's running time will be O(n), but it will still be somewhat slower than example1 because of the additional cost of reversing the list at the end. However that might be an acceptable price to pay to be able to handle large lists without stack overflow.

Keeping a counter at each recursive call in OCaml

I am trying to write a function that returns the index of the passed value v in a given list x; -1 if not found. My attempt at the solution:
let rec index (x, v) =
let i = 0 in
match x with
[] -> -1
| (curr::rest) -> if(curr == v) then
i
else
succ i; (* i++ *)
index(rest, v)
;;
This is obviously wrong to me (it will return -1 every time) because it redefines i at each pass. I have some obscure ways of doing it with separate functions in my head, none which I can write down at the moment. I know this is a common pattern in all programming, so my question is, what's the best way to do this in OCaml?
Mutation is not a common way to solve problems in OCaml. For this task, you should use recursion and accumulate results by changing the index i on certain conditions:
let index(x, v) =
let rec loop x i =
match x with
| [] -> -1
| h::t when h = v -> i
| _::t -> loop t (i+1)
in loop x 0
Another thing is that using -1 as an exceptional case is not a good idea. You may forget this assumption somewhere and treat it as other indices. In OCaml, it's better to treat this exception using option type so the compiler forces you to take care of None every time:
let index(x, v) =
let rec loop x i =
match x with
| [] -> None
| h::t when h = v -> Some i
| _::t -> loop t (i+1)
in loop x 0
This is pretty clearly a homework problem, so I'll just make two comments.
First, values like i are immutable in OCaml. Their values don't change. So succ i doesn't do what your comment says. It doesn't change the value of i. It just returns a value that's one bigger than i. It's equivalent to i + 1, not to i++.
Second the essence of recursion is to imagine how you would solve the problem if you already had a function that solves the problem! The only trick is that you're only allowed to pass this other function a smaller version of the problem. In your case, a smaller version of the problem is one where the list is shorter.
You can't mutate variables in OCaml (well, there is a way but you really shouldn't for simple things like this)
A basic trick you can do is create a helper function that receives extra arguments corresponding to the variables you want to "mutate". Note how I added an extra parameter for the i and also "mutate" the current list head in a similar way.
let rec index_helper (x, vs, i) =
match vs with
[] -> -1
| (curr::rest) ->
if(curr == x) then
i
else
index_helper (x, rest, i+1)
;;
let index (x, vs) = index_helper (x, vs, 0) ;;
This kind of tail-recursive transformation is a way to translate loops to functional programming but to be honest it is kind of low level (you have full power but the manual recursion looks like programming with gotos...).
For some particular patterns what you can instead try to do is take advantage of reusable higher order functions, such as map or folds.

Resources