CPS merge sort causes a stack overflow

CPS merge sort causes a stack overflow - functional-programming

Since I had problems with stack overflows due to a non-tail recursion, I used the continuations so as to make sorting of large list feasible.
I implemeneted the sorting this way (you can see the whole code here: http://paste.ubuntu.com/13481004/)
let merge_sort l =
let rec merge_sort' l cont =
match l with
| [] -> cont []
| [x] -> cont [x]
| _ ->
let (a,b) = split l
in
merge_sort' a
(fun leftRes -> merge_sort' b
(* OVERFLOW HERE *) (fun rightRes -> cont (merge leftRes rightRes) )
)
in merge_sort' l (fun x -> x)
I get a stack overflow nevertheless, in the indicated line.
What am I doing wrong?

(#) of OCaml's standard library is not tail recursive. merge function in your code http://paste.ubuntu.com/13481004/ uses (#), and this is the cause of the stack overflow.
list.mli says:
val append : 'a list -> 'a list -> 'a list
(** Catenate two lists. Same function as the infix operator [#].
Not tail-recursive (length of the first argument). The [#]
operator is not tail-recursive either. *)
but unfortunately this fact is not written in pervasives.mli where (#) is really declared:
val ( # ) : 'a list -> 'a list -> 'a list
(** List concatenation. *)
This is not good :-( I have filed an issue for it at OCaml dev page.
I redefined (#) as fun x y -> rev_append (rev x) y then your code runs w/o stack overflow. More elegantly, you can replace codes like (rev a) # l by rev_append a l.
P.S. (#) in pervasives.mli will be commented as "not tail recursive" in the next release of OCaml.

Related

Stack Overflow when calling a function that generates a lazy list?

I can define an infinite data structure - aka lazy list - like this.
let 'a lazylist = Succ of 'a * (unit -> 'a lazylist);;
(Why can't I replace unit -> 'a lazylist with () -> 'a lazylist?)
The way I understand lazy data structures the above definition says that a lazy list consists of a tupel of a generic element 'a and a function unit->'a lazylist that will compute the next element in the list when called with () which is of type unit.
So e.g. I could generate a list that has every even number:
let rec even_list l =
match l with
Succ (a, l') ->
if (a mod 2 = 0) then
Succ (a, fun() -> even_list (l' ())
else
even_list (l' ());;
The way I understand it: When fun() -> even_list (l'())) is called with the unit argument () it will call even_list with the successor of l' by giving it unit as an argument: l'()
But is it possible for the else even_list (l'());; part to lead to a Stack Overflow if we give even_list a lazylist as an argument that only consists of uneven elements e.g.? Whereas in the then part of the if-statement we only generate the next element of the list when called with () - in the else part we would search indefinitely.

First, you can use the built-in Seq.t type rather than define your own lazy list type.
Second, your function even_list is tail-recursive and cannot result in a stack overflow.
Third, if you are using the take function proposed in Call lazy-list function in OCaml, it is this function which is not tail-recursive and consumes stack.
You can write a tail-recursive version of this function
let rec take l n (Succ(x,f)) =
if n = 0 then List.rev l
else take (x::l) (n-1) (f ())
let take n l = take [] n l
or define a fold function
let rec fold_until n f acc (Succ(x,l)) =
if n = 0 then acc
else fold_until (n-1) f (f acc x) (l())
and use that function to define a printer that does not build an intermediary list.
(This is why it is generally advised to write-down a fully self-contained example, otherwise the issue is too often hidden in the implicit context of the question.)

How do you generate all permutations of a list with repetition in a functional programming language?

I'm trying to self-learn some programming in a functional programming language and recently stumbled on the problem of generating all the permutations of length m from a list of length n, with repetition. Mathematically, this should result in a total of n^m possible permutations, because each of the m 'slots' can be filled with any of the n elements. The code I have currently, however, does not give me all the elements:
let rec permuts n list =
match n, list with
0, _ -> [[]]
| _, [] -> []
| n, h :: t -> (List.map (fun tt -> h::tt) (permuts (n-1) list))
# permuts n t;;
The algorithm basically takes one element out of a list with m elements, slaps it onto the front of all the combinations with the rest of the elements, and concatenates the results into one list, giving only n C m results.
For example, the output for permuts 2 [1;2;3] yields
[[1;1]; [1;2]; [1;3]; [2;2]; [2;3]; [3;3]]
whereas I actually want
[[1;1]; [1;2]; [1;3]; [2;1]; [2;2]; [2;3]; [3;1]; [3;2]; [3;3]]
-- a total of 9 elements. How do I fix my code so that I get the result I need? Any guidance is appreciated.

Your error appears on the second line of:
| n, h :: t -> List.map (fun tt -> h::tt) (permuts (n-1) list)
# permuts n t
Indeed, with this you are decomposing the set of n-tuples with k elements as the sum of
the set of (n-1)-tuples prefixed with the first element
the set of n-tuples with (k-1) elements
Looking at the cardinal of the three sets, there is an obvious mismatch since
k^n ≠ k^(n-1) + (k-1)^n
And the problem is that the second term doesn't fit.
To avoid this issue, it is probably better to write a couple of helper function.
I would suggest to write the following three helper functions:
val distribute: 'a list -> 'a list -> 'a list list
(** distribute [x_1;...;x_n] y returns [x_1::y;...x_n::y] *)
val distribute_on_all: 'a list -> 'a list list
(** distribute_on_all x [l_1;...;l_n] returns distribute x l_1 # ... # distribute x l_n *)
val repeat: int -> ('a -> 'a) -> 'a -> 'a
(** repeat n f x is f(...(f x)...) with f applied n times *)
then your function will be simply
let power n l = repeat n (distribute_on_all l) [[]]

In Haskell, it's very natural to do this using a list comprehension:
samples :: Int -> [a] -> [[a]]
samples 0 _ = [[]]
samples n xs =
[ p : ps
| p <- xs
, ps <- samples (n - 1) xs
]

It seems to me you never want to recurse on the tail of the list, since all your selections are from the whole list.
The Haskell code of #dfeuer looks right. Note that it never deconstructs the list xs. It just recurses on n.
You should be able to copy the Haskell code using List.map in place of the first two lines of the list comprehension, and a recursive call with (n - 1) in place of the next line.

Here's how I would write it in OCaml:
let perm src =
let rec extend remaining_count tails =
match remaining_count with
| 0 -> tails
| _ ->
(* Put an element 'src_elt' taken from all the possible elements 'src'
in front of each possible tail 'tail' taken from 'tails',
resulting in 'new_tails'. The elements of 'new_tails' are one
item longer than the elements of 'tails'. *)
let new_tails =
List.fold_left (fun new_tails src_elt ->
List.fold_left (fun new_tails tail ->
(src_elt :: tail) :: new_tails
) new_tails tails
) [] src
in
extend (remaining_count - 1) new_tails
in
extend (List.length src) [[]]
The List.fold_left calls may look a bit intimidating but they work well. So it's a good idea to practice using List.fold_left. Similarly, Hashtbl.fold is also common and idiomatic, and you'd use it to collect the keys and values of a hash table.

How to write a function that appends a variable number of elements to a lazy list with each iteration?

The motivating problem is: Code a lazy list whose elements are all possible combinations of 0 and 1 i.e. [0], [1], [0;0], [0;1], etc..
Working in OCaml, I've written auxiliary functions for generating the list of permutations of length n+1 given n and for converting a list into a lazy list. The problem comes from the final function in the below block of code:
type 'a seq =
| Nil
| Cons of 'a * (unit -> 'a seq)
let rec adder = function
| [] -> []
| [[]] -> [[0];[1]]
| xs::ys -> (0::xs)::(1::xs)::(adder ys)
let rec listtoseq = function
| [] -> Nil
| xs::ys -> Cons(xs, fun () -> listtoseq ys)
let rec appendq xq yq =
match xq with
| Nil -> yq
| Cons (x, xf) -> Cons (x, fun() -> appendq (xf ()) yq)
let genlist xs = appendq (listtoseq xs) (genlist (adder xs))
Calling genlist [[0];[1]] results in a stack overflow. The issue seems to be that since genlist is an infinite loop I want to delay evaluation, yet evaluation is needed for appendq to work.
If this were a problem where one element is added to the lazy list at a time I could solve it, but I think the difficulty is that each set of length n permutations must be added at a time, and thus I don't know any other solution besides using an append function.

One way to look at your problem is that appendq isn't lazy enough. You can make things work if you define a function appendqf with this type:
'a seq -> (unit -> 'a seq) -> 'a seq
In other words, the second parameter isn't a sequence. It's a function that returns a sequence.
(Note that this type, unit -> 'a seq, is what actually appears inside a Cons.)
I tried this and it works for me.

Is this zip function tail recursive?

I implemented it using continuation. I think this is tail recursive but I'm told it's not. Why isn't it tail recursive?
let rec zip_tr fc sc l1 l2 = match l1, l2 with
| [], [] -> sc []
| [], _ -> fc (List.length l2)
| _, [] -> fc (List.length l1)
| h1::t1, h2::t2 ->
zip_tr fc (fun l -> sc ((h1, h2) :: l)) t1 t2
Isn't this tail recursive? Do the failure/success continuations have an effect effect on tail recursiveness?

There's only one recursive call in your code, and it is in tail position. So I would say your function is tail recursive.
It does build up a fairly large computation in the sc argument. However, the call to sc is in tail position also. In my tests, the function works for very large lists without running out of stack space.
If I try your function on two copies of a very long list (100,000,000 elements), it terminates successfully (after quite a long time). This suggests to me that it really is tail recursive.
Here is the session with the long list:
# let rec zip_tr fc sc l1 l2 = . . . ;;
val zip_tr :
(int -> 'a) -> (('b * 'c) list -> 'a) -> 'b list ->
'c list -> 'a = <fun>
# let rec mklong accum k =
if k <= 0 then accum
else mklong (k :: accum) (k - 1);;
val mklong : int list -> int -> int list = <fun>
# let long = mklong [] 100_000_000;;
val long : int list =
[1; 2; 3; 4; 5; ...]
# let long_pairs =
zip_tr (fun _ -> failwith "length mismatch")
(fun x -> x) long long;;
val long_pairs : (int * int) list =
[(1, 1); (2, 2); (3, 3); (4, 4); (5, 5); ...]
# List.length long_pairs;;
- : int = 100000000
If you change your code so that the call to sc is not a tail call:
zip_tr fc (fun l -> (h1, h2): sc l) t1 t2
It generates the result in reverse order, but it also fails for long lists:
# zip_tr (fun _ -> failwith "length mismatch")
(fun x -> x) [1;2] [3;4];;
- : (int * int) list = [(2, 4); (1, 3)]
# zip_tr (fun _ -> failwith "length mismatch")
(fun x -> x) long long;;
Stack overflow during evaluation (looping recursion?).
I'm don't know enough about OCaml code generation to explain this in detail, but it does suggest that your code really is tail recursive. However it's possible this depends on the implementation of closures. For a different implementation, perhaps the generated compuation for sc would consume a large amount of stack. Maybe this is what you're being told.

Using a tail-recursive function, you build something which is like a linked-list of continuations, by wrapping each sc inside another anonymous function; then, you call the resulting continuation.
Fortunately, your continuations are also tail-recursive, since the result of one call to sc directly gives the result of the anonymous closure. That explains why you don't have stack overflows when testing it.
The possible drawback of this function is that it allocates a lot of closures (but still with linear complexity) before starting to do any actual work, which is not what is usually done.
An advantage of this approach is that the success continuation is only called when both your lists are known to have the same size; more generally, compiling code to continuations is something that is interesting to know when working with languages (so your effort is not wasted).
If the function is part of some course, you are probably expected to directly build the result list while traversing your input lists, in a tail-recursive way, without delaying the work in continuations.

Folding a list in OCaml

In OCaml, a typical fold function looks like this:
let rec fold (combine: 'a -> 'b -> 'b) (base: 'b) (l: 'a list) : 'b =
begin match l with
| [] -> base
| x :: xs -> combine x (fold combine base xs)
end
For those familiar with OCaml (unlike me), it should be pretty straightforward what it's doing.
I'm writing a function that returns true when all items in the list satisfy the condition: if condition x is true for all x in some list l. However I'm implementing the function using a fold function and I'm stuck. Specifically I don't know what the list should return. I know that ideally the condition should be applied to every item in the list but I have no idea how the syntax should look. x && acc works but it fails a very simply test (shown below)
let test () : bool =
not (for_all (fun x -> x > 0) [1; 2; -5; -33; 2])
;; run_test "for_all: multiple elements; returns false" test
Here is my preliminary attempt. Any help is appreciated:
let for_all (pred: 'a -> bool) (l: 'a list) : bool =
fold (fun(x:'a)(acc: bool)-> _?_&&_?_ )false l

let rec fold (combine: 'a -> 'b -> 'b) (base: 'b) (l: 'a list) : 'b =
match l with
| [] -> base
| x::xs -> combine x (fold combine base xs)
let for_all (pred: 'a -> bool) (lst: 'a list) =
let combine x accum =
(pred x) && accum
in
fold combine true lst
Your combine function should not do x && base because elements of the list are not usually bool. You want your predicate function first evaluate the element to bool, then you "and" it with the accumulator.
There is no need for begin and end in fold. You can just pattern match with match <identifier> with.
There are two widespread types of fold: fold_left and fold_right. You're are using fold_right, which, basically, goes through the whole list and begins "combining" from the end of the list to the front. This is not tail-recursive.
fold_left, on the other hand goes from the front of the list and combines every element with the accumulator right away. This does not "eat up" your stack by a number of recursive function calls.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

CPS merge sort causes a stack overflow - functional-programming

Related

Stack Overflow when calling a function that generates a lazy list?

How do you generate all permutations of a list with repetition in a functional programming language?

How to write a function that appends a variable number of elements to a lazy list with each iteration?

Is this zip function tail recursive?

Folding a list in OCaml

Categories

Resources