Lazy "n choose k" in OCaml - functional-programming

As part of a bigger problem of enumerating a set, I need to write an OCaml function 'choose' which takes a list and outputs as the list of all possible sequences of size k made up of elements of that list (without repeating sequences which can be obtained from each other by permutation). The order they are put in the end list is not relevant.
For example,
choose 2 [1;2;3;4] = [[1;2];[1;3];[1;4];[2;3];[2;4];[3;4]]
Any ideas?
I would like to have the whole thing to be lazy, outputting a lazy list, but if you have a strict solution, that'll be very useful too.

Here is a strict and suboptimal version. I hope it is clear. It avoids duplicates by assuming there are no duplicates in the input list, and by generating only sublists that are in the same order as in the original list.
The length computation could be factored by passing l's length as an argument of choose. That would make the code less readable but more efficient.
For the lazy version, sprinkle "lazy" and "Lazy.force" on the code...
let rec choose k l =
if k = 0
then [ [] ]
else
let len = List.length l in
if len < k
then []
else if k = len
then [ l ]
else
match l with
h :: t ->
let starting_with_h =
(List.map (fun sublist -> h :: sublist) (choose (pred k) t))
in
let not_starting_with_h = choose k t in
starting_with_h # not_starting_with_h
| [] -> assert false
;;
val choose : int -> 'a list -> 'a list list = <fun>
# choose 3 [1; 2; 3; 4; 5; 6; 7] ;;
- : int list list =
[[1; 2; 3]; [1; 2; 4]; [1; 2; 5]; [1; 2; 6]; [1; 2; 7]; [1; 3; 4]; [1; 3; 5];
[1; 3; 6]; [1; 3; 7]; [1; 4; 5]; [1; 4; 6]; [1; 4; 7]; [1; 5; 6]; [1; 5; 7];
[1; 6; 7]; [2; 3; 4]; [2; 3; 5]; [2; 3; 6]; [2; 3; 7]; [2; 4; 5]; [2; 4; 6];
[2; 4; 7]; [2; 5; 6]; [2; 5; 7]; [2; 6; 7]; [3; 4; 5]; [3; 4; 6]; [3; 4; 7];
[3; 5; 6]; [3; 5; 7]; [3; 6; 7]; [4; 5; 6]; [4; 5; 7]; [4; 6; 7]; [5; 6; 7]]
EDIT:
A lazy_list_append as appears necessary from the comments below:
type 'a node_t =
| Empty
| Node of 'a * 'a zlist_t
and 'a zlist_t = 'a node_t lazy_t
let rec lazy_list_append l1 l2 =
lazy
(match Lazy.force l1 with
Empty -> Lazy.force l2
| Node (h, lt) ->
Node (h, lazy_list_append lt l2))
;;

Plugging in again with a Haskell solution (it's just easier to work with lazy lists since they are built-in):
combinations 0 _ = [[]]
combinations k [] = []
combinations k (x:xs) = map (x:) (combinations (k-1) xs) ++ combinations k xs
The first two cases follow from the properties of binomial coefficients and more specifically: n choose 0 = 1 for all n including n=0 (that's why it is first to handle the case 0 choose 0). The other one is 0 choose k = 0. The third equation is exact translation of the recursive definition of combinations.
Unfortunately when you apply it to an infinite list it returns a trivial solution:
> take 10 $ combinations 3 [1..]
[[1,2,3],[1,2,4],[1,2,5],[1,2,6],[1,2,7],[1,2,8],[1,2,9],[1,2,10],[1,2,11],[1,2,12]]
EDIT:
OK, so we really want to go trough each combination in a finite number of steps. With the above version we are obviously using only the expression to the left of ++ which generates only combinations starting with 1. We can work around this problem by defining an interesting list zipping function which builds a list by alternately picking the head of each of its argument lists (it's important to be non-strict in the second argument):
merge [] ys = ys
merge (x:xs) ys = x:merge ys xs
and use it instead of ++:
combinations k (x:xs) = map (x:) (combinations (k-1) xs) `merge` combinations k xs
lets see:
> let comb_10_3 = combinations 3 [1..10]
> let comb_inf_3 = combinations 3 [1..]
> take 10 comb_inf_3
[[1,2,3],[2,3,4],[1,3,4],[3,4,5],[1,2,4],[2,4,5],[1,4,5],[4,5,6],[1,2,5],[2,3,5]]
> comb_10_3 `intersect` comb_inf_3 == comb_10_3
True
> last $ combinations 3 [1..10]
[6,8,10]
> elemIndex [6,8,10] $ combinations 3 [1..]
Just 351
All 10 choose 3 combinations are there!

Just for the sake of completeness, I am putting here the final code which brings together the strict code from Pascal with my lazy stuff and all other Pascal's useful comments.
The lazy list type is defined, then two auxiliary lazy functions (append and map), and finally the function "choose" that we aim to define.
type 'a node_t =
| Nil
| Cons of 'a * 'a t
and 'a t = ('a node_t) Lazy.t
let rec append l1 l2 =
match Lazy.force l1 with
| Nil -> l2
| Cons (a, l) -> lazy (Cons (a, append l l2))
let rec map f ll = lazy (
match Lazy.force ll with
| Nil -> Nil
| Cons(h,t) -> Cons(f h, map f t) )
let rec choose k l len =
if k = 0
then lazy (Cons(lazy Nil,lazy Nil))
else
if len < k
then lazy Nil
else if k = len
then lazy (Cons (l,lazy Nil))
else
match Lazy.force l with
| Cons(h,t) -> let g h sublist = lazy (Cons (h,sublist))
in let starting_with_h = (map (g h) (choose (k-1) t (len-1)))
in let not_starting_with_h = choose k t (len-1)
in append starting_with_h not_starting_with_h
| Nil -> assert false
The result of evaluating "choose k ls n" is a lazy list of all choices of k elements of list ls, with ls considered up to size n. Note that, as pointed out by Pascal, because of the way the enumeration takes place, the function choose will not cover all choices of an infinite list.
Thanks, this was really useful!
Best,
Surikator.

Related

trace a nested recursion in Ocaml

I am trying to understand deeply nested recursion in OCaml by using the sorting list algorithm. For this reason I am tracing the below code which has a recursive function sort and calls another function insert.
let rec sort (lst : int list) =
match lst with [] -> [] | head :: tail -> insert head (sort tail)
and insert elt lst =
match lst with
| [] -> [ elt ]
| head :: tail -> if elt <= head then elt :: lst else head :: insert elt tail
I understand the first recursive calls for sort, but after that I cannot follow.
For instance, suppose we have the list [6, 2, 5, 3]. After sorting the tail of this list as 2,3,5 where in the code the head 6 is compared to each element of this tail? Can somebody provide a hint for the trace results?
utop # sort [6; 2; 5; 3];;
> sort <-- [6; 2; 5; 3]
> sort <-- [2; 5; 3]
> sort <-- [5; 3]
> sort <-- [3]
> sort <-- []
> sort --> []
> insert <-- 3
> insert -->
> insert* <-- []
> insert* --> [3]
> sort --> [3]
> insert <-- 5
> insert -->
> insert* <-- [3]
> insert <-- 5
> insert -->
> insert* <-- []
> insert* --> [5]
> insert* --> [3; 5]
> sort --> [3; 5]
> insert <-- 2
> insert -->
> insert* <-- [3; 5]
> insert* --> [2; 3; 5]
> sort --> [2; 3; 5]
> insert <-- 6
> insert -->
> insert* <-- [2; 3; 5]
> insert <-- 6
> insert -->
> insert* <-- [3; 5]
> insert <-- 6
> insert -->
> insert* <-- [5]
> insert <-- 6
> insert -->
> insert* <-- []
> insert* --> [6]
> insert* --> [5; 6]
> insert* --> [3; 5; 6]
> insert* --> [2; 3; 5; 6]
> sort --> [2; 3; 5; 6]
>
> - : int list = [2; 3; 5; 6]**
First of all, there's no reason to have insert and sort being mutually recursive since insert doesn't depend on sort. So you could write it like this:
let rec insert elt lst =
match lst with
| [] -> [ elt ]
| head :: tail -> if elt <= head then elt :: lst else head :: insert elt tail
let rec sort (lst : int list) =
match lst with [] -> [] | head :: tail -> insert head (sort tail)
Now, what happens in insert? The function tries to insert an element elt in a sorted list with the invariant that all elements before it should be smaller and all the elements after should be higher.
Two cases happen:
if the list is empty, the invariant in ensure if you just return a list containing the element you were trying to insert.
if the list is not, it's composed of an element we'll call head and the rest of the list that we'll call tail. Now we have two new cases:
if elt <= head then all the elements of the list are higher than elt so you just return elt :: list (for example if you call insert 1 [2; 3; 4] you'll return [1; 2; 3; 4]
otherwise, head < elt so we need to add head in front of the list that will be returned by inserting elt to tail, hence the recursive call to insert elt tail
Now, when you call sort you call it like this:
insert head (sort tail)
Why so? Because the invariant only works if the list you're trying to insert head into is sorted (hence the bold sorted before). So you need to sort tail before inserting head into it.
If you have the following list: [3; 2; 1], you'll call
insert 3 (sort [2; 1])
which is transformed in
insert 3 (insert 2 (sort [1]))
which is transformed in
insert 3 (insert 2 (insert 1 (sort [])))
which is resolved in
insert 3 (insert 2 [1])
which is resolved in
insert 3 [1; 2]
which is resolved in
[1; 2; 3]
And your list is sorted.
[EDIT]
Here's the code with some printing to see what's happening:
let pp_sep ppf () = Format.fprintf ppf "; "
let rec insert elt lst =
Format.printf "#[<v 2>(Insert %d in [%a]" elt
Format.(pp_print_list ~pp_sep (fun ppf d -> fprintf ppf "%d" d))
lst;
let l =
match lst with
| [] -> [ elt ]
| head :: tail ->
if elt <= head then elt :: lst
else (
Format.printf "#,";
head :: insert elt tail)
in
Format.printf ")#]";
l
let rec sort (lst : int list) =
match lst with
| [] -> []
| head :: tail ->
Format.printf "#[<v 2>(Sort [%a] then insert %d#,"
Format.(pp_print_list ~pp_sep (fun ppf d -> fprintf ppf "%d" d))
tail head;
let l = insert head (sort tail) in
Format.printf ")#]#,";
l
# sort [3;2;1];;
(Sort [2; 1] then insert 3
(Sort [1] then insert 2
(Sort [] then insert 1
(Insert 1 in []))
(Insert 2 in [1]
(Insert 2 in [])))
(Insert 3 in [1; 2]
(Insert 3 in [2]
(Insert 3 in []))))
- : int list = [1; 2; 3]
In a sort by insertion, it is the insertion function that performs the comparisons between the element to be inserted and the currently sorted list.
You can see that your trace inserts the elements of your list in reverse order:
insert <-- 3
...
insert <-- 5
...
insert <-- 5
...
insert <-- 2
...
insert <-- 6
...
insert <-- 6
...
insert <-- 6
...
insert <-- 6
...
A possible next step is to figure why insert is called four times with 6 as an argument and only once with 2 as an argument.

Applying a function n times

I am trying to apply a function to a value n times.
Currently, I have
let rec n_times (f, n, v) =
if n > 0 then
n_times f n-1 (f v)
else
v
For some reason I keep getting an error that says
This expression has type 'a but an expression was expected of type 'a * int * 'b
The type variable 'a occurs inside 'a * int * 'b
I saw a few posts that address the same problem I am working on but none of them gets the same error.
In the first line of your code: you say "I declare a function called n_times that take a triplet (f, n, v) so one argument" then at the call site (third line) you give 3 arguments.
To fix this: write let rec n_times f n v = on line 1 or n_times (f, n-1, (f v)) on line 3.
You have defined the function to take a 3-tuple of values. So when you call it recursively you need to supply a 3-tuple:
n_times (f, n - 1, f v)
There are at least two problems, it would help to know what the purpose is other than recursion.
To get this to run you have to change your third line. n_times is defined with three inputs so it needs to be called with three. Also the function is defined to take a general, integer, and general input and output a general type.
You could remove (f v) and input just v every loop,
# let rec n_times (f, n, v) =
if n > 0 then
n_times (f , n-1 , v)
else
v;;
val n_times : 'a * int * 'b -> 'b = <fun>
# n_times(2,3,4);;
- : int = 4
This will however always return just v at the end.
You could also replace (f v) with a list and preapped it each loop,
# let rec n_times (f, n, v) =
if n > 0 then
n_times (f , n-1 , f::v)
else
v;;
val n_times : 'a * int * 'a list -> 'a list = <fun>
# n_times(2,3,[4]);;
- : int list = [2; 2; 2; 4]
# n_times(2,5,[4]);;
- : int list = [2; 2; 2; 2; 2; 4]
This allows the list to grow with each loop.
There seems to be a misunderstanding in how OCaml functions with multiple arguments are defined. You should replace
let rec n_times (f, n, v) =
with:
let rec n_times f n v =

What are the usual higher-order primitives on lists in functional programming? (map, fold, and?)

What do you usually do with a list in purely functional programming?
Obviously we have map f [x0; x1; x2] which produces [f x0; f x1; f x2] and fold f acc [x0; x1; x2] which produces f (f (f acc x0) x1) x2).
For map, no information is transmitted between calls to f; for fold, all the information produced by f is re-injected in the next call through the accumulator acc.
I've also seen things like flatmap that concatenates the lists produced by map f when f returns a list and for_all that applies a predicate to all elements of the lists, but these are just special cases of map and fold.
I can think about something intermediate that would produce a list but also keep an accumulator during iteration (OCaml syntax):
let rec map_acc ~f acc = function
| [] -> []
| x::xs -> let (y, acc) = f x acc in y::(map_acc ~f acc xs)
Questions:
is map_acc a usual concept in functional programming?
if yes, what's its canonical name?
is it implemented in standard libraries?
what are the other usual higher-order functions?
Notes:
Here is an example where map_acc is used to produce running sums:
let sums xs = map_acc ~f:(fun x sum -> let sum = sum + x in (sum, sum)) 0 xs
And sums [3; 5; 6; 9] produces [3; 8; 14; 23]. Granted sums can be implemented using fold and a well-tailored accumulator, but it is simpler with mac_acc.
map_acc looks like a slightly more general scan_left:
let scan_left f init list =
let _, xs = List.fold_left
(fun (acc, coll) elt ->
let acc' = f acc elt in acc', acc'::coll)
(init, []) list in
List.rev xs
scan_left (+) 0 [3; 5; 6; 9] => [3; 8; 14; 23]
This function and variants appear under various names in the standard libraries of a few functional languages, although not OCaml.
None of these functions really deserve being called primitives.

List of n balls in m buckets in OCaml [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am trying to find all permutations where n balls are spread into m buckets. I am approaching it through recursion but I am confused on what I should recurse n on since n could decrease by any numbers... (I am recursing on m-1) Any thoughts on how to do this with a functional language approach?
There's a solution in C++ but I don't understand C++.
List of combinations of N balls in M boxes in C++
There is no need to generate redundant results. The following code is a bit ugly, but it does the job :
let ( <|> ) s e =
let rec aux s e res =
if e - s < 0 then res
else aux (s + 1) e (s :: res) in
List.rev (aux s e [])
let rec generate n m =
let prepend_x l x = List.map (fun u -> x::u) l in
if m = 1 then [[n]]
else
let l = List.map (fun p -> prepend_x (generate (n - p) (m - 1)) p) (0 <|> n) in
List.concat l
The idea is simply that you want all lists of the form p::u with u in generate (n - p) (m - 1), with p ranging over 0..n
let flatten_tail l =
let rec flat acc = function
| [] -> List.rev acc
| hd::tl -> flat (List.rev_append hd acc) tl
in
flat [] l
let concat_map_tail f l =
List.rev_map f l |> List.rev |> flatten_tail
let rm_dup l =
if List.length l = 0 then l
else
let sl = List.sort compare l in
List.fold_left (
fun (acc, e) x -> if x <> e then x::acc, x else acc,e
) ([List.hd sl], List.hd sl) (List.tl sl) |> fst |> List.rev
(* algorithm starts from here *)
let buckets m =
let rec generate acc m =
if m = 0 then acc
else generate (0::acc) (m-1)
in
generate [] m
let throw_1_ball bs =
let rec throw acc before = function
| [] -> acc
| b::tl ->
let new_before = b::before in
let new_acc = (List.rev_append before ((b+1)::tl))::acc in
throw new_acc new_before tl
in
throw [] [] bs
let throw_n_ball n m =
let bs = buckets m in
let rec throw i acc =
if i = 0 then acc
else throw (i-1) (concat_map_tail throw_1_ball acc |> rm_dup)
in
throw n [bs]
Above is the correct code, it is scary because I added several utility functions and make things as tail-recursive as possible. But the idea is very simple.
Here is the algorithm:
Let's say we have 3 buckets, initially it is [0;0;0].
If we throw 1 ball into the 3 buckets, we have 3 cases each of which
is a snapshot of the buckets, i.e., [[1;0;0];[0;1;0];[0;0;1]].
Then if we have 1 more ball, for each case above, we will 3 cases,
so the resulting case list have 9 cases
Then if we have 1 more ball, .....
In this way, we will generate 3^n cases and many of them may be redundant.
So when generated each case list, we just remove all duplicates in the case list.
utop # throw_n_ball 3 2;;
- : int list list = [[0; 3]; [1; 2]; [2; 1]; [3; 0]]
utop # throw_n_ball 5 3;;
- : int list list = [[0; 0; 5]; [0; 1; 4]; [0; 2; 3]; [0; 3; 2]; [0; 4; 1]; [0; 5; 0]; [1; 0; 4];[1; 1; 3]; [1; 2; 2]; [1; 3; 1]; [1; 4; 0]; [2; 0; 3]; [2; 1; 2]; [2; 2; 1]; [2; 3; 0]; [3; 0; 2]; [3; 1; 1]; [3; 2; 0]; [4; 0; 1]; [4; 1; 0]; [5; 0; 0]]

F# Finding the Powerset

I want to find the Powerset
powerset [1;2;3] = = [[]; [3]; [2]; [2; 3]; [1]; [1; 3]; [1; 2]; [1;
2; 3]]
let rec powerset = function
| [] -> []
| x::xs -> List.map (fun ys -> xs) xs::powerset (xs)
I am having trouble with the code, this is what my output looks like now.
val it : int list list list = [[[2; 3]; [2; 3]]; [[3]]; []]
Others already pointed out to a link that uses sequence expressions and enumerates the sets lazily. That's how I would solve the problem (note that there is nothing impure or non-functional about using for inside sequence comprehension - it is just a way to generate sequence of results):
let rec powerset s = seq {
match s with
| [] -> yield []
| h::t -> for x in powerset t do yield! [x; h::x] }
That said, this can be easily translated to code that returns a list and uses higher-order functions:
let rec powerset =
function
| [] -> [[]]
| x::xs -> List.collect (fun subset -> [subset; x::subset]) (powerset xs)
The power set of an empty set is a set with single element [] (note that this is wrong in your snippet). To generate a powerset of x::xs, we first generate powerset of xs and then return two sets for every single element of the generated powerset - one is the sub set and the other is the subset with added x element. (This is done using List.collect which is like calling List.map followed by List.concat.)

Resources