Any better way to implement quicksort in ocaml? - functional-programming

I implemented quicksort in OCaml. Here is the code:
let shuffle d =
let nd = List.map (fun c -> (Random.bits (), c)) d in
let sond = List.sort compare nd in
List.map snd sond;;
let partition = function
| [] -> ([], [], [])
| pivot::tl ->
let rec p (left, right) = function
| [] -> (left, right, [pivot])
| first::rest ->
let c = compare pivot first
in
if c > 0 then
p (first::left, right) rest
else
p (left, first::right) rest
in
p ([], []) tl;;
let quicksort l =
let sl = shuffle l
in
let rec qs = function
| [] -> []
| l ->
let (left, right, pivot) = partition l
in
(qs left) # pivot # (qs right)
in
qs sl;;
First, I think maybe there is a better way to implement partition. List.partition came to my mind, but I just wanted to implement the key part by myself
Second, I use # a lot in the sorting which is inefficient, right?
any suggestions?
Edit
One more question to think about is whether 3-way quicksort affects the implementation in OCaml?

The p function is badly indented; speaking of indentation, I tend to think the style of having a in on the next line is overkill for single-line declarations, so I'd rather put them at the end of the one-liner declaration.
More importantly, there is no need for it to take a tuple of lists as arguments, you'll have something syntactically lighter using two separate (curried) arguments. You could also use the List.partition function of the standard library.

A micro optimization that you can try is to do List.concat[[qs left]; [pivot]; [qs right]] to append the lists at once buth you will need to run some benchmarks to verify this even helps.

Related

Map List onto shifted self

I have finally found an excellent entry point into functional programming with elm, and boy, do I like it, yet I still lack some probably fundamental elegance concerning a few concepts.
I often find myself writing code similar to the one below, which seems to be doing what it should, but if someone more experienced could suggest a more compact and direct approach, I am sure that could give some valuable insights into this so(u)rcery.
What I imagine this could boil down to, is something like the following
(<-> is a vector subtraction operator):
edgeDirections : List Vector -> List Vector
edgeDirections corners = List.map2 (\p v -> p <-> v) corners (shiftr 1 corners)
but I don't really have a satisfying approach to a method that would do a shiftr.
But the rules of stackoverflow demand it, here is what I tried. I wrote an ugly example of a possible usage for shiftr (I absolutely dislike the Debug.crash and I am not happy about the Maybe):
Given a list of vectors (the corner points of a polygon), calculate the directional vectors by calculating the difference of each corner-vector to its previous one, starting with the diff between the first and the last entry in the list.
[v1,v2,v3] -> [v1-v3,v2-v1,v3-v2]
Here goes:
edgeDir : Vector -> ( Maybe Vector, List Vector ) -> ( Maybe Vector, List Vector )
edgeDir p ( v, list ) =
case v of
Nothing ->
Debug.crash ("nono")
Just vector ->
( Just p, list ++ [ p <-> vector ] )
edgeDirections : List Vector -> List Vector
edgeDirections corners =
let
last =
List.head <| List.reverse corners
in
snd <| List.foldl edgeDir ( last, [] ) corners
main =
show <| edgeDirections [ Vector -1 0, Vector 0 1, Vector 1 0 ]
I appreciate any insight into how this result could be achieved in a more direct manner, maybe using existing language constructs I am not aware of yet, or any pointers on how to lessen the pain with Maybe. The latter may Just not be possible, but I am certain that the former will a) blow me away and b) make me scratch my head a couple times :)
Thank you, and many thanks for this felicitous language!
If Elm had built-in init and last functions, this could be cleaner.
You can get away from all those Maybes by doing some pattern matching. Here's my attempt using just pattern matching and an accumulator.
import List exposing (map2, append, reverse)
shiftr list =
let shiftr' acc rest =
case rest of
[] -> []
[x] -> x :: reverse acc
(x::xs) -> shiftr' (x::acc) xs
in shiftr' [] list
edgeDirections vectors =
map2 (<->) vectors <| shiftr vectors
Notice also the shortened writing of the mapping function of (<->), which is equivalent to (\p v -> p <-> v).
Suppose Elm did have an init and last function - let's just define those quickly here:
init list =
case list of
[] -> Nothing
[_] -> Just []
(x::xs) -> Maybe.map ((::) x) <| init xs
last list =
case list of
[] -> Nothing
[x] -> Just x
(_::xs) -> last xs
Then your shiftr function could be shortened to something like:
shiftr list =
case (init list, last list) of
(Just i, Just l) -> l :: i
_ -> list
Just after I "hung up", I came up with this, but I am sure this can still be greatly improved upon, if it's even correct (and it only works for n=1)
shiftr : List a -> List a
shiftr list =
let
rev =
List.reverse list
in
case List.head rev of
Nothing ->
list
Just t ->
[ t ] ++ (List.reverse <| List.drop 1 rev)
main =
show (shiftr [ 1, 2, 3, 4 ] |> shiftr)

Map a list of options to list of strings

I have the following function in OCaml:
let get_all_parents lst =
List.map (fun (name,opt) -> opt) lst
That maps my big list with (name, opt) to just a list of opt. An option can contain of either None or Some value which in this case is a string. I want a list of strings with all my values.
I am a beginner learning OCaml.
I don't think filter and map used together is a good solution to this problem. This is because when you apply map to convert your string option to string, you will have the None case to deal with. Even if you know that you won't have any Nones because you filtered them away, the type checker doesn't, and can't help you. If you have non-exhaustive pattern match warnings enabled, you will get them, or you will have to supply some kind of dummy string for the None case. And, you will have to hope you don't introduce errors when refactoring later, or else write test cases or do more code review.
Instead, you need a function filter_map : ('a -> 'b option) -> 'a list -> 'b list. The idea is that this works like map, except filter_map f lst drops each element of lst for which f evaluates to None. If f evaluates to Some v, the result list will have v. You could then use filter_map like so:
filter_map (fun (_, opt) -> opt) lst
You could also write that as
filter_map snd lst
A more general example would be:
filter_map (fun (_, opt) ->
match opt with
| Some s -> Some (s ^ "\n")
| None -> None)
lst
filter_map can be implemented like this:
let filter_map f lst =
let rec loop acc = function
| [] -> List.rev acc
| v::lst' ->
match f v with
| None -> loop acc lst'
| Some v' -> loop (v'::acc) lst'
in
loop [] lst
EDIT For greater completeness, you could also do
let filter_map f lst =
List.fold_left (fun acc v ->
match f v with
| Some v' -> v'::acc
| None -> acc) [] lst
|> List.rev
It's a shame that this kind of function isn't in the standard library. It's present in both Batteries Included and Jane Street Core.
I'm going to expand on #Carsten's answer. He is pointing you the right direction.
It's not clear what question you're asking. For example, I'm not sure why you're telling us about your function get_all_parents. Possibly this function was your attempt to get the answer you want, and that it's not quite working for you. Or maybe you're happy with this function, but you want to do some further processing on its results?
Either way, List.map can't do the whole job because it always returns a list of the same length as its input. But you need a list that can be different lengths, depending on how many None values there are in the big list.
So you need a function that can extract only the parts of a list that you're interested in. As #Carsten says, the key function for this is List.filter.
Some combination of map and filter will definitely do what you want. Or you can just use fold, which has the power of both map and filter. Or you can write your own recursive function that does all the work.
Update
Maybe your problem is in extracting the string from a string option. The "nice" way to do this is to provide a default value to use when the option is None:
let get default xo =
match xo with
| None -> default
| Some x -> x
# get "none" (Some "abc");;
- : string = "abc"
# get "none" None;;
- : string = "none"
#
type opt = Some of string | None
List.fold_left (fun lres -> function
(name,Some value) -> value::lres
| (name,None) -> lres
) [] [("s1",None);("s2",Some "s2bis")]
result:
- : string list = ["s2bis"]

Using continuation / CPS to implement tail-recursive MergeSort in OCaml

I am trying to implement a tail-recursive MergeSort in OCaml.
Since Mergesort naturally is not tail-recursive, so I am using CPS to implement it.
Also my implementation is inspired by Tail-recursive merge sort in OCaml
Below is my code
let merge compare_fun l1 l2 =
let rec mg l1 l2 acc =
match l1, l2 with
| ([], []) -> List.rev acc
| ([], hd2::tl2) -> mg [] tl2 (hd2::acc)
| (hd1::tl1, []) -> mg tl1 [] (hd1::acc)
| (hd1::tl1, hd2::tl2) ->
let c = compare_fun hd1 hd2
in
if c = 1 then mg l1 tl2 (hd2::acc)
else if c = 0 then mg tl1 tl2 (hd2::hd1::acc)
else mg tl1 l2 (hd1::acc)
in
mg l1 l2 [];;
let split_list p l =
let rec split_list p (acc1, acc2) = function
| [] -> (List.rev acc1, List.rev acc2)
| hd::tl ->
if p > 0 then split_list (p-1) (hd::acc1, acc2) tl
else split_list (p-2) (acc1, hd::acc2) tl
in
split_list p ([], []) l;;
let mergeSort_cps compare_fun l =
let rec sort_cps l cf = (*cf = continuation func*)
match l with
| [] -> cf []
| hd::[] -> cf [hd]
| _ ->
let (left, right) = split_list ((List.length l)/2) l
in
sort_cps left (fun leftR -> sort_cps right (fun rightR -> cf (merge compare_fun leftR rightR)))
in
sort_cps l (fun x -> x);;
When I compile it, and run it with a 1,000,000 integers, it gives the error of stackoverflow. Why?
Edit
Here is the code I used for testing:
let compare_int x y =
if x > y then 1
else if x = y then 0
else -1;;
let create_list n =
Random.self_init ();
let rec create n' acc =
if n' = 0 then acc
else
create (n'-1) ((Random.int (n/2))::acc)
in
create n [];;
let l = create_list 1000000;;
let sl = mergeSort_cps compare_int l;;
in http://try.ocamlpro.com/, it gave this error: Exception: RangeError: Maximum call stack size exceeded.
in local ocaml top level, it didn't have any problem
Adding another answer to make a separate point: it seems that much of the confusion among answerers is caused by the fact that you don't use the standard OCaml compiler, but the TryOCaml website which runs a distinct OCaml backend, on top of javascript, and has therefore slightly different optimization and runtime characteristics.
I can reliably reproduce the fact that, on the TryOCaml website, the CPS-style function mergeSort_cps you show fails on lists of length 1_000_000 with the following error:
Exception: InternalError: too much recursion.
My analysis is that this is not due to a lack of tail-rec-ness, but by a lack of support, on the Javascript backend, of the non-obvious way in which the CPS-translated call is tailrec: recursion goes through a lambda-abstraction boundary (but still in tail position).
Turning the code in the direct, non-tail-rec version makes the problem go away:
let rec merge_sort compare = function
| [] -> []
| [hd] -> [hd]
| l ->
let (left, right) = split_list (List.length l / 2) l in
merge compare (merge_sort compare left) (merge_sort compare right);;
As I said in my other answer, this code has a logarithmic stack depth, so no StackOverflow will arise from its use (tail-rec is not everything). It is simpler code that the Javascript backend handles better.
Note that you can make it noticeably faster by using a better implementation split (still with your definition of merge) that avoids the double traversal of List.length then splitting:
let split li =
let rec split ls rs = function
| [] -> (ls, rs)
| x::xs -> split rs (x::ls) xs in
split [] [] li;;
let rec merge_sort compare = function
| [] -> []
| [hd] -> [hd]
| l ->
let (left, right) = split l in
merge compare (merge_sort compare left) (merge_sort compare right);;
Reading the comments, it seems that your Stack_overflow error is hard to reproduce.
Nevertheless, your code is not entirely in CPS or tail-recursive: in merge_sort, the calls to split_list and merge are made in a non-tail-call position.
The question is: by making your CPS transform and generous use of accumulators, what will be the worst stack depth related to recursion? Saving stack depth on the sort calls is in fact not very interesting: as each split the list in two, the worst stack depth would be O(log n) for n the size the input list.
On the contrary, split and merge would have made a linear O(n) usage of the stack if they weren't written in accumulator-passing style, so they are important to make tail-rec. As your implementation of those routines is tail-rec, there should be no need to worry about stack usage, and neither to convert the sort routine itself in CPS form that makes the code harder to read.
(Note that this logarithmic-decrease argument is specific to mergesort. A quicksort can have linear stack usage in worst case, so it could be important to make it tail-rec.)

F# Cutting a list in half using functional programming

I am trying to input a list into the function and it send me a list with the first half of the elements taken away using f# with the below recursion but I keep running into a base case problem that I just cant figure out. any thoughts? I am using the second shadow list to count how far I need to go until I am half way into the list (by removing two elements at a time)
let rec dropHalf listToDrop shadowList =
match shadowList with
| [] -> listToDrop
| shadowHead2::shadowHead1::shadowTail -> if shadowTail.Length<=1 then listToDrop else
match listToDrop with
|[] -> listToDrop
|listToDropHead::listToDropTail -> dropHalf listToDropTail shadowTail
let rec dropHalf listToDrop shadowList =
match shadowList with
| [] -> listToDrop
| shadowHead2::[] -> listToDrop (* odd number! *)
| shadowHead1::shadowHead2::shadowTail ->
match listToDrop with
| [] -> listToDrop (* should never happen? *)
| listToDropHead::listToDropTail -> dropHalf listToDropTail shadowTail
i'm afraid i don't use F#, but it's similar to ocaml, so hopefully the following is close to what you're looking for (maybe the comment format has changed?!). the idea is that when you exhaust the shadow you're done. your code was almost there, but the test for length on the shadow tail made no sense.
i want to emphasize that this isn't anything like anyone would write "in real life", but it sounds like you're battling with some weird requirements.
Because you use the shadow list with the same length as the original list and remove elements from these lists with different rates, it's better to create an auxiliary function:
let dropHalf xs =
let rec dropHalf' ys zs =
match ys, zs with
| _::_::ys', _::zs' -> dropHalf' ys' zs'
| _, zs' -> zs' (* One half of the shadow list ys *)
dropHalf' xs xs
If you don't care to traverse the list twice, the following solution is simpler:
let rec drop n xs =
match xs, n with
| _ when n < 0 -> failwith "n should be greater or equals to 0"
| [], _ -> []
| _, 0 -> xs
| _::xs', _ -> drop (n-1) xs'
let dropHalf xs =
xs |> drop (List.length xs/2)
and another simple solution needs some extra space but doesn't have to use recursion:
let dropHalf xs =
let xa = Array.ofList xs
xa.[xa.Length/2..] |> List.ofArray
As a general rule of thumb, if you're calling Length on a list, then there is most likely a better way to do what you're doing. Length has to iterate the entire list and is therefore O(n).
let secondHalf list =
let rec half (result : 'a list) = function
| a::b::sx -> half result.Tail sx
// uncomment to remove odd value from second half
// | (a::sx) -> result.Tail
| _ -> result
half list list
Here is a sample does what you described.
open System
open System.Collections.Generic
let items = seq { 1 .. 100 } |> Seq.toList
let getBackOfList ( targetList : int list) =
if (targetList.Length = 0) then
targetList
else
let len = targetList.Length
let halfLen = len / 2
targetList |> Seq.skip halfLen |> Seq.toList
let shortList = items |> getBackOfList
("Len: {0}", shortList.Length) |> Console.WriteLine
let result = Console.ReadLine()
Hope this helps

Recursive lambdas in F#

Take this example code (ignore it being horribly inefficient for the moment)
let listToString (lst:list<'a>) = ;;' prettify fix
let rec inner (lst:list<'a>) buffer = ;;' prettify fix
match List.length lst with
| 0 -> buffer
| _ -> inner (List.tl lst) (buffer + ((List.hd lst).ToString()))
inner lst ""
This is a common pattern I keep coming across in F#, I need to have an inner function who recurses itself over some value - and I only need this function once, is there in any way possible to call a lambda from within it self (some magic keyword or something) ? I would like the code to look something like this:
let listToString2 (lst:list<'a>) = ;;' prettify fix
( fun
(lst:list<'a>) buffer -> match List.length lst with ;;' prettify fix
| 0 -> buffer
| _ -> ##RECURSE## (List.tl lst) (buffer + ((List.hd lst).ToString()))
) lst ""
But as you might expect there is no way to refer to the anonymous function within itself, which is needed where I put ##RECURSE##
Yes, it's possible using so called y-combinators (or fixed-point combinators). Ex:
let rec fix f x = f (fix f) x
let fact f = function
| 0 -> 1
| x -> x * f (x-1)
let _ = (fix fact) 5 (* evaluates to "120" *)
I don't know articles for F# but this haskell entry might also be helpful.
But: I wouldn't use them if there is any alternative - They're quite hard to understand.
Your code (omit the type annotations here) is a standard construct and much more expressive.
let listToString lst =
let rec loop acc = function
| [] -> acc
| x::xs -> loop (acc ^ (string x)) xs
loop "" lst
Note that although you say you use the function only once, technically you refer to it by name twice, which is why it makes sense to give it a name.

Resources