Text Parsing and Nested Collection Transposition in F# - collections

I parse data from a csv file that looks like this:
X,..,..,Dx,..,..
Y,..,..,Dy,..,..
X,..,..,Dx,..,..
Y,..,..,Dy,..,..
X,..,..,Dx,..,..
Y,..,..,Dy,..,..
Each row is an element of an array of a type I defined and used with FileHelpers. This probably isn't relevant, but I'm including this incase someone knows a trick I could do at this stage of the process using FileHelpers.
I'm only interested in pairs X,Dx and Y,Dy
The data could have more than just X & Y eg.. (X,Dx); (Y,Dy); (Z,Dz); ...
I'll call the number of letters nL
The goal is to get the averages of Dx, Dy, ... for each group by processing an array of all D's which has SUM(nIterations) * nL elements.
I have a list of numbers of iterations:
let nIterations = [2000; 2000; 2000; 1000; 500; 400; 400; 400; 300; 300]
And for each of these numbers, I will have that many "letter groups." So the rows of data of interest for nIterations.[0], are rows 0 to (nIterations.[0] * nL)
To get the rows of interest for nIterations.[i], I make a list "nis" which is the result of a scan operation performed on nIterations.
let nis = List.scan (fun x e -> x + e) 0 nIterations
Then to isolate the nItertions.[i] group ..
let group = Array.sub Ds (nis.[i]*nL) (nIterations.[i]*nL)
Here's the whole thing:
nIterations |> List.mapi (fun i ni ->
let igroup = Array.sub Ds (nis.[i]*nL) (ni*nL)
let groupedbyLetter = (chunk nL igroup)
let sums = seq { for idx in 0..(nL - 1) do
let d = seq { for g in groupedbyLetter do
yield (Seq.head (Seq.skip idx g)) }
yield d |> Seq.sum }
sums |> Seq.map (fun x -> (x / (float ni))) ) |> List.ofSeq
That "chunk" function is one I found on SO:
let rec chunk n xs =
if Seq.isEmpty xs then Seq.empty
else
let (ys,zs) = splitAt n xs
Seq.append (Seq.singleton ys) (chunk n zs)
I have verified this works, and gets me what I want - a size nL collection of size nIterations.Length collections.
The problem is speed - this only works on small data sets; the sizes I'm working with in the example I've given are too big. It gets "hung" at the chunk function.
So my question is: How do I go about improving the speed of this whole process? (and/or) What is the best (or atleast a better) way to do that "transposition"
I figure I could:
try to rearrange the data as I'm reading it in
try to index the elements directly
try breaking the process into smaller stages or "passes"
???

I got it.
let averages =
(nIterations |> List.mapi (fun i ni ->
let igroup = Array.sub Ds (nis.[i]*nL) (ni*nL)
let groupedbyLetter =
[| for a in 1..nL..igroup.Length do
yield igroup.[(a - 1)..(a - 1)+(nL-1)] |]
[| for i in 0..(nL - 1) do
yield [| for j in 0..(groupedbyLetter.Length - 1) do
yield groupedbyLetter.[j].[i] |]
|> Array.average |]) )
let columns = [| for i in 0..(nL - 1) do
yield [| for j in 0..(nIterations.Length - 1) do
yield averages.[j].[i] |]
|]
The "columns" function is just transposing the data again so I can easily print..
----Average Ds----
nIterations X Y Z
2000 0.2 0.7 1.2
... ... ... ...
... ... ... ...
e.g. averages returns
[[x1,y1,z1,..], [x2,y2,z2,..], ... ]
and columns gives me
[ [x1,x2,..], [y1,y2,..], [z1,z2,..], ...]

Related

Is it possible to use List.unfold to list all factors of N?

I'm trying to wrap my head around functional programming using F#. I'm sticking to purely mathematical problems for now.
My current problem is simple enough: to write a function that takes an integer N and outputs a list of all the factors of N
Because of the similarities between sequences and C# IEnumerables formed by yield return I got this solution:
let seqFactorsOf n =
seq { for i in 2 .. (n / 2) do if n % i = 0 then yield i }
I don't think lists can be generated that way, though, so I turned to List.unfold:
let listFactorsOf n =
2 |> List.unfold (fun state ->
if state <= n / 2 then
if state % 2 = 0 then
Some (state, state + 1)
else
//need something here to appease the compiler. But what?
else
None)
My other attempt uses the concept of matching, with which I'm almost totally unfamiliar:
let listFactorsOf_2 n =
2 |> List.unfold(fun state ->
match state with
| x when x > n / 2 -> None
| x when n % x = 0 -> Some(x, x + 1)
//I need a match for the general case or I get a runtime error
)
Is there a way to create such list using List.unfold? Please notice that I'm a beginner (I started F# 3 days ago) and the documentation is not very kind to newbies, so if you'd try to be as didactic as possible I would appreciate it a lot.
First - yes, of course lists can be generated using that for..in syntax (it's called "list comprehensions" by the way). Just put the whole thing in square brackets instead of seq { }:
let seqFactorsOf n =
[ for i in 2 .. (n / 2) do if n % i = 0 then yield i ]
As for unfold - every iteration is required to either produce an element of the resulting list (by returning Some) or to signal end of iteration (by returning None). There is nothing you can return from the body of unfold to indicate "skipping" the element.
Instead, what you have to do is to somehow "skip" the unwanted elements yourself, and only ever return the next divisor (or None).
One way to do that is with a helper function:
let rec nextDivisor n i =
if n % i = 0 then Some i
else if i >= n/2 then None
else nextDivisor n (i+1)
Let's test it out:
nextDivisor 16 3
> Some 4
nextDivisor 16 5
> Some 8
nextDivisor 16 10
> None
Now we can use that in the body of unfold:
let listFactorsOf n =
2 |> List.unfold (fun state ->
match nextDivisor n state with
| Some d -> Some (d, d + 1)
| None -> None
)
As a bonus, the construct match x with Some a -> f a | None -> None is a well-known and widely used concept usually called "map". In this particular case - it's Option.map. So the above can be rewritten like this:
let listFactorsOf n =
2 |> List.unfold (fun state ->
nextDivisor n state
|> Option.map (fun d -> d, d+1)
)

Functional Pattern to check Connect 4 board for winner

I want to learn some functional style programming, so I want to write a littel Connect 4 engine.
Given a board I want to determine if a player has won in that board state, so I need a function
let winner (board : Board) : Player option = ???
'Usually' one could simply loop through the rows, the columns, and the diagonals, and as soon as we find a winner we return whoever we found and 'break out'. I'm not sure if something like that is even possible in F#.
In my current implementation I am using a helper function which takes a list of board cells and checks if there are four consecutive cells belonging to PlayerA or PlayerB. It returns a Player option type.
Then in my main 'winner' function I check if there is a winner in the rows, if yes, return that Player, if None, check the columns, etc.
So basically I am doing a lot of matching and stuff, and it seems to me like this should be easier to do with some kind of bind, but I wouldn't know how.
So how would one approach this problem in functional style?
EDIT: Some Code Snippets
These are my basic types
type Player =
| PlayerA
| PlayerB
type Cell =
| Empty
| Occupied of Player
type Board = Cell [] list
// Cell [] [] would probably be better, but some things were easier when I could use pattern matching x :: xs for lists
Here are some helper functions. This already seems like too much.
let rec getFours (l: 'a list):'a list list =
if List.length l < 4 then
[[]]
elif List.length l = 4 then
[l]
else
match l with
| [] -> [[]]
| x::xs -> [[x;l.[1];l.[2];l.[3]]] # getFours xs
let quadToPlayer quad=
if List.forall (fun x -> x = Occupied PlayerA) quad then
Some PlayerA
elif List.forall (fun x -> x = Occupied PlayerB) quad then
Some PlayerB
else
None
let rowWinnerHelper (row : Cell []) : Player option=
if Array.length row <4 then
None
else
let possibleWinners = getFours (List.ofArray row) |> List.map quadToPlayer
if List.exists (fun x -> x = Some PlayerA) possibleWinners then
Some PlayerA
elif List.exists (fun x -> x = Some PlayerB) possibleWinners then
Some PlayerB
else
None
let rowWinner (board:Board) : Player option =
let rowWinners = List.map rowWinnerHelper board
if List.exists (fun x -> x = Some PlayerA) rowWinners then
Some PlayerA
elif List.exists (fun x -> x = Some PlayerB) rowWinners then
Some PlayerB
else
None
What I don't like for example is that I am computing possible winners for all rows and all quadruples in each row etc. Instead of just stopping once I found the first winning Player.
Your could improve your getFours by computing if it's a win immediately rather than building lists.
let rec getFours player (l: 'a list): bool =
if List.length l < 4 then
false
elif List.length l = 4 then
quadToPlayer player l
else
match l with
| [] -> false
| x::xs -> (quadToPlayer player [x; l.[1];l.[2];l.[3]]) || (getFours xs)
let quadToPlayer player quad =
List.forall (fun x -> x = Occupied player) quad
Alternatively, if you have a fixed board size you can then precompute winning patterns and you can bitmask against them. This will increase significantly the performance.
Encode each players moves into a bit array (each) maybe using long type depending on the size of your board. The example below is for tic-tac-toe.
let white,black = board
let winningPatterns =
[
0b111000000; // horizontal
0b000111000;
0b000000111;
0b100100100; // vertical
0b010010010;
0b001001001;
0b100010001; // diagonal
0b001010100 ]
let whiteWin = winningPatterns
|> Seq.map( fun p -> white &&& p = p )
|> Seq.reduce (||)
let blackWin = winningPatterns
|> Seq.map( fun p -> black &&& p = p )
|> Seq.reduce (||)
There is an Elm implementation of Connect 4 here.
Following ideas from there, I learned that fold does the trick, as it can just keep track how many consecutive pieces by one player we have seen.
let arrayWinner (row:Cell []) (player:Player) =
Array.fold (fun count p->
if count = 4 then
4
elif p = Occupied player then
count + 1
else
0
) 0 row
|> (=) 4
This can then be used in an 'exists'-check
let arrayOfArrayWinner (board:Cell [] []) (player:Player) =
Array.exists (fun arr -> arrayWinner arr player) board
This bit of code accomplishes basically the same as the code snippet in the question.

F# Split Function

I'm building a merge sort function and my split method is giving me a value restriction error. I'm using 2 accumulating parameters, the 2 lists resulting from the split, that I package into a tuple in the end for the return. However I'm getting a value restriction error and I can't figure out what the problem is. Does anyone have any ideas?
let split lst =
let a = []
let b = []
let ctr = 0
let rec helper (lst,l1,l2,ctr) =
match lst with
| [] -> []
| x::xs -> if ctr%2 = 0 then helper(xs, x::l1, l2, ctr+1)
else
helper(xs, l1, x::l2, ctr+1)
helper (lst, a, b, ctr)
(a,b)
Any input is appreciated.
The code, as you have written it, doesn't really make sense. F# uses immutable values by default, therefore your function, as it's currently written, can be simplified to this:
let split lst =
let a = []
let b = []
(a,b)
This is probably not what you want. In fact, due to immutable bindings, there is no value in predeclaring a, b and ctr.
Here is a recursive function that will do the trick:
let split lst =
let rec helper lst l1 l2 ctr =
match lst with
| [] -> l1, l2 // return accumulated lists
| x::xs ->
if ctr%2 = 0 then
helper xs (x::l1) l2 (ctr+1) // prepend x to list 1 and increment
else
helper xs l1 (x::l2) (ctr+1) // prepend x to list 2 and increment
helper lst [] [] 0
Instead of using a recursive function, you could also solve this problem using List.fold, fold is a higher order function which generalises the accumulation process that we described explicitly in the recursive function above.
This approach is a bit more concise but very likely less familiar to someone new to functional programming, so I've tried to describe this process in more detail.
let split2 lst =
/// Take a running total of each list and a index*value and return a new
/// pair of lists with the supplied value prepended to the correct list
let splitFolder (l1, l2) (i, x) =
match i % 2 = 0 with
|true -> x :: l1, l2 // return list 1 with x prepended and list2
|false -> l1, x :: l2 // return list 1 and list 2 with x prepended
lst
|> List.mapi (fun i x -> i, x) // map list of values to list of index*values
|> List.fold (splitFolder) ([],[]) // fold over the list using the splitFolder function

F# stop Seq.map when a predicate evaluates true

I'm currently generating a sequence in a similar way to:
migrators
|> Seq.map (fun m -> m())
The migrator function is ultimately returning a discriminated union like:
type MigratorResult =
| Success of string * TimeSpan
| Error of string * Exception
I want to stop the map once I encounter my first Error but I need to include the Error in the final sequence.
I have something like the following to display a final message to the user
match results |> List.rev with
| [] -> "No results equals no migrators"
| head :: _ ->
match head with
| Success (dt, t) -> "All migrators succeeded"
| Error (dt, ex) -> "Migration halted owing to error"
So I need:
A way to stop the mapping when one of the map steps produces an Error
A way to have that error be the final element added to the sequence
I appreciate there may be a different sequence method other than map that will do this, I'm new to F# and searching online hasn't yielded anything as yet!
I guess there are multiple approaches here, but one way would be to use unfold:
migrators
|> Seq.unfold (fun ms ->
match ms with
| m :: tl ->
match m () with
| Success res -> Some (Success res, tl)
| Error res -> Some (Error res, [])
| [] -> None)
|> List.ofSeq
Note the List.ofSeq at the end, that's just there for realizing the sequence. A different way to go would be to use sequence comprehensions, some might say it results in a clearer code.
The ugly things Tomaš alludes to are 1) mutable state, and 2) manipulation of the underlying enumerator. A higher-order function which returns up to and including when the predicate holds would then look like this:
module Seq =
let takeUntil pred (xs : _ seq) = seq{
use en = xs.GetEnumerator()
let flag = ref true
while !flag && en.MoveNext() do
flag := not <| pred en.Current
yield en.Current }
seq{1..10} |> Seq.takeUntil (fun x -> x % 5 = 0)
|> Seq.toList
// val it : int list = [1; 2; 3; 4; 5]
For your specific application, you'd map the cases of the DU to a boolean.
(migrators : seq<MigratorResult>)
|> Seq.takeUntil (function Success _ -> false | Error _ -> true)
I think the answer from #scrwtp is probably the nicest way to do this if your input is reasonably small (and you can turn it into an F# list to use pattern matching). I'll add one more version, which works when your input is just a sequence and you do not want to turn it into a list.
Essentially, you want to do something that's almost like Seq.takeWhile, but it gives you one additional item at the end (the one, for which the predicate fails).
To use a simpler example, the following returns all numbers from a sequence until one that is divisible by 5:
let nums = [ 2 .. 10 ]
nums
|> Seq.map (fun m -> m % 5)
|> Seq.takeWhile (fun n -> n <> 0)
So, you basically just need to look one element ahead - to do this, you could use Seq.pairwise which gives you the current and the next element in the sequence"
nums
|> Seq.map (fun m -> m % 5)
|> Seq.pairwise // Get sequence of pairs with the next value
|> Seq.takeWhile (fun (p, n) -> p <> 0) // Look at the next value for test
|> Seq.mapi (fun i (p, n) -> // For the first item, we return both
if i = 0 then [p;n] else [n]) // for all other, we return the second
|> Seq.concat
The only ugly thing here is that you then need to flatten the sequence again using mapi and concat.
This is not very nice, so a good thing to do would be to define your own higher-order function like Seq.takeUntilAfter that encapsulates the behavior you need (and hides all the ugly things). Then your code could just use the function and look nice & readable (and you can experiment with other ways of implementing this).

Homework help converting an iterative function to recursive

For an assignment, i have written the following code in recursion. It takes a list of a vector data type, and a vector and calculates to closeness of the two vectors. This method works fine, but i don't know how to do the recursive version.
let romulus_iter (x:vector list) (vec:vector) =
let vector_close_hash = Hashtbl.create 10 in
let prevkey = ref 10000.0 in (* Define previous key to be a large value since we intially want to set closefactor to prev key*)
if List.length x = 0 then
{a=0.;b=0.}
else
begin
Hashtbl.clear vector_close_hash;
for i = 0 to (List.length x)-1 do
let vecinquestion = {a=(List.nth x i).a;b=(List.nth x i).b} in
let closefactor = vec_close vecinquestion vec in
if (closefactor < !prevkey) then
begin
prevkey := closefactor;
Hashtbl.add vector_close_hash closefactor vecinquestion
end
done;
Hashtbl.find vector_close_hash !prevkey
end;;
The general recursive equivalent of
for i = 0 to (List.length x)-1 do
f (List.nth x i)
done
is this:
let rec loop = function
| x::xs -> f x; loop xs
| [] -> ()
Note that just like a for-loop, this function only returns unit, though you can define a similar recursive function that returns a meaningful value (and in fact that's what most do). You can also use List.iter, which is meant just for this situation where you're applying an impure function that doesn't return anything meaningful to each item in the list:
List.iter f x

Resources