How do I print out year with average number recursively in F#? - recursion

Okay, so I have approached this headache for a couple days by trying to figure out how to print out year with average number from per line in my text file. I asked this similar question a couple days ago so basically I'm asking the same question, How do I print out lines recursively from a text file along with the average value of total elements from per line?
this goes on. However, I have created several functions. Now, here is my new question. Why does my program's output looks like this in the picture below? I have commented out a couple questions in my codes. I have been expecting to have output like
2010: 3.5788888
2009: 4.697858
This list goes on recursively.
here is my updated codes:
let ReadFile filename =
[ for line in System.IO.File.ReadLines(filename) -> line ]
let ParseLine (line:string) =
let strings = line.Split('\t')
let strlist = Array.toList(strings)
let year = System.Int32.Parse(strlist.Head)
let values = List.map System.Double.Parse strlist.Tail
(year, values)
let rec print (year, values) =
if values = [] then
()
else
printfn "%A: %A" year values.Head
print (year, values.Tail)
let avg (values:double list) = //this function can compute the average, but it wont work when I do in main, print(firstYear, avg (firstYear1))
let rec sum values accum =
match values with
| [] -> accum
| head :: tail -> sum tail (accum + head/12.0)
sum values 0.0
let rec sum (year, values:double list) =
if values = [] then
0.0
else
values.Head + sum (year, values.Tail)
[<EntryPoint>]
let main argv =
// read entire file as list of strings:
let file = ReadFile "rainfall-midway.txt"
printfn "** Rainfall Analysis Program **"
printfn ""
// let's parse first line into tuple (year, list of rainfall values),
// and then print for debugging:
let (year, values) = ParseLine file.Head
let firstYear = file.Head
let firstYear1 = file.Tail
//let data = List.map ParseLine file //I know map would be the key, but how does this work with year and its elements?
//let firstYear = data.Head
//let firstYear = data.Head
//print firstYear
print (firstYear, firstYear1)
//let S = sum firstYear
//printfn "%A" S
//let A = S / 12.0
//printfn "%A" A
// done:
printfn ""
printfn ""
0 // return 0 => success

The code you have is actually quite close to giving you the data you expect. There are a couple changes you could make to simplify things.
First to answer your question
Why does my program's output looks like this in the picture below?
This is because you are printing out the year and all of the parsed values (this doesn't match the code which just prints out the file). An easy way to resolve this is to have the ParseLine function calculate the average. You will need to move the avg prior to the ParseLine function but that should not be a problem.
let avg (values:double list) =
let rec sum values accum =
match values with
| [] -> accum
| head :: tail -> sum tail (accum + head/12.0)
sum values 0.0
let ReadFile filename =
[ for line in System.IO.File.ReadLines(filename) -> line ]
let ParseLine (line:string) =
let strings = line.Split('\t')
let strlist = Array.toList(strings)
let year = System.Int32.Parse(strlist.Head)
let values = List.map System.Double.Parse strlist.Tail
(year, avg values) // calculate avg here
Once that is done, you can use a map to run ParseLine on all lines from the file.
let result = file |> List.map ParseLine
Then to print out the results you need only iterate through the result list.
result |> List.iter(fun (year, avgRainfall) -> printfn "%i: %f" year avgRainfall)
That said we could just remove the sum and avg functions altogether and use fold instead in our ParseLine function.
let ParseLine (line:string) =
let strings = line.Split('\t')
let strlist = Array.toList(strings)
let year = System.Int32.Parse(strlist.Head)
year, (strlist.Tail |> List.fold(fun state el -> (System.Double.Parse el + state)) 0.0) / float strlist.Tail.Length
If you don't want to change the ParseLine function then you can do the following:
let result = file |> List.map(fun el ->
let (year, values) = ParseLine el
(year, avg values))

Related

How to read pair by pair from a file in SML?

I want to read N pairs from a file and store them as a tuples in a list.For example if i have these 3 pairs : 1-2 , 7-3, 2-9 i want my list to look like this -> [(1,2),(7,3),(2-9)]
I tried something like this:
fun ex filename =
let
fun readInt input = Option.valOf (TextIO.scanStream (Int.scan StringCvt.DEC) input)
val instream = TextIO.openIn filename
val T = readInt instream (*number of pairs*)
val _ = TextIO.inputLine instream
fun read_ints2 (x,acc) =
if x = 0 then acc
else read_ints2(x-1,(readInt instream,readInt instream)::acc)
in
...
end
When i run it i get an exeption error :/ What's wrong??
I came up with this solution. I reads a single line from the given file. In processing the text it strips away anything not a digit creating a single flat list of chars. Then it splits the flat list of chars into a list of pairs and in the process converts the chars to ints. I'm sure it could be improved.
fun readIntPairs file =
let val is = TextIO.openIn file
in
case (TextIO.inputLine is)
of NONE => ""
| SOME line => line
end
fun parseIntPairs data =
let val cs = (List.filter Char.isDigit) (explode data)
fun toInt c =
case Int.fromString (str c)
of NONE => 0
| SOME i => i
fun part [] = []
| part [x] = []
| part (x::y::zs) = (toInt x,toInt y)::part(zs)
in
part cs
end
parseIntPairs (readIntPairs "pairs.txt");

F# recursion behavior

I recently started learning F# and because I am quite new to most of the functional concepts I tend to want to write small examples for myself and check my premises with the results of the test.
Now I can't seem to be able to understand the result of the following code and why it behaves as such. The use case: I roll four six sides dice and only return their total when their sum is greater than 20.
This is my code:
let rnd = System.Random()
let d6 () = rnd.Next(1, 7)
let rec foo () =
// create a list of 4 d6 throws and print out the list
let numbers = seq { for i in 1 .. 4 -> d6() }
numbers |> Seq.iter( fun n -> printf "%i " n )
printfn "\n"
// sum the list and return the sum only when the sum is greater than 20
let total = numbers |> Seq.sum
match total with
| n when n < 21 -> foo ()
| _ -> total
Now when you run this you will find that this will eventually return a number greater than 20.
When you look at the output you will find that it did not print out the last list of numbers and I can't figure out why.
The sequences are lazily evaluated and are not cached. What happens here is that you have a sequence with a side effect that's evaluated multiple times.
First evaluation yields first sequence of random numbers:
numbers |> Seq.iter( fun n -> printf "%i " n )
The second call runs the evaluation again, producing completely different sequence:
let total = numbers |> Seq.sum
What you need to do if you want to keep the first evaluation around to run through it multiple times is either materialize the sequence or cache it:
// create a list directly
let numbers = [ for i in 1 .. 4 -> d6() ]
// or create a list from sequence
let numbers = seq { for i in 1 .. 4 -> d6() } |> List.ofSeq
// or cache the sequence
let numbers = seq { for i in 1 .. 4 -> d6() } |> Seq.cache

F#- AsyncSeq - how to return values in a list

Attempting to find anagrams in a list of words using F Sharps Async Sequences (I am aware there are better algorithms for anagram finding but trying to understand Async Sequneces)
From the 'runTest' below how can I
1. async read the collecion returned and output to screen
2. block until all results return & display final count/collection
open System
open System.ServiceModel
open System.Collections.Generic
open Microsoft.FSharp.Linq
open FSharp.Control
[<Literal>]
let testWord = "table"
let testWords = new List<string>()
testWords.Add("bleat")
testWords.Add("blate")
testWords.Add("junk")
let hasWord (word:string) =
let mutable res = true
let a = testWord.ToCharArray() |> Set.ofArray
let b = word.ToCharArray() |> Set.ofArray
let difference = Set.intersect a b
match difference.Count with
| 0 -> false
| _ -> true
let test2 (words:List<string>, (word:string)) : AsyncSeq<string> =
asyncSeq {
let res =
(words)
|> Seq.filter(fun x-> (hasWord(x)) )
|> AsyncSeq.ofSeq
yield! res
}
let runTest = test2(testWords,testWord)
|> //pull stuff from stream
|> // output to screen
|> ignore
()
So as you have the test2 function returning an asyncSeq. Your questions:
1. async read the collecion returned and output to screen
If you want to have some side-effecting code (such as outputting to the screen) you can use AsyncSeq.iter to apply a function to each item as it becomes available. Iter returns an Async<unit> so you can then "kick it off" using an appropriate Async method (blocking/non-blocking).
For example:
let processItem i =
// Do whatever side effecting code you want to do with an item
printfn "Item is '%s'" i
let runTestQ1 =
test2 (testWords, testWord)
|> AsyncSeq.iter processItem
|> Async.RunSynchronously
2. block until all results return & display final count/collection
If you want all the results collected so that you can work on them together, then you can convert the AsyncSeq into a normal Seq using AsyncSeq.toBlockingSeq and then convert it to a list to force the Seq to evaluate.
For example:
let runTestQ2 =
let allResults =
test2 (testWords, testWord)
|> AsyncSeq.toBlockingSeq
|> Seq.toList
// Do whatever you would like with your list of results
printfn "Final list is '%A' with a count of %i" allResults (allResults.Length)

Text Parsing and Nested Collection Transposition in F#

I parse data from a csv file that looks like this:
X,..,..,Dx,..,..
Y,..,..,Dy,..,..
X,..,..,Dx,..,..
Y,..,..,Dy,..,..
X,..,..,Dx,..,..
Y,..,..,Dy,..,..
Each row is an element of an array of a type I defined and used with FileHelpers. This probably isn't relevant, but I'm including this incase someone knows a trick I could do at this stage of the process using FileHelpers.
I'm only interested in pairs X,Dx and Y,Dy
The data could have more than just X & Y eg.. (X,Dx); (Y,Dy); (Z,Dz); ...
I'll call the number of letters nL
The goal is to get the averages of Dx, Dy, ... for each group by processing an array of all D's which has SUM(nIterations) * nL elements.
I have a list of numbers of iterations:
let nIterations = [2000; 2000; 2000; 1000; 500; 400; 400; 400; 300; 300]
And for each of these numbers, I will have that many "letter groups." So the rows of data of interest for nIterations.[0], are rows 0 to (nIterations.[0] * nL)
To get the rows of interest for nIterations.[i], I make a list "nis" which is the result of a scan operation performed on nIterations.
let nis = List.scan (fun x e -> x + e) 0 nIterations
Then to isolate the nItertions.[i] group ..
let group = Array.sub Ds (nis.[i]*nL) (nIterations.[i]*nL)
Here's the whole thing:
nIterations |> List.mapi (fun i ni ->
let igroup = Array.sub Ds (nis.[i]*nL) (ni*nL)
let groupedbyLetter = (chunk nL igroup)
let sums = seq { for idx in 0..(nL - 1) do
let d = seq { for g in groupedbyLetter do
yield (Seq.head (Seq.skip idx g)) }
yield d |> Seq.sum }
sums |> Seq.map (fun x -> (x / (float ni))) ) |> List.ofSeq
That "chunk" function is one I found on SO:
let rec chunk n xs =
if Seq.isEmpty xs then Seq.empty
else
let (ys,zs) = splitAt n xs
Seq.append (Seq.singleton ys) (chunk n zs)
I have verified this works, and gets me what I want - a size nL collection of size nIterations.Length collections.
The problem is speed - this only works on small data sets; the sizes I'm working with in the example I've given are too big. It gets "hung" at the chunk function.
So my question is: How do I go about improving the speed of this whole process? (and/or) What is the best (or atleast a better) way to do that "transposition"
I figure I could:
try to rearrange the data as I'm reading it in
try to index the elements directly
try breaking the process into smaller stages or "passes"
???
I got it.
let averages =
(nIterations |> List.mapi (fun i ni ->
let igroup = Array.sub Ds (nis.[i]*nL) (ni*nL)
let groupedbyLetter =
[| for a in 1..nL..igroup.Length do
yield igroup.[(a - 1)..(a - 1)+(nL-1)] |]
[| for i in 0..(nL - 1) do
yield [| for j in 0..(groupedbyLetter.Length - 1) do
yield groupedbyLetter.[j].[i] |]
|> Array.average |]) )
let columns = [| for i in 0..(nL - 1) do
yield [| for j in 0..(nIterations.Length - 1) do
yield averages.[j].[i] |]
|]
The "columns" function is just transposing the data again so I can easily print..
----Average Ds----
nIterations X Y Z
2000 0.2 0.7 1.2
... ... ... ...
... ... ... ...
e.g. averages returns
[[x1,y1,z1,..], [x2,y2,z2,..], ... ]
and columns gives me
[ [x1,x2,..], [y1,y2,..], [z1,z2,..], ...]

Foldl return a Tuple in SML?

The problem I'm working on needs to take in a list of integers and return the average of those numbers. It needs to fit a specific format that looks like this...
fun average (n::ns) =
let
val (a,b) = fold? (?) ? ?
in
real(a) / real(b)
end;
I'm only allowed to replace the question marks and cannot used any built in functions. I have a working solution, but it doesn't adhere to these rules.
fun average (n::ns) =
let
val (a,b) = ((foldl (fn(x, y)=>(x+y)) n ns), length(ns)+1)
in
real(a) / real(b)
end;
So, is there a way to make a fold function return a tuple? Something like this is what I want it to do, but obviously I can't do this...
val (a,b) = ((foldl (fn(x, y)=>(x+y), count++) n ns)
Return type of foldl is the type of the initial accummulator. So the idea here is to provide a tuple including sum and count of elements in the list:
fun average (n::ns) =
let
val (a, b) = foldl (fn (x, (sum, count)) => (sum+x, count+1)) (n, 1) ns
in
real(a) / real(b)
end
Notice that your solution fails if the list is empty, it's better to add another case of handling empty list (either returning 0.0 or throwing a custom exception):
fun average [] = 0.0
| average (n::ns) = (* the same as above *)

Resources