Deedle - Recursive mapping on a Series with an initial value - recursion

I am attempting to create a new time series (Series[DateTime,float]) from an existing one of the same type, where the map to the new Series is recursive - for example:
NewSeries_T = NewSeries_T-1 + constant * OldSeries_T;
I have "NewSeries_0 = 1", as an initialization value for the new series.
I'm attempting to write a Series.map function that will do the job - I've got as far as the following non-working code, but I can't figure out the recursive part:
let rec newSeries = existingSeries |> Series.map (fun k v ->
match k.Equals(initDate) with
| true -> 1
| false -> newSeries.LastValue() + constant * v
)
So, I think the trick is, how do I allow the function access to the "previous" value in the series to build this up recursively?
Edit - Moved to answer below.

Based on Fyodor's reccomendation - the Series.scanValues does exactly what I need:
let initalEntry = Series([initDate], [init])
let newSeries=
existingSeries
|> Series.filter (fun k v -> k.Equals(initDate) = false)
|> Series.scanValues (fun n x-> lambda * n + (1.0 - lambda) * x ) init
newSeries.Merge(initalEntry)
I took away the first value, as I want this to return the "init" value at the start of the series, and merged this back at the end.

Related

How does one get the first key,value pair from F# Map without knowing the key?

How does one get the first key,value pair from F# Map without knowing the key?
I know that the Map type is used to get a corresponding value given a key, e.g. find.
I also know that one can convert the map to a list and use List.Head, e.g.
List.head (Map.toList map)
I would like to do this
1. without a key
2. without knowing the types of the key and value
3. without using a mutable
4. without iterating through the entire map
5. without doing a conversion that iterates through the entire map behind the seen, e.g. Map.toList, etc.
I am also aware that if one gets the first key,value pair it might not be of use because the map documentation does not note if using map in two different calls guarantees the same order.
If the code can not be written then an existing reference from a site such as MSDN explaining and showing why not would be accepted.
TLDR;
How I arrived at this problem was converting this function:
let findmin l =
List.foldBack
(fun (_,pr1 as p1) (_,pr2 as p2) -> if pr1 <= pr2 then p1 else p2)
(List.tail l) (List.head l)
which is based on list and is used to find the minimum value in the associative list of string * int.
An example list:
["+",10; "-",10; "*",20; "/",20]
The list is used for parsing binary operator expressions that have precedence where the string is the binary operator and the int is the precedence. Other functions are preformed on the data such that using F# map might be an advantage over list. I have not decided on a final solution but wanted to explore this problem with map while it was still in the forefront.
Currently I am using:
let findmin m =
if Map.isEmpty m then
None
else
let result =
Map.foldBack
(fun key value (k,v) ->
if value <= v then (key,value)
else (k,v))
m ("",1000)
Some(result)
but here I had to hard code in the initial state ("",1000) when what would be better is just using the first value in the map as the initial state and then passing the remainder of the map as the starting map as was done with the list:
(List.tail l) (List.head l)
Yes this is partitioning the map but that did not work e.g.,
let infixes = ["+",10; "-",10; "*",20; "/",20]
let infixMap = infixes |> Map.ofList
let mutable test = true
let fx k v : bool =
if test then
printfn "first"
test <- false
true
else
printfn "rest"
false
let (first,rest) = Map.partition fx infixMap
which results in
val rest : Map<string,int> = map [("*", 20); ("+", 10); ("-", 10)]
val first : Map<string,int> = map [("/", 20)]
which are two maps and not a key,value pair for first
("/",20)
Notes about answers
For practical purposes with regards to the precedence parsing seeing the + operations before - in the final transformation is preferable so returning + before - is desirable. Thus this variation of the answer by marklam
let findmin (map : Map<_,_>) = map |> Seq.minBy (fun kvp -> kvp.Value)
achieves this and does this variation by Tomas
let findmin m =
Map.foldBack (fun k2 v2 st ->
match st with
| Some(k1, v1) when v1 < v2 -> st
| _ -> Some(k2, v2)) m None
The use of Seq.head does return the first item in the map but one must be aware that the map is constructed with the keys sorted so while for my practical example I would like to start with the lowest value being 10 and since the items are sorted by key the first one returned is ("*",20) with * being the first key because the keys are strings and sorted by such.
For me to practically use the answer by marklam I had to check for an empty list before calling and massage the output from a KeyValuePair into a tuple using let (a,b) = kvp.Key,kvp.Value
I don't think there is an answer that fully satisfies all your requirements, but:
You can just access the first key-value pair using m |> Seq.head. This is lazy unlike converting the map to list. This does not guarantee that you always get the same first element, but realistically, the implementation will guarantee that (it might change in the next version though).
For finding the minimum, you do not actually need the guarantee that Seq.head returns the same element always. It just needs to give you some element.
You can use other Seq-based functons as #marklam mentioned in his answer.
You can also use fold with state of type option<'K * 'V>, which you can initialize with None and then you do not have to worry about finding the first element:
m |> Map.fold (fun st k2 v2 ->
match st with
| Some(k1, v1) when v1 < v2 -> st
| _ -> Some(k2, v2)) None
Map implements IEnumerable<KeyValuePair<_,_>> so you can treat it as a Seq, like:
let findmin (map : Map<_,_>) = map |> Seq.minBy (fun kvp -> kvp.Key)
It's even simpler than the other answers. Map internally uses an AVL balanced tree so the entries are already ordered by key. As mentioned by #marklam Map implements IEnumerable<KeyValuePair<_,_>> so:
let m = Map.empty.Add("Y", 2).Add("X", 1)
let (key, value) = m |> Seq.head
// will return ("X", 1)
It doesn't matter what order the elements were added to the map, Seq.head can operate on the map directly and return the key/value mapping for the min key.
Sometimes it's required to explicitly convert Map to Seq:
let m = Map.empty.Add("Y", 2).Add("X", 1)
let (key, value) = m |> Map.toSeq |> Seq.head
The error message I've seen for this case says "the type 'a * 'b does not match the type Collections.Generic.KeyValuePair<string, int>". It may also be possible add type annotations rather than Map.toSeq.

F# stop Seq.map when a predicate evaluates true

I'm currently generating a sequence in a similar way to:
migrators
|> Seq.map (fun m -> m())
The migrator function is ultimately returning a discriminated union like:
type MigratorResult =
| Success of string * TimeSpan
| Error of string * Exception
I want to stop the map once I encounter my first Error but I need to include the Error in the final sequence.
I have something like the following to display a final message to the user
match results |> List.rev with
| [] -> "No results equals no migrators"
| head :: _ ->
match head with
| Success (dt, t) -> "All migrators succeeded"
| Error (dt, ex) -> "Migration halted owing to error"
So I need:
A way to stop the mapping when one of the map steps produces an Error
A way to have that error be the final element added to the sequence
I appreciate there may be a different sequence method other than map that will do this, I'm new to F# and searching online hasn't yielded anything as yet!
I guess there are multiple approaches here, but one way would be to use unfold:
migrators
|> Seq.unfold (fun ms ->
match ms with
| m :: tl ->
match m () with
| Success res -> Some (Success res, tl)
| Error res -> Some (Error res, [])
| [] -> None)
|> List.ofSeq
Note the List.ofSeq at the end, that's just there for realizing the sequence. A different way to go would be to use sequence comprehensions, some might say it results in a clearer code.
The ugly things Tomaš alludes to are 1) mutable state, and 2) manipulation of the underlying enumerator. A higher-order function which returns up to and including when the predicate holds would then look like this:
module Seq =
let takeUntil pred (xs : _ seq) = seq{
use en = xs.GetEnumerator()
let flag = ref true
while !flag && en.MoveNext() do
flag := not <| pred en.Current
yield en.Current }
seq{1..10} |> Seq.takeUntil (fun x -> x % 5 = 0)
|> Seq.toList
// val it : int list = [1; 2; 3; 4; 5]
For your specific application, you'd map the cases of the DU to a boolean.
(migrators : seq<MigratorResult>)
|> Seq.takeUntil (function Success _ -> false | Error _ -> true)
I think the answer from #scrwtp is probably the nicest way to do this if your input is reasonably small (and you can turn it into an F# list to use pattern matching). I'll add one more version, which works when your input is just a sequence and you do not want to turn it into a list.
Essentially, you want to do something that's almost like Seq.takeWhile, but it gives you one additional item at the end (the one, for which the predicate fails).
To use a simpler example, the following returns all numbers from a sequence until one that is divisible by 5:
let nums = [ 2 .. 10 ]
nums
|> Seq.map (fun m -> m % 5)
|> Seq.takeWhile (fun n -> n <> 0)
So, you basically just need to look one element ahead - to do this, you could use Seq.pairwise which gives you the current and the next element in the sequence"
nums
|> Seq.map (fun m -> m % 5)
|> Seq.pairwise // Get sequence of pairs with the next value
|> Seq.takeWhile (fun (p, n) -> p <> 0) // Look at the next value for test
|> Seq.mapi (fun i (p, n) -> // For the first item, we return both
if i = 0 then [p;n] else [n]) // for all other, we return the second
|> Seq.concat
The only ugly thing here is that you then need to flatten the sequence again using mapi and concat.
This is not very nice, so a good thing to do would be to define your own higher-order function like Seq.takeUntilAfter that encapsulates the behavior you need (and hides all the ugly things). Then your code could just use the function and look nice & readable (and you can experiment with other ways of implementing this).

Text Parsing and Nested Collection Transposition in F#

I parse data from a csv file that looks like this:
X,..,..,Dx,..,..
Y,..,..,Dy,..,..
X,..,..,Dx,..,..
Y,..,..,Dy,..,..
X,..,..,Dx,..,..
Y,..,..,Dy,..,..
Each row is an element of an array of a type I defined and used with FileHelpers. This probably isn't relevant, but I'm including this incase someone knows a trick I could do at this stage of the process using FileHelpers.
I'm only interested in pairs X,Dx and Y,Dy
The data could have more than just X & Y eg.. (X,Dx); (Y,Dy); (Z,Dz); ...
I'll call the number of letters nL
The goal is to get the averages of Dx, Dy, ... for each group by processing an array of all D's which has SUM(nIterations) * nL elements.
I have a list of numbers of iterations:
let nIterations = [2000; 2000; 2000; 1000; 500; 400; 400; 400; 300; 300]
And for each of these numbers, I will have that many "letter groups." So the rows of data of interest for nIterations.[0], are rows 0 to (nIterations.[0] * nL)
To get the rows of interest for nIterations.[i], I make a list "nis" which is the result of a scan operation performed on nIterations.
let nis = List.scan (fun x e -> x + e) 0 nIterations
Then to isolate the nItertions.[i] group ..
let group = Array.sub Ds (nis.[i]*nL) (nIterations.[i]*nL)
Here's the whole thing:
nIterations |> List.mapi (fun i ni ->
let igroup = Array.sub Ds (nis.[i]*nL) (ni*nL)
let groupedbyLetter = (chunk nL igroup)
let sums = seq { for idx in 0..(nL - 1) do
let d = seq { for g in groupedbyLetter do
yield (Seq.head (Seq.skip idx g)) }
yield d |> Seq.sum }
sums |> Seq.map (fun x -> (x / (float ni))) ) |> List.ofSeq
That "chunk" function is one I found on SO:
let rec chunk n xs =
if Seq.isEmpty xs then Seq.empty
else
let (ys,zs) = splitAt n xs
Seq.append (Seq.singleton ys) (chunk n zs)
I have verified this works, and gets me what I want - a size nL collection of size nIterations.Length collections.
The problem is speed - this only works on small data sets; the sizes I'm working with in the example I've given are too big. It gets "hung" at the chunk function.
So my question is: How do I go about improving the speed of this whole process? (and/or) What is the best (or atleast a better) way to do that "transposition"
I figure I could:
try to rearrange the data as I'm reading it in
try to index the elements directly
try breaking the process into smaller stages or "passes"
???
I got it.
let averages =
(nIterations |> List.mapi (fun i ni ->
let igroup = Array.sub Ds (nis.[i]*nL) (ni*nL)
let groupedbyLetter =
[| for a in 1..nL..igroup.Length do
yield igroup.[(a - 1)..(a - 1)+(nL-1)] |]
[| for i in 0..(nL - 1) do
yield [| for j in 0..(groupedbyLetter.Length - 1) do
yield groupedbyLetter.[j].[i] |]
|> Array.average |]) )
let columns = [| for i in 0..(nL - 1) do
yield [| for j in 0..(nIterations.Length - 1) do
yield averages.[j].[i] |]
|]
The "columns" function is just transposing the data again so I can easily print..
----Average Ds----
nIterations X Y Z
2000 0.2 0.7 1.2
... ... ... ...
... ... ... ...
e.g. averages returns
[[x1,y1,z1,..], [x2,y2,z2,..], ... ]
and columns gives me
[ [x1,x2,..], [y1,y2,..], [z1,z2,..], ...]

Foldl return a Tuple in SML?

The problem I'm working on needs to take in a list of integers and return the average of those numbers. It needs to fit a specific format that looks like this...
fun average (n::ns) =
let
val (a,b) = fold? (?) ? ?
in
real(a) / real(b)
end;
I'm only allowed to replace the question marks and cannot used any built in functions. I have a working solution, but it doesn't adhere to these rules.
fun average (n::ns) =
let
val (a,b) = ((foldl (fn(x, y)=>(x+y)) n ns), length(ns)+1)
in
real(a) / real(b)
end;
So, is there a way to make a fold function return a tuple? Something like this is what I want it to do, but obviously I can't do this...
val (a,b) = ((foldl (fn(x, y)=>(x+y), count++) n ns)
Return type of foldl is the type of the initial accummulator. So the idea here is to provide a tuple including sum and count of elements in the list:
fun average (n::ns) =
let
val (a, b) = foldl (fn (x, (sum, count)) => (sum+x, count+1)) (n, 1) ns
in
real(a) / real(b)
end
Notice that your solution fails if the list is empty, it's better to add another case of handling empty list (either returning 0.0 or throwing a custom exception):
fun average [] = 0.0
| average (n::ns) = (* the same as above *)

iterative version of recursive algorithm to make a binary tree

Given this algorithm, I would like to know if there exists an iterative version. Also, I want to know if the iterative version can be faster.
This some kind of pseudo-python...
the algorithm returns a reference to root of the tree
make_tree(array a)
if len(a) == 0
return None;
node = pick a random point from the array
calculate distances of the point against the others
calculate median of such distances
node.left = make_tree(subset of the array, such that the distance of points is lower to the median of distances)
node.right = make_tree(subset, such the distance is greater or equal to the median)
return node
A recursive function with only one recursive call can usually be turned into a tail-recursive function without too much effort, and then it's trivial to convert it into an iterative function. The canonical example here is factorial:
# naïve recursion
def fac(n):
if n <= 1:
return 1
else:
return n * fac(n - 1)
# tail-recursive with accumulator
def fac(n):
def fac_helper(m, k):
if m <= 1:
return k
else:
return fac_helper(m - 1, m * k)
return fac_helper(n, 1)
# iterative with accumulator
def fac(n):
k = 1
while n > 1:
n, k = n - 1, n * k
return k
However, your case here involves two recursive calls, and unless you significantly rework your algorithm, you need to keep a stack. Managing your own stack may be a little faster than using Python's function call stack, but the added speed and depth will probably not be worth the complexity. The canonical example here would be the Fibonacci sequence:
# naïve recursion
def fib(n):
if n <= 1:
return 1
else:
return fib(n - 1) + fib(n - 2)
# tail-recursive with accumulator and stack
def fib(n):
def fib_helper(m, k, stack):
if m <= 1:
if stack:
m = stack.pop()
return fib_helper(m, k + 1, stack)
else:
return k + 1
else:
stack.append(m - 2)
return fib_helper(m - 1, k, stack)
return fib_helper(n, 0, [])
# iterative with accumulator and stack
def fib(n):
k, stack = 0, []
while 1:
if n <= 1:
k = k + 1
if stack:
n = stack.pop()
else:
break
else:
stack.append(n - 2)
n = n - 1
return k
Now, your case is a lot tougher than this: a simple accumulator will have difficulties expressing a partly-built tree with a pointer to where a subtree needs to be generated. You'll want a zipper -- not easy to implement in a not-really-functional language like Python.
Making an iterative version is simply a matter of using your own stack instead of the normal language call stack. I doubt the iterative version would be faster, as the normal call stack is optimized for this purpose.
The data you're getting is random so the tree can be an arbitrary binary tree. For this case, you can use a threaded binary tree, which can be traversed and built w/o recursion and no stack. The nodes have a flag that indicate if the link is a link to another node or how to get to the "next node".
From http://en.wikipedia.org/wiki/Threaded_binary_tree
Depending on how you define "iterative", there is another solution not mentioned by the previous answers. If "iterative" just means "not subject to a stack overflow exception" (but "allowed to use 'let rec'"), then in a language that supports tail calls, you can write a version using continuations (rather than an "explicit stack"). The F# code below illustrates this. It is similar to your original problem, in that it builds a BST out of an array. If the array is shuffled randomly, the tree is relatively balanced and the recursive version does not create too deep a stack. But turn off shuffling, and the tree gets unbalanced, and the recursive version stack-overflows whereas the iterative-with-continuations version continues along happily.
#light
open System
let printResults = false
let MAX = 20000
let shuffleIt = true
// handy helper function
let rng = new Random(0)
let shuffle (arr : array<'a>) = // '
let n = arr.Length
for x in 1..n do
let i = n-x
let j = rng.Next(i+1)
let tmp = arr.[i]
arr.[i] <- arr.[j]
arr.[j] <- tmp
// Same random array
let sampleArray = Array.init MAX (fun x -> x)
if shuffleIt then
shuffle sampleArray
if printResults then
printfn "Sample array is %A" sampleArray
// Tree type
type Tree =
| Node of int * Tree * Tree
| Leaf
// MakeTree1 is recursive
let rec MakeTree1 (arr : array<int>) lo hi = // [lo,hi)
if lo = hi then
Leaf
else
let pivot = arr.[lo]
// partition
let mutable storeIndex = lo + 1
for i in lo + 1 .. hi - 1 do
if arr.[i] < pivot then
let tmp = arr.[i]
arr.[i] <- arr.[storeIndex]
arr.[storeIndex] <- tmp
storeIndex <- storeIndex + 1
Node(pivot, MakeTree1 arr (lo+1) storeIndex, MakeTree1 arr storeIndex hi)
// MakeTree2 has all tail calls (uses continuations rather than a stack, see
// http://lorgonblog.spaces.live.com/blog/cns!701679AD17B6D310!171.entry
// for more explanation)
let MakeTree2 (arr : array<int>) lo hi = // [lo,hi)
let rec MakeTree2Helper (arr : array<int>) lo hi k =
if lo = hi then
k Leaf
else
let pivot = arr.[lo]
// partition
let storeIndex = ref(lo + 1)
for i in lo + 1 .. hi - 1 do
if arr.[i] < pivot then
let tmp = arr.[i]
arr.[i] <- arr.[!storeIndex]
arr.[!storeIndex] <- tmp
storeIndex := !storeIndex + 1
MakeTree2Helper arr (lo+1) !storeIndex (fun lacc ->
MakeTree2Helper arr !storeIndex hi (fun racc ->
k (Node(pivot,lacc,racc))))
MakeTree2Helper arr lo hi (fun x -> x)
// MakeTree2 never stack overflows
printfn "calling MakeTree2..."
let tree2 = MakeTree2 sampleArray 0 MAX
if printResults then
printfn "MakeTree2 yields"
printfn "%A" tree2
// MakeTree1 might stack overflow
printfn "calling MakeTree1..."
let tree1 = MakeTree1 sampleArray 0 MAX
if printResults then
printfn "MakeTree1 yields"
printfn "%A" tree1
printfn "Trees are equal: %A" (tree1 = tree2)
Yes it is possible to make any recursive algorithm iterative. Implicitly, when you create a recursive algorithm each call places the prior call onto the stack. What you want to do is make the implicit call stack into an explicit one. The iterative version won't necessarily be faster, but you won't have to worry about a stack overflow. (do I get a badge for using the name of the site in my answer?
While it is true in the general sense that directly converting a recursive algorithm into an iterative one will require an explicit stack, there is a specific sub-set of algorithms which render directly in iterative form (without the need for a stack). These renderings may not have the same performance guarantees (iterating over a functional list vs recursive deconstruction), but they do often exist.
Here is stack based iterative solution (Java):
public static Tree builtBSTFromSortedArray(int[] inputArray){
Stack toBeDone=new Stack("sub trees to be created under these nodes");
//initialize start and end
int start=0;
int end=inputArray.length-1;
//keep memoy of the position (in the array) of the previously created node
int previous_end=end;
int previous_start=start;
//Create the result tree
Node root=new Node(inputArray[(start+end)/2]);
Tree result=new Tree(root);
while(root!=null){
System.out.println("Current root="+root.data);
//calculate last middle (last node position using the last start and last end)
int last_mid=(previous_start+previous_end)/2;
//*********** add left node to the previously created node ***********
//calculate new start and new end positions
//end is the previous index position minus 1
end=last_mid-1;
//start will not change for left nodes generation
start=previous_start;
//check if the index exists in the array and add the left node
if (end>=start){
root.left=new Node(inputArray[((start+end)/2)]);
System.out.println("\tCurrent root.left="+root.left.data);
}
else
root.left=null;
//save previous_end value (to be used in right node creation)
int previous_end_bck=previous_end;
//update previous end
previous_end=end;
//*********** add right node to the previously created node ***********
//get the initial value (inside the current iteration) of previous end
end=previous_end_bck;
//start is the previous index position plus one
start=last_mid+1;
//check if the index exists in the array and add the right node
if (start<=end){
root.right=new Node(inputArray[((start+end)/2)]);
System.out.println("\tCurrent root.right="+root.right.data);
//save the created node and its index position (start & end) in the array to toBeDone stack
toBeDone.push(root.right);
toBeDone.push(new Node(start));
toBeDone.push(new Node(end));
}
//*********** update the value of root ***********
if (root.left!=null){
root=root.left;
}
else{
if (toBeDone.top!=null) previous_end=toBeDone.pop().data;
if (toBeDone.top!=null) previous_start=toBeDone.pop().data;
root=toBeDone.pop();
}
}
return result;
}

Resources