Related
I am new to functional programming and so can not imagen how to build the new dictionary based on two other dictionaries with similar set of keys. The new dictionary will have the entries with all keys but values will be selected/computed based on some condition.
For example, having two dictionaries:
D1: [(1,100);(2,50);(3,150)]
D2: [(1,20);(2,30);(3,0);(4,10)]
and condition to get the average of two values, the resulting dictionary will be
DR: [(1,60);(2,40);(3,75);(4,10)]
I need implementation in F#.
Please could you give me some advise.
View them as two (or more...) lists of tuples that we concat makes it easier. The below solves your specfic problem. To generalise the process aggeragting a list of values to something specific you would need to change averageBy to fold and provide a fold function instead of float. Assuming d1 and d2 mataches your exmaple.
Seq.concat [ d1 ; d2 ]
|> Seq.map (|KeyValue|)
|> Seq.groupBy fst
|> Seq.map (fun (k, c) -> k, Seq.averageBy (snd >> float) c |> int)
|> dict
If you wanted to use an external library, you could do this using Deedle series, which has various operations for working with (time) series of data.
Here, you have two data series that have different keys. Deedle lets you zip series based on keys and handle the cases where one of the values is missing using the opt type:
#r "nuget:Deedle"
open Deedle
let s1 = series [(1,100);(2,50);(3,150)]
let s2 = series [(1,20);(2,30);(3,0);(4,10)]
Series.zip s1 s2
|> Series.mapValues (fun (v1, v2) ->
( (OptionalValue.defaultArg 0 v1) +
(OptionalValue.defaultArg 0 v2) ) / 2)
This may not make sense if this is a thing that you need just in one or two places, but if you're working with key-value series of data more generally, it may be worth checking out.
Solution 1
From a functional perspective I would use a Map data-structure, instead of a dictionary. You can convert a dictionary to a Map like this
let d1 = dict [(1,100);(2,50);(3,150)]
let m1 = Map [for KeyValue (key,value) in d1 -> key, value]
But i wouldn't use a Dictionary and convert it, I would use a Map diretly.
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
Next, you need a way to get all keys from both Maps. You can get the keys of a map with Map.keys but you need all the keys from both. You could get them by using a Set.
let keys = Set (Map.keys m1) + Set (Map.keys m2)
By adding two Sets you get a Set.union of both sets. Once you have them, you can traverse the keys, and try to get both values from both keys. If you use Map.find then you get an optional. You can Pattern match on both cases at once.
let result = Map [
for key in keys do
match Map.tryFind key m1, Map.tryFind key m2 with
| Some x, Some y -> key, (x + y) / 2
| Some x, None -> key, x
| None , Some y -> key, y
| None , None -> failwith "Cannot happen"
]
This creates a new Map data-structure and saves it into result. If both cases are Some then you compute the average, otherwise you just keep the value. As you iterate the keys of both Maps the None,None case cannot happen. A Key always must be in either one or the other.
After all of this, result will be:
Map [(1, 60); (2, 40); (3, 75); (4, 10)]
Again, here is the whole code at once:
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
let keys = Set (Map.keys m1) + Set (Map.keys m2)
let result = Map [
for key in keys do
match Map.tryFind key m1, Map.tryFind key m2 with
| Some x, Some y -> key, (x + y) / 2
| Some x, None -> key, x
| None , Some y -> key, y
| None , None -> failwith "Cannot happen"
]
You also can inline the keys variable, if you want.
Solution 2
When you have a Map then you can make use of the fact that adding a value always to a map, always creates a new Map data-structure. This way you are able to use Map.fold that traverses a Map data-structure and uses one of the map as the starting state while you traverse the other Map.
With Map.change you then can read and change a value in one step. If a key is already available you calculate the average, otherwise just add the value.
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
let result =
(m1,m2) ||> Map.fold (fun state key y ->
state |> Map.change key (function
| Some x -> Some ((x + y) / 2)
| None -> Some y
)
)
Bonus: Adding Functions to Modules
It's sad sometimes that F# has so few functions on Map. But you need the a lot, you always can add a union function youself to the Module. For example:
module Map =
let union f map1 map2 =
let keys = Set (Map.keys map1) + Set (Map.keys map2)
Map [
for key in keys do
match Map.tryFind key map1, Map.tryFind key map2 with
| Some x, Some y -> key, (f x y)
| Some x, None -> key, x
| None , Some y -> key, y
| None , None -> failwith "Cannot happen"
]
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
This way you get a Map.union and you can specify a lambda-function that is executed if both keys are present in both maps, otherwise the value is used unchanged.
There have been a couple of useful suggestions:
Group by keys with standard library functions from the Seq module, by user1981
Use a specialized library for dealing with data series, by Tomas Petricek
Use a map instead (a functional data structure based on comparison), by David Raab
To this I'd like to add
An imperative way, filling a combined dictionary by iterating through the keys of the source data structures, and finally
A query expression
An imperative way
The average calculation is hard-coded with the type int. You can still have generic keys, as their type does not figure in the function, except for the equality constraint required for dictionary keys. You could make the function generic for values too, by marking it inline, but that won't be a pretty sight as it will introduce a host of other constraints onto the type of values.
open System.Collections.Generic
let unionAverage (d1 : IDictionary<_,_>) (d2 : IDictionary<_,_>) =
let d = Dictionary<_,_>()
for k in Seq.append d1.Keys d2.Keys |> Seq.distinct do
match d1.TryGetValue k, d2.TryGetValue k with
| (true, v1), (true, v2) -> d.Add(k, (v1 + v2) / 2)
| (true, v), _ | _, (true, v) -> d.Add(k, v)
| _ -> failwith "Key not found"
d
let d1 = dict[1, 100; 2, 50; 3, 150]
let d2 = dict[1, 20; 2, 30; 3, 0; 4, 10]
unionAverage d1 d2
A query expression
It operates on the same principle as the answer from user1981, but for re-usability the average function has been factored out. It expects an arbitrary number of #seq<KeyValuePair<_,_>> elements, which is just another way to represent dictionaries that are accessed through their enumerators.
As the query expression uses System.Linq.IGrouping under the hood, this is upcast to a regular sequence to reduce confusion. Then there's the conversion to float for Seq.average to operate on, because the type int does not have the required member DivideByInt.
module Dict =
let unionByMany f src =
query{
for KeyValue(k, v) in Seq.concat src do
groupValBy v k into group
select (group.Key, f (group :> seq<_>)) }
|> dict
Dict.unionByMany (Seq.averageBy float >> int) [d1; d2]
Dict.unionByMany Seq.sum [d1; d2]
Dict.unionByMany Seq.min [d1; d2]
I am brand new to prolog and I feel like there is a concept that I am failing to understand, which is preventing me from grasping the concept of recursion in prolog. I am trying to return S, which is the sum of the square of each digit, taken as a list from an integer that is entered by the user in a query. E.g The user enters 12345, I must return S = (1^2)+(2^2)+(3^2)+(4^2)+(5^2) = 55.
In my program below, I understand why the each segment of the calculation of S is printed multiple time as it is part of the recursive rule. However, I do not understand how I would be able to print S as the final result. I figured that I could set a variable = to the result from sos in the second rule and add it as a parameter for intToList but can't seem to figure this one out. The compiler warns that S is a singleton variable in the intToList rule.
sos([],0).
sos([H|T],S) :-
sos(T, S1),
S is (S1 + (H * H)),
write('S is: '),write(S),nl.
intToList(0,[]).
intToList(N,[H|T]) :-
N1 is floor(N/10),
H is N mod 10,
intToList(N1,T),
sos([H|T],S).
The issue with your original code is that you're trying to handle your call to sos/2 within your recursive clause for intToList/2. Break it out (and rename intToList/2 to something more meaningful):
sosDigits(Number, SoS) :-
number_digits(Number, Digits),
sos(Digits, SoS).
Here's your original sos/2 without the write, which seems to work fine:
sos([], 0).
sos([H|T], S) :-
sos(T, S1),
S is (S1 + (H * H)).
Or better, use an accumulator for tail recursion:
sos(Numbers, SoS) :-
sos(Numbers, 0, SoS).
sos([], SoS, SoS).
sos([X|Xs], A, SoS) :-
A1 is A + X*X,
sos(Xs, A1, SoS).
You can also implement sos/2 using maplist/3 and sumlist/2:
square(X, S) :- S is X * X.
sos(Numbers, SoS) :- maplist(square, Numbers, Squares), sumlist(Squares, SoS).
Your intToList/2 needs to be refactored using an accumulator to maintain correct digit order and to get rid of the call to sos/2. Renamed as explained above:
number_digits(Number, Digits) :-
number_digits(Number, [], Digits).
number_digits(Number, DigitsSoFar, [Number | DigitsSoFar]) :-
Number < 10.
number_digits(Number, DigitsSoFar, Digits) :-
Number >= 10,
NumberPrefix is Number div 10,
ThisDigit is Number mod 10,
number_digits(NumberPrefix, [ThisDigit | DigitsSoFar], Digits).
The above number_digits/2 also handles 0 correctly, so that number_digits(0, Digits) yields Digit = [0] rather than Digits = [].
You can rewrite the above implementation of number_digits/3 using the -> ; construct:
number_digits(Number, DigitsSoFar, Digits) :-
( Number < 10
-> Digits = [Number | DigitsSoFar]
; NumberPrefix is Number div 10,
ThisDigit is Number mod 10,
number_digits(NumberPrefix, [ThisDigit | DigitsSoFar], Digits)
).
Then it won't leave a choice point.
Try this:
sos([],Accumulator,Accumulator).
sos([H|T],Accumulator,Result_out) :-
Square is H * H,
Accumulator1 is Accumulator + Square,
sos(T,Accumulator1,Result_out).
int_to_list(N,R) :-
atom_chars(N,Digit_Chars),
int_to_list1(Digit_Chars,Digits),
sos(Digits,0,R).
int_to_list1([],[]).
int_to_list1([Digit_Char|Digit_Chars],[Digit|Digits]) :-
atom_number(Digit_Char,Digit),
int_to_list1(Digit_Chars,Digits).
For int_to_list I used atom_chars which is built-in e.g.
?- atom_chars(12345,R).
R = ['1', '2', '3', '4', '5'].
And then used a typical loop to convert each character to a number using atom_number e.g.
?- atom_number('2',R).
R = 2.
For sos I used an accumulator to accumulate the answer, and then once the list was empty moved the value in the accumulator to the result with
sos([],Accumulator,Accumulator).
Notice that there are to different variables for the accumulator e.g.
Accumulator1 is Accumulator + Square,
sos(T,Accumulator1,Result_out).
this is because in Prolog variables are immutable, so one can not keep assigning new values to the same variable.
Here are some example runs
?- int_to_list(1234,R).
R = 30.
?- int_to_list(12345,R).
R = 55.
?- int_to_list(123456,R).
R = 91.
If you have any questions just ask in the comments under this answer.
I am attempting to create a new time series (Series[DateTime,float]) from an existing one of the same type, where the map to the new Series is recursive - for example:
NewSeries_T = NewSeries_T-1 + constant * OldSeries_T;
I have "NewSeries_0 = 1", as an initialization value for the new series.
I'm attempting to write a Series.map function that will do the job - I've got as far as the following non-working code, but I can't figure out the recursive part:
let rec newSeries = existingSeries |> Series.map (fun k v ->
match k.Equals(initDate) with
| true -> 1
| false -> newSeries.LastValue() + constant * v
)
So, I think the trick is, how do I allow the function access to the "previous" value in the series to build this up recursively?
Edit - Moved to answer below.
Based on Fyodor's reccomendation - the Series.scanValues does exactly what I need:
let initalEntry = Series([initDate], [init])
let newSeries=
existingSeries
|> Series.filter (fun k v -> k.Equals(initDate) = false)
|> Series.scanValues (fun n x-> lambda * n + (1.0 - lambda) * x ) init
newSeries.Merge(initalEntry)
I took away the first value, as I want this to return the "init" value at the start of the series, and merged this back at the end.
How does one get the first key,value pair from F# Map without knowing the key?
I know that the Map type is used to get a corresponding value given a key, e.g. find.
I also know that one can convert the map to a list and use List.Head, e.g.
List.head (Map.toList map)
I would like to do this
1. without a key
2. without knowing the types of the key and value
3. without using a mutable
4. without iterating through the entire map
5. without doing a conversion that iterates through the entire map behind the seen, e.g. Map.toList, etc.
I am also aware that if one gets the first key,value pair it might not be of use because the map documentation does not note if using map in two different calls guarantees the same order.
If the code can not be written then an existing reference from a site such as MSDN explaining and showing why not would be accepted.
TLDR;
How I arrived at this problem was converting this function:
let findmin l =
List.foldBack
(fun (_,pr1 as p1) (_,pr2 as p2) -> if pr1 <= pr2 then p1 else p2)
(List.tail l) (List.head l)
which is based on list and is used to find the minimum value in the associative list of string * int.
An example list:
["+",10; "-",10; "*",20; "/",20]
The list is used for parsing binary operator expressions that have precedence where the string is the binary operator and the int is the precedence. Other functions are preformed on the data such that using F# map might be an advantage over list. I have not decided on a final solution but wanted to explore this problem with map while it was still in the forefront.
Currently I am using:
let findmin m =
if Map.isEmpty m then
None
else
let result =
Map.foldBack
(fun key value (k,v) ->
if value <= v then (key,value)
else (k,v))
m ("",1000)
Some(result)
but here I had to hard code in the initial state ("",1000) when what would be better is just using the first value in the map as the initial state and then passing the remainder of the map as the starting map as was done with the list:
(List.tail l) (List.head l)
Yes this is partitioning the map but that did not work e.g.,
let infixes = ["+",10; "-",10; "*",20; "/",20]
let infixMap = infixes |> Map.ofList
let mutable test = true
let fx k v : bool =
if test then
printfn "first"
test <- false
true
else
printfn "rest"
false
let (first,rest) = Map.partition fx infixMap
which results in
val rest : Map<string,int> = map [("*", 20); ("+", 10); ("-", 10)]
val first : Map<string,int> = map [("/", 20)]
which are two maps and not a key,value pair for first
("/",20)
Notes about answers
For practical purposes with regards to the precedence parsing seeing the + operations before - in the final transformation is preferable so returning + before - is desirable. Thus this variation of the answer by marklam
let findmin (map : Map<_,_>) = map |> Seq.minBy (fun kvp -> kvp.Value)
achieves this and does this variation by Tomas
let findmin m =
Map.foldBack (fun k2 v2 st ->
match st with
| Some(k1, v1) when v1 < v2 -> st
| _ -> Some(k2, v2)) m None
The use of Seq.head does return the first item in the map but one must be aware that the map is constructed with the keys sorted so while for my practical example I would like to start with the lowest value being 10 and since the items are sorted by key the first one returned is ("*",20) with * being the first key because the keys are strings and sorted by such.
For me to practically use the answer by marklam I had to check for an empty list before calling and massage the output from a KeyValuePair into a tuple using let (a,b) = kvp.Key,kvp.Value
I don't think there is an answer that fully satisfies all your requirements, but:
You can just access the first key-value pair using m |> Seq.head. This is lazy unlike converting the map to list. This does not guarantee that you always get the same first element, but realistically, the implementation will guarantee that (it might change in the next version though).
For finding the minimum, you do not actually need the guarantee that Seq.head returns the same element always. It just needs to give you some element.
You can use other Seq-based functons as #marklam mentioned in his answer.
You can also use fold with state of type option<'K * 'V>, which you can initialize with None and then you do not have to worry about finding the first element:
m |> Map.fold (fun st k2 v2 ->
match st with
| Some(k1, v1) when v1 < v2 -> st
| _ -> Some(k2, v2)) None
Map implements IEnumerable<KeyValuePair<_,_>> so you can treat it as a Seq, like:
let findmin (map : Map<_,_>) = map |> Seq.minBy (fun kvp -> kvp.Key)
It's even simpler than the other answers. Map internally uses an AVL balanced tree so the entries are already ordered by key. As mentioned by #marklam Map implements IEnumerable<KeyValuePair<_,_>> so:
let m = Map.empty.Add("Y", 2).Add("X", 1)
let (key, value) = m |> Seq.head
// will return ("X", 1)
It doesn't matter what order the elements were added to the map, Seq.head can operate on the map directly and return the key/value mapping for the min key.
Sometimes it's required to explicitly convert Map to Seq:
let m = Map.empty.Add("Y", 2).Add("X", 1)
let (key, value) = m |> Map.toSeq |> Seq.head
The error message I've seen for this case says "the type 'a * 'b does not match the type Collections.Generic.KeyValuePair<string, int>". It may also be possible add type annotations rather than Map.toSeq.
I just started learning functional programming and I find myself very confused by the concept of pattern matching (i'm using SML). Take for example the following expression to insert an element in an ordered list:
fun insert (n,nil) = [n]
| insert (n, L as x::l) =
if(n < x) then n::L
else x::(insert(n,l))
How can the list L be expressed as x::l? I know x refers to the first element of the list and l to the rest, but I don't know what to call this construct or how it can be used. I have read a lot but all the documentation I find doesn't mention this at all. Here is another example that doesn't use the 'as' keyword.
(*returns a list with the sum of each element added of two lists added together*)
fun addLists (nil,L) = L
| addLists (L,nil) = L
| addLists (x::xs,y::ys) =
(x + y)::(addLists(xs,ys))
Thank you for your help!
For the insert function here:
fun insert (n,nil) = [n]
| insert (n, L as x::l) =
if(n < x) then n::L
else x::(insert(n,l))
The | insert (n, L as x::l) part is a pattern which will be matched against. L as x::l is called an as pattern. It allows us to:
Pattern match against a non empty list, where x is the head of the list and l is the tail of the list
Refer to the entire matched list x::l with the name L
This is similar (although not totally the same) as doing:
| insert (n, x::l)
except that if you do that, the insert function will become:
fun insert (n,nil) = [n]
| insert (n, x::l) =
if(n < x) then n::x::l
else x::(insert(n,l))
So the big advantage of using the L as x::l as pattern over a non as pattern is that it allows us to refer to the entire list, not just its head and tail, and avoids an additional list construction when we need to refer to the entire list. Observe that the only difference in the 2 pieces of code is n::L and n::x::l. Since we use the as pattern L as x::l in the first insert function, we are able to do n::L instead of n::x::l. This avoids one :: operation (also known as cons operation).
As for this:
fun addLists (nil,L) = L
| addLists (L,nil) = L
| addLists (x::xs,y::ys) =
(x + y)::(addLists(xs,ys))
For the second pattern | addLists (x::xs,y::ys), in nowhere do we reconstruct the list x::xs and y::ys in the code following it, so we do not need as patterns. You could write it like:
fun addLists (nil,L) = L
| addLists (L,nil) = L
| addLists (ListX as x::xs, ListY as y::ys) =
(x + y)::(addLists(xs,ys))
and it'll still work, except that we do not refer to ListX or ListY here, and hence these two as patterns are unnecessary.