Already existing high-order function for this algorithm? - collections

I've come up with this simple algorithm (convert list of tuples to a map collection of keys to lists) that I needed in my F# code:
let MergeIntoMap<'K,'V when 'K: comparison>(from: seq<'K*'V>): Map<'K,seq<'V>>=
let keys = from.Select(fun (k,v) -> k)
let keyValuePairs = seq {
for key in keys do
let valsForKey = from.Where(fun (k,v) -> key = k).Select(fun (k,v) -> v) |> seq
yield key,valsForKey
}
keyValuePairs |> Map.ofSeq
Example input:
[ ("a", 1); ("b", 2), ("a", 3) ]
Output:
dict [ ("a", [1; 3]), ("b", [2]) ]
And I was thinking this must be something that is already in the BCL or F#'s set of high order functions maybe? If yes, can someone reference me to it? Because I'm sure my code is not very efficient as it is...

It seems you want to get something like that
let toGroupMap x =
x
|> Seq.groupBy fst
|> Seq.map
(fun (k,v) -> k, v |> Seq.map snd |> Seq.toArray)
|> Map.ofSeq
fsi:
val toGroupMap : x:seq<'a * 'b> -> Map<'a,'b []> when 'a : comparison
val input : (string * int) list = [("a", 1); ("b", 2); ("a", 3)]
val output : Map<string,int []> = map [("a", [|1; 3|]); ("b", [|2|])]
Edit
As written Fyodor Soikin in the comments, there is a extension method ToLookup, which probably does what you need.
open System.Linq
let output = input.ToLookup(fst, snd)
You can read here about the difference between ILookup and IDictionary interfaces

Related

Removing from a list of a tuples that contains an empty element in the second projection F#

I need to learn the right way to do pattern matching on Pair types:
let pairToBeFiltered = Ok ([(1,[]);(2,[3;4]);(5,[6;7;8]);(9,[]);(10,[])])
let filterEmpty (pair: int * int list) =
match pair with
| (x,y) when y <> [] -> (x,y) //This gives error because of incomplete pattern matching!
let filtering = List.map(filterEmpty) pairToBeFiltered
Desired output:
Ok([(2,[3;4]);(5,[6;7;8])])
This should do it:
let pairsToBeFiltered = Ok ([(1,[]);(2,[3;4]);(5,[6;7;8]);(9,[]);(10,[])])
let filterEmpty pairs =
List.where (fun (_, y) -> y <> []) pairs // pattern match on pair occurs here
let filtering : Result<_, string> =
pairsToBeFiltered
|> Result.map filterEmpty
printfn "%A" filtering // Ok [(2, [3; 4]); (5, [6; 7; 8])]
There are a number of issues here:
For clarity, I modified filterEmpty so it processes the entire list, rather than a single pair. This is where we apply the filtering function, List.where, using pattern matching. (In your code, note that List.map with a match expression doesn't filter anything.)
Since your list is wrapped in a Result, you need to unwrap it via Result.map in order to process it. (Since you didn't specify a 'TError type, I assumed string to pacify the compiler.)
Three more versions:
(* using match statement *)
module Version1 =
let pairsToBeFiltered : Result<_, string> =
Ok [(1,[]);(2,[3;4]);(5,[6;7;8]);(9,[]);(10,[])]
let myWhere (pair : int * List<int>) =
match pair with
| _, [] -> false
| _, _ -> true
let myFilter l0 = l0 |> Result.map (List.filter myWhere)
let result = pairsToBeFiltered |> myFilter
(* using lambda functions and List.isEmpty *)
module Version2 =
let pairsToBeFiltered : Result<_, string> =
Ok [(1,[]);(2,[3;4]);(5,[6;7;8]);(9,[]);(10,[])]
let myFilter l0 =
l0
|> Result.map (fun l1 ->
l1 |> List.filter (fun (_, l2) ->
l2 |> List.isEmpty |> not))
let result = pairsToBeFiltered |> myFilter
(* shortening Version2 (point free style - take care, can be confusing) *)
module Version3 =
let pairsToBeFiltered : Result<_, string> =
Ok [(1,[]);(2,[3;4]);(5,[6;7;8]);(9,[]);(10,[])]
let myFilter = Result.map (List.filter (snd >> List.isEmpty >> not))
let result = pairsToBeFiltered |> myFilter

Parse the string into a list of tuples

I am looking for a piece of code in F# that can parse this type of string:
"x=1,y=42,A=[1,3,4,8]"
into a list of tuples that looks like this:
[("x",1);("y",42);("A",1);("A",3);("A",4);("A",8)]
Thanks in advance :)
You can quite nicely solve this using the FParsec parser combinator library. This is manageable using regular expressions, but it's not very elegant. Parser combinators make it very clear what the grammar of the inputs that you can handle is. You can also easily add other features like whitespace.
The following actually produces a list of string * Value pairs where Value is a new data type, corresponding to the possible right-hand-sides in the input:
type Value = Int of int | List of int list
Now, you can do the parsing using the following:
let ident = identifier (IdentifierOptions())
let rhs =
// Right-hand-side is either an integer...
( pint32 |>> Int ) <|>
// Or a list [ .. ] of integers separated by ','
( pchar '[' >>. (sepBy pint32 (pchar ',')) .>> pchar ']' |>> List )
let tuple =
// A single tuple is an identifier = right-hand-side
ident .>> pchar '=' .>>. rhs
let p =
// The input is a comma separated list of tuples
sepBy tuple (pchar ',')
run p "x=1,y=42,A=[1,3,4,8]"
Sometimes a named regex makes for readable code, even if not the regex.
(?<id>\w+)=((\[((?<list>(\d+))*,?\s*)*\])|(?<number>\d+))
This reads: Identifier = [Number followed by comma or space, zero or more] | Number
let parse input =
[
let regex = Regex("(?<id>\w+)=((\[((?<list>(\d+))*,?\s*)*\])|(?<number>\d+))")
let matches = regex.Matches input
for (expr : Match) in matches do
let group name = expr.Groups.[string name]
let id = group "id"
let list = group "list"
let number = group "number"
if list.Success then
for (capture : Capture) in list.Captures do
yield (id.Value, int capture.Value)
else if number.Success then
yield (id.Value, int number.Value)
]
Test
let input = "var1=1, var2=2, list=[1, 2, 3, 4], single=[1], empty=[], bad=[,,], bad=var"
printfn "%A" (parse input)
Output
[("var1", 1); ("var2", 2); ("list", 1); ("list", 2); ("list", 3); ("list", 4); "single", 1)]
It's quite advisable to follow the approach outlined by Tomas Petricek's answer, employing the established FParsec parser combinator library.
For educational purposes, you might want to roll your own parser combinator, and for this endeavor Scott W.'s blog ("Understanding parser combinators", and "Building a useful set of parser combinators") contains valuable information.
The parsing looks quite similar:
// parse a list of integers enclosed in brackets and separated by ','
let plist = pchar '[' >>. sepBy1 pint (pchar ',') .>> pchar ']'
// parser for the right hand side, singleton integer or a list of integers
let intOrList = pint |>> (fun x -> [x]) <|> plist
// projection for generation of string * integer tuples
let ungroup p =
p |>> List.collect (fun (key, xs) -> xs |> List.map (fun x -> key, x))
// parser for an input of zero or more string value pairs separated by ','
let parser =
sepBy (letters .>> pchar '=' .>>. intOrList) (pchar ',')
|> ungroup
"x=1,y=42,A=[1,3,4,8]"
|> run parser
// val it : ((String * int) list * string) option =
// Some ([("x", 1); ("y", 42); ("A", 1); ("A", 3); ("A", 4); ("A", 8)], "")
This simple grammar still requires 15 or so parser combinators. Another difference is that for simplicity's sake the Parser type has been modeled on FSharp's Option type.
type Parser<'T,'U> = Parser of ('T -> ('U * 'T) option)
let run (Parser f1) x = // run the parser with input
f1 x
let returnP arg = // lift a value to a Parser
Parser (fun x -> Some(arg, x))
let (>>=) (Parser f1) f = // apply parser-producing function
Parser(f1 >> Option.bind (fun (a, b) -> run (f a) b))
let (|>>) p f = // apply function to value inside Parser
p >>= (f >> returnP)
let (.>>.) p1 p2 = // andThen combinator
p1 >>= fun r1 ->
p2 >>= fun r2 ->
returnP (r1, r2)
let (.>>) p1 p2 = // andThen, but keep first value only
(p1 .>>. p2) |>> fst
let (>>.) p1 p2 = // andThen, keep second value only
(p1 .>>. p2) |>> snd
let pchar c = // parse a single character
Parser (fun s ->
if String.length s > 0 && s.[0] = c then Some(c, s.[1..])
else None )
let (<|>) (Parser f1) (Parser f2) = // orElse combinator
Parser(fun arg ->
match f1 arg with None -> f2 arg | res -> res )
let choice parsers = // choose any of a list of combinators
List.reduce (<|>) parsers
let anyOf = // choose any of a list of characters
List.map pchar >> choice
let many (Parser f) = // matches zero or more occurrences
let rec aux input =
match f input with
| None -> [], input
| Some (x, rest1) ->
let xs, rest2 = aux rest1
x::xs, rest2
Parser (fun arg -> Some(aux arg))
let many1 p = // matches one or more occurrences of p
p >>= fun x ->
many p >>= fun xs ->
returnP (x::xs)
let stringP p = // converts list of characters to string
p |>> (fun xs -> System.String(List.toArray xs))
let letters = // matches one or more letters
many1 (anyOf ['A'..'Z'] <|> anyOf ['a'..'z']) |> stringP
let pint = // matches an integer
many1 (anyOf ['0'..'9']) |> stringP |>> int
let sepBy1 p sep = // matches p one or more times, separated by sep
p .>>. many (sep >>. p) |>> (fun (x,xs) -> x::xs)
let sepBy p sep = // matches p zero or more times, separated by sep
sepBy1 p sep <|> returnP []
Try this:
open System.Text.RegularExpressions
let input = "x=1,y=42,A=[1,3,4,8]"
Regex.Split(input,",(?=[A-Za-z])") //output: [|"x=1"; "y=42"; "A=[1,3,4,8]"|]
|> Array.collect (fun x ->
let l,v = Regex.Split(x,"=") |> fun t -> Array.head t,Array.last t //label and value
Regex.Split(v,",") |> Array.map (fun x -> l,Regex.Replace(x,"\[|\]","") |> int))
|> List.ofArray

Optimized version of except for Map

I've written a generic except function for Maps that, given a source map and an other map, returns only the items of the source map without corresponding keys in the other map.
module MapExt =
let getKeys<'k,'v when 'k : comparison> : Map<'k,'v> -> 'k[] =
Map.toArray >> Array.map fst
let except<'k,'v when 'k : comparison>(other:Map<'k,'v>) (source:Map<'k,'v>) : ('k * 'v)[] =
source |> getKeys
|> Array.except (other |> getKeys)
|> Array.map(fun k -> (k, source.[k]))
Now, I've seen in the second part of this answer, that an optimized version of the map's keys is obtained via a Map.fold.
Therefore, can I do a similar optimization of my original MapExt module in the following way?
module MapExtOpt =
let getKeys<'k,'v when 'k : comparison> (m : Map<'k,'v>) : 'k list =
Map.fold (fun keys key _ -> key::keys) [] m
let except<'k,'v when 'k : comparison>
(other : Map<'k,'v>) (source : Map<'k,'v>) : ('k * 'v) list =
source
|> Map.fold (fun s k v ->
if (other.ContainsKey k) then
s
else
(k,v) :: s
) []
Or am I reinventing some already existing (and optimized) functions?
I don't think there is a built in function, but this is a simpler way of doing what you are trying to do. It only goes over the 'to be removed' map once, so its much more efficient.
let except toRemove source =
Map.fold (fun m k _ -> if Map.containsKey k m then Map.remove k m else m) source toRemove
Finally,
thanks to Loïc Denuzière for his comment on Slack:
The if is not necessary: if m doesn't contain k, Map.remove k m just returns m anyway
I think I can also apply a double eta reduction by considering that it makes sense to speak about the keys to remove (not about a map whose values are ignored), so I would simply redefine it as
let except<'k,'v when 'k : comparison> = List.foldBack Map.remove<'k,'v>

Asynchronous mapFold

For the following example, Array.mapFold produces the result ([|1; 4; 12|], 7).
let mapping s x = (s * x, s + x)
[| 1..3 |]
|> Array.mapFold mapping 1
Now suppose our mapping is asynchronous.
let asyncMapping s x = async { return (s * x, s + x) }
I am able to create Array.mapFoldAsync for the following to work.
[| 1..3 |]
|> Array.mapFoldAsync asyncMapping 1
|> Async.RunSynchronously
Is there a succinct way to achieve this without creating Array.mapFoldAsync?
I am asking as a way to learn other techniques - my attempts using Array.fold were horrible.
I don't think it would generally be of much benefit to combine mapFold with an Async function, because the expected result is a tuple ('values * 'accumulator), but using an Async function will at best give you an Async<'values * 'accumulator>. Consider the following attempt to make Array.mapFold work with Async:
let mapping s x = async {
let! s' = s
let! x' = x
return (s' * x', s' + x')
}
[| 1..3 |]
|> Array.map async.Return
|> Array.mapFold mapping (async.Return 1)
Even this doesn't work, because of the type mismatch: The type ''a * Async<'b>' does not match the type 'Async<'c * 'd>'.
You may also have noticed that while there is an Array.Parallel.map, there's no Array.Parallel.fold or Array.Parallel.mapFold. If you try to write your own mapFoldAsync, you may see why. The mapping part is pretty easy, just partially apply Array.map and compose with Async.Parallel:
let mapAsync f = Array.map f >> Async.Parallel
You can implement an async fold as well, but since each evaluation depends on the previous result, you can't leverage Async.Parallel this time:
let foldAsync f state array =
match array |> Array.length with
| 0 -> async.Return state
| length ->
async {
let mutable acc = state
for i = 0 to length - 1 do
let! value = f acc array.[i]
acc <- value
return acc
}
Now, when we try to combine these to build a mapFoldAsync, it becomes apparent that we can't leverage parallel execution on the mapping anymore, because both the values and the accumulator can be based on the result of the previous evaluation. That means our mapFoldAsync will be a modified 'foldAsync', not a composition of it with mapAsync:
let mapFoldAsync (f: 's -> 'a -> Async<'b * 's>) (state: 's) (array: 'a []) =
match array |> Array.length with
| 0 -> async.Return ([||], state)
| length ->
async {
let mutable acc = state
let results = Array.init length <| fun _ -> Unchecked.defaultof<'b>
for i = 0 to length - 1 do
let! (x,y) = f acc array.[i]
results.[i] <- x
acc <- y
return (results, acc)
}
While this will give you a way to do a mapFold with an async mapping function, the only real benefit would be if the mapping function did something with high-latency, such as a service call. You won't be able to leverage parallel execution for speed-up. If possible, I would suggest considering an alternative solution, based on your real-world scenario.
Without external libraries (I recommend to try AsyncSeq or Hopac.Streams)
you could do this:
let mapping s x = (fst s * x, snd s + x) |> async.Return
module Array =
let mapFoldAsync folderAsync (state: 'state) (array: 'elem []) = async {
let mutable finalState = state
for elem in array do
let! nextState = folderAsync finalState elem
finalState <- nextState
return finalState
}
[| 1..4 |]
|> Array.mapFoldAsync mapping (1,0)
|> Async.RunSynchronously

Swap key and value in a map in fsharp

How to create a new map that is similar to the original one, but with swapped keys and values in Fsharp?
For example, I have this
let map1 =
[("A", "1"); ("B", "2"); ("C", "3");]
|> Map.ofList
and want to get this:
let map2 =
[("1", "A"); ("2", "B"); ("3", "C");]
|> Map.ofList
Thank you for your help!
Perhaps you will approach this decision:
let map1 = Map.ofList [("A", "1"); ("B", "2"); ("C", "3")]
map1 |> printfn "%A"
let rev map: Map<string,string> =
Map.fold (fun m key value -> m.Add(value,key)) Map.empty map
rev map1 |> printfn "%A"
Print:
map [("A", "1"); ("B", "2"); ("C", "3")]
map [("1", "A"); ("2", "B"); ("3", "C")]
Link: http://ideone.com/cfN2yH
You could convert it to a list and back, calling a function to swap in the middle.
let swap (x, y) = y, x
let swapAll tuples = List.map swap tuples
let invert map = map |> Map.toList |> swapAll |> Map.ofList
This method somewhat highlights what's nice about functional programming--you can build up complex behavior just by combining small building blocks.

Resources