Removing from a list of a tuples that contains an empty element in the second projection F# - functional-programming

I need to learn the right way to do pattern matching on Pair types:
let pairToBeFiltered = Ok ([(1,[]);(2,[3;4]);(5,[6;7;8]);(9,[]);(10,[])])
let filterEmpty (pair: int * int list) =
match pair with
| (x,y) when y <> [] -> (x,y) //This gives error because of incomplete pattern matching!
let filtering = List.map(filterEmpty) pairToBeFiltered
Desired output:
Ok([(2,[3;4]);(5,[6;7;8])])

This should do it:
let pairsToBeFiltered = Ok ([(1,[]);(2,[3;4]);(5,[6;7;8]);(9,[]);(10,[])])
let filterEmpty pairs =
List.where (fun (_, y) -> y <> []) pairs // pattern match on pair occurs here
let filtering : Result<_, string> =
pairsToBeFiltered
|> Result.map filterEmpty
printfn "%A" filtering // Ok [(2, [3; 4]); (5, [6; 7; 8])]
There are a number of issues here:
For clarity, I modified filterEmpty so it processes the entire list, rather than a single pair. This is where we apply the filtering function, List.where, using pattern matching. (In your code, note that List.map with a match expression doesn't filter anything.)
Since your list is wrapped in a Result, you need to unwrap it via Result.map in order to process it. (Since you didn't specify a 'TError type, I assumed string to pacify the compiler.)

Three more versions:
(* using match statement *)
module Version1 =
let pairsToBeFiltered : Result<_, string> =
Ok [(1,[]);(2,[3;4]);(5,[6;7;8]);(9,[]);(10,[])]
let myWhere (pair : int * List<int>) =
match pair with
| _, [] -> false
| _, _ -> true
let myFilter l0 = l0 |> Result.map (List.filter myWhere)
let result = pairsToBeFiltered |> myFilter
(* using lambda functions and List.isEmpty *)
module Version2 =
let pairsToBeFiltered : Result<_, string> =
Ok [(1,[]);(2,[3;4]);(5,[6;7;8]);(9,[]);(10,[])]
let myFilter l0 =
l0
|> Result.map (fun l1 ->
l1 |> List.filter (fun (_, l2) ->
l2 |> List.isEmpty |> not))
let result = pairsToBeFiltered |> myFilter
(* shortening Version2 (point free style - take care, can be confusing) *)
module Version3 =
let pairsToBeFiltered : Result<_, string> =
Ok [(1,[]);(2,[3;4]);(5,[6;7;8]);(9,[]);(10,[])]
let myFilter = Result.map (List.filter (snd >> List.isEmpty >> not))
let result = pairsToBeFiltered |> myFilter

Related

Parse the string into a list of tuples

I am looking for a piece of code in F# that can parse this type of string:
"x=1,y=42,A=[1,3,4,8]"
into a list of tuples that looks like this:
[("x",1);("y",42);("A",1);("A",3);("A",4);("A",8)]
Thanks in advance :)
You can quite nicely solve this using the FParsec parser combinator library. This is manageable using regular expressions, but it's not very elegant. Parser combinators make it very clear what the grammar of the inputs that you can handle is. You can also easily add other features like whitespace.
The following actually produces a list of string * Value pairs where Value is a new data type, corresponding to the possible right-hand-sides in the input:
type Value = Int of int | List of int list
Now, you can do the parsing using the following:
let ident = identifier (IdentifierOptions())
let rhs =
// Right-hand-side is either an integer...
( pint32 |>> Int ) <|>
// Or a list [ .. ] of integers separated by ','
( pchar '[' >>. (sepBy pint32 (pchar ',')) .>> pchar ']' |>> List )
let tuple =
// A single tuple is an identifier = right-hand-side
ident .>> pchar '=' .>>. rhs
let p =
// The input is a comma separated list of tuples
sepBy tuple (pchar ',')
run p "x=1,y=42,A=[1,3,4,8]"
Sometimes a named regex makes for readable code, even if not the regex.
(?<id>\w+)=((\[((?<list>(\d+))*,?\s*)*\])|(?<number>\d+))
This reads: Identifier = [Number followed by comma or space, zero or more] | Number
let parse input =
[
let regex = Regex("(?<id>\w+)=((\[((?<list>(\d+))*,?\s*)*\])|(?<number>\d+))")
let matches = regex.Matches input
for (expr : Match) in matches do
let group name = expr.Groups.[string name]
let id = group "id"
let list = group "list"
let number = group "number"
if list.Success then
for (capture : Capture) in list.Captures do
yield (id.Value, int capture.Value)
else if number.Success then
yield (id.Value, int number.Value)
]
Test
let input = "var1=1, var2=2, list=[1, 2, 3, 4], single=[1], empty=[], bad=[,,], bad=var"
printfn "%A" (parse input)
Output
[("var1", 1); ("var2", 2); ("list", 1); ("list", 2); ("list", 3); ("list", 4); "single", 1)]
It's quite advisable to follow the approach outlined by Tomas Petricek's answer, employing the established FParsec parser combinator library.
For educational purposes, you might want to roll your own parser combinator, and for this endeavor Scott W.'s blog ("Understanding parser combinators", and "Building a useful set of parser combinators") contains valuable information.
The parsing looks quite similar:
// parse a list of integers enclosed in brackets and separated by ','
let plist = pchar '[' >>. sepBy1 pint (pchar ',') .>> pchar ']'
// parser for the right hand side, singleton integer or a list of integers
let intOrList = pint |>> (fun x -> [x]) <|> plist
// projection for generation of string * integer tuples
let ungroup p =
p |>> List.collect (fun (key, xs) -> xs |> List.map (fun x -> key, x))
// parser for an input of zero or more string value pairs separated by ','
let parser =
sepBy (letters .>> pchar '=' .>>. intOrList) (pchar ',')
|> ungroup
"x=1,y=42,A=[1,3,4,8]"
|> run parser
// val it : ((String * int) list * string) option =
// Some ([("x", 1); ("y", 42); ("A", 1); ("A", 3); ("A", 4); ("A", 8)], "")
This simple grammar still requires 15 or so parser combinators. Another difference is that for simplicity's sake the Parser type has been modeled on FSharp's Option type.
type Parser<'T,'U> = Parser of ('T -> ('U * 'T) option)
let run (Parser f1) x = // run the parser with input
f1 x
let returnP arg = // lift a value to a Parser
Parser (fun x -> Some(arg, x))
let (>>=) (Parser f1) f = // apply parser-producing function
Parser(f1 >> Option.bind (fun (a, b) -> run (f a) b))
let (|>>) p f = // apply function to value inside Parser
p >>= (f >> returnP)
let (.>>.) p1 p2 = // andThen combinator
p1 >>= fun r1 ->
p2 >>= fun r2 ->
returnP (r1, r2)
let (.>>) p1 p2 = // andThen, but keep first value only
(p1 .>>. p2) |>> fst
let (>>.) p1 p2 = // andThen, keep second value only
(p1 .>>. p2) |>> snd
let pchar c = // parse a single character
Parser (fun s ->
if String.length s > 0 && s.[0] = c then Some(c, s.[1..])
else None )
let (<|>) (Parser f1) (Parser f2) = // orElse combinator
Parser(fun arg ->
match f1 arg with None -> f2 arg | res -> res )
let choice parsers = // choose any of a list of combinators
List.reduce (<|>) parsers
let anyOf = // choose any of a list of characters
List.map pchar >> choice
let many (Parser f) = // matches zero or more occurrences
let rec aux input =
match f input with
| None -> [], input
| Some (x, rest1) ->
let xs, rest2 = aux rest1
x::xs, rest2
Parser (fun arg -> Some(aux arg))
let many1 p = // matches one or more occurrences of p
p >>= fun x ->
many p >>= fun xs ->
returnP (x::xs)
let stringP p = // converts list of characters to string
p |>> (fun xs -> System.String(List.toArray xs))
let letters = // matches one or more letters
many1 (anyOf ['A'..'Z'] <|> anyOf ['a'..'z']) |> stringP
let pint = // matches an integer
many1 (anyOf ['0'..'9']) |> stringP |>> int
let sepBy1 p sep = // matches p one or more times, separated by sep
p .>>. many (sep >>. p) |>> (fun (x,xs) -> x::xs)
let sepBy p sep = // matches p zero or more times, separated by sep
sepBy1 p sep <|> returnP []
Try this:
open System.Text.RegularExpressions
let input = "x=1,y=42,A=[1,3,4,8]"
Regex.Split(input,",(?=[A-Za-z])") //output: [|"x=1"; "y=42"; "A=[1,3,4,8]"|]
|> Array.collect (fun x ->
let l,v = Regex.Split(x,"=") |> fun t -> Array.head t,Array.last t //label and value
Regex.Split(v,",") |> Array.map (fun x -> l,Regex.Replace(x,"\[|\]","") |> int))
|> List.ofArray

Asynchronous mapFold

For the following example, Array.mapFold produces the result ([|1; 4; 12|], 7).
let mapping s x = (s * x, s + x)
[| 1..3 |]
|> Array.mapFold mapping 1
Now suppose our mapping is asynchronous.
let asyncMapping s x = async { return (s * x, s + x) }
I am able to create Array.mapFoldAsync for the following to work.
[| 1..3 |]
|> Array.mapFoldAsync asyncMapping 1
|> Async.RunSynchronously
Is there a succinct way to achieve this without creating Array.mapFoldAsync?
I am asking as a way to learn other techniques - my attempts using Array.fold were horrible.
I don't think it would generally be of much benefit to combine mapFold with an Async function, because the expected result is a tuple ('values * 'accumulator), but using an Async function will at best give you an Async<'values * 'accumulator>. Consider the following attempt to make Array.mapFold work with Async:
let mapping s x = async {
let! s' = s
let! x' = x
return (s' * x', s' + x')
}
[| 1..3 |]
|> Array.map async.Return
|> Array.mapFold mapping (async.Return 1)
Even this doesn't work, because of the type mismatch: The type ''a * Async<'b>' does not match the type 'Async<'c * 'd>'.
You may also have noticed that while there is an Array.Parallel.map, there's no Array.Parallel.fold or Array.Parallel.mapFold. If you try to write your own mapFoldAsync, you may see why. The mapping part is pretty easy, just partially apply Array.map and compose with Async.Parallel:
let mapAsync f = Array.map f >> Async.Parallel
You can implement an async fold as well, but since each evaluation depends on the previous result, you can't leverage Async.Parallel this time:
let foldAsync f state array =
match array |> Array.length with
| 0 -> async.Return state
| length ->
async {
let mutable acc = state
for i = 0 to length - 1 do
let! value = f acc array.[i]
acc <- value
return acc
}
Now, when we try to combine these to build a mapFoldAsync, it becomes apparent that we can't leverage parallel execution on the mapping anymore, because both the values and the accumulator can be based on the result of the previous evaluation. That means our mapFoldAsync will be a modified 'foldAsync', not a composition of it with mapAsync:
let mapFoldAsync (f: 's -> 'a -> Async<'b * 's>) (state: 's) (array: 'a []) =
match array |> Array.length with
| 0 -> async.Return ([||], state)
| length ->
async {
let mutable acc = state
let results = Array.init length <| fun _ -> Unchecked.defaultof<'b>
for i = 0 to length - 1 do
let! (x,y) = f acc array.[i]
results.[i] <- x
acc <- y
return (results, acc)
}
While this will give you a way to do a mapFold with an async mapping function, the only real benefit would be if the mapping function did something with high-latency, such as a service call. You won't be able to leverage parallel execution for speed-up. If possible, I would suggest considering an alternative solution, based on your real-world scenario.
Without external libraries (I recommend to try AsyncSeq or Hopac.Streams)
you could do this:
let mapping s x = (fst s * x, snd s + x) |> async.Return
module Array =
let mapFoldAsync folderAsync (state: 'state) (array: 'elem []) = async {
let mutable finalState = state
for elem in array do
let! nextState = folderAsync finalState elem
finalState <- nextState
return finalState
}
[| 1..4 |]
|> Array.mapFoldAsync mapping (1,0)
|> Async.RunSynchronously

Already existing high-order function for this algorithm?

I've come up with this simple algorithm (convert list of tuples to a map collection of keys to lists) that I needed in my F# code:
let MergeIntoMap<'K,'V when 'K: comparison>(from: seq<'K*'V>): Map<'K,seq<'V>>=
let keys = from.Select(fun (k,v) -> k)
let keyValuePairs = seq {
for key in keys do
let valsForKey = from.Where(fun (k,v) -> key = k).Select(fun (k,v) -> v) |> seq
yield key,valsForKey
}
keyValuePairs |> Map.ofSeq
Example input:
[ ("a", 1); ("b", 2), ("a", 3) ]
Output:
dict [ ("a", [1; 3]), ("b", [2]) ]
And I was thinking this must be something that is already in the BCL or F#'s set of high order functions maybe? If yes, can someone reference me to it? Because I'm sure my code is not very efficient as it is...
It seems you want to get something like that
let toGroupMap x =
x
|> Seq.groupBy fst
|> Seq.map
(fun (k,v) -> k, v |> Seq.map snd |> Seq.toArray)
|> Map.ofSeq
fsi:
val toGroupMap : x:seq<'a * 'b> -> Map<'a,'b []> when 'a : comparison
val input : (string * int) list = [("a", 1); ("b", 2); ("a", 3)]
val output : Map<string,int []> = map [("a", [|1; 3|]); ("b", [|2|])]
Edit
As written Fyodor Soikin in the comments, there is a extension method ToLookup, which probably does what you need.
open System.Linq
let output = input.ToLookup(fst, snd)
You can read here about the difference between ILookup and IDictionary interfaces

F# idiomatic way of determining if all (key*value) list entries sharing key have same singular distinct value

Given a set of dictionaries of int*string where the first is the "primary", I want to answer the question:
For all additional dictionaries, do they all have the same values for the same keys as the primary?
I've currently achieved this by doing the following:
let allSame = primary # remaining
|> Seq.groupBy (fun (pos, _) -> pos)
|> Seq.map (fun (pos, items) -> (pos, items |> Seq.map (fun (_, name) -> name) |> Seq.distinct |> List.ofSeq))
|> Seq.exists (fun (_, names) -> names.Length > 1))
I'm wondering if there's a more idiomatic way of achieving this?
Going through the duplication of pos in the grouping int * (int * string) list and then having to reduce down to int * string list seems a bit redundant, but groupBy unfortunately doesn't offer a value projection overload.
EDIT
Given a bunch of items (fields) with simplified structure {SortOrder:int;Name:string;...}
I'm going: Field list -> (int * string) list
The "primary" is just the head of the list, it doesn't matter which one I pick as "primary" because I'm only interested in whether all fields with the same position also share the same name.
That's why I'm grouping by position, then reducing down to a distinct list of names, and just counting the entries (>1 obviously means some divergence).
Eventual Solution
Here's what I ended up with:
let primary = getFields <| fst x
let allSame = (primary) # ((tail |> List.map (fun (m,_) -> getFields m)) |> List.collect (fun e -> e))
|> Seq.sortBy (fun (pos, _) -> pos)
|> Seq.pairwise
|> Seq.forall (fun ((_,namex),(_,namey)) -> Seq.forall2 (=) namex namey)
if allSame then
Some (fst x)
else
failwith "Some error message here"
Like #Carsten said in his comment, sort by key and then compare each KeyValuePair. As an added benefit, Seq.forall is lazy and stops evaluation at the first mismatch.
[primary; remaining1; remaining2]
|> Seq.map (Seq.sortBy (fun (KeyValue(k,_)) -> k))
|> Seq.pairwise
|> Seq.forall (fun (x, y) -> Seq.forall2 (=) x y)
Going for pure readability, I'd prefer to define a helper function to test each key/value pair.
Your question doesn't say if a mismatched key should be OK or not, so pick whichever is appropriate:
(mismatched key is OK)
let looselyAllSame (primary :: remaining) =
let hasDifferentName key value =
primary |> Map.tryFind key |> Option.exists ((<>) value)
not (remaining |> List.exists (Map.exists hasDifferentName))
(mismatched key is not OK)
let strictlyAllSame (primary :: remaining)=
let hasSameName key value =
primary |> Map.tryFind key |> Option.exists ((=) value)
remaining |> List.forall (Map.forall hasSameName)

Mutually recursive let bindings

I'm trying to implement a parser that looks something like this:
open System
type ParseResult<'a> =
{
Result : Option<'a>;
Rest : string
}
let Fail = fun input -> { Result = None; Rest = input }
let Return a = fun input -> { Result = Some a; Rest = input }
let ThenBind p f =
fun input ->
let r = p input
match r.Result with
| None -> { Result = None; Rest = input } // Recreate the result since p returns a ParseResult<'a>
| _ -> (f r.Result) r.Rest
let Then p1 p2 = ThenBind p1 (fun r -> p2)
let Or p1 p2 =
fun input ->
let r = p1 input
match r.Result with
| None -> p2 input
| _ -> r
let rec Chainl1Helper a p op =
Or
<| ThenBind op (fun f ->
ThenBind p (fun y ->
Chainl1Helper (f.Value a y.Value) p op))
<| Return a
let Chainl1 p op = ThenBind p (fun x -> Chainl1Helper x.Value p op)
let rec Chainr1 p op =
ThenBind p (fun x ->
Or
(ThenBind op (fun f ->
ThenBind (Chainr1 p op) (fun y ->
Return (f.Value x.Value y.Value))))
(Return x.Value))
let Next = fun input ->
match input with
| null -> { Result = None; Rest = input }
| "" -> { Result = None; Rest = input }
| _ -> { Result = Some <| char input.[0..1]; Rest = input.[1..] }
let Sat predicate = ThenBind Next (fun n -> if predicate n.Value then Return n.Value else Fail)
let Digit = ThenBind (Sat Char.IsDigit) (fun c -> Return <| float c.Value)
let rec NatHelper i =
Or
(ThenBind Digit (fun x ->
NatHelper (float 10 * i + x.Value) ))
(Return i)
let Nat = ThenBind Digit (fun d -> NatHelper d.Value)
let LiteralChar c = Sat (fun x -> x = c)
let rec Literal input token =
match input with
| "" -> Return token
| _ -> Then (LiteralChar <| char input.[0..1]) (Literal input.[1..] token)
let AddSub =
Or
<| ThenBind (LiteralChar '+') (fun c -> Return (+))
<| ThenBind (LiteralChar '-') (fun c -> Return (-))
let MulDiv =
Or
<| ThenBind (LiteralChar '*') (fun c -> Return (*))
<| ThenBind (LiteralChar '/') (fun c -> Return (/))
let Exp = ThenBind (LiteralChar '^') (fun c -> Return ( ** ))
let rec Expression = Chainl1 Term AddSub
and Term = Chainl1 Factor MulDiv
and Factor = Chainr1 Part Exp
and Part = Or Nat Paren
and Paren =
Then
<| LiteralChar '('
<| ThenBind Expression (fun e ->
Then (LiteralChar ')') (Return e.Value))
The last functions are mutually recursive in their definitions. Expression's definition depends on Term, which depends on Factor, which depends on Part, which depends on Paren, which depends on Expression.
When I try to compile this, I get an error about mutually recursive definitions with the suggestion to make Expression lazy or a function. I tried both of those, and I get a cryptic InvalidOperationException with both that says something about ValueFactory attempting to access the Value property.
In general, F# lets you use let rec .. and .. not just for defining mutually recursive functions, but also for defining mutually recursive values. This means that you might be able to write something like this:
let rec Expression = Chainl1 Term AddSub
and Paren =
Then
<| LiteralChar '('
<| ThenBind Expression (fun e ->
Then (LiteralChar ')') (Return e.Value))
and Part = Or Nat Paren
and Factor = Chainr1 Part Exp
and Term = Chainl1 Factor MulDiv
However, this only works if the computation is not evaluated immediately (because then the recursive definition would not make sense). This very much depends on the library you're using here (or on the rest of your code). But you can try the above and see if that works - if no, you'll need to provide more details.
EDIT In the updated example, there is an immediate loop in your recursive definition. You need to delay some part of the definition using fun _ -> ... so that not everything needs to be evaluated at once. In your example, you can do that by replacing Then with ThenBind in the definition of Paren:
let rec Expression = Chainl1 Term AddSub
and Term = Chainl1 Factor MulDiv
and Factor = Chainr1 Part Exp
and Part = Or Nat Paren
and Paren =
ThenBind
(LiteralChar '(')
(fun _ -> ThenBind Expression (fun e ->
Then (LiteralChar ')') (Return e.Value)))

Resources