F# - GroupBy and apply function to each property inside second tuple item - dictionary

I have a an F# list of classes for which I am using properties to access data (i'm using a library developed in C#). I would like to group by one property then apply a separate function to each property in the second item of the resulting tuple.
Example:
let grouped = list |> Seq.groupBy (fun x -> x.Year) //group by the year property. Results in Seq<int * seq<myClass>>
|> Seq.map (fun (a, b) -> (a, //How to map generic functions to each remaining property in the second tuple?
Hopefully this will make sense to someone. My second tuple item is a seq resulting from the groupBy. Each remaining property in MyClass needs to have a different function applying to it. In the past to sum a property i have just done something like:
|> Seq.map (fun (a, b) -> (a, b |> Seq.SumBy (fun x -> x.myProperty)))
I'd like to do something like this using Seq.map for several properties.
Many Thanks for any help at all,
Richard

You need to somehow specify the properties that you want to work with - the simplest way is to create a list of functions that read the properties. Assuming your type is MyType, you can write something like this:
let properties = [ (fun (x:MyType) -> x.MyProperty) ]
After you construct groups, you can then iterate over all properties in properties (using List.map or F# list comprehension) and caculate values |> Seq.sumBy prop where values is the group and prop is the current property:
let grouped =
list
|> Seq.groupBy (fun x -> x.Year)
|> Seq.map (fun (key, values) ->
(key, [for prop in properties -> values |> Seq.sumBy prop ])
If you need to use other aggregation functions than Seq.sumBy, then you can build a list of aggregating operations that you need to run (instead of a list of properties).
let properties = [ "MyPropSum", Seq.sumBy (fun (x:MyType) -> x.MyProperty);
"MyProp2Avg", Seq.averageBy (fun (x:MyType) -> x.MyProperty2) ]
To make further processing easier, I would probably build a dictionary with the results - this can be easily done by passing the list with name-value pairs to the dict function:
let grouped =
list
|> Seq.groupBy (fun x -> x.Year)
|> Seq.map (fun (key, values) ->
(key, dict [for name, aggregate in properties -> name, aggregate values ])

Related

flatten a map of map in F#

I have the following:
Map<Instrument, Map<PositionSide, PositionData>>
is there a way to flatten this to a:
PositionData list
without iterating through the 2 maps manually?
If we do not worry about duplicate values from the inner maps, it can be done by treating the maps as sequences. Their in-built GetEnumerator() method returns a sequence of the type System.Collections.Generic.KeyValuePair<_,_>.
The latter can be deconstructed by F#'s predefined active recognizer KeyValue.
Map.empty<Instrument, Map<PositionSide, PositionData>>
|> Seq.collect (fun (KeyValue(_, v)) -> v)
|> Seq.map (fun (KeyValue(_, v)) -> v)
|> Seq.toList
// val it : PositionData list = []
I'm afraid it is not possible. There are many issues with it
at first this are not 2, but potentially many collections in another collection.
Map is not built for iteration, it's dictionary like. But you can convert it to a List of pairs first with Map.toList
there is no other possibility than iterate over values if you need them all.
let toValuesList map =
map
|> Map.toList
|> List.map snd
let ds =
data
|> toValuesList
|> List.collect toValuesList
If you could lookup PositionData by some key you would at most avoid iterating internal Map's
let data =
map
|> toValuesList
|> List.map (fun valueMap -> lookupWithin valueMap)

F# idiomatic way of determining if all (key*value) list entries sharing key have same singular distinct value

Given a set of dictionaries of int*string where the first is the "primary", I want to answer the question:
For all additional dictionaries, do they all have the same values for the same keys as the primary?
I've currently achieved this by doing the following:
let allSame = primary # remaining
|> Seq.groupBy (fun (pos, _) -> pos)
|> Seq.map (fun (pos, items) -> (pos, items |> Seq.map (fun (_, name) -> name) |> Seq.distinct |> List.ofSeq))
|> Seq.exists (fun (_, names) -> names.Length > 1))
I'm wondering if there's a more idiomatic way of achieving this?
Going through the duplication of pos in the grouping int * (int * string) list and then having to reduce down to int * string list seems a bit redundant, but groupBy unfortunately doesn't offer a value projection overload.
EDIT
Given a bunch of items (fields) with simplified structure {SortOrder:int;Name:string;...}
I'm going: Field list -> (int * string) list
The "primary" is just the head of the list, it doesn't matter which one I pick as "primary" because I'm only interested in whether all fields with the same position also share the same name.
That's why I'm grouping by position, then reducing down to a distinct list of names, and just counting the entries (>1 obviously means some divergence).
Eventual Solution
Here's what I ended up with:
let primary = getFields <| fst x
let allSame = (primary) # ((tail |> List.map (fun (m,_) -> getFields m)) |> List.collect (fun e -> e))
|> Seq.sortBy (fun (pos, _) -> pos)
|> Seq.pairwise
|> Seq.forall (fun ((_,namex),(_,namey)) -> Seq.forall2 (=) namex namey)
if allSame then
Some (fst x)
else
failwith "Some error message here"
Like #Carsten said in his comment, sort by key and then compare each KeyValuePair. As an added benefit, Seq.forall is lazy and stops evaluation at the first mismatch.
[primary; remaining1; remaining2]
|> Seq.map (Seq.sortBy (fun (KeyValue(k,_)) -> k))
|> Seq.pairwise
|> Seq.forall (fun (x, y) -> Seq.forall2 (=) x y)
Going for pure readability, I'd prefer to define a helper function to test each key/value pair.
Your question doesn't say if a mismatched key should be OK or not, so pick whichever is appropriate:
(mismatched key is OK)
let looselyAllSame (primary :: remaining) =
let hasDifferentName key value =
primary |> Map.tryFind key |> Option.exists ((<>) value)
not (remaining |> List.exists (Map.exists hasDifferentName))
(mismatched key is not OK)
let strictlyAllSame (primary :: remaining)=
let hasSameName key value =
primary |> Map.tryFind key |> Option.exists ((=) value)
remaining |> List.forall (Map.forall hasSameName)

Map a list of options to list of strings

I have the following function in OCaml:
let get_all_parents lst =
List.map (fun (name,opt) -> opt) lst
That maps my big list with (name, opt) to just a list of opt. An option can contain of either None or Some value which in this case is a string. I want a list of strings with all my values.
I am a beginner learning OCaml.
I don't think filter and map used together is a good solution to this problem. This is because when you apply map to convert your string option to string, you will have the None case to deal with. Even if you know that you won't have any Nones because you filtered them away, the type checker doesn't, and can't help you. If you have non-exhaustive pattern match warnings enabled, you will get them, or you will have to supply some kind of dummy string for the None case. And, you will have to hope you don't introduce errors when refactoring later, or else write test cases or do more code review.
Instead, you need a function filter_map : ('a -> 'b option) -> 'a list -> 'b list. The idea is that this works like map, except filter_map f lst drops each element of lst for which f evaluates to None. If f evaluates to Some v, the result list will have v. You could then use filter_map like so:
filter_map (fun (_, opt) -> opt) lst
You could also write that as
filter_map snd lst
A more general example would be:
filter_map (fun (_, opt) ->
match opt with
| Some s -> Some (s ^ "\n")
| None -> None)
lst
filter_map can be implemented like this:
let filter_map f lst =
let rec loop acc = function
| [] -> List.rev acc
| v::lst' ->
match f v with
| None -> loop acc lst'
| Some v' -> loop (v'::acc) lst'
in
loop [] lst
EDIT For greater completeness, you could also do
let filter_map f lst =
List.fold_left (fun acc v ->
match f v with
| Some v' -> v'::acc
| None -> acc) [] lst
|> List.rev
It's a shame that this kind of function isn't in the standard library. It's present in both Batteries Included and Jane Street Core.
I'm going to expand on #Carsten's answer. He is pointing you the right direction.
It's not clear what question you're asking. For example, I'm not sure why you're telling us about your function get_all_parents. Possibly this function was your attempt to get the answer you want, and that it's not quite working for you. Or maybe you're happy with this function, but you want to do some further processing on its results?
Either way, List.map can't do the whole job because it always returns a list of the same length as its input. But you need a list that can be different lengths, depending on how many None values there are in the big list.
So you need a function that can extract only the parts of a list that you're interested in. As #Carsten says, the key function for this is List.filter.
Some combination of map and filter will definitely do what you want. Or you can just use fold, which has the power of both map and filter. Or you can write your own recursive function that does all the work.
Update
Maybe your problem is in extracting the string from a string option. The "nice" way to do this is to provide a default value to use when the option is None:
let get default xo =
match xo with
| None -> default
| Some x -> x
# get "none" (Some "abc");;
- : string = "abc"
# get "none" None;;
- : string = "none"
#
type opt = Some of string | None
List.fold_left (fun lres -> function
(name,Some value) -> value::lres
| (name,None) -> lres
) [] [("s1",None);("s2",Some "s2bis")]
result:
- : string list = ["s2bis"]

F# stop Seq.map when a predicate evaluates true

I'm currently generating a sequence in a similar way to:
migrators
|> Seq.map (fun m -> m())
The migrator function is ultimately returning a discriminated union like:
type MigratorResult =
| Success of string * TimeSpan
| Error of string * Exception
I want to stop the map once I encounter my first Error but I need to include the Error in the final sequence.
I have something like the following to display a final message to the user
match results |> List.rev with
| [] -> "No results equals no migrators"
| head :: _ ->
match head with
| Success (dt, t) -> "All migrators succeeded"
| Error (dt, ex) -> "Migration halted owing to error"
So I need:
A way to stop the mapping when one of the map steps produces an Error
A way to have that error be the final element added to the sequence
I appreciate there may be a different sequence method other than map that will do this, I'm new to F# and searching online hasn't yielded anything as yet!
I guess there are multiple approaches here, but one way would be to use unfold:
migrators
|> Seq.unfold (fun ms ->
match ms with
| m :: tl ->
match m () with
| Success res -> Some (Success res, tl)
| Error res -> Some (Error res, [])
| [] -> None)
|> List.ofSeq
Note the List.ofSeq at the end, that's just there for realizing the sequence. A different way to go would be to use sequence comprehensions, some might say it results in a clearer code.
The ugly things Tomaš alludes to are 1) mutable state, and 2) manipulation of the underlying enumerator. A higher-order function which returns up to and including when the predicate holds would then look like this:
module Seq =
let takeUntil pred (xs : _ seq) = seq{
use en = xs.GetEnumerator()
let flag = ref true
while !flag && en.MoveNext() do
flag := not <| pred en.Current
yield en.Current }
seq{1..10} |> Seq.takeUntil (fun x -> x % 5 = 0)
|> Seq.toList
// val it : int list = [1; 2; 3; 4; 5]
For your specific application, you'd map the cases of the DU to a boolean.
(migrators : seq<MigratorResult>)
|> Seq.takeUntil (function Success _ -> false | Error _ -> true)
I think the answer from #scrwtp is probably the nicest way to do this if your input is reasonably small (and you can turn it into an F# list to use pattern matching). I'll add one more version, which works when your input is just a sequence and you do not want to turn it into a list.
Essentially, you want to do something that's almost like Seq.takeWhile, but it gives you one additional item at the end (the one, for which the predicate fails).
To use a simpler example, the following returns all numbers from a sequence until one that is divisible by 5:
let nums = [ 2 .. 10 ]
nums
|> Seq.map (fun m -> m % 5)
|> Seq.takeWhile (fun n -> n <> 0)
So, you basically just need to look one element ahead - to do this, you could use Seq.pairwise which gives you the current and the next element in the sequence"
nums
|> Seq.map (fun m -> m % 5)
|> Seq.pairwise // Get sequence of pairs with the next value
|> Seq.takeWhile (fun (p, n) -> p <> 0) // Look at the next value for test
|> Seq.mapi (fun i (p, n) -> // For the first item, we return both
if i = 0 then [p;n] else [n]) // for all other, we return the second
|> Seq.concat
The only ugly thing here is that you then need to flatten the sequence again using mapi and concat.
This is not very nice, so a good thing to do would be to define your own higher-order function like Seq.takeUntilAfter that encapsulates the behavior you need (and hides all the ugly things). Then your code could just use the function and look nice & readable (and you can experiment with other ways of implementing this).

Returning object with different value

Is there a way to return an object used in a lambda expression, but with a different value? I've been using the "kind of linq-select" way, but I'd like to do something like this:
let bob= tab
|> Seq.map (fun x -> ignore (x.Value=x.Value+1); x)
|> Seq.iter (fun x -> x.Dump())
making all the x's in my sequence to have their value +1'ed.
instead of doing this:
let bob= tab
|> Seq.map (fun x -> Ville(IdVille= 9, NoVille=x.Value+1, Nom=x.Nom, __RowVersion = x.__RowVersion))
|> Seq.iter (fun x -> x.Dump())
edit:
What I expect to get : from this, a dump of the sequence, hence the Iter and Dump...
What I want the sequence to be? Here is an example, well the original sequence, but after applying a function to each element and get a copy of the result... (No side effect on the original sequence).
For example, I have a sequence of names, I'd like to have a copy of the original sequence, but with upper-cased names. Now imagine the same, but with a sequence of objects got from a database.
Edit2:
I made a test with LinqPad and AdventureWorks database, and I did this:
let dc = new TypedDataContext()
let tab = dc.GetTable<Address>()
let bob = tab
|> Seq.map (fun x -> ignore (x.AddressLine1 <- "Bob"); x)
tab.Dump()
bob.Dump()
The 2 Dump() results are differents. If I invert the 2 Dump() calls, both results are the same. You were right!
It's hard to tell what you're trying to do, but mutating a value suggests an imperative approach, so why not a for loop?
for x in tab do
x.Value <- x.Value + 1
x.Dump()
What value do you expect for bob? Seq.iter returns unit. If you mutate tab within Seq.map it will have the same value as bob.
EDIT
If you modify elements of a sequence within map the result and the original sequence will be one and the same. map is not intended to be used with side effects. An example:
type T(value) =
member val Value = value with get, set
let tab = [T(0); T(1); T(3)]
let bob = tab |> Seq.map (fun x -> x.Value <- x.Value + 1; x)
tab = (Seq.toList bob) //true
You can try using map along with an object expression to update just one field:
let bob = tab
|> Seq.map (fun x -> {x with Value = x.Value + 1})
|> Seq.iter (fun x -> x.Dump())
Though bob will not get the results of Dump() assigned to it if you are using iter. You'd need to use map again for that.
Edit
This only works with record types.

Resources