I am looking for a good concise way to find the subset of a dictionary o whose keys contained in the set option_set, or an alias of their key is in the option set.
o = Dict{Symbol,Any}(:a=>2,:b=>1.0,:c=>1//2,:d=>1,:e=>3.0)
options_set = Set([:a :d :f])
aliases = Dict{Symbol,Symbol}(:c=>:d,:b=>:f)
# I want the dictionary of the intersection, including the aliased names
# i.e. Dict(:a=>2,:d=>1,:f=>1.0) or Dict(:a=>2,:d=>1//2,:f=>1.0) (which one is easier?)
#Starting idea
Dict([Pair(k,o[k]) for k in (keys(o) ∩ options_set)]) # Dict(:a=>2)
Dict([Pair(k,o[k]) for k in ((keys(o) ∪ values(aliases)) ∩ options_set)]) # Dict(:a=>2,:d=>1)
Is there a good way to handle using the aliased key to get the right value in the resulting dictionary?
Edit: I realized that it's much easier to just have the aliases in the other direction, i.e.
aliases2 = Dict{Symbol,Symbol}(:d=>:c,:f=>:b)
dict1 = Dict([Pair(k,o[k]) for k in (keys(o) ∩ options_set)])
dict2 = Dict([Pair(k,o[aliases2[k]]) for k in (keys(aliases2) ∩ options_set)])
merge(dict1,dict2)
Still wondering if there's a way of accomplishing the task from the original dictionary which is more direct than inverting it first.
It might be more performant to invert the dictionary, but you could always just write out the loop.
julia> result = Dict{Symbol, Any}()
Dict{Symbol,Any} with 0 entries
julia> for (k, v) in o
if k in options_set
push!(result, k => v)
elseif haskey(aliases, k)
push!(result, aliases[k] => v)
end
end
Dict{Symbol,Any} with 3 entries:
:a => 2
:d => 1
:f => 1.0
This is an old question, but I haven't found answer on 'stackoverflow', so I share the code working in Julia 1.7.2.
"way to find the subset of a dictionary o whose keys contained in the set option_set"
o_result = filter( p -> p[1] in options_set , o)
"or an alias of their key is in the option set."
alias_result = filter( p -> p[2] in options_set, aliases)
Related
I am new to functional programming and so can not imagen how to build the new dictionary based on two other dictionaries with similar set of keys. The new dictionary will have the entries with all keys but values will be selected/computed based on some condition.
For example, having two dictionaries:
D1: [(1,100);(2,50);(3,150)]
D2: [(1,20);(2,30);(3,0);(4,10)]
and condition to get the average of two values, the resulting dictionary will be
DR: [(1,60);(2,40);(3,75);(4,10)]
I need implementation in F#.
Please could you give me some advise.
View them as two (or more...) lists of tuples that we concat makes it easier. The below solves your specfic problem. To generalise the process aggeragting a list of values to something specific you would need to change averageBy to fold and provide a fold function instead of float. Assuming d1 and d2 mataches your exmaple.
Seq.concat [ d1 ; d2 ]
|> Seq.map (|KeyValue|)
|> Seq.groupBy fst
|> Seq.map (fun (k, c) -> k, Seq.averageBy (snd >> float) c |> int)
|> dict
If you wanted to use an external library, you could do this using Deedle series, which has various operations for working with (time) series of data.
Here, you have two data series that have different keys. Deedle lets you zip series based on keys and handle the cases where one of the values is missing using the opt type:
#r "nuget:Deedle"
open Deedle
let s1 = series [(1,100);(2,50);(3,150)]
let s2 = series [(1,20);(2,30);(3,0);(4,10)]
Series.zip s1 s2
|> Series.mapValues (fun (v1, v2) ->
( (OptionalValue.defaultArg 0 v1) +
(OptionalValue.defaultArg 0 v2) ) / 2)
This may not make sense if this is a thing that you need just in one or two places, but if you're working with key-value series of data more generally, it may be worth checking out.
Solution 1
From a functional perspective I would use a Map data-structure, instead of a dictionary. You can convert a dictionary to a Map like this
let d1 = dict [(1,100);(2,50);(3,150)]
let m1 = Map [for KeyValue (key,value) in d1 -> key, value]
But i wouldn't use a Dictionary and convert it, I would use a Map diretly.
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
Next, you need a way to get all keys from both Maps. You can get the keys of a map with Map.keys but you need all the keys from both. You could get them by using a Set.
let keys = Set (Map.keys m1) + Set (Map.keys m2)
By adding two Sets you get a Set.union of both sets. Once you have them, you can traverse the keys, and try to get both values from both keys. If you use Map.find then you get an optional. You can Pattern match on both cases at once.
let result = Map [
for key in keys do
match Map.tryFind key m1, Map.tryFind key m2 with
| Some x, Some y -> key, (x + y) / 2
| Some x, None -> key, x
| None , Some y -> key, y
| None , None -> failwith "Cannot happen"
]
This creates a new Map data-structure and saves it into result. If both cases are Some then you compute the average, otherwise you just keep the value. As you iterate the keys of both Maps the None,None case cannot happen. A Key always must be in either one or the other.
After all of this, result will be:
Map [(1, 60); (2, 40); (3, 75); (4, 10)]
Again, here is the whole code at once:
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
let keys = Set (Map.keys m1) + Set (Map.keys m2)
let result = Map [
for key in keys do
match Map.tryFind key m1, Map.tryFind key m2 with
| Some x, Some y -> key, (x + y) / 2
| Some x, None -> key, x
| None , Some y -> key, y
| None , None -> failwith "Cannot happen"
]
You also can inline the keys variable, if you want.
Solution 2
When you have a Map then you can make use of the fact that adding a value always to a map, always creates a new Map data-structure. This way you are able to use Map.fold that traverses a Map data-structure and uses one of the map as the starting state while you traverse the other Map.
With Map.change you then can read and change a value in one step. If a key is already available you calculate the average, otherwise just add the value.
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
let result =
(m1,m2) ||> Map.fold (fun state key y ->
state |> Map.change key (function
| Some x -> Some ((x + y) / 2)
| None -> Some y
)
)
Bonus: Adding Functions to Modules
It's sad sometimes that F# has so few functions on Map. But you need the a lot, you always can add a union function youself to the Module. For example:
module Map =
let union f map1 map2 =
let keys = Set (Map.keys map1) + Set (Map.keys map2)
Map [
for key in keys do
match Map.tryFind key map1, Map.tryFind key map2 with
| Some x, Some y -> key, (f x y)
| Some x, None -> key, x
| None , Some y -> key, y
| None , None -> failwith "Cannot happen"
]
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
This way you get a Map.union and you can specify a lambda-function that is executed if both keys are present in both maps, otherwise the value is used unchanged.
There have been a couple of useful suggestions:
Group by keys with standard library functions from the Seq module, by user1981
Use a specialized library for dealing with data series, by Tomas Petricek
Use a map instead (a functional data structure based on comparison), by David Raab
To this I'd like to add
An imperative way, filling a combined dictionary by iterating through the keys of the source data structures, and finally
A query expression
An imperative way
The average calculation is hard-coded with the type int. You can still have generic keys, as their type does not figure in the function, except for the equality constraint required for dictionary keys. You could make the function generic for values too, by marking it inline, but that won't be a pretty sight as it will introduce a host of other constraints onto the type of values.
open System.Collections.Generic
let unionAverage (d1 : IDictionary<_,_>) (d2 : IDictionary<_,_>) =
let d = Dictionary<_,_>()
for k in Seq.append d1.Keys d2.Keys |> Seq.distinct do
match d1.TryGetValue k, d2.TryGetValue k with
| (true, v1), (true, v2) -> d.Add(k, (v1 + v2) / 2)
| (true, v), _ | _, (true, v) -> d.Add(k, v)
| _ -> failwith "Key not found"
d
let d1 = dict[1, 100; 2, 50; 3, 150]
let d2 = dict[1, 20; 2, 30; 3, 0; 4, 10]
unionAverage d1 d2
A query expression
It operates on the same principle as the answer from user1981, but for re-usability the average function has been factored out. It expects an arbitrary number of #seq<KeyValuePair<_,_>> elements, which is just another way to represent dictionaries that are accessed through their enumerators.
As the query expression uses System.Linq.IGrouping under the hood, this is upcast to a regular sequence to reduce confusion. Then there's the conversion to float for Seq.average to operate on, because the type int does not have the required member DivideByInt.
module Dict =
let unionByMany f src =
query{
for KeyValue(k, v) in Seq.concat src do
groupValBy v k into group
select (group.Key, f (group :> seq<_>)) }
|> dict
Dict.unionByMany (Seq.averageBy float >> int) [d1; d2]
Dict.unionByMany Seq.sum [d1; d2]
Dict.unionByMany Seq.min [d1; d2]
I have a computation where I'm inserting values into a Map and then looking them up again. I know that I never use a key before inserting it, but using (!) freely makes me nervous anyway. I'm looking for a way to get a total lookup function that doesn't return a Maybe, and which the type system prevents me from accidentally abusing.
My first thought was to make a monad transformer similar to StateT, where the state is a Map and there are special functions for inserts and lookup in the monad. The insert function returns a Receipt s k newtype, where s is a phantom index type in the style of the ST monad and k is the type of the key, and the lookup function takes a Receipt instead of a bare key. By hiding the Receipt constructor and using a quantified run function similar to runST, this should ensure that lookups only happen after inserts in the same map. (Full code is below.)
But I fear that I've reinvented a wheel, or that that there's an alternate way to get safe, total map lookups that's already in use. Is there any prior art for this problem in a public package somewhere?
{-# LANGUAGE DeriveFunctor, LambdaCase, RankNTypes #-}
module KeyedStateT (KeyedStateT, Receipt, insert, lookup, receiptToKey, runKeyedStateT)
where
import Prelude hiding (lookup)
import Control.Arrow ((&&&))
import Control.Monad (ap, (>=>))
import Data.Map (Map)
import qualified Data.Map as Map
import Data.Maybe (fromJust)
newtype KeyedStateT s k v m a = KeyedStateT (Map k v -> m (a, Map k v)) deriving Functor
keyedState :: Applicative m => (Map k v -> (a, Map k v)) -> KeyedStateT s k v m a
keyedState f = KeyedStateT (pure . f)
instance Monad m => Applicative (KeyedStateT s k v m) where
pure = keyedState . (,)
(<*>) = ap
instance Monad m => Monad (KeyedStateT s k v m) where
KeyedStateT m >>= f = KeyedStateT $ m >=> uncurry ((\(KeyedStateT m') -> m') . f)
newtype Receipt s k = Receipt { receiptToKey :: k }
insert :: (Applicative m, Ord k) => k -> v -> KeyedStateT s k v m (Receipt s k)
insert k v = keyedState $ const (Receipt k) &&& Map.insert k v
lookup :: (Applicative m, Ord k) => Receipt s k -> KeyedStateT s k v m v
lookup (Receipt k) = keyedState $ (Map.! k) &&& id
runKeyedStateT :: (forall s. KeyedStateT s k v m a) -> m (a, Map k v)
runKeyedStateT (KeyedStateT m) = m Map.empty
module Main where
import Data.Functor.Identity (runIdentity)
import qualified KeyedStateT as KS
main = putStrLn . fst . runIdentity $ KS.runKeyedStateT $ do
one <- KS.insert 1 "hello"
two <- KS.insert 2 " world"
h <- KS.lookup one
w <- KS.lookup two
pure $ h ++ w
Edit: Several commenters have asked why I want to hold on to a Receipt instead of the actual value. I want to be able to use the Receipt in Sets and Maps (I didn't add the Eq and Ord instances for Receipt in my MVCE, but I have them in my project), but the values in my Map are not equatable. If I replaced Receipt with a key-value pair newtype, I'd have to implement a dishonest Eq instance for that pair that disregarded the value, and then I'd be nervous about that. The Map is there to ensure that there's only one value under consideration for any of my equatable "proxy" keys at any given time.
I suppose an alternate solution that would work just fine for me would be a monad transformer that provides a supply of Refs, where data Ref v = Ref Int v, with the monad ensuring that Refs are given out with unique Int IDs, and Eq Ref etc. only looking at the Int (and now honesty is guaranteed by the uniqueness of the Ints). I would accept pointers to such a transformer in the wild as well.
Your solution resembles the technique used by justified-containers to guarantee that keys are present in a map. But there are some differences:
justified-containers uses continuation-passing-style.
inserting a new key in justified-containers requires "recycling" existing receipts to work in the updated map. It seems that you don't need to "recyle" receipts because you never have multiple versions of the map around at the same time.
justified-containers supports key deletion. I'm not sure if your solution could be expanded to allow the same.
An expanded description of the technique used by justified-containers can be found in the functional pearl "Ghosts of departed proofs".
I want to create a list of lists in SML, which represents a truth table of the following form:
Example:
[
[("r",true),("p",true),("q",true)],
[("r",false),("p",false),("q",true)],
[("r",false),("p",true),("q",true)],
...
]
I think I could achieve this in two ways:
(1) with the cartesian product
(2) converting truth table index entry to binary, which would represent an encoded line in the list (e.g. 8(decimal) is 1000(binary) => [("r",true),("p",false),("q",false)]), but I think this is to complicated and there is probably an easier way.
What would be the easiest way to go about this?
fun tt [] = [[]]
| tt (x :: xs) =
let
val txs = tt xs
in
map (fn l => (x, true) :: l) txs #
map (fn l => (x, false) :: l) txs
end
- tt ["a", "b", "c"];
val it =
[[("a",true),("b",true),("c",true)],[("a",true),("b",true),("c",false)],
[("a",true),("b",false),("c",true)],[("a",true),("b",false),("c",false)],
[("a",false),("b",true),("c",true)],[("a",false),("b",true),("c",false)],
[("a",false),("b",false),("c",true)],[("a",false),("b",false),("c",false)]]
: (string * bool) list list
I know how to do the equivalent of Scheme's (or Python's) map and filter functions with the list monad using only the "bind" operation.
Here's some Scala to illustrate:
scala> // map
scala> List(1,2,3,4,5,6).flatMap {x => List(x * x)}
res20: List[Int] = List(1, 4, 9, 16, 25, 36)
scala> // filter
scala> List(1,2,3,4,5,6).flatMap {x => if (x % 2 == 0) List() else List(x)}
res21: List[Int] = List(1, 3, 5)
and the same thing in Haskell:
Prelude> -- map
Prelude> [1, 2, 3, 4, 5, 6] >>= (\x -> [x * x])
[1,4,9,16,25,36]
Prelude> -- filter
Prelude> [1, 2, 3, 4, 5, 6] >>= (\x -> if (mod x 2 == 0) then [] else [x])
[1,3,5]
Scheme and Python also have a reduce function that's often grouped with map and filter. The reduce function combines the first two elements of a list using the supplied binary function, and then combines that result the the next element, and then so on. A common use to to compute the sum or product of a list of values. Here's some Python to illustrate:
>>> reduce(lambda x, y: x + y, [1,2,3,4,5,6])
21
>>> (((((1+2)+3)+4)+5)+6)
21
Is there any way to do the equivalent of this reduce using just the bind operation on a list monad? If bind can't do this on its own, what's the most "monadic" way to perform this operation?
If possible, please limit/avoid the use of syntactic sugar (ie: do notation in Haskell or sequence comprehensions in Scala) when answering.
One of the defining properties of the bind operation is that the result is still "inside" the monad¹. So when you perform bind on a list, the result will again be a list. Since the reduce operation² often results in something other than a list, it can't be expressed in terms of the bind operation.
In addition to that the bind operation on lists (i.e. concatMap/flatMap) only looks at one element at a time and offers no way of reusing the result of previous steps. So even if we're okay with getting the result wrapped in a single-element list, there's no way to do it just with monad operations.
¹ So if you have a type that allows you to perform no operations on it except the ones defined by the monad type class, you can never "break out" of the monad. That's what makes the IO monad works.
² Which is called fold in Haskell and Scala by the way.
If bind can't do this on its own, what's the most "monadic" way to perform this operation?
While the answer given by #sepp2k is correct, there is a way to do a reduce-like operation on a list monadically, but using the product or "writer" monad and an operation which corresponds to distributing the product monad over the list functor.
The definition is:
import Control.Monad.Writer.Lazy
import Data.Monoid
reduce :: Monoid a => [a] -> a
reduce xs = snd . runWriter . sequence $ map tell xs
Let me unpack:
The Writer monad has a data type Writer w a which is basically a tuple (product) of a value a and "written" value w. The type of written values w must be a monoid where the bind operation of the Writer monad is defined something like:
(w, a) >>= f = let (w', b) = f a in (mappend w w', b)
i.e. take the incoming written value, and the result written value, and combine them using the binary operation of the monoid.
The tell operation writes a value, tell :: w -> Writer w (). Thus map tell has type [a] -> [Writer a ()] i.e. a list of monadic values where each element of the original list has been "written" in the monad.
sequence :: Monad m => [m a] -> m [a] corresponds to a distributive law between lists and monads i.e. distribute the monad type over the list type; sequence can be defined in terms of bind as:
sequence [] = return []
sequnece (x:xs) = x >>= (\x' -> (sequence xs) >>= (\xs' -> return $ x':xs'))
(actually the implementation in Prelude uses foldr, a clue to the reduction-like usage)
Thus, sequence $ map tell xs has type Writer a [()]
The runWriter operation unpacks the Writer type, runWriter :: Writer w a -> (a, w),
which is composed here with snd to project out the accumulated value.
An example usage on lists of Ints would be to use the monoid instance:
instance Monoid Int where
mappend = (+)
mempty = 0
then:
> reduce ([1,2,3,4]::[Int])
10
Is it possible to write recursive anonymous functions in SML? I know I could just use the fun syntax, but I'm curious.
I have written, as an example of what I want:
val fact =
fn n => case n of
0 => 1
| x => x * fact (n - 1)
The anonymous function aren't really anonymous anymore when you bind it to a
variable. And since val rec is just the derived form of fun with no
difference other than appearance, you could just as well have written it using
the fun syntax. Also you can do pattern matching in fn expressions as well
as in case, as cases are derived from fn.
So in all its simpleness you could have written your function as
val rec fact = fn 0 => 1
| x => x * fact (x - 1)
but this is the exact same as the below more readable (in my oppinion)
fun fact 0 = 1
| fact x = x * fact (x - 1)
As far as I think, there is only one reason to use write your code using the
long val rec, and that is because you can easier annotate your code with
comments and forced types. For examples if you have seen Haskell code before and
like the way they type annotate their functions, you could write it something
like this
val rec fact : int -> int =
fn 0 => 1
| x => x * fact (x - 1)
As templatetypedef mentioned, it is possible to do it using a fixed-point
combinator. Such a combinator might look like
fun Y f =
let
exception BlackHole
val r = ref (fn _ => raise BlackHole)
fun a x = !r x
fun ta f = (r := f ; f)
in
ta (f a)
end
And you could then calculate fact 5 with the below code, which uses anonymous
functions to express the faculty function and then binds the result of the
computation to res.
val res =
Y (fn fact =>
fn 0 => 1
| n => n * fact (n - 1)
)
5
The fixed-point code and example computation are courtesy of Morten Brøns-Pedersen.
Updated response to George Kangas' answer:
In languages I know, a recursive function will always get bound to a
name. The convenient and conventional way is provided by keywords like
"define", or "let", or "letrec",...
Trivially true by definition. If the function (recursive or not) wasn't bound to a name it would be anonymous.
The unconventional, more anonymous looking, way is by lambda binding.
I don't see what unconventional there is about anonymous functions, they are used all the time in SML, infact in any functional language. Its even starting to show up in more and more imperative languages as well.
Jesper Reenberg's answer shows lambda binding; the "anonymous"
function gets bound to the names "f" and "fact" by lambdas (called
"fn" in SML).
The anonymous function is in fact anonymous (not "anonymous" -- no quotes), and yes of course it will get bound in the scope of what ever function it is passed onto as an argument. In any other cases the language would be totally useless. The exact same thing happens when calling map (fn x => x) [.....], in this case the anonymous identity function, is still in fact anonymous.
The "normal" definition of an anonymous function (at least according to wikipedia), saying that it must not be bound to an identifier, is a bit weak and ought to include the implicit statement "in the current environment".
This is in fact true for my example, as seen by running it in mlton with the -show-basis argument on an file containing only fun Y ... and the val res ..
val Y: (('a -> 'b) -> 'a -> 'b) -> 'a -> 'b
val res: int32
From this it is seen that none of the anonymous functions are bound in the environment.
A shorter "lambdanonymous" alternative, which requires OCaml launched
by "ocaml -rectypes":
(fun f n -> f f n)
(fun f n -> if n = 0 then 1 else n * (f f (n - 1))
7;; Which produces 7! = 5040.
It seems that you have completely misunderstood the idea of the original question:
Is it possible to write recursive anonymous functions in SML?
And the simple answer is yes. The complex answer is (among others?) an example of this done using a fix point combinator, not a "lambdanonymous" (what ever that is supposed to mean) example done in another language using features not even remotely possible in SML.
All you have to do is put rec after val, as in
val rec fact =
fn n => case n of
0 => 1
| x => x * fact (n - 1)
Wikipedia describes this near the top of the first section.
let fun fact 0 = 1
| fact x = x * fact (x - 1)
in
fact
end
This is a recursive anonymous function. The name 'fact' is only used internally.
Some languages (such as Coq) use 'fix' as the primitive for recursive functions, while some languages (such as SML) use recursive-let as the primitive. These two primitives can encode each other:
fix f => e
:= let rec f = e in f end
let rec f = e ... in ... end
:= let f = fix f => e ... in ... end
In languages I know, a recursive function will always get bound to a name. The convenient and conventional way is provided by keywords like "define", or "let", or "letrec",...
The unconventional, more anonymous looking, way is by lambda binding. Jesper Reenberg's answer shows lambda binding; the "anonymous" function gets bound to the names "f" and "fact" by lambdas (called "fn" in SML).
A shorter "lambdanonymous" alternative, which requires OCaml launched by "ocaml -rectypes":
(fun f n -> f f n)
(fun f n -> if n = 0 then 1 else n * (f f (n - 1))
7;;
Which produces 7! = 5040.