Deleting item out of a List - dictionary

Does anyone has an idea how I can delete items out of the list (e.g. word a, the)? I am trying several ways, but do not arrive at a solution. Thank you for your help!
lst = list()
for key, val in list(counts.items()):
lst.append((val, key))
lst.sort(reverse=True)
for key, val in lst[:10]:
print(key, val)

You don't show the definition of counts but it looks like a dict of words to number of occurrences. If so, then this will do what you want
blacklist = ['a', 'the', 'an', 'and']
for key, value in [(k, v) for (k, v) in sorted(counts.iteritems(), reverse=True) if k not in blacklist]:
print "%s: %s" % (key, value)
The code above doesn't actually remove keys from your counts, it just filters them out of the loop. If you actually need to delete them (since that's what you asked for), use dict.pop() (docs)
blacklist = ['a', 'the', 'an', 'and']
for key in blacklist:
counts.pop(key, None)

Related

How to build the dictionary from two other dictionaries by some condition on their values

I am new to functional programming and so can not imagen how to build the new dictionary based on two other dictionaries with similar set of keys. The new dictionary will have the entries with all keys but values will be selected/computed based on some condition.
For example, having two dictionaries:
D1: [(1,100);(2,50);(3,150)]
D2: [(1,20);(2,30);(3,0);(4,10)]
and condition to get the average of two values, the resulting dictionary will be
DR: [(1,60);(2,40);(3,75);(4,10)]
I need implementation in F#.
Please could you give me some advise.
View them as two (or more...) lists of tuples that we concat makes it easier. The below solves your specfic problem. To generalise the process aggeragting a list of values to something specific you would need to change averageBy to fold and provide a fold function instead of float. Assuming d1 and d2 mataches your exmaple.
Seq.concat [ d1 ; d2 ]
|> Seq.map (|KeyValue|)
|> Seq.groupBy fst
|> Seq.map (fun (k, c) -> k, Seq.averageBy (snd >> float) c |> int)
|> dict
If you wanted to use an external library, you could do this using Deedle series, which has various operations for working with (time) series of data.
Here, you have two data series that have different keys. Deedle lets you zip series based on keys and handle the cases where one of the values is missing using the opt type:
#r "nuget:Deedle"
open Deedle
let s1 = series [(1,100);(2,50);(3,150)]
let s2 = series [(1,20);(2,30);(3,0);(4,10)]
Series.zip s1 s2
|> Series.mapValues (fun (v1, v2) ->
( (OptionalValue.defaultArg 0 v1) +
(OptionalValue.defaultArg 0 v2) ) / 2)
This may not make sense if this is a thing that you need just in one or two places, but if you're working with key-value series of data more generally, it may be worth checking out.
Solution 1
From a functional perspective I would use a Map data-structure, instead of a dictionary. You can convert a dictionary to a Map like this
let d1 = dict [(1,100);(2,50);(3,150)]
let m1 = Map [for KeyValue (key,value) in d1 -> key, value]
But i wouldn't use a Dictionary and convert it, I would use a Map diretly.
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
Next, you need a way to get all keys from both Maps. You can get the keys of a map with Map.keys but you need all the keys from both. You could get them by using a Set.
let keys = Set (Map.keys m1) + Set (Map.keys m2)
By adding two Sets you get a Set.union of both sets. Once you have them, you can traverse the keys, and try to get both values from both keys. If you use Map.find then you get an optional. You can Pattern match on both cases at once.
let result = Map [
for key in keys do
match Map.tryFind key m1, Map.tryFind key m2 with
| Some x, Some y -> key, (x + y) / 2
| Some x, None -> key, x
| None , Some y -> key, y
| None , None -> failwith "Cannot happen"
]
This creates a new Map data-structure and saves it into result. If both cases are Some then you compute the average, otherwise you just keep the value. As you iterate the keys of both Maps the None,None case cannot happen. A Key always must be in either one or the other.
After all of this, result will be:
Map [(1, 60); (2, 40); (3, 75); (4, 10)]
Again, here is the whole code at once:
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
let keys = Set (Map.keys m1) + Set (Map.keys m2)
let result = Map [
for key in keys do
match Map.tryFind key m1, Map.tryFind key m2 with
| Some x, Some y -> key, (x + y) / 2
| Some x, None -> key, x
| None , Some y -> key, y
| None , None -> failwith "Cannot happen"
]
You also can inline the keys variable, if you want.
Solution 2
When you have a Map then you can make use of the fact that adding a value always to a map, always creates a new Map data-structure. This way you are able to use Map.fold that traverses a Map data-structure and uses one of the map as the starting state while you traverse the other Map.
With Map.change you then can read and change a value in one step. If a key is already available you calculate the average, otherwise just add the value.
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
let result =
(m1,m2) ||> Map.fold (fun state key y ->
state |> Map.change key (function
| Some x -> Some ((x + y) / 2)
| None -> Some y
)
)
Bonus: Adding Functions to Modules
It's sad sometimes that F# has so few functions on Map. But you need the a lot, you always can add a union function youself to the Module. For example:
module Map =
let union f map1 map2 =
let keys = Set (Map.keys map1) + Set (Map.keys map2)
Map [
for key in keys do
match Map.tryFind key map1, Map.tryFind key map2 with
| Some x, Some y -> key, (f x y)
| Some x, None -> key, x
| None , Some y -> key, y
| None , None -> failwith "Cannot happen"
]
let m1 = Map [(1,100);(2,50);(3,150)]
let m2 = Map [(1,20);(2,30);(3,0);(4,10)]
This way you get a Map.union and you can specify a lambda-function that is executed if both keys are present in both maps, otherwise the value is used unchanged.
There have been a couple of useful suggestions:
Group by keys with standard library functions from the Seq module, by user1981
Use a specialized library for dealing with data series, by Tomas Petricek
Use a map instead (a functional data structure based on comparison), by David Raab
To this I'd like to add
An imperative way, filling a combined dictionary by iterating through the keys of the source data structures, and finally
A query expression
An imperative way
The average calculation is hard-coded with the type int. You can still have generic keys, as their type does not figure in the function, except for the equality constraint required for dictionary keys. You could make the function generic for values too, by marking it inline, but that won't be a pretty sight as it will introduce a host of other constraints onto the type of values.
open System.Collections.Generic
let unionAverage (d1 : IDictionary<_,_>) (d2 : IDictionary<_,_>) =
let d = Dictionary<_,_>()
for k in Seq.append d1.Keys d2.Keys |> Seq.distinct do
match d1.TryGetValue k, d2.TryGetValue k with
| (true, v1), (true, v2) -> d.Add(k, (v1 + v2) / 2)
| (true, v), _ | _, (true, v) -> d.Add(k, v)
| _ -> failwith "Key not found"
d
let d1 = dict[1, 100; 2, 50; 3, 150]
let d2 = dict[1, 20; 2, 30; 3, 0; 4, 10]
unionAverage d1 d2
A query expression
It operates on the same principle as the answer from user1981, but for re-usability the average function has been factored out. It expects an arbitrary number of #seq<KeyValuePair<_,_>> elements, which is just another way to represent dictionaries that are accessed through their enumerators.
As the query expression uses System.Linq.IGrouping under the hood, this is upcast to a regular sequence to reduce confusion. Then there's the conversion to float for Seq.average to operate on, because the type int does not have the required member DivideByInt.
module Dict =
let unionByMany f src =
query{
for KeyValue(k, v) in Seq.concat src do
groupValBy v k into group
select (group.Key, f (group :> seq<_>)) }
|> dict
Dict.unionByMany (Seq.averageBy float >> int) [d1; d2]
Dict.unionByMany Seq.sum [d1; d2]
Dict.unionByMany Seq.min [d1; d2]

How to overwrite a dictionary (list of key-val pairs) in SWI-prolog?

Here's my problem:
I have a key-val pair dictionary similar to this (more pairs in real codes):
args = [(score, 0), (multiplier, 1), (reward, 0), (prior_prob, 1)].
I defined two functions to work with the dictionary:
% look up the Val of a Key in a Dict
lookup(Key,Dict,Val):-
member((Key,Val), Dict).
and
% update(Key, NewVal, Dict, NewDict) updates the value of a key in the dict
update(Key,Val,[],[(Key,Val)]). % Add new pair to current dict
update(Key,Val,[(Key,_)|Rest], [(Key,Val)|Rest]):- !. % Found the key and update the value
update(Key,Val,[KV|Rest],[KV|Result]) :- % recursively look for the key
update(Key,Val,Rest,Result).
The reason I need a dictionary is that I have many functions that need those arguments (such as "score", "multiplier", etc.). Those functions call each other and pass on the arguments. Not all the arguments are needed by every function, but many of them are, and some are updated more often than the others. So this dictionary is basically a list of arguments wrapped up as a package that needs to be passed around and overwritten frequently. For example, without the dictionary, I may have this (made-up) function:
calculate('cond1', 'cond2', S0, S1, Multiplier, Reward, Prior):-
getscore('cond1', 'cond2', S0, S1, Multiplier, Reward, Prior).
getscore('cond1', 'cond2', S0, S1, Multiplier, Reward, Prior):-
reward('cond1', 'cond2', Reward), % look up rewards based on conditions
MultNew is Multiplier*Prior, % calculate the new multiplier
S1 is (S0+Rewards*MultNew). % update score
But with a dictionary, I can have:
calculate2('cond1', 'cond2', Args, NewArgs):-
getscore2('cond1', 'cond2', Args, NewArgs).
getscore2('cond1', 'cond2', Args, NewArgs):-
reward('cond1', 'cond2', Reward),
lookup(prior, Args, Prior),
lookup(multiplier, Args, Mult),
update(reward, Reward, Args),
MultNew is Multiplier*Prior,
update(multiplier, MultNew, Args, NewArgs),
update(score, S0+Reward*MultNew, Args, NewArgs).
(The second way looks longer and slower than the first, but since in reality, not all args need to be updated or looked up all at once, and since it's more flexible to add more parameters later, I think it's better to have the dictionary. Plz let me know if there's better design choice) When I run it, I get:
No permission to modify static procedure `(=)/2'
at the line number where I defined the dictionary args.
I tried :-dynamic(arg/0, update/4, lookup/3)., which is no use.
What does (=)/2 mean here? How to permit overwriting a dictionary in Prolog? Thank you in advance!
Here is the (possible) solution for your problem (in SWI-Prolog).
The dictionary is implemented using SWI-Prolog embedded database (non-persistent)
It's the set of term chains. Terms are associated with user-supplied key
%key `small int` or `atom`-
lookup(Key, Value) :-
current_key( Key ),
recorded( Key, Value, _ ) ->
true
;
recordz(Key, Value).
update( Key, OldValue, NewValue ) :-
same_term( OldValue, NewValue ) ->
true
;
(current_key( Key ),
recorded( Key, OldValue, Ref ase( Ref )
;
true,
recordz( Key, NewValue, _ )).
EDIT
You can also use the global vars, backtrackable and non-backtrackable.
e.g.
nb_setval/2, nb_getval/2
So your first statement could look as follows
:- nb_setval(score, 0),
nb_setval(multiplier, 1),
nb_setval(reward, 0),
nb_setval(prior_prob, 1),
nb_setval(args, [score, multiplier, reward, prior_prob]).
EDIT2
=/2 is the predicate's calling the Prolog's unification procedure unify/2.
If you mean under dictionary overwriting the reassign of a variable group you can do that with either of the solutions :
EDIT3
% I think it should be like this:
update_dict(Dict):-
update(args, Dict).
update(Key, Val):-
(nb_getval(Key, OldVal),
exists(Val, OldVal)) -> true;
nb_setval(Key, Val).
update(Key, Val, Dict):-
update_dict( Dict ),
update( Key, Val ).
exists(Val, OldVal) :-
nonvar(OldVal),
same_term(Val, OldVal).
%======================================
lookup_dict( Dict):-
lookup(args, Dict).
lookup(Key, Val):-
nb_getval(Key, OldVal),
exists( Val, OldVal ) ->
true
;
nb_setval(Key, Val).
lookup( Key, Val, Dict ):-
lookup_dict( Dict ),
lookup( Key, Val ).

How does one get the first key,value pair from F# Map without knowing the key?

How does one get the first key,value pair from F# Map without knowing the key?
I know that the Map type is used to get a corresponding value given a key, e.g. find.
I also know that one can convert the map to a list and use List.Head, e.g.
List.head (Map.toList map)
I would like to do this
1. without a key
2. without knowing the types of the key and value
3. without using a mutable
4. without iterating through the entire map
5. without doing a conversion that iterates through the entire map behind the seen, e.g. Map.toList, etc.
I am also aware that if one gets the first key,value pair it might not be of use because the map documentation does not note if using map in two different calls guarantees the same order.
If the code can not be written then an existing reference from a site such as MSDN explaining and showing why not would be accepted.
TLDR;
How I arrived at this problem was converting this function:
let findmin l =
List.foldBack
(fun (_,pr1 as p1) (_,pr2 as p2) -> if pr1 <= pr2 then p1 else p2)
(List.tail l) (List.head l)
which is based on list and is used to find the minimum value in the associative list of string * int.
An example list:
["+",10; "-",10; "*",20; "/",20]
The list is used for parsing binary operator expressions that have precedence where the string is the binary operator and the int is the precedence. Other functions are preformed on the data such that using F# map might be an advantage over list. I have not decided on a final solution but wanted to explore this problem with map while it was still in the forefront.
Currently I am using:
let findmin m =
if Map.isEmpty m then
None
else
let result =
Map.foldBack
(fun key value (k,v) ->
if value <= v then (key,value)
else (k,v))
m ("",1000)
Some(result)
but here I had to hard code in the initial state ("",1000) when what would be better is just using the first value in the map as the initial state and then passing the remainder of the map as the starting map as was done with the list:
(List.tail l) (List.head l)
Yes this is partitioning the map but that did not work e.g.,
let infixes = ["+",10; "-",10; "*",20; "/",20]
let infixMap = infixes |> Map.ofList
let mutable test = true
let fx k v : bool =
if test then
printfn "first"
test <- false
true
else
printfn "rest"
false
let (first,rest) = Map.partition fx infixMap
which results in
val rest : Map<string,int> = map [("*", 20); ("+", 10); ("-", 10)]
val first : Map<string,int> = map [("/", 20)]
which are two maps and not a key,value pair for first
("/",20)
Notes about answers
For practical purposes with regards to the precedence parsing seeing the + operations before - in the final transformation is preferable so returning + before - is desirable. Thus this variation of the answer by marklam
let findmin (map : Map<_,_>) = map |> Seq.minBy (fun kvp -> kvp.Value)
achieves this and does this variation by Tomas
let findmin m =
Map.foldBack (fun k2 v2 st ->
match st with
| Some(k1, v1) when v1 < v2 -> st
| _ -> Some(k2, v2)) m None
The use of Seq.head does return the first item in the map but one must be aware that the map is constructed with the keys sorted so while for my practical example I would like to start with the lowest value being 10 and since the items are sorted by key the first one returned is ("*",20) with * being the first key because the keys are strings and sorted by such.
For me to practically use the answer by marklam I had to check for an empty list before calling and massage the output from a KeyValuePair into a tuple using let (a,b) = kvp.Key,kvp.Value
I don't think there is an answer that fully satisfies all your requirements, but:
You can just access the first key-value pair using m |> Seq.head. This is lazy unlike converting the map to list. This does not guarantee that you always get the same first element, but realistically, the implementation will guarantee that (it might change in the next version though).
For finding the minimum, you do not actually need the guarantee that Seq.head returns the same element always. It just needs to give you some element.
You can use other Seq-based functons as #marklam mentioned in his answer.
You can also use fold with state of type option<'K * 'V>, which you can initialize with None and then you do not have to worry about finding the first element:
m |> Map.fold (fun st k2 v2 ->
match st with
| Some(k1, v1) when v1 < v2 -> st
| _ -> Some(k2, v2)) None
Map implements IEnumerable<KeyValuePair<_,_>> so you can treat it as a Seq, like:
let findmin (map : Map<_,_>) = map |> Seq.minBy (fun kvp -> kvp.Key)
It's even simpler than the other answers. Map internally uses an AVL balanced tree so the entries are already ordered by key. As mentioned by #marklam Map implements IEnumerable<KeyValuePair<_,_>> so:
let m = Map.empty.Add("Y", 2).Add("X", 1)
let (key, value) = m |> Seq.head
// will return ("X", 1)
It doesn't matter what order the elements were added to the map, Seq.head can operate on the map directly and return the key/value mapping for the min key.
Sometimes it's required to explicitly convert Map to Seq:
let m = Map.empty.Add("Y", 2).Add("X", 1)
let (key, value) = m |> Map.toSeq |> Seq.head
The error message I've seen for this case says "the type 'a * 'b does not match the type Collections.Generic.KeyValuePair<string, int>". It may also be possible add type annotations rather than Map.toSeq.

Enumerating an Elixir HashDict structure

I'm a newbie to Elixir and am trying to write a GenServer that stores key, value pairs in a HashDict. Storing a compound key and value is fine. here's the code:
#Initialise the HashDict GenServer.start_link
def init(:ok) do
{:ok, HashDict.new}
end
#Implement the server call back for GenServer.cast
def handle_cast({:add, event}, dict) do
{foo, bar, baz, qux} = event
key = %{key1: foo, key2: bar}
value = %{val1: baz, val2: qux}
{:noreply, HashDict.put(dict, key, value) }
end
All good. But I'm having trouble implementing the handle_call behaviour that I want. So here I'd like:
For a given key1 value, retrieve all corresponding value entries in HashDict. This will mean ignoring the value for key2 (so kind of a select all).
Having returned all the val2s, add them all up (assuming they are integers, ignoring val1) to give an overall sum.
So I've got this far:
def handle_call({:get, getKey}, _from, dict) do
key = %{key1: getKey, key2: _}
{:reply, HashDict.fetch(dict, key), dict}
end
This doesn't work, as it's not possible to pattern match on _. Presumably I would use some kind of Enumeration over the map such as the following to achieve my second objective:
Enum.map(mymap, fn {k, v} -> v end)|> Enum.sum{}
But I can't seem to quite crack the syntax to achieve my two aims. Thanks for any help!
If I understand your question correctly, the following should accomplish what you are wanting to do:
def handle_call({:get, getKey}, _from, dict) do
sum = Enum.reduce(dict, 0, fn
({%{key1: key1}, %{val2: val2}}, acc)
when key1 === getKey
and is_integer(val2) ->
val2 + acc
(_, acc) ->
acc
end)
{:reply, sum, dict}
end
See the documentation of Enum.reduce/3 for more information.

Binary trees as innested pairs

I'm trying to represent a generic binary tree as a pair.
I'll use the SML syntax as example. This is my btree type definition:
datatype btree = leaf | branch of btree*btree;
So, I'd like to write a function that, given a btree, print the following:
bprint leaf = 0
bprint (branch (leaf,leaf)) = (0,0)
bprint (branch (leaf, branch (leaf,leaf))) = (0, (0, 0))
and so on.
The problem is that this function always return different types. This is obviously a problem for SML and maybe for other functional languages.
Any idea?
Since all you want to do is to print the tree structure to the screen, you can just do that and have your function's return type be unit. That is instead of trying to return the tuple (0, (0, 0)) just print the string (0, (0, 0)) to the screen. This way you won't run into any difficulties with types.
If you really do not need a string representation anywhere else, as already mentioned by others, just printing the tree might be the easiest way:
open TextIO
datatype btree = leaf | branch of btree * btree
fun print_btree leaf = print "0"
| print_btree (branch (s, t)) =
(print "("; print_btree s; print ", "; print_btree t; print ")")
In case you also want to be able to obtain a string representing a btree, the naive solution would be:
fun btree_to_string leaf = "0"
| btree_to_string (branch (s, t)) =
"(" ^ btree_to_string s ^ ", " ^ btree_to_string t ^ ")"
However, I do not really recommend this variant since for big btrees there is a problem due to the many string concatenations.
Something nice to think about is the following variant, which avoids the concatenation problem by a trick (that is for example also used in Haskell's Show class), i.e., instead of working on strings, work on functions from char lists to char lists. Then concatenation can be replaced by function composition
fun btree_to_string' t =
let
fun add s t = s # t
fun add_btree leaf = add [#"0"]
| add_btree (branch (s, t)) =
add [#"("] o add_btree s o add [#",", #" "] o add_btree t o add [#")"]
in implode (add_btree t []) end

Resources