Optimized version of except for Map - dictionary

I've written a generic except function for Maps that, given a source map and an other map, returns only the items of the source map without corresponding keys in the other map.
module MapExt =
let getKeys<'k,'v when 'k : comparison> : Map<'k,'v> -> 'k[] =
Map.toArray >> Array.map fst
let except<'k,'v when 'k : comparison>(other:Map<'k,'v>) (source:Map<'k,'v>) : ('k * 'v)[] =
source |> getKeys
|> Array.except (other |> getKeys)
|> Array.map(fun k -> (k, source.[k]))
Now, I've seen in the second part of this answer, that an optimized version of the map's keys is obtained via a Map.fold.
Therefore, can I do a similar optimization of my original MapExt module in the following way?
module MapExtOpt =
let getKeys<'k,'v when 'k : comparison> (m : Map<'k,'v>) : 'k list =
Map.fold (fun keys key _ -> key::keys) [] m
let except<'k,'v when 'k : comparison>
(other : Map<'k,'v>) (source : Map<'k,'v>) : ('k * 'v) list =
source
|> Map.fold (fun s k v ->
if (other.ContainsKey k) then
s
else
(k,v) :: s
) []
Or am I reinventing some already existing (and optimized) functions?

I don't think there is a built in function, but this is a simpler way of doing what you are trying to do. It only goes over the 'to be removed' map once, so its much more efficient.
let except toRemove source =
Map.fold (fun m k _ -> if Map.containsKey k m then Map.remove k m else m) source toRemove

Finally,
thanks to Loïc Denuzière for his comment on Slack:
The if is not necessary: if m doesn't contain k, Map.remove k m just returns m anyway
I think I can also apply a double eta reduction by considering that it makes sense to speak about the keys to remove (not about a map whose values are ignored), so I would simply redefine it as
let except<'k,'v when 'k : comparison> = List.foldBack Map.remove<'k,'v>

Related

Map a list of options to list of strings

I have the following function in OCaml:
let get_all_parents lst =
List.map (fun (name,opt) -> opt) lst
That maps my big list with (name, opt) to just a list of opt. An option can contain of either None or Some value which in this case is a string. I want a list of strings with all my values.
I am a beginner learning OCaml.
I don't think filter and map used together is a good solution to this problem. This is because when you apply map to convert your string option to string, you will have the None case to deal with. Even if you know that you won't have any Nones because you filtered them away, the type checker doesn't, and can't help you. If you have non-exhaustive pattern match warnings enabled, you will get them, or you will have to supply some kind of dummy string for the None case. And, you will have to hope you don't introduce errors when refactoring later, or else write test cases or do more code review.
Instead, you need a function filter_map : ('a -> 'b option) -> 'a list -> 'b list. The idea is that this works like map, except filter_map f lst drops each element of lst for which f evaluates to None. If f evaluates to Some v, the result list will have v. You could then use filter_map like so:
filter_map (fun (_, opt) -> opt) lst
You could also write that as
filter_map snd lst
A more general example would be:
filter_map (fun (_, opt) ->
match opt with
| Some s -> Some (s ^ "\n")
| None -> None)
lst
filter_map can be implemented like this:
let filter_map f lst =
let rec loop acc = function
| [] -> List.rev acc
| v::lst' ->
match f v with
| None -> loop acc lst'
| Some v' -> loop (v'::acc) lst'
in
loop [] lst
EDIT For greater completeness, you could also do
let filter_map f lst =
List.fold_left (fun acc v ->
match f v with
| Some v' -> v'::acc
| None -> acc) [] lst
|> List.rev
It's a shame that this kind of function isn't in the standard library. It's present in both Batteries Included and Jane Street Core.
I'm going to expand on #Carsten's answer. He is pointing you the right direction.
It's not clear what question you're asking. For example, I'm not sure why you're telling us about your function get_all_parents. Possibly this function was your attempt to get the answer you want, and that it's not quite working for you. Or maybe you're happy with this function, but you want to do some further processing on its results?
Either way, List.map can't do the whole job because it always returns a list of the same length as its input. But you need a list that can be different lengths, depending on how many None values there are in the big list.
So you need a function that can extract only the parts of a list that you're interested in. As #Carsten says, the key function for this is List.filter.
Some combination of map and filter will definitely do what you want. Or you can just use fold, which has the power of both map and filter. Or you can write your own recursive function that does all the work.
Update
Maybe your problem is in extracting the string from a string option. The "nice" way to do this is to provide a default value to use when the option is None:
let get default xo =
match xo with
| None -> default
| Some x -> x
# get "none" (Some "abc");;
- : string = "abc"
# get "none" None;;
- : string = "none"
#
type opt = Some of string | None
List.fold_left (fun lres -> function
(name,Some value) -> value::lres
| (name,None) -> lres
) [] [("s1",None);("s2",Some "s2bis")]
result:
- : string list = ["s2bis"]

Ocaml: weighted graph

I'm working on Ocaml but i'm still a beginner so i have to ask a little help.
Following book instructions, i created a type that represent an oriented graph:
type 'a graph = Gr of ('a * 'a) list;;
let grafo1 = Gr [(1,2);(1,3);(1,4);(2,6);(3,5);(4,6);(6,5);(6,7);(5,4)];;
Then i created a succ function that take a node as input and it give me his successor as output:
let succ (Gr arcs) n=
let rec aux = function
[] -> []
| (x,y):: rest ->
if n = x then y::(aux rest)
else aux rest
in aux arcs;;
Then i used the succ function to make a modified BFS function, this function show me if exists a path between 2 nodes:
let bfs graph p start =
let rec search visited = function
[] -> raise Nodo_not_reachable
|n:: rest ->
if List.mem n visited
then search visited rest
else if p n then n
else search (n::visited) (rest # (succ graph n))
in search [] [start];;
I call the function using this code:
bfs grafo1 (function x -> x=7) 1;;
The function give true as output if exist a path between node 1 and node 7.
Now, i want to do the same things but with a WEIGHTED graph, so i created a new type, a list where each element is composed by 3 numbers instead 2: (node start - wiegh of edge - node arrive):
type 'b graph_w = Grw of ('b * 'b * 'b) list;;
let grafo2 = Grw [(1,3,2);(1,1,5);(2,2,3);(5,5,3);(5,4,6);(3,1,6);(3,7,4);(6,2,7);(4,4,6)];;
So, i modified my previous function to adapt them on this type:
let succ_w (Grw arcs) n=
let rec aux = function
[]-> []
| (x,y,z)::rest ->
if n=x then z::(aux rest)
else aux rest
in aux arcs;;
let bfs_w graph_w p start =
let rec search visited = function
[] -> raise Nodo_non_raggiungibile
|n:: rest ->
if find n visited
then search visited rest
else if p n then n
else search (n::visited) (rest # (succ_w graph_w n))
in search [] [start];;
(Since i can't use List.mem on this new type, i declared a function called find that give me true as output if an element (x,y,z) is contained on a list):
let rec find (x,y,z) = function
[] -> false
| (v,c,p)::rest -> if (x=v) then true else find (x,y,z) rest;;
find (2,3,1) [(2,2,3);(4,5,6);(8,9,0)];;
Now a little problem, someone can tell me how can i call my bfs_w function using the new graph type?
Using
bfs_w grafo2 (function x -> x=7) 1;;
I get the following error:
This expression has type int graph_w but an expression was expected of type ('a * 'b * 'c) graph_w
/--------------------------------/
Okay now the function work correctly thx ^^, but there is another problem: since i want to solve the longhest path problem using bfs (given a start node and a stop node, say true if exists a path between the nodes with a least weight k) i have to implement the (x,y,z) format on my function, so i tried something like this: (is the same function that you suggest but with (x,y,z) instead n:
let bfs_w2 graph_w start stop =
let rec search visited = function
| [] -> raise Node_not_Reachable
| (v,c,p) :: rest ->
if (find (v,c,p) visited) then search visited rest
else if v = stop then true
else search ((v,c,p)::visited) (rest # (succ_w graph_w (v,c,p))) in
search [] [start];;
When i declare the function:
bfs_w2 grafo2 1 4;;
or
bfs_w2 grafo2 (function x -> x=4) 1;;
i met the same error on "grafo2":
This expression has type int graph_w
but an expression was expected of type ('a * 'b * 'c) graph_w
I can't understand where the problem is, the function is almost identical to the one wich you suggest.
ps: i even tried this but i met same result:
let bfs_w2 graph_w p start =
let rec search visited = function
| [] -> raise Nodo_not_reachable
| (x,y,z) :: rest ->
if (List.mem (x,y,z) visited) then search visited rest
else if p (x,y,z) then (x,y,z)
else search ((x,y,z)::visited) (rest # (succ_w graph_w (x,y,z))) in
search [] [start];;
First of all let me define the model of you graph, so that we can speak the same language. You're using a edge list to represent a graph. For a graph of type 'a graph we will say that 'a is node, and 'a * 'a = node * node is an edge. So the graph itself is represented as edge list. To represent a weighted graph you're using edges that are labeled with weights, so the edge has now type node * weight * node, where weight has type int.
Now, let's go to your problem. If you look carefully in bfs implementation you will notice, that visited has type node list (i.e., int list), not edge list. And you're applying your predicate to a value of type node, not to an edge.
The same should be true to your bfs_w function. But here you, for some reason, decided that you're storing edges in a visited list, and used you own find function, instead of List.mem that is fully applicable here.
let bfs_w graph_w p start =
let rec search visited = function
| [] -> raise Not_found
| n :: rest ->
if List.mem n visited
then search visited rest
else if p n then n
else search (n::visited) (rest # (succ_w graph_w n)) in
search [] [start]
This implementation will have a correct type, expecting a user predicate that accepts a value of type node.

F# - Treating a function like a map

Long story short, I came up with this funny function set, that takes a function, f : 'k -> 'v, a chosen value, k : 'k, a chosen result, v : 'v, uses f as the basis for a new function g : 'k -> 'v that is the exact same as f, except for that it now holds that, g k = v.
Here is the (pretty simple) F# code I wrote in order to make it:
let set : ('k -> 'v) -> 'k -> 'v -> 'k -> 'v =
fun f k v x ->
if x = k then v else f x
My questions are:
Does this function pose any problems?
I could imagine repeat use of the function, like this
let kvs : (int * int) List = ... // A very long list of random int pairs.
List.fold (fun f (k,v) -> set f k v) id kvs
would start building up a long list of functions on the heap. Is this something to be concerned about?
Is there a better way to do this, while still keeping the type?
I mean, I could do stuff like construct a type for holding the original function, f, a Map, setting key-value pairs to the map, and checking the map first, the function second, when using keys to get values, but that's not what interests me here - what interest me is having a function for "modifying" a single result for a given value, for a given function.
Potential problems:
The set-modified function leaks space if you override the same value twice:
let huge_object = ...
let small_object = ...
let f0 = set f 0 huge_object
let f1 = set f0 0 small_object
Even though it can never be the output of f1, huge_object cannot be garbage-collected until f1 can: huge_object is referenced by f0, which is in turn referenced by the f1.
The set-modified function has overhead linear in the number of set operations applied to it.
I don't know if these are actual problems for your intended application.
If you wish set to have exactly the type ('k -> 'v) -> 'k -> 'v -> 'k -> 'v then I don't see a better way(*). The obvious idea would be to have a "modification table" of functions you've already modified, then let set look up a given f in this table. But function types do not admit equality checking, so you cannot compare f to the set of functions known to your modification table.
(*) Reflection not withstanding.

Limitations of let rec in OCaml

I'm studying OCaml these days and came across this:
OCaml has limits on what it can put on the righthand side of a let rec. Like this one
let memo_rec f_norec =
let rec f = memoize (fun x -> f_norec f x) in
f;;
Error: This kind of expression is not allowed as right-hand side of `let rec'
in which, the memoize is a function that take a function and turns it into a memorized version with Hashtable. It's apparent that OCaml has some restriction on the use of constructs at the right-hand side of 'let rec', but I don't really get it, could anyone explain a bit more on this?
The kind of expressions that are allowed to be bound by let rec are described in section 8.1 of the manual. Specifically, function applications involving the let rec defined names are not allowed.
A rough summary (taken from that very link):
Informally, the class of accepted definitions consists of those definitions where the defined names occur only inside function bodies or as argument to a data constructor.
You can use tying-the-knot techniques to define memoizing fixpoints. See for example those two equivalent definitions:
let fix_memo f =
let rec g = {contents = fixpoint}
and fixpoint x = f !g x in
g := memoize !g;
!g
let fix_memo f =
let g = ref (fun _ -> assert false) in
g := memoize (fun x -> f !g x);
!g
Or using lazy as reminded by Alain:
let fix_memo f =
let rec fix = lazy (memoize (fun x -> f (Lazy.force fix) x)) in
Lazy.force fix

OCaml: Is there a function with type 'a -> 'a other than the identity function?

This isn't a homework question, by the way. It got brought up in class but my teacher couldn't think of any. Thanks.
How do you define the identity functions ? If you're only considering the syntax, there are different identity functions, which all have the correct type:
let f x = x
let f2 x = (fun y -> y) x
let f3 x = (fun y -> y) (fun y -> y) x
let f4 x = (fun y -> (fun y -> y) y) x
let f5 x = (fun y z -> z) x x
let f6 x = if false then x else x
There are even weirder functions:
let f7 x = if Random.bool() then x else x
let f8 x = if Sys.argv < 5 then x else x
If you restrict yourself to a pure subset of OCaml (which rules out f7 and f8), all the functions you can build verify an observational equation that ensures, in a sense, that what they compute is the identity : for all value f : 'a -> 'a, we have that f x = x
This equation does not depend on the specific function, it is uniquely determined by the type. There are several theorems (framed in different contexts) that formalize the informal idea that "a polymorphic function can't change a parameter of polymorphic type, only pass it around". See for example the paper of Philip Wadler, Theorems for free!.
The nice thing with those theorems is that they don't only apply to the 'a -> 'a case, which is not so interesting. You can get a theorem out of the ('a -> 'a -> bool) -> 'a list -> 'a list type of a sorting function, which says that its application commutes with the mapping of a monotonous function.
More formally, if you have any function s with such a type, then for all types u, v, functions cmp_u : u -> u -> bool, cmp_v : v -> v -> bool, f : u -> v, and list li : u list, and if cmp_u u u' implies cmp_v (f u) (f u') (f is monotonous), you have :
map f (s cmp_u li) = s cmp_v (map f li)
This is indeed true when s is exactly a sorting function, but I find it impressive to be able to prove that it is true of any function s with the same type.
Once you allow non-termination, either by diverging (looping indefinitely, as with the let rec f x = f x function given above), or by raising exceptions, of course you can have anything : you can build a function of type 'a -> 'b, and types don't mean anything anymore. Using Obj.magic : 'a -> 'b has the same effect.
There are saner ways to lose the equivalence to identity : you could work inside a non-empty environment, with predefined values accessible from the function. Consider for example the following function :
let counter = ref 0
let f x = incr counter; x
You still that the property that for all x, f x = x : if you only consider the return value, your function still behaves as the identity. But once you consider side-effects, you're not equivalent to the (side-effect-free) identity anymore : if I know counter, I can write a separating function that returns true when given this function f, and would return false for pure identity functions.
let separate g =
let before = !counter in
g ();
!counter = before + 1
If counter is hidden (for example by a module signature, or simply let f = let counter = ... in fun x -> ...), and no other function can observe it, then we again can't distinguish f and the pure identity functions. So the story is much more subtle in presence of local state.
let rec f x = f (f x)
This function never terminates, but it does have type 'a -> 'a.
If we only allow total functions, the question becomes more interesting. Without using evil tricks, it's not possible to write a total function of type 'a -> 'a, but evil tricks are fun so:
let f (x:'a):'a = Obj.magic 42
Obj.magic is an evil abomination of type 'a -> 'b which allows all kinds of shenanigans to circumvent the type system.
On second thought that one isn't total either because it will crash when used with boxed types.
So the real answer is: the identity function is the only total function of type 'a -> 'a.
Throwing an exception can also give you an 'a -> 'a type:
# let f (x:'a) : 'a = raise (Failure "aaa");;
val f : 'a -> 'a = <fun>
If you restrict yourself to a "reasonable" strongly normalizing typed λ-calculus, there is a single function of type ∀α α→α, which is the identity function. You can prove it by examining the possible normal forms of a term of this type.
Philip Wadler's 1989 article "Theorems for Free" explains how functions having polymorphic types necessarily satisfy certain theorems (e.g. a map-like function commutes with composition).
There are however some nonintuitive issues when one deals with much polymorphism. For instance, there is a standard trick for encoding inductive types and recursion with impredicative polymorphism, by representing an inductive object (e.g. a list) using its recursor function. In some cases, there are terms belonging to the type of the recursor function that are not recursor functions; there is an example in §4.3.1 of Christine Paulin's PhD thesis.

Resources