F#: How to represent a finite collection with strong typing?

F#: How to represent a finite collection with strong typing? - collections

I have a finite set of things all of the same type, and I wish to represent them in a strongly-typed way. I'd like to be able to manipulate the complete set and easily extract the elements. Here is one way:
type Planet = Mercury | Venus | Earth
type PlanetInfo = { Diameter: float }
let planets =
Map [ Mercury, { Diameter = 100. }
Venus, { Diameter = 200. }
Earth, { Diameter = 300. } ]
let venusDiameter = planets.[Venus].Diameter
The good points about this method are:
There are exactly three Planets, as defined by the discriminated union.
We have the whole set in the map planets, which can be manipulated, iterated etc..
planets.[Mars] would cause an error, because "Mars" is not a Planet.
But on the downside:
There is not necessarily a one-to-one mapping between the union and the map. The need to mention each planet twice is a shortcoming. Here is another method which addresses the last point:
type Planet = { Name: string; Diameter: float }
let planets =
[ { Name = "Mercury"; Diameter = 100. }
{ Name = "Venus"; Diameter = 200. }
{ Name = "Earth"; Diameter = 300. } ]
|> List.map (fun e -> e.Name, e)
|> Map
let venusDiameter = planets.["Venus"].Diameter
So now each planet is mentioned in only one place, but planets.["Mars"] fails to cause a compile-time error because the planet identifiers are now "stringly typed".
Is there some way of doing this which has all four good points?

Another option would be to use the Planet type as the Name member in the PlanetInfo type and initialize the Map using a transformation from list:
module Planets =
type Planet =
| Mercury
| Venus
| Earth
type PlanetInfo = { Name: Planet; Diameter: float}
let planets : PlanetInfo list =
[
{Name = Mercury; Diameter = 100.}
{Name = Venus; Diameter = 200.}
{Name = Earth; Diameter = 300.}
]
let planetsmap = planets |> List.map (fun pi -> pi.Name, pi) |> Map.ofList
planetsmap.[Mercury].Diameter
This approach doesn't require reflection and offers compile time type checking. So it is pretty much the same as your second approach, Monica.

How about this?
type Planet =
|Mercury
|Venus
|Earth
member this.Diameter =
match this with
|Mercury -> 100.
|Venus -> 200.
|Earth -> 300.
open Microsoft.FSharp.Reflection
let planets =
FSharpType.GetUnionCases(typeof<Planet>)
|> Array.map (fun case -> FSharpValue.MakeUnion(case, [||]) :?> Planet)

Related

Using `should equal` with sequences in F# and FsUnit

I am using FsUnit.Xunit. I am getting a failure for the following test case:
[<Fact>]
let ``Initialization of DFF`` () =
dff Seq.empty Seq.empty |> should equal (seq {Zero})
The test failure is:
Message: 
FsUnit.Xunit+MatchException : Exception of type 'FsUnit.Xunit+MatchException' was thrown.
Expected: Equals seq [Zero]
Actual: seq [Zero]
Stack Trace: 
That.Static[a](a actual, IMatcher`1 matcher)
Signal.Initialization of DFF() line 11
I get the same error if the test is:
[<Fact>]
let ``Initialization of DFF`` () =
dff Seq.empty Seq.empty |> should equal (Seq.singleton Zero)
I have never tested equality of sequences using FsUnit.Xunit, so I am confused what's going on. I'm not even for sure what the failure message is telling me, as it seems to be saying that the expected and actual are the same. I can get this to work fine by converting the sequences to lists, but it would be nice to not have to do that.
Could someone explain what's going on here? It seems I'm not understanding the error message and thus probably something about Equals and comparing sequence values (literals?). Thanks.
Source code to be able to reproduce (I think this is everything):
type Bit =
| Zero
| One
type Signal = seq<Bit>
let Nand a b =
match a, b with
| Zero, Zero -> One
| Zero, One -> One
| One, Zero -> One
| One, One -> Zero
let Not input =
Nand input input
let And a b =
Not (Nand a b)
let Or a b =
Nand (Not a) (Not b)
let private liftToSignal1 op (signal: Signal) : Signal =
Seq.map op signal
let private liftToSignal2 op (signalA: Signal) (signalB: Signal) : Signal =
Seq.map2 op signalA signalB
let Not' = liftToSignal1 Not
let And' = liftToSignal2 And
let Or' = liftToSignal2 Or
let rec dff data clock : Signal =
seq {
yield Zero
yield! Or' (And' data clock)
(And' (dff data clock) (Not' clock))
}

This is an issue with structural vs. referential equality.
In F# seq { 'a' } = seq { 'a' } // false but [ 'a' ] = [ 'a' ] // true due to seq being IEnumerable and not supporting structural equality (or comparison).
Lists (and other F# container-like types) are much more 'intelligent', i.e. they support structural equality / comparison if the contained objects support it:
[ {| foo = StringComparison.Ordinal; bar = Some(1.23) |} ] =
[ {| foo = StringComparison.Ordinal; bar = Some(1.23) |} ] // true
but don't, if they contain anything that doesn't:
[ box(fun() -> 3) ] = [ box(fun() -> 3) ] // false
So, to make the test work, add a List.ofSeq:
dff Seq.empty Seq.empty |> List.ofSeq |> should equal [ Zero ]

Detect cycle in undirected graph in Ocaml

Does someone has idea how to detect if there is a cycle in undirected graph in OCaml?
Here's the type I'm using for graph:
type 'a graph = { nodes : 'a list; edges : ('a * 'a * int) list }
And for example, I would like to check if this graph contains cycles:
let graph = { nodes = ['a'; 'b'; 'c'; 'd'; 'e'; 'f'; 'g'; 'h'; 'j';];
edges = [('c', 'j', 9); ('d', 'e', 8); ('a', 'b', 8); ('b', 'c', 7); ('f', 'g', 6); ('b', 'h', 4); ('a', 'd', 4); ('g', 'h', 2); ('b', 'f', 2); ('e', 'g', 1)]}

In both directed and undirected graphs, the presence of a cycle is detected using depth first search. Roughly, you traverse a graph and if a walk contains repetitive nodes, then there is a cycle.
Commonly, an additional data structure is employed for labeling already visited nodes. For example, we can employ a set data structure (using vanilla OCaml),
module Nodes = Set.Make(struct
type t = int
let compare = compare
end)
let dfs {edges} f x =
let rec loop visited x = function
| [] -> x
| (src,dst,data) :: rest ->
let x = f src data in
let visited = Nodes.add src visited in
if Nodes.mem dst visited
then loop visited x rest
else ... in
loop Nodes.empty x edges
You can also use an imperative hash table instead of a pure functional set. There is also an algorithm, called Iterative Deepening DFS, that can traverse cyclic graphs without labeling all visited nodes, which is useful when your graph is huge (and won't fit into the memory).
Unless you're doing this for an exercise, I would suggest you using some existing Graph library in OCaml, e.g., OCamlgraph (docs) or Graphlib (docs).

It is also possible to avoid visiting the same edge twice by removing it from the list of available edges; assuming order does not matter in among edges, you can remove an edge as follows:
let edges_remove_edge edges edge =
let (src, dst, _) = edge in
let rec iter edges res = match edges with
| [] -> res
| ((s, d, _) as e)::edges ->
if (s = src && d = dst) then
res # edges
else
iter edges (e::res)
in iter edges []
Removing an edge from a graph is then done by building a new graph that shares data with the previous graph, but with a modified list of edges:
let graph_remove_edge graph edge =
{ nodes = graph.nodes;
edges = edges_remove_edge graph.edges edge }
You can then transform the graph along the recursive calls of your graph traversal; the example does nothing interesting here, it is just to demonstrate the structure:
let choose_edge graph = match graph.edges with
| [] -> None
| e::_ -> Some e;;
let rec visit graph = match (choose_edge graph) with
| None -> graph
| Some e -> visit (graph_remove_edge graph e);;
# visit graph;;
- : char graph =
{nodes = ['a'; 'b'; 'c'; 'd'; 'e'; 'f'; 'g'; 'h'; 'j']; edges = []}
Or, you keep track of the current graph with a ref:
let visit2 graph =
let g = ref graph in
let rec v () = match (choose_edge !g) with
| None -> ()
| Some e -> begin g := graph_remove_edge !g e; v () end
in v(); !g

I managed to detect cycle by using union-find data structure.
A structure to represent a subset for union-find:
let create n =
{parent = Array.init n (fun i -> i);
rank = Array.init n (fun i -> 0)}
A utility function to find set of an element. It uses path compression technique:
let rec find uf i =
let pi = uf.parent.(i) in
if pi == i then
i
else begin
let ci = find uf pi in
uf.parent.(i) <- ci;
ci
end
A function that does union of two sets of x and y. It uses union by rank:
let union ({ parent = p; rank = r } as uf) x y =
let cx = find uf x in
let cy = find uf y in
if cx == cy then raise (Failure "Cycle detected") else begin
if r.(cx) > r.(cy) then
p.(cy) <- cx
else if r.(cx) < r.(cy) then
p.(cx) <- cy
else begin
r.(cx) <- r.(cx) + 1;
p.(cy) <- cx
end
end
I created function for checking if there is a cycle.
let thereIsCycle c1 c2 g subset =
let isCycle = try Some (union subset (findIndex c1 g.nodes) (findIndex c2 g.nodes)) with _ -> None in
match isCycle with
| Some isCycle -> false
| None -> true
let rec findIndex x lst =
match lst with
| [] -> raise (Failure "Not Found")
| h :: t -> if x = h then 0 else 1 + findIndex x t

How to collapse a recursive tree in OCaml

I have a tree type:
type tree = Vertex of int * tree list;;
My recursive equality definition is that two trees are equal if their ints are equal and all of their children are equal.
How do I build the function
topo: tree -> tree list
that creates a list of all of the trees in depth first search order with each tree appearing once and only once (according to the equality definition)? I want to do this in a computationally efficient way. Maybe use lazy or a hashmap?
Here is my attempt, the code blows up when the length is too large:
type tree = Vertex of int * (tree list)
let rec base = function
| 0 -> Vertex (0, [])
| i -> Vertex (i, [base (i - 1)])
let rec range = function
| 0 -> [0]
| i -> i :: range (i - 1)
let agg i = Vertex (-1, List.map base (range i))
let rec equals (a: tree) (b: tree) : bool =
let rec deep_match a_dep b_dep = match a_dep, b_dep with
| [], [] -> true
| [], _
| _, [] -> false
| x::xs, y::ys -> equals x y && deep_match xs ys
in
let Vertex (ai, al) = a in
let Vertex (bi, bl) = b in
ai = bi && deep_match al bl
let rec in_list (a: tree) (l: tree list) : bool = match l with
| [] -> false
| hd::tl -> equals a hd || in_list a tl
let rec topological (pool: tree list) (t: tree) : tree list =
if in_list t pool then pool else
t::match t with
| Vertex(_, []) -> pool
| Vertex(_, deps) -> List.fold_left topological pool deps
let big_agg = agg 100_000
let topo_ordered = topological [] big_agg;;
Printf.printf "len %i\n" (List.length topo_ordered)

To make it efficient you need to implement ordering and hash-consing. With total ordering, you can store your trees in a balanced tree or even a hashtable, thus turning your in_list into O(logN) or even O(1). Adding hash-consing will enable O(1) comparison of your trees (at the cost of less efficient tree construction).
Instead of having both, depending on your design constraints, you can have only one. For the didactic purposes, let's implement hash-consing for your particular representation
To implement hash-consing you need to make your constructor private and hide data constructors behind an abstraction wall (to prevent users from breaking you hash-consing properties):
module Tree : sig
type t = private Vertex of int * t list
val create : int -> t list -> t
val equal : t -> t -> bool
end = struct
type t = Vertex of int * t list
let repository = Hashtbl.create 64
let create n children =
let node = Vertex (n,children) in
try Hashtbl.find repository node
with Not_found -> Hashtbl.add repository node node; node
let equal x y = x == y
end
Since we guaranteed that structurally equal trees are physically equal during the tree creation (i.e., if there exists an equal tree in our repository then we return it), we are now able to substitute structural equality with physical equality, i.e., with pointer comparison.
We got a fast comparison with the price - we now leaking memory, since we need to store all ever created trees and the create function is now O(N). We can alleviate the first problem by using ephemerons, but the latter problem will persist, of course.
Another issue, is that we're not able to put our trees into ordered structure, like a map or a set. We can of course use regular polymorphic compare, but since it will be O(N), inserting to such structure will become quadratic. Not an option for us. Therefore we need to add total ordering on our trees. We can theoretically do this without changing the representation (using ephemerons), but it is easier just to add an order parameter to our tree representation, e.g.,
module Tree : sig
type order (* = private int *) (* add this for debuggin *)
type t = private Vertex of order * int * t list
val create : int -> t list -> t
val equal : t -> t -> bool
val compare : t -> t -> int
end = struct
type order = int
type t = Vertex of order * int * t list
type tree = t
module Repository = Hashtbl.Make(struct
type t = tree
let max_hash = 16
let rec equal (Vertex (_,p1,x)) (Vertex (_,p2,y)) =
match compare p1 p2 with
| 0 -> equal_trees x y
| n -> false
and equal_trees xs ys = match xs, ys with
| [],[] -> true
| [],_ | _,[] -> false
| x :: xs, y::ys -> equal x y && equal_trees xs ys
let rec hash (Vertex (_,p,xs)) =
hash_trees (Hashtbl.hash p) max_hash xs
and hash_trees hash depth = function
| x :: xs when depth > 0 ->
hash_trees (Hashtbl.hash x) (depth-1) xs
| _ -> hash
end)
let repository = Repository.create 64
let create n children =
try Repository.find repository (Vertex (0,n,children))
with Not_found ->
let order = Repository.length repository + 1 in
let node = Vertex (order,n,children) in
Repository.add repository node node; node
let equal x y = x == y
let order (Vertex (order,_,_)) = order
let compare x y = compare (order x) (order y)
end
We had to manually implement the structural variants of equal and hash for our trees because we need to ignore the order in comparison, when we store a new tree in the repository. It looks like a bit of work, but in the real-life you can do this using derivers.
Anyways, now we got a comparable version of a tree with a comparison function which is O(1), so we can put our trees in sets and maps, and implement your topo efficiently.
A nice feature of both implementations is a tight representation of a tree, since sharing is guaranteed by the create function. E.g.,
# let t1 = Tree.create 42 [];;
val t1 : Tree.t = Tree.Vertex (1, 42, [])
# let t3 = Tree.create 42 [t1; t1];;
val t3 : Tree.t =
Tree.Vertex (2, 42, [Tree.Vertex (1, 42, []); Tree.Vertex (1, 42, [])])
# let t5 = Tree.create 42 [t1; t3; t1];;
val t5 : Tree.t =
Tree.Vertex (3, 42,
[Tree.Vertex (1, 42, []);
Tree.Vertex (2, 42, [Tree.Vertex (1, 42, []); Tree.Vertex (1, 42, [])]);
Tree.Vertex (1, 42, [])])
#
In this example, t1 in t5 and t3 will be the same pointer.

For optimal performance, one possibility would be to use hashconsing. However, in your current example, both the generation and the unicity test are quadratic in n. Fixing both points seems to already improve performance a lot.
First, we can avoid the quadratic tree generation by adding a lot of sharing:
let range max =
let rec range elt l n =
if n > max then elt::l
else
let next = Vertex(n,[elt]) in
range next (elt::l) (n+1) in
range (Vertex(0,[])) [] 1
let agg i = Vertex (-1, range i)
With this change, it is become reasonable to generate a tree with 1010 elements (but only 105 unique elements).
Then, the unicity test can be done with a set (or a hashtable):
module S = Set.Make(struct type t = tree let compare = compare end)
let rec topological (set, pool) t =
if S.mem t set then (set, pool) else
let set = S.add t set in
let set, pool =
match t with
| Vertex(_, []) -> set, pool
| Vertex(_, deps) -> List.fold_left topological (set,pool) deps in
set, t::pool

Why isn't this F# inner function tail-recursive?

If I call this function with a very high initial currentReflection value I get a stack overflow exception, which indicates that the function is not tail-recursive (correct?). My understanding was that as long as the recursive call was the final computation of the function then it should be compiler-optimized as a tail-recursive function to reuse the current stack frame. Anyone know why this isn't the case here?
let rec traceColorAt intersection ray currentReflection =
// some useful values to compute at the start
let matrix = intersection.sphere.transformation |> transpose |> invert
let transNormal = matrix.Transform(intersection.normal) |> norm
let hitPoint = intersection.point
let ambient = ambientColorAt intersection
let specular = specularColorAt intersection hitPoint transNormal
let diffuse = diffuseColorAt intersection hitPoint transNormal
let primaryColor = ambient + diffuse + specular
if currentReflection = 0 then
primaryColor
else
let reflectDir = (ray.direction - 2.0 * norm ((Vector3D.DotProduct(ray.direction, intersection.normal)) * intersection.normal))
let newRay = { origin=intersection.point; direction=reflectDir }
let intersections = castRay newRay scene
match intersections with
| [] -> primaryColor
| _ ->
let newIntersection = List.minBy(fun x -> x.t) intersections
let reflectivity = intersection.sphere.material.reflectivity
primaryColor + traceColorAt newIntersection newRay (currentReflection - 1) * reflectivity

The recursive call to traceColorAt appears as part of a larger expression. This prevents tail call optimization because further computation is necessary after traceColorAt returns.
To convert this function to be tail recursive, you could add an additional accumulator parameter for primaryColor. The outermost call to traceColorAt would pass the "zero" value for primaryColor (black?) and each recursive call would sum in the adjustment it computes, e.g. the code would look something like:
let rec traceColorAt intersection ray currentReflection primaryColor
...
let newPrimaryColor = primaryColor + ambient + diffuse + specular
...
match intersections with
| [] -> newPrimaryColor
| _ ->
...
traceColorAt newIntersection newRay ((currentReflection - 1) * reflectivity) newPrimaryColor
If you wish to hide the extra parameter from callers, introduce a helper function that performs the bulk of the work and call that from traceColorAt.

Tail recursion works if the function would simply return the result of another function. In this case, you have primaryColor + traceColorAt(...), which means that it is not simply returning the value of the function-- it's also adding something to it.
You could fix this by passing the current accumulated color as a parameter.

Embed a variable inside an F# quotation

I'm writing an F# dsl for SQL (http://github.com/kolosy/furious).
A select statement would look like this:
type person = {
personId: string
firstname: string
lastname: string
homeAddress: address
workAddress: address
altAddresses: address seq
}
and address = {
addressId: string
street1: string
zip: string
}
let (neighbor: person seq) =
db.Yield <# Seq.filter (fun p -> p.homeAddress.zip = '60614') #>
The obvious (and silly) question is... How do I parametrize the quotation?
If I just somehting like:
let z = "60614"
let (neighbor: person seq) =
db.Yield <# Seq.filter (fun p -> p.homeAddress.zip = z) #>
then z gets resolved into a static property accessor (PropertyGet(None, String z, [])). I need something that will let me retrieve the value of the variable/let binding based solely on the quotation. Ideas?

Quotations are not my forte, but check out the difference here:
let z = "60614"
let foo = <# List.filter (fun s -> s = z) #>
printfn "%A" foo
let foo2 =
let z = z
<# List.filter (fun s -> s = z) #>
printfn "%A" foo2
I think maybe having 'z' be local to the expression means the value is captured, rather than a property reference.

In addition to what Brian wrote - I believe that the encoding of access to global let bound values is also pretty stable and they will quite likely continue to be encoded as PropGet in the future.
This means that you could support this case explicitly in your translator and add a simple pre-processing step to get values of these properties. This can be done using ExprShape (which allows you to fully traverse quotation just using 4 cases). This would allow your DSL to support the general case as well.
The following function traverses quotation and replaces access to global lets with their value:
open Microsoft.FSharp.Quotations
let rec expand e =
match e with
// Extract value of global 'let' bound symbols
| Patterns.PropertyGet(None, pi, []) ->
Expr.Value(pi.GetValue(null, [| |]), e.Type)
// standard recursive processing of quotations
| ExprShape.ShapeCombination(a, b) ->
ExprShape.RebuildShapeCombination(a, b |> List.map expand)
| ExprShape.ShapeLambda(v, b) -> Expr.Lambda(v, expand b)
| ExprShape.ShapeVar(v) -> Expr.Var(v)
Then you can write the following to get a quotation that contains value instead of PropGet:
let z = 5
let eOrig = <# Seq.filter (fun p -> p = z) [ 1 .. 10 ]#>
let eNice = expand eOrig