F# map and distinct objects - dictionary

I have some nondescript but distinct objects (specifically, unnamed variables in logic expressions) that I want to put in a map that associates them with their values. As I understand it, map needs to distinguish objects by some ordered field, so I can't just have
type Term =
...
| Var
as this would not allow different variables distinguishable from each other. Instead I could presumably have
type Term =
...
| Var of int64
and then have a new_var function that increments a global int64 counter and returns a new variable with the incremented value. This seems slightly inelegant, but should work.
Is the global counter the recommended way to handle this, or is there a more idiomatic method?

It's not really a "map having to distinguish objects" thing - when you declare a type like this:
type Term =
| Var
you have a type with a single valid value - Var. If you're saying you want to have objects that are distinct - this is not what you want. You can still use that type as a key in a map - not a particularly useful one though, since it will have at most a single element.
Using a counter is a good enough way to handle it. If you don't want a "global" one, you can roll it into a function using a ref cell to hold it:
type Term =
| Var of int
let make =
let counter = ref 0
fun () ->
counter := !counter + 1
Term.Var (!counter)
Or use GUIDs if you don't care about the values and want the counter out of the picture:
type Term =
| Var of System.Guid
let make () =
Term.Var (System.Guid.NewGuid())

Related

Balanced tree for functional symbol table

I'm doing exercises of "Modern Compiler Implementation in ML" (Andrew Appel). One of which (ex 1.1 d) is to recommend a balanced-tree data structure for functional symbol table. Appeal mentioned such data structure should rebalance on insertion but not on lookup. Being totally new to functional programming, I found this confusing. What is key insight on this requirement?
A tree that’s rebalanced on every insertion and deletion doesn’t need to rebalance on lookup, because lookup doesn’t modify the structure. If it was balanced before a lookup, it will stay balanced during and after.
In functional languages, insertion and rebalancing can be more expensive than in a procedural one. Because you can’t alter any node in place, you replace a node by creating a new node, then replacing its parent with a new node whose children are the new daughter and the unaltered older daughter, and then replace the grandparent node with one whose children are the new parent and her older sister, and so on up. You finish when you create a new root node for the updated tree and garbage-collect all the nodes you replaced. However, some tree structures have the desirable property that they need to replace no more than O(log N) nodes of the tree on an insertion and can re-use the rest. This means that the rotation of a red-black tree (for example) has not much more overhead than an unbalanced insertion.
Also, you will typically need to query a symbol table much more often than you update it. It therefore becomes less tempting to try to make insertion faster: if you’re inserting, you might as well rebalance.
The question of which self-balancing tree structure is best for a functional language has been asked here, more than once.
Since Davislor already answered your question extensively, here are mostly some implementation hints. I would add that choice of data structure for your symbol table is probably not relevant for a toy compiler. Compilation time only starts to become an issue when you compiler is used on a lot of code and the code is recompiled often.
Sticking to a O(n) insert/lookup data structure is fine in practice until it isn't.
Signature-wise, all you want is a key-value mapping, insert, and lookup:
signature SymTab =
sig
type id
type value
type symtab
val empty : symtab
val insert : id -> value -> symtab -> symtab
val lookup : id -> symtab -> value option
end
A simple O(n) implementation with lists might be:
structure ListSymTab : SymTab =
struct
type id = string
type value = int
type symtab = (id * value) list
val empty = []
fun insert id value [] = [(id, value)]
| insert id value ((id',value')::symtab) =
if id = id'
then (id,value)::symtab
else (id',value')::insert id value symtab
fun lookup _ [] = NONE
| lookup id ((id',value)::symtab) =
if id = id' then SOME value else lookup id symtab
end
You might use it like:
- ListSymTab.lookup "hello" (ListSymTab.insert "hello" 42 ListSymTab.empty);
> val it = SOME 42 : int option
Then again, maybe your symbol table doesn't map strings to integers, or you may have one symbol table for variables and one for functions.
You could parameterise the id/value types using a functor:
functor ListSymTabFn (X : sig
eqtype id
type value
end) : SymTab =
struct
type id = X.id
type value = X.value
(* The rest is the same as ListSymTab. *)
end
And you might use it like:
- structure ListSymTab = ListSymTabFn(struct type id = string type value = int end);
- ListSymTab.lookup "world" (ListSymTab.insert "hello" 42 ListSymTab.empty);
> val it = NONE : int option
All you need for a list-based symbol table is that the identifiers/symbols can be compared for equality. For your balanced-tree symbol table, you need identifiers/symbols to be orderable.
Instead of implementing balanced trees from scratch, look e.g. at SML/NJ's RedBlackMapFn:
To create a structure implementing maps (dictionaries) over a type T [...]:
structure MapT = RedBlackMapFn (struct
type ord_key = T
val compare = compareT
end)
Try this example with T as string and compare as String.compare:
$ sml
Standard ML of New Jersey v110.76 [built: Sun Jun 29 03:29:51 2014]
- structure MapS = RedBlackMapFn (struct
type ord_key = string
val compare = String.compare
end);
[autoloading]
[library $SMLNJ-BASIS/basis.cm is stable]
[library $SMLNJ-LIB/Util/smlnj-lib.cm is stable]
[autoloading done]
structure MapS : ORD_MAP?
- open MapS;
...
Opening the structure is an easy way to explore the available functions and their types.
We can then create a similar functor to ListSymTabFn, but one that takes an additional compare function:
functor RedBlackSymTabFn (X : sig
type id
type value
val compare : id * id -> order
end) : SymTab =
struct
type id = X.id
type value = X.value
structure SymTabX = RedBlackMapFn (struct
type ord_key = X.id
val compare = X.compare
end)
(* The 'a map type inside SymTabX maps X.id to anything. *)
(* We are, however, only interested in mapping to values. *)
type symtab = value SymTabX.map
(* Use other stuff in SymTabT for empty, insert, lookup. *)
end
Finally, you can use this as your symbol table:
structure SymTab = RedBlackSymTabFn(struct
type id = string
type value = int
val compare = String.compare
end);

Pointer to a record in OCaml

I am implementing binary search trees in OCaml, trying to use as much imperative programming as possible.
I have the following data type:
type tKey = Key of int;;
type tBST = Null | Pos of node ref
and node = {mutable key : tKey; mutable left : tBST; mutable right : tBST};;
I am having trouble with this function:
let createNode k tree =
tree := Pos ({key = k; left = Null; right = Null});;
Error: This record expression is expected to have type node ref
The field key does not belong to type ref
A binary search tree can be either Null (means empty tree) or a Pos. A tree Pos is a pointer to a node, and a node is a structure of a key and 2 other trees (left and right).
My main goal here is to have a tree that is modified after functions are over. Passing tree by reference so when createNode is over, the tBST I passed as parameter is modified.
Question: is actually possible to do what I am trying in OCaml? if so, how could I change my function createNode and/or data type to make this happen?
Thank you very much.
It is possible, but you need to create the Pos node with a reference explicitly:
Pos (ref {key = k; (*...*)})
Whether what you are trying to do is recommended practice in a language like Ocaml is a different story, though.
The question has already been answered. I would just like to add a side note: The use of ref seems superfluous in this case.
A value of type tBST is either Null or a mutable pointer. If it is Null it will remain Null. If it is non-Null, it will remain non-Null, but the actual pointer might change. That might well be what you intended, but I have my doubts. In particular, what tBST does not do, is to emulate C-style pointers (which are either null or really point somewhere). I suspect, though, that that was your intention.
The idiomatic way to emulate C-style pointers is to just use the built-in option type, like so:
type tBST = node option
A value of type node option is either None or Some n, where n is a pointer to a value of type node. You use tBST for mutable fields (of the record node), so you would effectively have mutable C-style pointers to nodes.
Here is what you probably had in mind:
type tree = node option ref
and node = {
mutable left: tree;
mutable key: int;
mutable right: tree;
};;
let t0 : tree = ref None;;
let t1 : tree = ref (Some { left = ref None; key = 1; right = ref None; }) ;;
let create_node key tree =
tree := Some { left = ref None; key; right = ref None; }
No need to have a separate type for key but you can if you want it, and with the latest OCaml there no runtime overhead for it.

how return a new type with an update value

If I want to change a value on a list, I will return a new list with the new value instead of changing the value on the old list.
Now I have four types. I need to update the value location in varEnd, instead of changing the value, I need to return a new type with the update value
type varEnd = {
v: ctype;
k: varkind;
l: location;
}
;;
type varStart = {
ct: ctype;
sy: sTable;
n: int;
stm: stmt list;
e: expr
}
and sEntry = Var of varEnd | Fun of varStart
and sTable = (string * sEntry) list
type environment = sTable list;;
(a function where environment is the only parameter i can use)
let allocateMem (env:environment) : environment =
I tried to use List.iter, but it changes the value directly, which type is also not mutable. I think List.fold will be a better option.
The biggest issue i have is there are four different types.
I think you're saying that you know how to change an element of a list by constructing a new list.
Now you want to do this to an environment, and an environment is a list of quite complicated things. But this doesn't make any difference, the way to change the list is the same. The only difference is that the replacement value will be a complicated thing.
I don't know what you mean when you say you have four types. I see a lot more than four types listed here. But on the other hand, an environment seems to contain things of basically two different types.
Maybe (but possibly not) you're saying you don't know a good way to change just one of the four fields of a record while leaving the others the same. This is something for which there's a good answer. Assume that x is something of type varEnd. Then you can say:
{ x with l = loc }
If, in fact, you don't know how to modify an element of a list by creating a new list, then that's the thing to figure out first. You can do it with a fold, but in fact you can also do it with List.map, which is a little simpler. You can't do it with List.iter.
Update
Assume we have a record type like this:
type r = { a: int; b: float; }
Here's a function that takes r list list and adds 1.0 to the b fields of those records whose a fields are 0.
let incr_ll rll =
let f r = if r.a = 0 then { r with b = r.b +. 1.0 } else r in
List.map (List.map f) rll
The type of this function is r list list -> r list list.

Can I insert into a map by key in F#?

I'm messing around a bit with F# and I'm not quite sure if I'm doing this correctly. In C# this could be done with an IDictionary or something similar.
type School() =
member val Roster = Map.empty with get, set
member this.add(grade: int, studentName: string) =
match this.Roster.ContainsKey(grade) with
| true -> // Can I do something like this.Roster.[grade].Insert([studentName])?
| false -> this.Roster <- this.Roster.Add(grade, [studentName])
Is there a way to insert into the map if it contains a specified key or am I just using the wrong collection in this case?
The F# Map type is a mapping from keys to values just like ordinary .NET Dictionary, except that it is immutable.
If I understand your aim correctly, you're trying to keep a list of students for each grade. The type in that case is a map from integers to lists of names, i.e. Map<int, string list>.
The Add operation on the map actually either adds or replaces an element, so I think that's the operation you want in the false case. In the true case, you need to get the current list, append the new student and then replace the existing record. One way to do this is to write something like:
type School() =
member val Roster = Map.empty with get, set
member this.Add(grade: int, studentName: string) =
// Try to get the current list of students for a given 'grade'
let studentsOpt = this.Roster.TryFind(grade)
// If the result was 'None', then use empty list as the default
let students = defaultArg studentsOpt []
// Create a new list with the new student at the front
let newStudents = studentName::students
// Create & save map with new/replaced mapping for 'grade'
this.Roster <- this.Roster.Add(grade, newStudents)
This is not thread-safe (because calling Add concurrently might not update the map properly). However, you can access school.Roster at any time, iterate over it (or share references to it) safely, because it is an immutable structure. However, if you do not care about that, then using standard Dictionary would be perfectly fine too - depends on your actual use case.

OCaml: Does storing some values to be used later introduce "side effects"?

For a homework assignment, we've been instructed to complete a task without introducing any "side-effects". I've looked up "side-effects" on Wikipedia, and though I get that in theory it means "modifies a state or has an observable interaction with calling functions", I'm having trouble figuring out specifics.
For example, would creating a value that holds a non-compile time result be introducing side effects?
Say I had (might not be syntactically perfect):
val myList = (someFunction x y);;
if List.exists ((=) 7) myList then true else false;;
Would this introduce side-effects? I guess maybe I'm confused on what "modifies a state" means in the definition of side-effects.
No; a side-effect refers to e.g. mutating a ref cell with the assignment operator :=, or other things where the value referred to by a name changes over time. In this case, myList is an immutable value that never changes during the program, thus it is effect-free.
See also
http://en.wikipedia.org/wiki/Referential_transparency_(computer_science)
A good way to think about it is "have I changed anything which any later code (including running this same function again later) could ever possibly see other than the value I'm returning?" If so, that's a side effect. If not, then you can know that there isn't one.
So, something like:
let inc_nosf v = v+1
has no side effects because it just returns a new value which is one more than an integer v. So if you run the following code in the ocaml toplevel, you get the corresponding results:
# let x = 5;;
val x : int = 5
# inc_nosf x;;
- : int = 6
# x;;
- : int = 5
As you can see, the value of x didn't change. So, since we didn't save the return value, then nothing really got incremented. Our function itself only modifies the return value, not x itself. So to save it into x, we'd have to do:
# let x = inc_nosf x;;
val x : int = 6
# x;;
- : int = 6
Since the inc_nosf function has no side effects (that is, it only communicates with the outside world using its return value, not by making any other changes).
But something like:
let inc_sf r = r := !r+1
has side effects because it changes the value stored in the reference represented by r. So if you run similar code in the top level, you get this, instead:
# let y = ref 5;;
val y : int ref = {contents = 5}
# inc_sf y;;
- : unit = ()
# y;;
- : int ref = {contents = 6}
So, in this case, even though we still don't save the return value, it got incremented anyway. That means there must have been changes to something other than the return value. In this case, that change was the assignment using := which changed the stored value of the ref.
As a good rule of thumb, in Ocaml, if you avoid using refs, records, classes, strings, arrays, and hash tables, then you will avoid any risk of side effects. Although you can safely use string literals as long as you avoid modifying the string in place using functions like String.set or String.fill. Basically, any function which can modify a data type in place will cause a side effect.

Resources