How to implement a dictionary as a function in OCaml? - functional-programming

I am learning Jason Hickey's Introduction to Objective Caml.
Here is an exercise I don't have any clue
First of all, what does it mean to implement a dictionary as a function? How can I image that?
Do we need any array or something like that? Apparently, we can't have array in this exercise, because array hasn't been introduced yet in Chapter 3. But How do I do it without some storage?
So I don't know how to do it, I wish some hints and guides.

I think the point of this exercise is to get you to use closures. For example, consider the following pair of OCaml functions in a file fun-dict.ml:
let empty (_ : string) : int = 0
let add d k v = fun k' -> if k = k' then v else d k'
Then at the OCaml prompt you can do:
# #use "fun-dict.ml";;
val empty : string -> int =
val add : ('a -> 'b) -> 'a -> 'b -> 'a -> 'b =
# let d = add empty "foo" 10;;
val d : string -> int =
# d "bar";; (* Since our dictionary is a function we simply call with a
string to look up a value *)
- : int = 0 (* We never added "bar" so we get 0 *)
# d "foo";;
- : int = 10 (* We added "foo" -> 10 *)
In this example the dictionary is a function on a string key to an int value. The empty function is a dictionary that maps all keys to 0. The add function creates a closure which takes one argument, a key. Remember that our definition of a dictionary here is function from key to values so this closure is a dictionary. It checks to see if k' (the closure parameter) is = k where k is the key just added. If it is it returns the new value, otherwise it calls the old dictionary.
You effectively have a list of closures which are chained not by cons cells by by closing over the next dictionary(function) in the chain).
Extra exercise, how would you remove a key from this dictionary?
Edit: What is a closure?
A closure is a function which references variables (names) from the scope it was created in. So what does that mean?
Consider our add function. It returns a function
fun k' -> if k = k' then v else d k
If you only look at that function there are three names that aren't defined, d, k, and v. To figure out what they are we have to look in the enclosing scope, i.e. the scope of add. Where we find
let add d k v = ...
So even after add has returned a new function that function still references the arguments to add. So a closure is a function which must be closed over by some outer scope in order to be meaningful.

In OCaml you can use an actual function to represent a dictionary. Non-FP languages usually don't support functions as first-class objects, so if you're used to them you might have trouble thinking that way at first.
A dictionary is a map, which is a function. Imagine you have a function d that takes a string and gives back a number. It gives back different numbers for different strings but always the same number for the same string. This is a dictionary. The string is the thing you're looking up, and the number you get back is the associated entry in the dictionary.
You don't need an array (or a list). Your add function can construct a function that does what's necessary without any (explicit) data structure. Note that the add function takes a dictionary (a function) and returns a dictionary (a new function).
To get started thinking about higher-order functions, here's an example. The function bump takes a function (f: int -> int) and an int (k: int). It returns a new function that returns a value that's k bigger than what f returns for the same input.
let bump f k = fun n -> k + f n
(The point is that bump, like add, takes a function and some data and returns a new function based on these values.)

I thought it might be worth to add that functions in OCaml are not just pieces of code (unlike in C, C++, Java etc.). In those non-functional languages, functions don't have any state associated with them, it would be kind of rediculous to talk about such a thing. But this is not the case with functions in functional languages, you should start to think of them as a kind of objects; a weird kind of objects, yes.
So how can we "make" these objects? Let's take Jeffrey's example:
let bump f k =
fun n ->
k + f n
Now what does bump actually do? It might help you to think of bump as a constructor that you may already be familiar with. What does it construct? it constructs a function object (very losely speaking here). So what state does that resulting object has? it has two instance variables (sort of) which are f and k. These two instance variables are bound to the resulting function-object when you invoke bump f k. You can see that the returned function-object:
fun n ->
k + f n
Utilizes these instance variables f and k in it's body. Once this function-object is returned, you can only invoke it, there's no other way for you to access f or k (so this is encapsulation).
It's very uncommon to use the term function-object, they are called just functions, but you have to keep in mind that they can "enclose" state as well. These function-objects (also called closures) are not far separated from the "real" objects in object-oriented programming languages, a very interesting discussion can be found here.

I'm also struggling with this problem. Here's my solution and it works for the cases listed in the textbook...
An empty dictionary simply returns 0:
let empty (k:string) = 0
Find calls the dictionary's function on the key. This function is trivial:
let find (d: string -> int) k = d k
Add extends the function of the dictionary to have another conditional branching. We return a new dictionary that takes a key k' and matches it against k (the key we need to add). If it matches, we return v (the corresponding value). If it doesn't match we return the old (smaller) dictionary:
let add (d: string -> int) k v =
fun k' ->
if k' = k then
v
else
d k'
You could alter add to have a remove function. Also, I added a condition to make sure we don't remove a non-exisiting key. This is just for practice. This implementation of a dictionary is bad anyways:
let remove (d: string -> int) k =
if find d k = 0 then
d
else
fun k' ->
if k' = k then
0
else
d k'
I'm not good with the terminology as I'm still learning functional programming. So, feel free to correct me.

Related

Haskell Map: Transform key and catch errors

I struggle to find a solution (other than using Data.Map.fromList and toList) for the following problem:
Transform the key with a function that returns an Either String k where as usual errors are expressed as Left, and the whole transformation fails if a single key-transformation fails.
import Data.Map.Strict
mapEitherKey :: (k -> Either String k') -> Map k a -> Either String (Map k' a)
For the right-hand side v of Map k v it is easily done with mapM, because
Map k is an instance of Monad.
But none of the mapping functions offers anything functorial in the key. (I wonder whether there's a deeper reason why Map is not an instance of Bifunctor, for instance. I do see that it's not trivial because key collisions need to be taken into account.)
Any insights would be appreciated.
I'd like to second what #chi said; unless you know that the order is preserved, you'll have to reinsert the key each time. Which means that toList and then reinserting with fromList should be the best you can get asymptotically.
However, since you asked for a way to do it without those functions, I'd like to suggest using foldMapWithKey.
import Data.Monoid (Ap(..))
import qualified Data.Map as M
mapEitherKey :: (Ord k, Ord k') => (k -> Either String k') -> Map k a -> Either String (Map k' a)
mapEitherKey f = getAp . M.foldMapWithKey (\k v -> Ap (flip M.singleton v <$> f k))
The idea for this is to lift the reconstruction of the map with key k' (which we do by unioning together a bunch of singletons) into the Either String Applicative, which will short circuit if it encounters a Left.
I wrote it kind of quick and dirty, but you can refactor it to be more readable. It should be asymptotically ideal, but there might be more performant approaches.
Note that you need the Ord constraints on k and k'.
Example usage:
Prelude> f k = if k > 10 then Left "bad number" else Right $ show k
Prelude> mapEitherKey f (M.fromList [(0,0),(1,1),(2,2)])
Right (fromList [("0",0),("1",1),("2",2)])
Prelude> mapEitherKey f (M.fromList [(0,0),(1,1),(2,2),(11,11)])
Left "bad number"

Haskell Data.Map lookup AND delete at the same time

I was recently using the Map type from Data.Map inside a State Monad and so I wanted to write a function, that looks up a value in the Map and also deletes it from the Map inside the State Monad.
My current implementation looks like this:
lookupDelete :: (Ord k) => k -> State (Map k v) (Maybe v)
lookupDelete k = do
m <- get
put (M.delete k m)
return $ M.lookup k m
While this works, it feels quite inefficient. With mutable maps in imperative languages, it is not uncommon to find delete functions, that also return the value that was deleted.
I couldn't find a function for this, so I would really appreciate if someone knows one (or can explain why there is none)
A simple implementation is in terms of alterF:
lookupDelete :: Ord k => k -> State (Map k v) (Maybe v)
lookupDelete = state . alterF (\x -> (x, Nothing))
The x in alterF's argument is the Maybe value stored at the key given to lookupDelete. This anonymous function returns a (Maybe v, Maybe v). (,) (Maybe v) is a functor, and basically it serves as a "context" through which we can save whatever data we want from x. In this case we just save the whole x. The Nothing in the right element specifies that we want deletion. Once fully applied, alterF then gives us (Maybe v, Map k v), where the context (left element) is whatever we saved in the anonymous function and the right element is the mutated map. Then we wrap this stateful operation in state.
alterF is quite powerful: lots of operations can be built out of it simply by choosing the correct "context" functor. E.g. insert and delete come from using Identity, and lookup comes from using Const (Maybe v). A specialized function for lookupDelete is not necessary when we have alterF. One way to understand why alterF is so powerful is to recognize its type:
flip alterF k :: Functor f => (Maybe a -> f (Maybe a)) -> Map k a -> f (Map k a)
Things with types in this pattern
SomeClass f => (a -> f b) -> s -> f t
are called "optics" (when SomeClass is Functor, they're called "lenses"), and they represent how to "find" and "mutate" and "collate" "fields" inside "structures", because they let us focus on part of a structure, modify it (with the function argument), and save some information through a context (by letting us choose f). See the lens package for other uses of this pattern. (As the docs for alterF note, it's basically at from lens.)
There is no function specifically for "delete and lookup". Instead you use a more general tool: updateLookupWithKey is "lookup and update", where update can be delete or modify.
updateLookupWithKey :: Ord k =>
(k -> a -> Maybe a) -> k -> Map k a -> (Maybe a, Map k a)
lookupDelete k = do
(ret, m) <- gets $ updateLookupWithKey (\_ _ -> Nothing) k
put m
pure ret

What is the difference between a writer monad and a list writer monad

I was looking at the examples of writer monad to understand how it works and almost all of those looks like a list writer monad. I know a list writer monad is a type of writer monad. But what really is a writer monad in lay-mans terms.
In lay terms, the writer monad is the monad that lets you "write" items to a "log" while you produce a value. When you're done, you end up with the value you produced and the log that contains all the stuff you wrote. To put it another way, it is the monad whose side effects are "writing things to a log".
Let's make this more concrete with examples of both the list writer and the (generic) writer monads. I'll use Haskell here, since it is the original context in which Monads for Functional Programming were described.
The List Writer Monad
I assume that the "list writer" monad is one that logs an item (of some type we'll call w) into a list of items (of type [w], of course). It also produces a value of type a. (See the note at the bottom if you get errors using this code yourself.)
newtype ListWriter w a = ListWriter { runListWriter :: ([w], a) }
instance Monad (ListWriter w) where
return a = ListWriter ([], a) -- produce an a, don't log anything
ListWriter (ws, a) >>= k =
let ListWriter (xs, a') = k a -- run the action 'k' on the existing value,
in ListWriter (ws ++ xs, a') -- add anything it logs to the existing log,
-- and produce a new result value
-- Add an item to the log and produce a boring value.
-- When sequenced with >>, this will add the item to existing log.
tell :: w -> ListWriter w ()
tell w = ListWriter ([w], ())
ex1 :: ListWriter String Int
ex1 = do
tell "foo"
tell "bar"
return 0
(NB: This is equivalent to ex1 = tell "foo" >> tell "bar" >> return 0, demonstrating the use of tell with >> to add an item to the log.)
If we evaluate runListWriter ex1 in GHCi, we see that it wrote "foo" and "bar" to the log and produced the result value 0.
λ> runListWriter ex1
(["foo","bar"],0)
The (Generic) Writer Monad
Now, let's see how we turn this into the generic writer monad. The writer monad works with any sort of thing that can be combined together, not just a list. Specifically, it works with any Monoid:
class Monoid m where
mempty :: m -- an empty m
mappend :: m -> m -> m -- combine two m's into a single m
Lists are a Monoid with [] and (++) as mempty and mappend respectively. A non-list example of a Monoid is sums of integers:
λ> Sum 1 <> Sum 2 -- (<>) = mappend
Sum {getSum = 3}
The writer monad is then
newtype Writer w m = Writer { runWriter :: (w, m) }
Instead of a list of w's, we just have a single w. But when we define the Monad, we ensure that w is a Monoid so we can start with an empty log and append a new entry to the log:
instance Monoid w => Monad (Writer w) where
return a = Writer (mempty, a) -- produce an a, don't log anything
Writer (w, a) >>= k =
let Writer (x, a') = k a -- we combine the two w's rather than
in Writer (w <> x, a') -- (++)-ing two lists
Note the differences here: we use mempty instead of [] and (<>) instead of (++). This is how we generalize from lists to any Monoid.
So the writer monad is really a generalization of the list monad to arbitrary things that can be combined rather than just lists. You can use lists with the Writer to get something (almost) equivalent to ListWriter. The only difference is that you have to wrap your logged item in a list when you append it to the log:
ex2 :: Writer [String] Int
ex2 = do
tell ["foo"]
tell ["bar"]
return 0
but you get the same result:
λ> runWriter ex2
(["foo","bar"],0)
This is because instead of logging "an item that will be put in a list", you are logging "a list". (This does mean that you can log multiple items at the same time by passing a list of more than one element.)
For an example of a non-list use of Writer, consider counting the comparisons a sort function makes. Each time your function make a comparison, you can tell (Sum 1). (You can tell someone. Get it? Is this thing on?) Then, at the end, you'll get back the total count (i.e., the sum) of all of the comparisons along with the sorted list.
NOTE: If you try to use these ListWriter and Writer definitions yourself, GHC will tell you that you are missing Functor and Applicative instances. Once you have the Monad instance, you can write the others in its terms:
import Control.Monad (ap, liftM)
instance Functor (ListWriter w) where
fmap = liftM
instance Applicative (ListWriter w) where
pure = return
(<*>) = ap
And likewise for Writer. I elided them above for clarity.

Extract nth element of a tuple

For a list, you can do pattern matching and iterate until the nth element, but for a tuple, how would you grab the nth element?
TL;DR; Stop trying to access directly the n-th element of a t-uple and use a record or an array as they allow random access.
You can grab the n-th element by unpacking the t-uple with value deconstruction, either by a let construct, a match construct or a function definition:
let ivuple = (5, 2, 1, 1)
let squared_sum_let =
let (a,b,c,d) = ivuple in
a*a + b*b + c*c + d*d
let squared_sum_match =
match ivuple with (a,b,c,d) -> a*a + b*b + c*c + d*d
let squared_sum_fun (a,b,c,d) =
a*a + b*b + c*c + d*d
The match-construct has here no virtue over the let-construct, it is just included for the sake of completeness.
Do not use t-uples, Don¹
There are only a few cases where using t-uples to represent a type is the right thing to do. Most of the times, we pick a t-uple because we are too lazy to define a type and we should interpret the problem of accessing the n-th field of a t-uple or iterating over the fields of a t-uple as a serious signal that it is time to switch to a proper type.
There are two natural replacements to t-uples: records and arrays.
When to use records
We can see a record as a t-uple whose entries are labelled; as such, they are definitely the most natural replacement to t-uples if we want to access them directly.
type ivuple = {
a: int;
b: int;
c: int;
d: int;
}
We then access directly the field a of a value x of type ivuple by writing x.a. Note that records are easily copied with modifications, as in let y = { x with d = 0 }. There is no natural way to iterate over the fields of a record, mostly because a record do not need to be homogeneous.
When to use arrays
A large² homogeneous collection of values is adequately represented by an array, which allows direct access, iterating and folding. A possible inconvenience is that the size of an array is not part of its type, but for arrays of fixed size, this is easily circumvented by introducing a private type — or even an abstract type. I described an example of this technique in my answer to the question “OCaml compiler check for vector lengths”.
Note on float boxing
When using floats in t-uples, in records containing only floats and in arrays, these are unboxed. We should therefore not notice any performance modification when changing from one type to the other in our numeric computations.
¹ See the TeXbook.
² Large starts near 4.
Since the length of OCaml tuples is part of the type and hence known (and fixed) at compile time, you get the n-th item by straightforward pattern matching on the tuple. For the same reason, the problem of extracting the n-th element of an "arbitrary-length tuple" cannot occur in practice - such a "tuple" cannot be expressed in OCaml's type system.
You might still not want to write out a pattern every time you need to project a tuple, and nothing prevents you from generating the functions get_1_1...get_i_j... that extract the i-th element from a j-tuple for any possible combination of i and j occuring in your code, e.g.
let get_1_1 (a) = a
let get_1_2 (a,_) = a
let get_2_2 (_,a) = a
let get_1_3 (a,_,_) = a
let get_2_3 (_,a,_) = a
...
Not necessarily pretty, but possible.
Note: Previously I had claimed that OCaml tuples can have at most length 255 and you can simply generate all possible tuple projections once and for all. As #Virgile pointed out in the comments, this is incorrect - tuples can be huge. This means that it is impractical to generate all possible tuple projection functions upfront, hence the restriction "occurring in your code" above.
It's not possible to write such a function in full generality in OCaml. One way to see this is to think about what type the function would have. There are two problems. First, each size of tuple is a different type. So you can't write a function that accesses elements of tuples of different sizes. The second problem is that different elements of a tuple can have different types. Lists don't have either of these problems, which is why you can have List.nth.
If you're willing to work with a fixed size tuple whose elements are all the same type, you can write a function as shown by #user2361830.
Update
If you really have collections of values of the same type that you want to access by index, you should probably be using an array.
here is a function wich return you the string of the ocaml function you need to do that ;) very helpful I use it frequently.
let tup len n =
if n>=0 && n<len then
let rec rep str nn = match nn<1 with
|true ->""
|_->str ^ (rep str (nn-1))in
let txt1 ="let t"^(string_of_int len)^"_"^(string_of_int n)^" tup = match tup with |" ^ (rep "_," n) ^ "a" and
txt2 =","^(rep "_," (len-n-2)) and
txt3 ="->a" in
if n = len-1 then
print_string (txt1^txt3)
else
print_string (txt1^txt2^"_"^txt3)
else raise (Failure "Error") ;;
For example:
tup 8 6;;
return:
let t8_6 tup = match tup with |_,_,_,_,_,_,a,_->a
and of course:
val t8_6 : 'a * 'b * 'c * 'd * 'e * 'f * 'g * 'h -> 'g = <fun>

Does "Value Restriction" practically mean that there is no higher order functional programming?

Does "Value Restriction" practically mean that there is no higher order functional programming?
I have a problem that each time I try to do a bit of HOP I get caught by a VR error. Example:
let simple (s:string)= fun rq->1
let oops= simple ""
type 'a SimpleType= F of (int ->'a-> 'a)
let get a = F(fun req -> id)
let oops2= get ""
and I would like to know whether it is a problem of a prticular implementation of VR or it is a general problem that has no solution in a mutable type-infered language that doesn't include mutation in the type system.
Does “Value Restriction” mean that there is no higher order functional programming?
Absolutely not! The value restriction barely interferes with higher-order functional programming at all. What it does do is restrict some applications of polymorphic functions—not higher-order functions—at top level.
Let's look at your example.
Your problem is that oops and oops2 are both the identity function and have type forall 'a . 'a -> 'a. In other words each is a polymorphic value. But the right-hand side is not a so-called "syntactic value"; it is a function application. (A function application is not allowed to return a polymorphic value because if it were, you could construct a hacky function using mutable references and lists that would subvert the type system; that is, you could write a terminating function type type forall 'a 'b . 'a -> 'b.
Luckily in almost all practical cases, the polymorphic value in question is a function, and you can define it by eta-expanding:
let oops x = simple "" x
This idiom looks like it has some run-time cost, but depending on the inliner and optimizer, that can be got rid of by the compiler—it's just the poor typechecker that is having trouble.
The oops2 example is more troublesome because you have to pack and unpack the value constructor:
let oops2 = F(fun x -> let F f = get "" in f x)
This is quite a but more tedious, but the anonymous function fun x -> ... is a syntactic value, and F is a datatype constructor, and a constructor applied to a syntactic value is also a syntactic value, and Bob's your uncle. The packing and unpacking of F is all going to be compiled into the identity function, so oops2 is going to compile into exactly the same machine code as oops.
Things are even nastier when you want a run-time computation to return a polymorphic value like None or []. As hinted at by Nathan Sanders, you can run afoul of the value restriction with an expression as simple as rev []:
Standard ML of New Jersey v110.67 [built: Sun Oct 19 17:18:14 2008]
- val l = rev [];
stdIn:1.5-1.15 Warning: type vars not generalized because of
value restriction are instantiated to dummy types (X1,X2,...)
val l = [] : ?.X1 list
-
Nothing higher-order there! And yet the value restriction applies.
In practice the value restriction presents no barrier to the definition and use of higher-order functions; you just eta-expand.
I didn't know the details of the value restriction, so I searched and found this article. Here is the relevant part:
Obviously, we aren't going to write the expression rev [] in a program, so it doesn't particularly matter that it isn't polymorphic. But what if we create a function using a function call? With curried functions, we do this all the time:
- val revlists = map rev;
Here revlists should be polymorphic, but the value restriction messes us up:
- val revlists = map rev;
stdIn:32.1-32.23 Warning: type vars not generalized because of
value restriction are instantiated to dummy types (X1,X2,...)
val revlists = fn : ?.X1 list list -> ?.X1 list list
Fortunately, there is a simple trick that we can use to make revlists polymorphic. We can replace the definition of revlists with
- val revlists = (fn xs => map rev xs);
val revlists = fn : 'a list list -> 'a list list
and now everything works just fine, since (fn xs => map rev xs) is a syntactic value.
(Equivalently, we could have used the more common fun syntax:
- fun revlists xs = map rev xs;
val revlists = fn : 'a list list -> 'a list list
with the same result.) In the literature, the trick of replacing a function-valued expression e with (fn x => e x) is known as eta expansion. It has been found empirically that eta expansion usually suffices for dealing with the value restriction.
To summarise, it doesn't look like higher-order programming is restricted so much as point-free programming. This might explain some of the trouble I have when translating Haskell code to F#.
Edit: Specifically, here's how to fix your first example:
let simple (s:string)= fun rq->1
let oops= (fun x -> simple "" x) (* eta-expand oops *)
type 'a SimpleType= F of (int ->'a-> 'a)
let get a = F(fun req -> id)
let oops2= get ""
I haven't figured out the second one yet because the type constructor is getting in the way.
Here is the answer to this question in the context of F#.
To summarize, in F# passing a type argument to a generic (=polymorphic) function is a run-time operation, so it is actually type-safe to generalize (as in, you will not crash at runtime). The behaviour of thusly generalized value can be surprising though.
For this particular example in F#, one can recover generalization with a type annotation and an explicit type parameter:
type 'a SimpleType= F of (int ->'a-> 'a)
let get a = F(fun req -> id)
let oops2<'T> : 'T SimpleType = get ""

Resources