I've been thinking about how type inference works in the following OCaml program:
let rec f x = (g x) + 5
and g x = f (x + 5);;
Granted, the program is quite useless (looping forever), but what about the types?
OCaml says:
val f : int -> int = <fun>
val g : int -> int = <fun>
This would exactly be my intuition, but how does the type inference algorithm know this?
Say the algorithm considers "f" first: the only constraint it can get there is that the return type of "g" must be "int", and therefore its own return type is also "int". But it cannot infer the type of its argument by the definition of "f".
On the other hand, if it considers "g" first, it can see that the type of its own argument must be "int". But without having considered "f" before, it can't know that the return type of "g" is also "int".
What is the magic behind it?
Say the algorithm considers "f" first: the only constraint it can get there is that the return type of "g" must be "int", and therefore its own return type is also "int". But it cannot infer the type of its argument by the definition of "f".
It can't infer it to a concrete type, but it can infer something. Namely: the argument type of f must be the same as the argument type of g. So basically after looking at f, ocaml knows the following about the types:
for some (to be determined) 'a:
f: 'a -> int
g: 'a -> int
After looking at g it knows that 'a must be int.
For a more in-depth look on how the type inference algorithm works, you can read the Wikipedia article about Hindley-Milner type inference or this blog post, which seems to be much friendlier than the Wikipedia article.
Here is my mental model of what goes on, which may or may not match reality.
let rec f x =
Ok, at this point we know that f is a function that takes argument x. Thus we have:
f: 'a -> 'b
x: 'a
for some 'a and 'b. Next:
(g x)
Ok, now we know g is a function that can be applied to x, so we add
g: 'a -> 'c
to our list of information. Continuing...
(g x) + 5
Aha, the return type of g must be int, so now we have solved 'c=int. At this point we have:
f: 'a -> 'b
x: 'a
g: 'a -> int
Moving on...
and g x =
Ok, there's a different x here, let's assume the original code had y instead, to keep things more obvious. That is, let's rewrite the code as
and g y = f (y + 5);;
Ok, so we are at
and g y =
so now our info is:
f: 'a -> 'b
x: 'a
g: 'a -> int
y: 'a
since y is an argument to g... and we keep going:
f (y + 5);;
and this tells us from y+5 that y has type int, which solves 'a=int. And since this is the return value of g, which we already know must be int, this solves 'b=int. That was a lot in one step, if the code were
and g y =
let t = y + 5 in
let r = f t in
f r;;
then the first line would show y is an int, thus solving for 'a, and then the next line would say that r has type 'b, and then the final line is the return of g, which solves 'b=int.
Related
Consider for example let f x = f x in f 1. Is its signature defined?
If so, what is it?
One could argue, that OCaml doesn't know about the fact that it's not terminating and that its type is simply inferred as 'a. Is that correct?
let a b = let rec f x = f x in f 1;;
is for example val a : 'a -> 'b eventhough it is very clear, that when a is applied, there won't be a 'b
Then requirement for a sound type system when you have type(E) = T is that if E evaluates to some value v, then v is a value that belongs to type T. A type is meaningful when the expression gives a value, and exceptions and infinite loops do not.
The type checker however is total, and gives a type for all expression, even if it is just a free type variable.
Here the return type is left unbound, and is printed as 'a.
# let f x = if x then (failwith "A") else (failwith "B");;
val f : bool -> 'a = <fun>
Here the return type of the then branch is unified with the type of the else branch:
# let f x = if x then (failwith "A") else 5;;
val f : bool -> int = <fun>
#
One way to read function types like unit -> 'a is to remember that the
type variable 'a encompasses empty types.
For example, if I have a function f
let rec f:'a. _ -> 'a = fun () -> f ()
and an empty type
type empty = |
(* using 4.07 empty variants *)
(* or *)
type (_,_) eq = Refl: ('a,'a) eq
type empty = (float,int) eq
then I can restrict the type of f to unit -> empty:
let g: unit -> empty = f
Moreover, the more general type of f can be useful in presence of branches.
For instance, I could define a return that raises an exception in order
to exit early from a for-loop:
let search pred n =
let exception Return of int in
let return: 'a. int -> 'a = fun n -> raise (Return n) in
try
for i = 0 to n do
if pred i then return i
done;
None
with Return n -> Some n
Here, the polymorphic type of return makes it possible to use it in a context
where unit was expected.
The function:
fn : 'a -> 'b
Now, are there any functions which can be defined and have this type?
There are two possible implementations for that function signature in Standard ML. One employs exceptions, the other recursion:
val raises : 'a -> 'b =
fn a => raise Fail "some error";
(* Infinite looping; satisfies the type signature, *)
(* but won't ever produce anything. *)
val rec loops : 'a -> 'b =
fn a => loops a;
The first solution may be useful for defining a helper function, say bug, which saves a few key strokes:
fun bug msg = raise Fail ("BUG: " ^ msg);
The other solution may be useful for defining server loops or REPLs.
In the Basis library, OS.Process.exit is such a function that returns an unknown generic type 'a:
- OS.Process.exit;
val it = fn : OS.Process.status -> 'a
A small echo REPL with type val repl = fn : unit -> 'a:
fun repl () =
let
val line = TextIO.inputLine TextIO.stdIn
in
case line of
NONE => OS.Process.exit OS.Process.failure
| SOME ":q\n" => OS.Process.exit OS.Process.success
| SOME line => (TextIO.print line ; repl ())
end
You might also find useful this question about the type signature of Haskell's forever function.
I can think of one example:
fun f a = raise Div;
I can think of several:
One that is recursive,
fun f x = f x
Any function that raises exceptions,
fun f x = raise SomeExn
Any function that is mutually recursive, e.g.,
fun f x = g x
and g x = f x
Any function that uses casting (requires specific compiler support, below is for Moscow ML),
fun f x = Obj.magic x
Breaking the type system like this is probably cheating, but unlike all the other functions with this type, this function actually returns something. (In the simplest case, it's the identity function.)
A function that throws if the Collatz conjecture is false, recurses infinitely if true,
fun f x =
let fun loop (i : IntInf.int) =
if collatz i
then loop (i+1)
else raise Collatz
in loop 1 end
which is really just a combination of the first two.
Any function that performs arbitrary I/O and recurses infinitely, e.g.
fun f x = (print "Woohoo!"; f x)
fun repl x =
let val y = read ()
val z = eval y
val _ = print z
in repl x end
One may argue that exceptions and infinite recursion represent the same theoretical value ⊥ (bottom) meaning "no result", although since you can catch exceptions and not infinitely recursive functions, you may also argue they're different.
If you restrict yourself to pure functions (e.g. no printing or exceptions) and only Standard ML (and not compiler-specific features) and you think of the mutually recursive cases as functionally equivalent in spite of their different recursion schemes, we're back to just fun f x = f x.
The reason why fun f x = f x has type 'a → 'b is perhaps obvious: The type-inference algorithm assumes that the input type and the output type are 'a and 'b respectively and goes on to conclude the function's only constraint: That f x's input type must be equal to f x's input type, and that f x's output type must be equal to f x's output type, at which point the types 'a and 'b have not been specialized any further.
I started learning functional programming (OCaml), but I don't understand one important topic about fp: signatures (I'm not sure if it's a proper name). When I type something and compile with ocaml, I get for example:
# let inc x = x + 1 ;;
val inc : int -> int = <fun>
This is trivial, but I don't know, why this:
let something f g a b = f a (g a b)
gives an output:
val something : (’a -> ’b -> ’c) -> (’a -> ’d -> ’b) -> ’a -> ’d -> ’c = <fun>
I suppose, that this topic is absolutely basics of fp for many of you, but I ask for help here, because I haven't found anything on the Internet about signatures in OCaml (there are some articles about signatures in Haskell, but not explanations).
If this topic somehow will survive, I post here several functions, which signatures made me confused:
# let nie f a b = f b a ;; (* flip *)
val nie : (’a -> ’b -> ’c) -> ’b -> ’a -> ’c = <fun>
# let i f g a b = f (g a b) b ;;
val i : (’a -> ’b -> ’c) -> (’d -> ’b -> ’a) -> ’d -> ’b -> ’c = <fun>
# let s x y z = x z (y z) ;;
val s : (’a -> ’b -> ’c) -> (’a -> ’b) -> ’a -> ’c = <fun>
# let callCC f k = f (fun c d -> k c) k ;;
val callCC : ((’a -> ’b -> ’c) -> (’a -> ’c) -> ’d) -> (’a -> ’c) -> ’d = <fun>
Thank you for help and explanation.
There are a couple of concepts you need to understand to make sense of this type signature and I don't know which ones you already do, so I tried my best to explain every important concept:
Currying
As you know, if you have the type foo -> bar, this describes a function taking an argument of type foo and returning a result of type bar. Since -> is right associative, the type foo -> bar -> baz is the same as foo -> (bar -> baz) and thus describes a function taking an argument of type foo and returning a value of type bar -> baz, which means the return value is a function taking a value of type bar and returning a value of type baz.
Such a function can be called like my_function my_foo my_bar, which because function application is left-associative, is the same as (my_function my_foo) my_bar, i.e. it applies my_function to the argument my_foo and then applies the function that is returned as a result to the argument my_bar.
Because it can be called like this, a function of type foo -> bar -> baz is often called "a function taking two arguments" and I will do so in the rest of this answer.
Type variables
If you define a function like let f x = x, it will have the type 'a -> 'a. But 'a isn't actually a type defined anywhere in the OCaml standard library, so what is it?
Any type that starts with a ' is a so-called type variable. A type variable can stand for any possible type. So in the example above f can be called with an int or a string or a list or anything at all - it doesn't matter.
Furthermore if the same type variable appears in a type signature more than once, it will stand for the same type. So in the example above that means, that the return type of f is the same as the argument type. So if f is called with an int, it returns an int. If it is called with a string, it returns a string and so on.
So a function of type 'a -> 'b -> 'a could take two arguments of any types (which might not be the same type for the first and second argument) and returns a value of the same type as the first argument, while a function of type 'a -> 'a -> 'a would take two arguments of the same type.
One note about type inference: Unless you explicitly give a function a type signature, OCaml will always infer the most general type possible for you. So unless a function uses any operations that only work with a given type (like + for example), the inferred type will contain type variables.
Now to explain the type...
val something : ('a -> 'b -> 'c) -> ('a -> 'd -> 'b) -> 'a -> 'd -> 'c = <fun>
This type signature tells you that something is a function taking four arguments.
The type of the first argument is 'a -> 'b -> 'c. I.e. a function taking two arguments of arbitrary and possibly different types and returning a value of an arbitrary type.
The type of the second argument is 'a -> 'd -> 'b. This is again a function with two arguments. The important thing to note here is that the first argument of the function must have the same type as the first argument of the first function and the return value of the function must have the same type as the second argument of the first function.
The type of the third argument is 'a, which is also the type of the first arguments of both functions.
Lastly, the type of the fourth argument is 'd, which is the type of the second argument of the second function.
The return value will be of type 'c, i.e. the return type of the first function.
If you're really interested in the subject (and have access to a university library), read Wadler's excellent (if somewhat dated) "Introduction to functional programming". It explains type signatures and type inference in a very nice and readable way.
Two further hints: Note that the -> arrow is right-associative, so you can bracket things from the right which sometimes helps to understand things, ie a -> b -> c is the same as a -> (b -> c). This is connected to the second hint: Higher order functions. You can do things like
let add x y = x + y
let inc = add 1
so in FP, thinking of 'add' as a function that has to take two numerical parameters and returns a numerical value is not generally the right thing to do: It can also be a function that takes one numerical argument and returns a function with type num -> num.
Understanding this will help you understand type signatures, but you can do it without. Here, quick and easy:
# let s x y z = x z (y z) ;;
val s : (’a -> ’b -> ’c) -> (’a -> ’b) -> ’a -> ’c = <fun>
Look at the right hand side. y is given one argument, so it is of type a -> b where a is the type of z. x is given two arguments, the first one of which is z, so the type of the first argument has to be a as well. The type of (y z) , the second argument, is b, and hence the type of x is (a -> b -> c). This allows you to deduce the type of s immediately.
This isn't a homework question, by the way. It got brought up in class but my teacher couldn't think of any. Thanks.
How do you define the identity functions ? If you're only considering the syntax, there are different identity functions, which all have the correct type:
let f x = x
let f2 x = (fun y -> y) x
let f3 x = (fun y -> y) (fun y -> y) x
let f4 x = (fun y -> (fun y -> y) y) x
let f5 x = (fun y z -> z) x x
let f6 x = if false then x else x
There are even weirder functions:
let f7 x = if Random.bool() then x else x
let f8 x = if Sys.argv < 5 then x else x
If you restrict yourself to a pure subset of OCaml (which rules out f7 and f8), all the functions you can build verify an observational equation that ensures, in a sense, that what they compute is the identity : for all value f : 'a -> 'a, we have that f x = x
This equation does not depend on the specific function, it is uniquely determined by the type. There are several theorems (framed in different contexts) that formalize the informal idea that "a polymorphic function can't change a parameter of polymorphic type, only pass it around". See for example the paper of Philip Wadler, Theorems for free!.
The nice thing with those theorems is that they don't only apply to the 'a -> 'a case, which is not so interesting. You can get a theorem out of the ('a -> 'a -> bool) -> 'a list -> 'a list type of a sorting function, which says that its application commutes with the mapping of a monotonous function.
More formally, if you have any function s with such a type, then for all types u, v, functions cmp_u : u -> u -> bool, cmp_v : v -> v -> bool, f : u -> v, and list li : u list, and if cmp_u u u' implies cmp_v (f u) (f u') (f is monotonous), you have :
map f (s cmp_u li) = s cmp_v (map f li)
This is indeed true when s is exactly a sorting function, but I find it impressive to be able to prove that it is true of any function s with the same type.
Once you allow non-termination, either by diverging (looping indefinitely, as with the let rec f x = f x function given above), or by raising exceptions, of course you can have anything : you can build a function of type 'a -> 'b, and types don't mean anything anymore. Using Obj.magic : 'a -> 'b has the same effect.
There are saner ways to lose the equivalence to identity : you could work inside a non-empty environment, with predefined values accessible from the function. Consider for example the following function :
let counter = ref 0
let f x = incr counter; x
You still that the property that for all x, f x = x : if you only consider the return value, your function still behaves as the identity. But once you consider side-effects, you're not equivalent to the (side-effect-free) identity anymore : if I know counter, I can write a separating function that returns true when given this function f, and would return false for pure identity functions.
let separate g =
let before = !counter in
g ();
!counter = before + 1
If counter is hidden (for example by a module signature, or simply let f = let counter = ... in fun x -> ...), and no other function can observe it, then we again can't distinguish f and the pure identity functions. So the story is much more subtle in presence of local state.
let rec f x = f (f x)
This function never terminates, but it does have type 'a -> 'a.
If we only allow total functions, the question becomes more interesting. Without using evil tricks, it's not possible to write a total function of type 'a -> 'a, but evil tricks are fun so:
let f (x:'a):'a = Obj.magic 42
Obj.magic is an evil abomination of type 'a -> 'b which allows all kinds of shenanigans to circumvent the type system.
On second thought that one isn't total either because it will crash when used with boxed types.
So the real answer is: the identity function is the only total function of type 'a -> 'a.
Throwing an exception can also give you an 'a -> 'a type:
# let f (x:'a) : 'a = raise (Failure "aaa");;
val f : 'a -> 'a = <fun>
If you restrict yourself to a "reasonable" strongly normalizing typed λ-calculus, there is a single function of type ∀α α→α, which is the identity function. You can prove it by examining the possible normal forms of a term of this type.
Philip Wadler's 1989 article "Theorems for Free" explains how functions having polymorphic types necessarily satisfy certain theorems (e.g. a map-like function commutes with composition).
There are however some nonintuitive issues when one deals with much polymorphism. For instance, there is a standard trick for encoding inductive types and recursion with impredicative polymorphism, by representing an inductive object (e.g. a list) using its recursor function. In some cases, there are terms belonging to the type of the recursor function that are not recursor functions; there is an example in §4.3.1 of Christine Paulin's PhD thesis.
All,
Here is the type expression which I need to convert to a ML expression:
int -> (int*int -> 'a list) -> 'a list
Now I know this is a currying style expression which takes 2 arguments:
1st argument = Type int
and 2nd argument = Function which takes the previous int value twice and return a list of any type
I am having a hard time figuring such a function that would take an int and return 'a list.
I am new to ML and hence this might be trivial to others, but obviously not me.
Any help is greatly appreciated.
You get an int and a function int*int -> 'a list. You're supposed to return an 'a list. So all you need to do is call the function you get with (x,x) (where x is the int you get) and return the result of that. So
fun foo x f = f (x,x)
Note that this is not the only possible function with type int -> (int*int -> 'a list) -> 'a list. For example the functions fun foo x f = f (x, 42) and fun foo x f = f (23, x) would also have that type.
Edit:
To make the type match exactly add a type annotation to restrict the return type of f:
fun foo x (f : int*int -> 'a list) = f (x,x)
Note however that there is no real reason to do that. This version behaves exactly as the one before, except that it only accepts functions that return a list.