How is lazy evaluation working in this example? - functional-programming

If I have this function:
let x = print_endline "foo" in x,x;;
it has this signature: unit * unit = ((), ())
and evaluates to foo
I dont understand why it prints foo once.
On the other hand: let x () = print_endline "foo" in x(),x();; will print foo twice and has the signature:
unit * unit = ((), ())
can someone explain the principle behind this and how it works in this specific example?

OCaml is an eager language unless directed otherwise. To create lazy values we use the lazy keyword, e.g., lazy (print_endline "foo") will create a lazy value of type unit Lazy.t that will print foo, once it is forced with Lazy.force.
All other values in OCaml are eager and are evaluated immediately. You got confused with the let .. in ... syntactic form.
The syntactic form
let <var> = <expr1> in <expr2>
first evaluates <expr1> and then evaluates <expr2> with the value of <expr1> bound to <var>. Finally, the result of <expr2> becomes the value of the whole form.
So when you say, let x = 2 in x + x you first evaluate 2 and bind it to x and then evaluate x + x with x=2, which is 4, so the value of the expression let x = 2 in x + x in 4. This is a pure expression, it doesn't define any functions and doesn't bind x in the global context.
Going back to your example,
let x = print_endline "foo" in x, x
we first evaluate <expr1>, which is print_endline "foo" that has the side effect of printing foo and returns the value () of type unit. Next, () is bound to x and the <expr2> is evaluated, which x,x, and which is equal to (),().
Your second example,
let x () = print_endline "foo" in x (), x ()
binds x to the function fun () -> print_endline "foo" since
let f x = <expr>
is the syntactic sugar for
let f = fun x -> <expr>
i.e., to a definition of a function.
Therefore, your expression first evaluates <expr1>, which is now fun () -> print_endline "foo" and binds it to x. Now you evaluate x (), x () and calls x () twice, both calls return ().
Finally, you said in your question that,
If I have this function: let x = print_endline "foo" in x,x;;
You don't have any functions there. Much like in let x = 2 in x + x, you are not defining any functions but just evaluate x+x with x bound to 2. After it is evaluated, there are no leftovers, x is not bound to anything. In C-like syntax, it is the same as
{
int x = 2;
x + x;
}
So if you want to define a function, you have to use the syntax
let <name> <params> =
<expr>
Where <expr> is the body of the function that is evaluated in a context where each parameter is substituted with its actual value (argument).
E.g.,
let print_foo () =
print_endline "foo"
or,
let sum_twice x y =
let z = x + y in
2 * z
I would suggest reading the OCaml Tutorial to get a better idea of OCaml syntax.

There is no lazy evaluation involved in this code at all. It would be more the other way around: lazy code (in OCaml) often involves having a function of the form () -> 'a instead of a value 'a, but this doesn't mean that any function of this form has to do with lazy evaluation.
In the first snippet, let x = print_endline "foo" in x, x, you are binding x to a value, the result of print_endline "foo", and then returning x, x. However, by doing so you must call print_endline "foo", which will print foo. At this point, x : unit = (). Then, you return x, x, that is, (), ().
In the second snippet, let x () = print_endline "foo" in x (), x (), you are defining a function x. That form is syntax sugar for let x = fun () -> print_endline "foo" in x (), x (). For this reason, before the evaluation of x (), x (), you have x : unit -> unit = <fun>. Then, you call that function twice, each time printing foo, and returning ().
Note that, when you write something like let x () = ..., you could also be writing let x (y: unit) = ..., except that since the type unit has a single constructor (()), OCaml allows you to deconstruct it directly in the function declaration. Which is why you are allowed to write let x () = ....

Related

OCaml lazy evaluation: 'a lazy_t vs unit -> 'a

Both achieve the same thing
# let x = fun () -> begin print_endline "Hello"; 1 end;;
val x : unit -> int = <fun>
# x ();;
Hello
- : int = 1
# let y = lazy begin print_endline "World"; 2 end;;
val y : int lazy_t = <lazy>
# Lazy.force y;;
World
- : int = 2
Is there any reason one should be preferred over the other? Which one is more efficient?
First of all, they do not behave the same, try to do Lazy.force y yet another time, and you will notice the difference, the "World" message is no longer printed, so the computation is not repeated, as the result was remembered in the lazy value.
This is the main difference between lazy computations and thunks. They both defer the computation until the time when they are forced. But the thunk will evaluate its body every time, where the lazy value will be evaluated once, and the result of the computation will be memoized.
Underneath the hood, the lazy value is implemented as a thunk object with a special flag. When runtime first calls the lazy value, it substitutes the body of the thunk with the result of computation. So, after the first call to Lazy.force y, the y object actually became an integer 2. So the consequent calls to Lazy.force do nothing.
Lazy.force returns the same value again without recomputing it.
Exemple
let ra = ref 0 ;;
let y = lazy (ra:= !ra+1);;
Lazy.force y;;
# Lazy.force y;;
- : unit = ()
# !ra;;
- : int = 1
# Lazy.force y;;
- : unit = ()
# !ra;;
- : int = 1

ml function of type fn : 'a -> 'b

The function:
fn : 'a -> 'b
Now, are there any functions which can be defined and have this type?
There are two possible implementations for that function signature in Standard ML. One employs exceptions, the other recursion:
val raises : 'a -> 'b =
fn a => raise Fail "some error";
(* Infinite looping; satisfies the type signature, *)
(* but won't ever produce anything. *)
val rec loops : 'a -> 'b =
fn a => loops a;
The first solution may be useful for defining a helper function, say bug, which saves a few key strokes:
fun bug msg = raise Fail ("BUG: " ^ msg);
The other solution may be useful for defining server loops or REPLs.
In the Basis library, OS.Process.exit is such a function that returns an unknown generic type 'a:
- OS.Process.exit;
val it = fn : OS.Process.status -> 'a
A small echo REPL with type val repl = fn : unit -> 'a:
fun repl () =
let
val line = TextIO.inputLine TextIO.stdIn
in
case line of
NONE => OS.Process.exit OS.Process.failure
| SOME ":q\n" => OS.Process.exit OS.Process.success
| SOME line => (TextIO.print line ; repl ())
end
You might also find useful this question about the type signature of Haskell's forever function.
I can think of one example:
fun f a = raise Div;
I can think of several:
One that is recursive,
fun f x = f x
Any function that raises exceptions,
fun f x = raise SomeExn
Any function that is mutually recursive, e.g.,
fun f x = g x
and g x = f x
Any function that uses casting (requires specific compiler support, below is for Moscow ML),
fun f x = Obj.magic x
Breaking the type system like this is probably cheating, but unlike all the other functions with this type, this function actually returns something. (In the simplest case, it's the identity function.)
A function that throws if the Collatz conjecture is false, recurses infinitely if true,
fun f x =
let fun loop (i : IntInf.int) =
if collatz i
then loop (i+1)
else raise Collatz
in loop 1 end
which is really just a combination of the first two.
Any function that performs arbitrary I/O and recurses infinitely, e.g.
fun f x = (print "Woohoo!"; f x)
fun repl x =
let val y = read ()
val z = eval y
val _ = print z
in repl x end
One may argue that exceptions and infinite recursion represent the same theoretical value ⊥ (bottom) meaning "no result", although since you can catch exceptions and not infinitely recursive functions, you may also argue they're different.
If you restrict yourself to pure functions (e.g. no printing or exceptions) and only Standard ML (and not compiler-specific features) and you think of the mutually recursive cases as functionally equivalent in spite of their different recursion schemes, we're back to just fun f x = f x.
The reason why fun f x = f x has type 'a → 'b is perhaps obvious: The type-inference algorithm assumes that the input type and the output type are 'a and 'b respectively and goes on to conclude the function's only constraint: That f x's input type must be equal to f x's input type, and that f x's output type must be equal to f x's output type, at which point the types 'a and 'b have not been specialized any further.

OCaml: applying second argument first(higher-order functions)

I defined a higher-order function like this:
val func : int -> string -> unit
I would like to use this function in two ways:
other_func (func 5)
some_other_func (fun x -> func x "abc")
i.e., by making functions with one of the arguments already defined. However, the second usage is less concise and readable than the first one. Is there a more readable way to pass the second argument to make another function?
In Haskell, there's a function flip for this. You can define it yourself:
let flip f x y = f y x
Then you can say:
other_func (func 5)
third_func (flip func "abc")
Flip is defined in Jane Street Core as Fn.flip. It's defined in OCaml Batteries Included as BatPervasives.flip. (In other words, everybody agrees this is a useful function.)
The question posed in the headline "change order of parameters" is already answered. But I am reading your description as "how do I write a new function with the second parameter fixed". So I will answer this simple question with an ocaml toplevel protocol:
# let func i s = if i < 1 then print_endline "Counter error."
else for ix = 1 to i do print_endline s done;;
val func : int -> string -> unit = <fun>
# func 3 "hi";;
hi
hi
hi
- : unit = ()
# let f1 n = func n "curried second param";;
val f1 : int -> unit = <fun>
# f1 4;;
curried second param
curried second param
curried second param
curried second param
- : unit = ()
#

Scope/order of evaluation of nested `let .. in ..` in OCaml

I have a little problems here that I don't 100% understand:
let x = 1 in let x = x+2 in let x = x+3 in x
I know the result of this expression is 6, but just want to make sure the order of calculating this expression; which part is calculated first?
You asked about the order of the evaluation in the expression let x=1 in let x=x+2 in .... The order is "left-to-right"! When you have a chain of let a=b in let c=d in ..., the order of evaluation is always left-to-right.
However, in your example there is a confusing part: you used the same variable name, x, in every let construct. This is confusing because you then see things like let x=x+1, and this looks like you are "redefining" x or "changing the value of x". But no "changing" of "x" actually happens here in OCAML! What happens here, as already pointed out above, is that a new variable is introduced every time, so your example is entirely equivalent to
let x = 1 in let y = x+2 in let z = y+3 in z;;
Note that here the order of evaluation is also left-to-right. (It is always left-to-right in every chain of let constructs.) In your original question, you chose to call all these new variables "x" rather than x, y, and z. This is confusing to most people. It is better to avoid this kind of coding style.
But how do we check that we renamed the variables correctly? Why "let x=1 in let y=x+2" and not "let x=1 in let x=y+2"? This x=x+2 business is quite confusing! Well, there is another way of understanding the evaluation of let x=aaa in bbb. The construct
let x=aaa in bbb
can be always replaced by the following closure applied to aaa,
(fun x -> bbb) aaa
Once you rewrite it in this way, you can easily see two things: First, OCAML will not evaluate "bbb" inside the closure until "aaa" is evaluated. (For this reason, the evaluation of let x=aaa in bbb proceeds by first evaluating aaa and then bbb, that is, "left-to-right".) Second, the variable "x" is confined to the body of the closure and so "x" cannot be visible inside the expression "aaa". For this reason, if "aaa" contains a variable called "x", it must be already defined with some value before, and it has nothing to do with the "x" inside the closure. For reasons of clarity, it would be better to call this variable by a different name.
In your example:
let x=1 in let x=x+2 in let x=x+3 in x
is rewritten as
(fun x -> let x=x+2 in let x=x+3 in x) 1
Then the inner let constructs are also rewritten:
(fun x -> (fun x -> let x=x+3 in x) x+2 ) 1
(fun x -> (fun x -> (fun x-> x) x+3) x+2 ) 1
Now let us rename the arguments of functions inside each function, which we can always do without changing the meaning of the code:
(fun x -> (fun y -> (fun z -> z) y+3) x+2 ) 1
This is the same as
let x=1 in let y=x+2 in let z=y+3 in z
In this way, you can verify that you have renamed the variables correctly.
Imagine parens:
let x = 1 in (let x = (x+2) in (let x = (x+3) in x))
Then substitute (x=1) where x it's not covered by another declaration of x and eliminate outermost let:
let x = (1+2) in (let x = (x+3) in x)
Evaluate:
let x = 3 in (let x = (x+3) in x)
Substitute:
let x = (3+3) in x
Evaluate:
let x = 6 in x
Substitute:
6
(This is a little long for a comment, so here's a smallish extra answer.)
As Chuck points out, there is no closure involved in this expression. The only complexity at all is due to the scoping rules. OCaml scoping rules are the usual ones, i.e., names refer to the nearest (innermost) definition. In the expression:
let v = e1 in e2
The variable v isn't visible (i.e., cannot be named) in e1. If (by chance) a variable of that name appears in e1, it must refer to some outer definition of (a different) v. But the new v can (of course) be named in e2. So your expression is equivalent to the following:
let x = 1 in let y = x+2 in let z = y+3 in z
It seems to me this is clearer, but it has exactly the same meaning.

Homework help converting an iterative function to recursive

For an assignment, i have written the following code in recursion. It takes a list of a vector data type, and a vector and calculates to closeness of the two vectors. This method works fine, but i don't know how to do the recursive version.
let romulus_iter (x:vector list) (vec:vector) =
let vector_close_hash = Hashtbl.create 10 in
let prevkey = ref 10000.0 in (* Define previous key to be a large value since we intially want to set closefactor to prev key*)
if List.length x = 0 then
{a=0.;b=0.}
else
begin
Hashtbl.clear vector_close_hash;
for i = 0 to (List.length x)-1 do
let vecinquestion = {a=(List.nth x i).a;b=(List.nth x i).b} in
let closefactor = vec_close vecinquestion vec in
if (closefactor < !prevkey) then
begin
prevkey := closefactor;
Hashtbl.add vector_close_hash closefactor vecinquestion
end
done;
Hashtbl.find vector_close_hash !prevkey
end;;
The general recursive equivalent of
for i = 0 to (List.length x)-1 do
f (List.nth x i)
done
is this:
let rec loop = function
| x::xs -> f x; loop xs
| [] -> ()
Note that just like a for-loop, this function only returns unit, though you can define a similar recursive function that returns a meaningful value (and in fact that's what most do). You can also use List.iter, which is meant just for this situation where you're applying an impure function that doesn't return anything meaningful to each item in the list:
List.iter f x

Resources