Difference between let+, let* and let()? - functional-programming

As the documentation on OCaml is sparse, i would appreciate if some one can explain the difference in different flavors of let usage.
I tried looking into https://dev.realworldocaml.org/toc.html, but there is no easy way to search in the website. Google search landed me to some articles, but did not get the exact explanation.

The basic form of let expressions is:
let p1 = e1
and p2 = e2
...
and pN = eN
in e
where N is at least 1. In this form, let expressions pattern matches the value that results from evaluating the RHS expressions against the LHS patterns, then evaluates the body with the new bindings defined by the LHS patterns in scope. For example,
let x, y = 1, 2 in
x + y
evaluates to 3.
When let has an operator name attached, it is the application of what is called a "let operator" or "binding operator" (to give you easier terms to search up). For example:
let+ x, y = 1, 2 in
x + y
desugars to (let+) (1, 2) (fun (x, y) -> x + y). (Similar to how one surrounds the operator + in parentheses, making it (+), to refer to its identifier, the identifier for the let operator let+, as it appears in a let expression, would be (let+).)
Finally, when a let binding has an operator name attached, all the and bindings must have operator names attached as well.
let* x = 1
and+ y = 2
and* z = 3 in
x + y + z
desugars to (let*) ((and+) 1 ((and*) 2 3)) (fun ((x, y), z) ->).
The following program is invalid and has no meaning because the let binding is being used as an operator, but the and binding is not:
let* x = 1
and y = 2 in
x + y
Binding operators are covered in the "language extensions" section of the OCaml documentation.
let () = e is merely the non-operator form of a pattern match, where () is the pattern that matches the only value of the unit type. The unit type is conventionally the type of expressions that don't evaluate to a meaningful value, but exist for side effects (e.g. print_endline "Hello world!"). Matching against () ensures that the expression has type (), catching partial application errors. The following typechecks:
let f x y =
print_endline x;
print_endline y
let () =
f "Hello" "World"
The following does not:
let f x y =
print_endline x;
print_endline y
let () =
f "Hello" (* Missing second argument, so expression has type string -> unit, not unit *)
Note that the binding operators are useful for conveniently using "monads" and "applicatives," so you may hear these words when learning about binding operators. However, binding operators are not inherently related to these concepts. All they do is desugar to the expressions that I describe above, and any other significance (such as relation to monads) results from how the operator was defined.

Consider the following code from the OCaml page on let operators.
let ( let* ) o f =
match o with
| None -> None
| Some x -> f x
let return x = Some x
If we create a very simply map:
module M = Map.Make (Int)
let m = M.(empty |> add 1 4 |> add 2 3 |> add 3 7)
If we wanted to write a function that takes a map and two keys and adds the values at those keys, returning int option, we might write:
let add_values m k1 k2 =
match M.find_opt k1 m with
| None -> None
| Some v1 ->
match M.find_opt k2 m with
| None -> None
| Some v2 ->
Some (v1 + v2)
Now, of course there are multiple ways of defining this. We could:
let add_values m k1 k2 =
match (M.find_opt k1 m, M.find_opt k2 m) with
| (None, _) | (_, None) -> None
| (Some v1, Some v2) -> Some (v1 + v2)
Or take advantage of exceptions:
let add_values m k1 k2 =
try
Some (M.find k1 m + M.find k2 m)
with
| Not_found -> None
Let operators let us write:
let add_values m k1 k2 =
let* v1 = M.find_opt k1 m in
let* v2 = M.find_opt k2 m in
return (v1 + v2)

Related

Is there an equivalent for map or fmap to replace while loops?

Haskell replaces for loops over iteratable objects with map :: (a -> b) -> [a] -> [b] or
fmap :: (a -> b) -> f a -> f b. (This question isn't limited to Haskell, I'm just using the syntax here.)
Is there something similar that replaces a while loop, like
wmap :: ([a] -> b) -> [a] -> ([b] -> Bool) -> [b]?
This function returns a list of b.
The first argument is a function that takes a list and computes a value that will end up in the list returned by wmap (so it's a very specific kind of while loop).
The second argument is the list that we use as our starting point.
The third argument is a function that evaluates the stoping criteria.
And as a functor,
wfmap :: (f a -> b) -> f a -> (f b -> Bool) -> f b
For example, a Jacobi solver would look like this (with b now the same type as a):
jacobi :: ([a] -> [a]) -> [a] -> ([a] -> Bool) -> [a]
What I'm looking for isn't really pure. wmap could have values that mutate internally, but only exist inside the function. It also has nondeterministic runtime, if it terminates at all.
In the case of a Gauss-Seidel solver, there would be no return value, since the [a] would be modified in place.
Something like this:
gs :: ([a] -> [a]) -> [a] -> ([a] -> Bool) -> ???
Does wmap or wfmap exist as part of any language by default, and what is it called?
Answer 1 (thanks to Bergi): Instead of the silly wmap/wfmap signature, we already have until.
Does an in place version of until exist for things like gs?
There is a proverb in engineering which states "Don't generalize before you have at least 3 implementations". There is some truth to it - especially when looking for new functional iteration concepts before doing it by foot a few times.
"Doing it by foot" here means, you should - if there is no friendly helper function you know of - resort to recursion. Write your "special cases" recursively. Preferably in a tail recursive form. Then, if you start to see recurring patterns, you might come up with a way to refactor into some recurring iteration scheme and its "kernel".
Let's for the sake of clarification of the above, assume you never heard of foldl and you want accumulate a result from iteration over a list... Then, you would write something like:
myAvg values =
total / (length values)
where
mySum acc [] = acc
mySum acc (x:xs) = mySum (acc + x) xs
total = mySum 0 values
And after doing this a couple of times, the pattern might show, that the recursions in those where clauses always look darn similar. You might then come up with a name like "fold" or "reduce" for that inner recursion snippet and end up with:
myAvg values = (foldl (+) 0.0 values) / fromIntegral (length values) :: Float
So, if you are looking for helper functions which help with your use-cases, my advice is you first write a few instances as recursive functions and then look for patterns.
So, with all that said, let's get our fingers wet and see how the Jacobi algorithm could translate to Haskell. Just so we have something to talk about. Now - usually I do not use Haskell for anything requiring arrays (containers with O(1) element access), because there are at least 5 array packages I know of and I would have to read for 2 days to decide which one is suitable for my application. TL;DR;). So I stick with lists and NO package dependencies beyond prelude in the code below. But that is - given the size of the example equations we try to solve is tiny - not a bad thing at all. Plus, the code demonstrates, that list comprehensions in lazy Haskell allow for un-imperative and yet performant operations on sets of cells (e.g. in the matrix), without any need for explicit looping.
type Matrix = [[Double]]
-- sorry - my mind went blank while looking for a better name for this...
-- but it is useful nonetheless
idefix nr nc =
[ [(r,c) | c <- [0..nc-1]] | r <- [0..nr-1]]
matElem m (r,c) = (m !! r) !! c
transpose (r,c) = (c,r)
matrixDim m = (length m, length . head $ m)
-- constructs a Matrix by enumerating the indices and querying
-- 'unfolder' for a value.
-- try "unfoldMatrix 3 3 id" and you see how indices relate to
-- cells in the matrix.
unfoldMatrix nr nc unfolder =
fmap (\row -> fmap (\cell -> unfolder cell) row) $ idefix nr nc
-- Not really needed for Jacobi problem but good
-- training to get our fingers wet with unfoldMatrix.
transposeMatrix m =
let (nr,nc) = matrixDim m in
unfoldMatrix nc nr (matElem m . transpose)
addMatrix m1 m2
| (matrixDim m1) == (matrixDim m2) =
let (nr,nc) = matrixDim m1 in
unfoldMatrix nr nc (\idx -> matElem m1 idx + matElem m2 idx)
subMatrix m1 m2
| (matrixDim m1) == (matrixDim m2) =
let (nr,nc) = matrixDim m1 in
unfoldMatrix nr nc (\idx -> matElem m1 idx - matElem m2 idx)
dluMatrix :: Matrix -> (Matrix,Matrix,Matrix)
dluMatrix m
| (fst . matrixDim $ m) == (snd . matrixDim $ m) =
let n = fst . matrixDim $ m in
(unfoldMatrix n n (\(r,c) -> if r == c then matElem m (r,c) else 0.0)
,unfoldMatrix n n (\(r,c) -> if r > c then matElem m (r,c) else 0.0)
,unfoldMatrix n n (\(r,c) -> if c > r then matElem m (r,c) else 0.0)
)
mulMatrix m1 m2
| (snd . matrixDim $ m1) == (fst . matrixDim $ m2) =
let (nr, nc) = ((fst . matrixDim $ m1),(snd . matrixDim $ m2)) in
unfoldMatrix nr nc
(\(ro,co) ->
sum [ matElem m1 (ro,i) * matElem m2 (i,co) | i <- [0..nr-1]]
)
isSquareMatrix m = let (nr,nc) = matrixDim m in nr == nc
jacobi :: Double -> Matrix -> Matrix -> Matrix -> Matrix
jacobi errMax a b x0
| isSquareMatrix a && (snd . matrixDim $ a) == (fst . matrixDim $ b) =
approximate x0
-- We could possibly avoid our hand rolled recursion
-- with the help of 'loop' from Control.Monad.Extra
-- according to hoogle. But it would not look better at all.
-- loop (\x -> let x' = jacobiStep x in if converged x' then Right x' else Left x') x0
where
(nra, nca) = matrixDim a
(d,l,u) = dluMatrix a
dinv = unfoldMatrix nra nca (\(r,c) ->
if r == c
then 1.0 / matElem d (r,c)
else 0.0)
lu = addMatrix l u
converged x =
let delta = (subMatrix (mulMatrix a x) b) in
let (nrd,ncd) = matrixDim delta in
let err = sum (fmap (\idx -> let v = matElem delta idx in v * v)
(concat (idefix nrd ncd))) in
err < errMax
jacobiStep x =
(mulMatrix dinv (subMatrix b (mulMatrix lu x)))
approximate x =
let x' = jacobiStep x in
if converged x' then x' else approximate x'
wikiExample errMax =
let a = [[ 2.0, 1.0],[5.0,7.0]] in
let b = [[11], [13]] in
jacobi errMax a b [[1.0],[1.0]]
Function idefix, despite it's silly name, IMHO is an eye opener for people coming from non-lazy languages. Their first reflex is to get scared: "What - he creates a list with the indices instead of writing loops? What a waste!" But a waste, it is not in lazy languages. What you see in this function (the list comprehension) produces a lazy list. It is not really created. What happens behind the scene is similar in spirit to what LINQ does in C# - IEnumerator<T> juggling.
We use idefix a second time when we want to sum all elements in our delta. There, we do not care about the concrete structure of the matrix. And so we use the standard prelude function concat to flatten the Matrix into a linear list. Lazy as well, of course. That is the beauty.
The next notable difference to the imperative wikipedia pseudo code is, that using matrix notation is much less complicated compared to nested looping and operating on single cells. Fortunately, the wikipedia article shows both. So, instead of a while loop with 2 nested loops, we only need an equivalent of the outermost while loop. Which is covered by our 2 liner recursive function approximate.
Lessons learned:
Lists and list comprehensions can help simplify code otherwise requiring nested loops. (In lazy languages).
Ocaml and Common Lisp have mutability and built in arrays and loops. That makes a package, very convenient when translating algorithms from imperative languages or imperative pseudo code.
Haskell has immutability and no built in arrays and no loops, but instead it has a similarly powerful set of tools, namely Laziness, tail call optimization and a terse syntax. That combination requires more planning (and writing some usually short helper functions) instead of the classical C approach of "Let's write it all in main()."
Sometimes it is easier to write a 2 line long recursive function than to think about how to abstract it.
In FP, you don't usually try to fit everything "inside the loop." You do one step and pass it on to the next function. There are lots of combinations that are useful in different situations. A common replacement for a while loop is a map followed by a takeWhile or a dropWhile, but there are many other possibilities, up to just plain recursion.

Definition without recursion, by cases, in Isabelle

I'm trying to define a unary operation on a set stalk x, whose typical elements are of the form germ x U s. In this case, there is no way to define an operation on general things of the same type as germ x U s in a way that reduces to what I want, so it seems like I really do have to resort to a definition by cases. I attempted the following
definition stalk_mop2 :: "'a ⇒( ('a set × 'a) set ⇒ ('a set × 'a) set ) " where
"stalk_mop2 x y = ( (λ z . if (∃ U s. y= germ x U s ) then
(germ x U ( -⇩a ⇘objectsmap U⇙ s ) ) else undefined) z ) " ,
and got the error message that U s are extra variables on the RHS. It seems like by using this syntax Isabelle does not make the connection between the if hypothesis and the following term, so that although I did bind U and s in the conditional statement, it apparently interprets the next occurrences of U and s (after then) as free variables.
What I really want is just a function that takes x and something of the form germ x U s and returns germ x U ( -⇩a ⇘objectsmap U⇙ s ). Nothing here is recursive.
Is there a way around this problem, or maybe a better way to make definitions by cases that will allow me to define what I want?
Be aware that this is nothing strange about Isabelle's syntax but, there just is no connection between the if-condition and the then- and else-branches. The scope of the existential quantifier naturally ends with then.
If you want to obtain a witness for something you know exists, you can use Hilbert's choice operator, e.g., SOME (U, s). y = germ x U s) gives you a pair (U, s) that satisfies y = germ x U s if such a pair exists (which you made sure by your if-condition), and is undefined otherwise.
So how about:
definition stalk_mop2 :: "'a ⇒(('a set × 'a) set ⇒ ('a set × 'a) set)"
where
"stalk_mop2 x y = ((λz .
if ∃U s. y = germ x U s then
let (U, s) = (SOME (U, s). y = germ x U s) in
germ x U (-⇩a ⇘objectsmap U⇙ s)
else undefined) z)"
Update: You can use multiple lets in one of the following ways
let x1 = e1 in let x2 = e2 in ...
or
let x1 = e1; x2 = e2; ... in ...

Memoization in OCaml?

It is possible to improve "raw" Fibonacci recursive procedure
Fib[n_] := If[n < 2, n, Fib[n - 1] + Fib[n - 2]]
with
Fib[n_] := Fib[n] = If[n < 2, n, Fib[n - 1] + Fib[n - 2]]
in Wolfram Mathematica.
First version will suffer from exponential explosion while second one will not since Mathematica will see repeating function calls in expression and memoize (reuse) them.
Is it possible to do the same in OCaml?
How to improve
let rec fib n = if n<2 then n else fib (n-1) + fib (n-2);;
in the same manner?
The solution provided by rgrinberg can be generalized so that we can memoize any function. I am going to use associative lists instead of hashtables. But it does not really matter, you can easily convert all my examples to use hashtables.
First, here is a function memo which takes another function and returns its memoized version. It is what nlucaroni suggested in one of the comments:
let memo f =
let m = ref [] in
fun x ->
try
List.assoc x !m
with
Not_found ->
let y = f x in
m := (x, y) :: !m ;
y
The function memo f keeps a list m of results computed so far. When asked to compute f x it first checks m to see if f x has been computed already. If yes, it returns the result, otherwise it actually computes f x, stores the result in m, and returns it.
There is a problem with the above memo in case f is recursive. Once memo calls f to compute f x, any recursive calls made by f will not be intercepted by memo. To solve this problem we need to do two things:
In the definition of such a recursive f we need to substitute recursive calls with calls to a function "to be provided later" (this will be the memoized version of f).
In memo f we need to provide f with the promised "function which you should call when you want to make a recursive call".
This leads to the following solution:
let memo_rec f =
let m = ref [] in
let rec g x =
try
List.assoc x !m
with
Not_found ->
let y = f g x in
m := (x, y) :: !m ;
y
in
g
To demonstrate how this works, let us memoize the naive Fibonacci function. We need to write it so that it accepts an extra argument, which I will call self. This argument is what the function should use instead of recursively calling itself:
let fib self = function
0 -> 1
| 1 -> 1
| n -> self (n - 1) + self (n - 2)
Now to get the memoized fib, we compute
let fib_memoized = memo_rec fib
You are welcome to try it out to see that fib_memoized 50 returns instantly. (This is not so for memo f where f is the usual naive recursive definition.)
You pretty much do what the mathematica version does but manually:
let rec fib =
let cache = Hashtbl.create 10 in
begin fun n ->
try Hashtbl.find cache n
with Not_found -> begin
if n < 2 then n
else
let f = fib (n-1) + fib (n-2) in
Hashtbl.add cache n f; f
end
end
Here I choose a hashtable to store already computed results instead of recomputing them.
Note that you should still beware of integer overflow since we are using a normal and not a big int.

When exactly do we use let rec?

I know that let rec is used when I want recursive.
For example,
let rec power i x = if i = 0 then 1.0 else x *. (power (i-1) x);;
Ok, I understand that.
But how about this one:
let x y = y + y in x 2?
Should I use rec inside?
I think I should, because it has x 2 inside, loading itself, but it seems it is fine with compiler.
So when I should use let rec and shouldn't?
Also, what is the difference between
let (-) x y = y - x in 1-2-3;;
and
let rec (-) x y = y - x in 1-2-3;;
Are they both legal?
You need to understand the scoping rules of OCaml first.
When you write let f XXX = YYY in ZZZ, if you use f in YYY then you need rec. In both cases (ie with or without rec),f will be defined in ZZZ.
So:
let x y = y + y in
x 2
is perfectly valid.
For you second question: no it is not equivalent, if you try it on the toplevel, the second statement loop for ever and is equivalent to let rec loop x y = loop y x in (). To understand why it is looping for ever, you can understand the application of loop as an expansion where the identifier is replaced by its body. so:
So loop body is function x y -> loop y x, which can be expanded to
function x y -> (function a b -> loop b a) y x (I've renamed the parameter names to avoid ambiguity), which is equivalent to function x y -> loop x y when you apply the body and so on and so on. So this function never does anything, it just loops forever by trying to expand/apply its body and swapping its arguments.

How do variables in pattern matching allow parameter omission?

I'm doing some homework but I've been stuck for hours on something.
I'm sure it's really trivial but I still can't wrap my head around it after digging through the all documentation available.
Can anybody give me a hand?
Basically, the exercise in OCaml programming asks to define the function x^n with the exponentiation by squaring algorithm.
I've looked at the solution:
let rec exp x = function
0 -> 1
| n when n mod 2 = 0 -> let y = exp x (n/2) in y*y
| n when n mod 2 <> 0 -> let y = exp x ((n-1)/2) in y*y*x
;;
What I don't understand in particular is how the parameter n can be omitted from the fun statement and why should it be used as a variable for a match with x, which has no apparent link with the definition of exponentiation by squaring.
Here's how I would do it:
let rec exp x n = match n with
0 -> 1
| n when (n mod 2) = 1 -> (exp x ((n-1)/2)) * (exp x ((n-1)/2)) * x
| n when (n mod 2) = 0 -> (exp x (n/2)) * (exp x (n/2))
;;
Your version is syntaxically correct, yields a good answer, but is long to execute.
In your code, exp is called recursively twice, thus yielding twice as much computation, each call yielding itself twice as much computation, etc. down to n=0. In the solution, exp is called only once, the result is storred in the variable y, then y is squared.
Now, about the syntax,
let f n = match n with
| 0 -> 0
| foo -> foo-1
is equivalent to:
let f = function
| 0 -> 0
| foo -> foo-1
The line let rec exp x = function is the begging of a function that takes two arguments: x, and an unnammed argument used in the pattern matching. In the pattern matching, the line
| n when n mod 2 = 0 ->
names this argument n. Not that a different name could be used in each case of the pattern matching (even if that would be less clear):
| n when n mod 2 = 0 -> let y = exp x (n/2) in y*y
| p when p mod 2 <> 0 -> let y = exp x ((p-1)/2) in y*y*x
The keyword "function" is not a syntaxic sugar for
match x with
but for
fun x -> match x with
thus
let rec exp x = function
could be replaced by
let rec exp x = fun y -> match y with
which is of course equivalent with your solution
let rec exp x y = match y with
Note that i wrote "y" and not "n" to avoid confusion. The n variable introduced after the match is a new variable, which is only related to the function parameter because it match it. For instance, instead of
let y = x in ...
you could write :
match x with y -> ...
In this match expression, the "y" expression is the "pattern" matched. And like any pattern, it binds its variables (here y) with the value matched. (here the value of x) And like any pattern, the variables in the pattern are new variables, which may shadow previously defined variables. In your code :
let rec exp x n = match n with
0 -> 1
| n when (n mod 2) = 1 -> (exp x ((n-1)/2)) * (exp x ((n-1)/2)) * x
| n when (n mod 2) = 0 -> (exp x (n/2)) * (exp x (n/2))
;;
the variable n in the two cases shadow the parameter n. This isn't a problem, though, since the two variable with the same name have the same value.

Resources