Comparison in pattern matching in OCaml - functional-programming

I want to write a function set which changes the index i in the 'a array a to the value 'a v and raise an invalid_argument exception if i is bigger then the length-1 of the array.
I know that this can be done with if/then/else:
let set i v a =
let l = Array.length a in
if i > (l-1) then
raise (Invalid_argument "index out of bounds")
else
a.(i) <- v
However I want to know if this can be achieved in a pure functional approach, using pattern matching and the OCaml standard library. I don't how to compare values inside the pattern matching, I get an error at the marked line:
let set i v a =
let l = Array.length a in
match i with
>>>>>> | > l-1 -> raise (Invalid_argument "index out of bounds")
| _ -> a.(i) <- v
Is there a workaround to achieve this? perhaps with a helper function?

An if expression is a pure functional approach, and is also the right approach. In general, pattern matching has the purpose of deconstructing values; it's not an alternative to an if.
However, it's still possible to do this with pattern matching:
let set i v a =
let l = Array.length a in
match compare l i with
| 1 -> a.(i) <- v
| _ -> raise ## Invalid_argument "index out of bounds"
EDIT: Apparently, compare can return other values than -1, 0 and 1 so this version is not reliable (but you wouldn't use it anyway, would you?)...
Or, more efficiently
let set i v a =
let l = Array.length a in
match l > i with
| true -> a.(i) <- v
| false -> raise ## Invalid_argument "index out of bounds"
But then you realize that matching over a boolean is just an if. Which is why the correct version is still
let set i v a =
let l = Array.length a in
if l > i then a.(i) <- v
else raise ## Invalid_argument "index out of bounds"

BlackBeans' answer is correct. But also know that pattern-matching in OCaml can take advantage of conditional guards when you want to place a conditional on a pattern.
Consider the following simple example.
type species = Dog | Cat
type weight = int
type pet = Pet of species * weight
let sound = function
| Pet (Dog, weight) when weight < 10 -> "Yip!"
| Pet (Dog, _) -> "Woof!"
| Pet (Cat, weight) when weight > 100 -> "ROAR!!!"
| _ -> "Meow!"
The patterns Pet (Dog, weight) and Pet (Dog, _) would otherwise match the same values (with the latter not binding a name to the weight).
An equivalent with if/else would look like:
let sound = function
| Pet (Dog, weight) ->
if weight < 10 then "Yip!"
else "Woof!"
| Pet (Cat, weight) ->
if weight > 100 -> "ROAR!!!"
else "Meow!"
In many ways which you prefer boils down to opinion, and which you feel is more expressive.

Related

Errors while iterating and processing the split string in recursive functions

Let's say we have a string
"+x1 +x2 -x3
+x4 +x5 -x6
..."
and a type formula:
type formula =
| Bot
| Top
| Atom of string
| Imp of (formula * formula)
| Or of (formula * formula)
| And of (formula * formula)
| Not of formula
let atom x = Atom x
(aka predicate logic)
and we want to:
Create a function which takes one line, splits it and turns it into disjunction using the formula type. (sort of like Or(Atom "x1", Atom "x2", Not Atom "x3") if we give the first line as an input)
I've written this:
let string_to_disj st =
let lst = Str.split (Str.regexp " \t") st in
let rec total lst =
match lst with
| [] -> Or (Bot, Bot) (*Is this correct btw?*)
| h :: t -> Or (string_to_lit h, total t);;
where
let string_to_lit =
match String.get s 0 with
| '+' -> atom (String.sub s 1 (String.length s-1))
| '-' -> Not(atom(String.sub s 1 (String.length s-1)))
| _ -> atom(s);;
However, string_to_disj raises a syntax error at line
| h :: t -> Or (string_to_lit h, total t)
What have I done wrong?
You have let rec total lst but you have no matching in. Every let requires a matching in. (Except at the top level of a module where it is for defining exported symbols of the module).
Also note that you are defining a function named total but you have no calls to the function except the one recursive call.

Recursion in F#, expected type int but got type "int list -> int"

I'm new to F# and want to implement a least common multiple function of a list by doing it recursively, e.g. lcm(a,b,c) = lcm(a, lcm(b,c)), where lcm of two elements is calculated from the gcd.
I have the following code. I try to match the input of the lcm function with a list of two elements, and otherwise a general list, which I split up into its first element and the remaining part. The part "lcm (tail)" gives a compiler error. It says it was expected to have type "int" but has type "int list -> int". It looks like it says that the expression "lcm tail" is itself a function, which I don't understand. Why is it not an int?
let rec gcd a b =
if b = 0
then abs a
else gcd b (a % b)
let lcmSimple a b = a*b/(gcd a b)
let rec lcm list = function
| [a;b] -> lcmSimple a b
| head::tail -> lcmSimple (head) (lcm (tail))
Best regards.
When defining the function as let f = function | ..., the argument to the function is implicit, as it is interpreted as let f x = match x with | ....
Thus let rec lcm list = function |... is a function of two variables, which are list and the implicit variable. This is why the compiler claims that lcm tail is a function - only one variable has been passed, where it expected two. A better version of the code is
let rec gcd a b =
if b = 0
then abs a
else gcd b (a % b)
let lcmSimple a b = a*b/(gcd a b)
let rec lcm = function
| [a;b] -> lcmSimple a b
| head::tail -> lcmSimple (head) (lcm (tail))
| [] -> 1
where the last case has been included to complete the pattern.

How to create a cached recursive type?

open System
open System.Collections.Generic
type Node<'a>(expr:'a, symbol:int) =
member x.Expression = expr
member x.Symbol = symbol
override x.GetHashCode() = symbol
override x.Equals(y) =
match y with
| :? Node<'a> as y -> symbol = y.Symbol
| _ -> failwith "Invalid equality for Node."
interface IComparable with
member x.CompareTo(y) =
match y with
| :? Node<'a> as y -> compare symbol y.Symbol
| _ -> failwith "Invalid comparison for Node."
type Ty =
| Int
| String
| Tuple of Ty list
| Rec of Node<Ty>
| Union of Ty list
type NodeDict<'a> = Dictionary<'a,Node<'a>>
let get_nodify_tag =
let mutable i = 0
fun () -> i <- i+1; i
let nodify (dict: NodeDict<_>) x =
match dict.TryGetValue x with
| true, x -> x
| false, _ ->
let x' = Node(x,get_nodify_tag())
dict.[x] <- x'
x'
let d = Dictionary(HashIdentity.Structural)
let nodify_ty x = nodify d x
let rec int_string_stream =
Union
[
Tuple [Int; Rec (nodify_ty (int_string_stream))]
Tuple [String; Rec (nodify_ty (int_string_stream))]
]
In the above example, the int_string_stream gives a type error, but it neatly illustrates what I want to do. Of course, I want both sides to get tagged with the same symbol in nodify_ty. When I tried changing the Rec type to Node<Lazy<Ty>> I've found that it does not compare them correctly and each sides gets a new symbol which is useless to me.
I am working on a language, and the way I've dealt with storing recursive types up to now is by mapping Rec to an int and then substituting that with the related Ty in a dictionary whenever I need it. Currently, I am in the process of cleaning up the language, and would like to have the Rec case be Node<Ty> rather than an int.
At this point though, I am not sure what else could I try here. Could this be done somehow?
I think you will need to add some form of explicit "delay" to the discriminated union that represents your types. Without an explicit delay, you'll always end up fully evaluating the types and so there is no potential for closing the loop.
Something like this seems to work:
type Ty =
| Int
| String
| Tuple of Ty list
| Rec of Node<Ty>
| Union of Ty list
| Delayed of Lazy<Ty>
// (rest is as before)
let rec int_string_stream = Delayed(Lazy.Create(fun () ->
Union
[
Tuple [Int; Rec (nodify_ty (int_string_stream))]
Tuple [String; Rec (nodify_ty (int_string_stream))]
]))
This will mean that when you pattern match on Ty, you'll always need to check for Delayed, evaluate the lazy value and then pattern match again, but that's probably doable!

Default recursion on recursive types

Idiomatic F# can nicely represent the classic recursive expression data structure:
type Expression =
| Number of int
| Add of Expression * Expression
| Multiply of Expression * Expression
| Variable of string
together with recursive functions thereon:
let rec simplify_add (exp: Expression): Expression =
match exp with
| Add (x, Number 0) -> x
| Add (Number 0, x) -> x
| _ -> exp
... oops, that doesn't work as written; simplify_add needs to recur into subexpressions. In this toy example that's easy enough to do, only a couple of extra lines of code, but in a real program there would be dozens of expression types; one would prefer to avoid adding dozens of lines of boilerplate to every function that operates on expressions.
Is there any way to express 'by default, recur on subexpressions'? Something like:
let rec simplify_add (exp: Expression): Expression =
match exp with
| Add (x, Number 0) -> x
| Add (Number 0, x) -> x
| _ -> recur simplify_add exp
where recur might perhaps be some sort of higher-order function that uses reflection to look up the type definition or somesuch?
Unfortunately, F# does not give you any recursive function for processing your data type "for free". You could probably generate one using reflection - this would be valid if you have a lot of recursive types, but it might not be worth it in normal situations.
There are various patterns that you can use to hide the repetition though. One that I find particularly nice is based on the ExprShape module from standard F# libraries. The idea is to define an active pattern that gives you a view of your type as either leaf (with no nested sub-expressions) or node (with a list of sub-expressions):
type ShapeInfo = Shape of Expression
// View expression as a node or leaf. The 'Shape' just stores
// the original expression to keep its original structure
let (|Leaf|Node|) e =
match e with
| Number n -> Leaf(Shape e)
| Add(e1, e2) -> Node(Shape e, [e1; e2])
| Multiply(e1, e2) -> Node(Shape e, [e1; e2])
| Variable s -> Leaf(Shape e)
// Reconstruct an expression from shape, using new list
// of sub-expressions in the node case.
let FromLeaf(Shape e) = e
let FromNode(Shape e, args) =
match e, args with
| Add(_, _), [e1; e2] -> Add(e1, e2)
| Multiply(_, _), [e1; e2] -> Multiply(e1, e2)
| _ -> failwith "Wrong format"
This is some boilerplate code that you'd have to write. But the nice thing is that we can now write the recursive simplifyAdd function using just your special cases and two additional patterns for leaf and node:
let rec simplifyAdd exp =
match exp with
// Special cases for this particular function
| Add (x, Number 0) -> x
| Add (Number 0, x) -> x
// This now captures all other recursive/leaf cases
| Node (n, exps) -> FromNode(n, List.map simplifyAdd exps)
| Leaf _ -> exp

How to write a pattern match in Ocaml so it is easy to scale?

I am learning Jason Hickey's Introduction to Objective Caml.
There is an exercise like this:
Exercise 4.3 Suppose we have a crypto-system based on the following substitution cipher, where each plain letter is encrypted according to the following table.
Plain | A B C D
--------------------
Encrypted | C A D B
For example, the string BAD would be encrypted as ACB.
Write a function check that, given a plaintext string s1 and a ciphertext string s2, returns true if, and only if, s2 is the ciphertext for s1. Your function should raise an exception if s1 is not a plaintext string. You may wish to refer to the string operations on page 8. How does your code scale as the alphabet gets larger? [emphasis added]
Basically, I wrote two functions with might-be-stupid-naive ways for this exercise.
I would like to ask for advice on my solutions first.
Then I would like to ask for hints for the scaled solution as highlighted in the exercise.
Using if else
let check_cipher_1 s1 s2 =
let len1 = String.length s1 in
let len2 = String.length s2 in
if len1 = len2 then
let rec check pos =
if pos = -1 then
true
else
let sub1 = s1.[pos] in
let sub2 = s2.[pos] in
match sub1 with
| 'A' -> (match sub2 with
|'C' -> check (pos-1)
| _ -> false)
| 'B' -> (match sub2 with
|'A' -> check (pos-1)
| _ -> false)
| 'C' -> (match sub2 with
|'D' -> check (pos-1)
| _ -> false)
| 'D' -> (match sub2 with
|'B' -> check (pos-1)
| _ -> false)
| _ -> false;
in
check (len1-1)
else
false
Using pure match everywhere
let check_cipher_2 s1 s2 =
let len1 = String.length s1 in
let len2 = String.length s2 in
match () with
| () when len1 = len2 ->
let rec check pos =
match pos with
| -1 -> true
| _ ->
let sub1 = s1.[pos] in
let sub2 = s2.[pos] in
(*http://stackoverflow.com/questions/257605/ocaml-match-expression-inside-another-one*)
match sub1 with
| 'A' -> (match sub2 with
|'C' -> check (pos-1)
| _ -> false)
| 'B' -> (match sub2 with
|'A' -> check (pos-1)
| _ -> false)
| 'C' -> (match sub2 with
|'D' -> check (pos-1)
| _ -> false)
| 'D' -> (match sub2 with
|'B' -> check (pos-1)
| _ -> false)
| _ -> false
in
check (len1-1)
| () -> false
Ok. The above two solutions are similar.
I produced these two, because in here http://www.quora.com/OCaml/What-is-the-syntax-for-nested-IF-statements-in-OCaml, some people say that if else is not prefered.
This is essentially the first time I ever wrote a not-that-simple function in my whole life. So I am really hungry for suggestions here.
For exmaple,
how can I improve these solutions?
should I prefer match over if else?
Am I designing the rec or use the rec correctly?
if that in check (len1-1) correct?
Scale it
The exercise asks How does your code scale as the alphabet gets larger?. I really don't have a clue for now. In Java, I would say I will have a map, then for each char in s1, I am looking s2 for the according char and to see whether it is the value in the map.
Any suggestions on this?
Here's a simple solution:
let tr = function
| 'A' -> 'C'
| 'B' -> 'A'
| 'C' -> 'D'
| 'D' -> 'B'
| _ -> failwith "not a plaintext"
let check ~tr s1 s2 = (String.map tr s1) = s2
check ~tr "BAD" "ACD"
you can add more letters by composing with tr. I.e.
let comp c1 c2 x = try (c1 x) with _ -> (c2 x)
let tr2 = comp tr (function | 'X' -> 'Y')
how can I improve these solutions?
You misuse indentation which makes the program much harder to read. Eliminating unnecessary tabs and move check to outer scope for readability:
let check_cipher_1 s1 s2 =
let rec check pos =
if pos = -1 then
true
else
let sub1 = s1.[pos] in
let sub2 = s2.[pos] in
match sub1 with
| 'A' -> (match sub2 with
|'C' -> check (pos-1)
| _ -> false)
| 'B' -> (match sub2 with
|'A' -> check (pos-1)
| _ -> false)
| 'C' -> (match sub2 with
|'D' -> check (pos-1)
| _ -> false)
| 'D' -> (match sub2 with
|'B' -> check (pos-1)
| _ -> false)
| _ -> false in
let len1 = String.length s1 in
let len2 = String.length s2 in
if len1 = len2 then
check (len1-1)
else false
should I prefer match over if else?
It depends on situations. If pattern matching is superficial as you demonstrate in the 2nd function (match () with | () when len1 = len2) then it brings no value compared to a simple if/else construct. If you pattern match on values, it is better than if/else and potentially shorter when you make use of advanced constructs. For example, you can shorten the function by matching on tuples:
let check_cipher_1 s1 s2 =
let rec check pos =
if pos = -1 then
true
else
match s1.[pos], s2.[pos] with
| 'A', 'C' | 'B', 'A'
| 'C', 'D' | 'D', 'B' -> check (pos-1)
| _ -> false in
let len1 = String.length s1 in
let len2 = String.length s2 in
len1 = len2 && check (len1 - 1)
Here we also use Or pattern to group patterns having the same output actions and replace an unnecessary if/else block by &&.
Am I designing the rec or use the rec correctly?
if that in check (len1-1) correct?
Your function looks nice. There's no better way than testing with a few inputs on OCaml top-level.
Scale it
The number of patterns grows linearly with the size of the alphabet. It's pretty nice IMO.
The simplest solution seems to be to just cipher the text and compare the result:
let cipher_char = function
| 'A' -> 'C'
| 'B' -> 'A'
| 'C' -> 'D'
| 'D' -> 'B'
| _ -> failwith "cipher_char"
let cipher = String.map cipher_char
let check_cipher s1 s2 = (cipher s1 = s2)
The cipher_char function scales linearly with the size of the alphabet. To make it a bit more compact and generic you could use a lookup table of some form, e.g.
(* Assume that only letters are needed *)
let cipher_mapping = "CADB"
let cipher_char c =
try cipher_mapping.[Char.code c - Char.code 'A']
with Invalid_argument _ -> failwith "cipher_char"

Resources