Invalid_argument "String.sub / Bytes.sub" - recursion

I have a small problem on an exercice that i'm doing.
I try to recursively count vowels in a String but i have this strange error popping.
Can someone explain me why ?
let rec nb_voyelle = function chaine ->
if chaine == "" then
0
else
let length = (String.length chaine)-1 in
let p_length = String.sub chaine 0 length in
match chaine.[length] with
| 'a' | 'e' | 'i' | 'o' | 'u' | 'y' -> 1 + nb_voyelle p_length
| _ -> 0 + nb_voyelle p_length
;;

Answer is i used "==" to compare the string in my final case which is not the good pervasive to test equality between two elements.
As such, the case (String.sub "" 0 -1) happens and the function fail raising this error.

Related

Comparison in pattern matching in OCaml

I want to write a function set which changes the index i in the 'a array a to the value 'a v and raise an invalid_argument exception if i is bigger then the length-1 of the array.
I know that this can be done with if/then/else:
let set i v a =
let l = Array.length a in
if i > (l-1) then
raise (Invalid_argument "index out of bounds")
else
a.(i) <- v
However I want to know if this can be achieved in a pure functional approach, using pattern matching and the OCaml standard library. I don't how to compare values inside the pattern matching, I get an error at the marked line:
let set i v a =
let l = Array.length a in
match i with
>>>>>> | > l-1 -> raise (Invalid_argument "index out of bounds")
| _ -> a.(i) <- v
Is there a workaround to achieve this? perhaps with a helper function?
An if expression is a pure functional approach, and is also the right approach. In general, pattern matching has the purpose of deconstructing values; it's not an alternative to an if.
However, it's still possible to do this with pattern matching:
let set i v a =
let l = Array.length a in
match compare l i with
| 1 -> a.(i) <- v
| _ -> raise ## Invalid_argument "index out of bounds"
EDIT: Apparently, compare can return other values than -1, 0 and 1 so this version is not reliable (but you wouldn't use it anyway, would you?)...
Or, more efficiently
let set i v a =
let l = Array.length a in
match l > i with
| true -> a.(i) <- v
| false -> raise ## Invalid_argument "index out of bounds"
But then you realize that matching over a boolean is just an if. Which is why the correct version is still
let set i v a =
let l = Array.length a in
if l > i then a.(i) <- v
else raise ## Invalid_argument "index out of bounds"
BlackBeans' answer is correct. But also know that pattern-matching in OCaml can take advantage of conditional guards when you want to place a conditional on a pattern.
Consider the following simple example.
type species = Dog | Cat
type weight = int
type pet = Pet of species * weight
let sound = function
| Pet (Dog, weight) when weight < 10 -> "Yip!"
| Pet (Dog, _) -> "Woof!"
| Pet (Cat, weight) when weight > 100 -> "ROAR!!!"
| _ -> "Meow!"
The patterns Pet (Dog, weight) and Pet (Dog, _) would otherwise match the same values (with the latter not binding a name to the weight).
An equivalent with if/else would look like:
let sound = function
| Pet (Dog, weight) ->
if weight < 10 then "Yip!"
else "Woof!"
| Pet (Cat, weight) ->
if weight > 100 -> "ROAR!!!"
else "Meow!"
In many ways which you prefer boils down to opinion, and which you feel is more expressive.

Nested functions in OCaml and their parameters

I have a problem with how nested functions should be implemented in OCaml, i need the output (list) of one function to be the input of another. And both should be recursive. The problem is i've played around with the parameters and they arent feeding properly:
let toComb sentence =
let rec listCleanup sentence =
match sentence with
| [] -> []
| h::t when h = "" -> listCleanup t
| h::t -> h::listCleanup t
in
let rec toString listCleanup sentence =
match listCleanup sentence with
| [] -> ""
| [element] -> element
| h::t -> h ^ " " ^ toString listCleanup sentence
in
toString listCleanup sentence;;
If I use the function and its parameter as a parameter, there's a stack overflow, but if I use just the function without a parameter, I get a mismatch of parameters. What should be the fix here?
To correct your code, here is what would work properly:
let to_comb sentence =
let rec cleanup s = match s with
| [] -> []
| ""::tail -> cleanup tail
| hd::tail -> hd::cleanup tail in
let rec to_string s = match s with
| [] -> ""
| [x] -> x
| hd::tail -> hd ^ " " ^ to_string tail in
to_string (cleanup s)
Note that I only call cleanup once, because you only ever need to clean the whole sequence only once. However, turns out both of these function can be expressed more simply with predefined OCaml function:
let to_comb sentence =
sentence
|> List.filter (fun s -> s <> "")
|> String.concat " "
You could almost read this code out loud to get a description of what it does. It starts with a sentence, filters the empty words in it, then concatenates them with spaces in between.

fsharp function doesn't return anything

I have the following smaller tokenizer for simple arithmetic expressions. I am new to fsharp and I don't know why this function doesn't return anything when being called. Can someone please help?
let tokenizer s =
let chars1 = scan s
let rec repeat list =
match list with
| []->[]
| char::chars ->
match char with
| ')' -> RP::repeat chars
| '(' -> LP::repeat chars
| '+' -> Plus::repeat chars
| '*' -> Times::repeat chars
| '^' -> Pow::repeat chars
| _ ->
let (x,y) = makeInt (toInt char) chars
Int x::repeat chars
repeat chars1
The implementation of scan, toInt, makeInt and the union type for the expression was not presented, but might be inferred as:
let scan (s:string) = s.ToCharArray() |> Array.toList
let toInt c = int c - int '0'
let makeInt n chars = (n,chars)
type expr = RP | LP | Plus | Times | Pow | Int of int
let tokenizer s =
let chars1 = scan s
let rec repeat list =
match list with
| []->[]
| char::chars ->
match char with
| ')' -> RP::repeat chars
| '(' -> LP::repeat chars
| '+' -> Plus::repeat chars
| '*' -> Times::repeat chars
| '^' -> Pow::repeat chars
| _ ->
let (x,y) = makeInt (toInt char) chars
Int x::repeat chars
repeat chars1
in which case:
tokenizer "1+1"
gives:
val it : expr list = [Int 1; Plus; Int 1]
It's possible the issue is in the implementation of your scan function.

OCaml syntax error on filter

I just begin OCaml (and functional programming) today and I'm trying to code a function that count the number of occurrences of "value" into an array (tab).
I tried :
let rec count_occ tab value =
let rec count_rec idx time = function
| tab.length - 1 -> time
| _ when tab.(indice) == value-> count_rec (idx + 1) (time + 1)
| _ -> count_rec (indice + 1) time
in
count_rec 0 0
;;
Unfortunately, it doesn't compile because of a syntax error, and I don't find the solution.
let rec count_occ tab value =
This rec above is not necessary.
let rec count_rec idx time = function
| tab.length - 1 -> time
You cannot match against an expression. You want to use guards like you did on the next line, or if statements to test something like this. tab.length also does not exist as tab is an array, not a record with a length field. You want Array.length tab.
Really though, you don't want the function at all. function is the same as fun x -> match x with, and would imply that count_rec has type, int -> int -> int -> int.
| _ when tab.(indice) == value-> count_rec (idx + 1) (time + 1)
indices is not declared; lets assume you meant idx. Also, == is physical equality, you really want =.
| _ -> count_rec (indice + 1) time
in
count_rec 0 0
You're off to a good start, the basics of your recursion are correct although one edge case is incorrect, but a minor issue you should be able to resolve once you have the syntactic issues fixed.
finnaly I post my final code :
let count_occ tab value =
let rec count_rec idx time =
if (Array.length tab) = idx then
time
else if (tab.(idx)) = value then
count_rec (idx + 1) (time + 1)
else
count_rec (idx + 1) time
in
count_rec 0 0
;;

How to write a pattern match in Ocaml so it is easy to scale?

I am learning Jason Hickey's Introduction to Objective Caml.
There is an exercise like this:
Exercise 4.3 Suppose we have a crypto-system based on the following substitution cipher, where each plain letter is encrypted according to the following table.
Plain | A B C D
--------------------
Encrypted | C A D B
For example, the string BAD would be encrypted as ACB.
Write a function check that, given a plaintext string s1 and a ciphertext string s2, returns true if, and only if, s2 is the ciphertext for s1. Your function should raise an exception if s1 is not a plaintext string. You may wish to refer to the string operations on page 8. How does your code scale as the alphabet gets larger? [emphasis added]
Basically, I wrote two functions with might-be-stupid-naive ways for this exercise.
I would like to ask for advice on my solutions first.
Then I would like to ask for hints for the scaled solution as highlighted in the exercise.
Using if else
let check_cipher_1 s1 s2 =
let len1 = String.length s1 in
let len2 = String.length s2 in
if len1 = len2 then
let rec check pos =
if pos = -1 then
true
else
let sub1 = s1.[pos] in
let sub2 = s2.[pos] in
match sub1 with
| 'A' -> (match sub2 with
|'C' -> check (pos-1)
| _ -> false)
| 'B' -> (match sub2 with
|'A' -> check (pos-1)
| _ -> false)
| 'C' -> (match sub2 with
|'D' -> check (pos-1)
| _ -> false)
| 'D' -> (match sub2 with
|'B' -> check (pos-1)
| _ -> false)
| _ -> false;
in
check (len1-1)
else
false
Using pure match everywhere
let check_cipher_2 s1 s2 =
let len1 = String.length s1 in
let len2 = String.length s2 in
match () with
| () when len1 = len2 ->
let rec check pos =
match pos with
| -1 -> true
| _ ->
let sub1 = s1.[pos] in
let sub2 = s2.[pos] in
(*http://stackoverflow.com/questions/257605/ocaml-match-expression-inside-another-one*)
match sub1 with
| 'A' -> (match sub2 with
|'C' -> check (pos-1)
| _ -> false)
| 'B' -> (match sub2 with
|'A' -> check (pos-1)
| _ -> false)
| 'C' -> (match sub2 with
|'D' -> check (pos-1)
| _ -> false)
| 'D' -> (match sub2 with
|'B' -> check (pos-1)
| _ -> false)
| _ -> false
in
check (len1-1)
| () -> false
Ok. The above two solutions are similar.
I produced these two, because in here http://www.quora.com/OCaml/What-is-the-syntax-for-nested-IF-statements-in-OCaml, some people say that if else is not prefered.
This is essentially the first time I ever wrote a not-that-simple function in my whole life. So I am really hungry for suggestions here.
For exmaple,
how can I improve these solutions?
should I prefer match over if else?
Am I designing the rec or use the rec correctly?
if that in check (len1-1) correct?
Scale it
The exercise asks How does your code scale as the alphabet gets larger?. I really don't have a clue for now. In Java, I would say I will have a map, then for each char in s1, I am looking s2 for the according char and to see whether it is the value in the map.
Any suggestions on this?
Here's a simple solution:
let tr = function
| 'A' -> 'C'
| 'B' -> 'A'
| 'C' -> 'D'
| 'D' -> 'B'
| _ -> failwith "not a plaintext"
let check ~tr s1 s2 = (String.map tr s1) = s2
check ~tr "BAD" "ACD"
you can add more letters by composing with tr. I.e.
let comp c1 c2 x = try (c1 x) with _ -> (c2 x)
let tr2 = comp tr (function | 'X' -> 'Y')
how can I improve these solutions?
You misuse indentation which makes the program much harder to read. Eliminating unnecessary tabs and move check to outer scope for readability:
let check_cipher_1 s1 s2 =
let rec check pos =
if pos = -1 then
true
else
let sub1 = s1.[pos] in
let sub2 = s2.[pos] in
match sub1 with
| 'A' -> (match sub2 with
|'C' -> check (pos-1)
| _ -> false)
| 'B' -> (match sub2 with
|'A' -> check (pos-1)
| _ -> false)
| 'C' -> (match sub2 with
|'D' -> check (pos-1)
| _ -> false)
| 'D' -> (match sub2 with
|'B' -> check (pos-1)
| _ -> false)
| _ -> false in
let len1 = String.length s1 in
let len2 = String.length s2 in
if len1 = len2 then
check (len1-1)
else false
should I prefer match over if else?
It depends on situations. If pattern matching is superficial as you demonstrate in the 2nd function (match () with | () when len1 = len2) then it brings no value compared to a simple if/else construct. If you pattern match on values, it is better than if/else and potentially shorter when you make use of advanced constructs. For example, you can shorten the function by matching on tuples:
let check_cipher_1 s1 s2 =
let rec check pos =
if pos = -1 then
true
else
match s1.[pos], s2.[pos] with
| 'A', 'C' | 'B', 'A'
| 'C', 'D' | 'D', 'B' -> check (pos-1)
| _ -> false in
let len1 = String.length s1 in
let len2 = String.length s2 in
len1 = len2 && check (len1 - 1)
Here we also use Or pattern to group patterns having the same output actions and replace an unnecessary if/else block by &&.
Am I designing the rec or use the rec correctly?
if that in check (len1-1) correct?
Your function looks nice. There's no better way than testing with a few inputs on OCaml top-level.
Scale it
The number of patterns grows linearly with the size of the alphabet. It's pretty nice IMO.
The simplest solution seems to be to just cipher the text and compare the result:
let cipher_char = function
| 'A' -> 'C'
| 'B' -> 'A'
| 'C' -> 'D'
| 'D' -> 'B'
| _ -> failwith "cipher_char"
let cipher = String.map cipher_char
let check_cipher s1 s2 = (cipher s1 = s2)
The cipher_char function scales linearly with the size of the alphabet. To make it a bit more compact and generic you could use a lookup table of some form, e.g.
(* Assume that only letters are needed *)
let cipher_mapping = "CADB"
let cipher_char c =
try cipher_mapping.[Char.code c - Char.code 'A']
with Invalid_argument _ -> failwith "cipher_char"

Resources