Why is my Nearley grammar causing a loop? - nearley

I am playing around with nearley.js and something is confusing me.
As a test I am trying to build a parser parsing poker ranks.
Now, this grammar is working as expected:
#{% function nuller() { return null; } %}
main -> _ composition _ {% nuller %}
composition -> _ expression _ {% nuller %}
| composition _ "," _ rank {% nuller %}
expression -> _ rank _ {% nuller %}
rank -> [a, k, q, j, t, A, K, Q, J, T, 2-9] {% nuller %}
_ -> [\s]:* {% nuller %}
However, the second I change | composition _ "," _ rank to | composition _ "," _ expression then I end up with a loop:
#{% function nuller() { return null; } %}
main -> _ composition _ {% nuller %}
composition -> _ expression _ {% nuller %}
| composition _ "," _ expression {% nuller %}
expression -> _ rank _ {% nuller %}
rank -> [a, k, q, j, t, A, K, Q, J, T, 2-9] {% nuller %}
_ -> [\s]:* {% nuller %}
Can somebody explain me why that is?
Code can quickly be tested at the playground:
https://omrelli.ug/nearley-playground/
The test string I use is: a, k, q, j, t, 9, 8, 7, 6, 5, 4, 3, 2
Thank you very much in advance!

Well, there are multiple interpretations possible for each of the values in your string, this is because you both have optional whitespace before your expression and rank:
If you look at the second element k, this can be interpreted as:
space + expression
space + rank
Since you have 2 possibilities for 12 items in your string, you will get 144 possible combinations.
This returns one solution (i have left out the optional whitespace before expression):
main -> _ composition _ {% nuller %}
composition -> _ expression _ {% nuller %}
| composition _ "," expression {% nuller %}
expression -> _ rank _ {% nuller %}
rank -> [a, k, q, j, t, A, K, Q, J, T, 2-9] {% nuller %}
_ -> [\s]:* {% nuller %}

Related

Trying to use match statements in OCaml to write a function that checks if an element is in a list

I am new to OCaml and struggling to work with matches. I want to write a function that takes a list and a value and then returns true if the value is in that list and false if it is not. Here is my idea but I am struggling to get it to work.
let rec contains xs x =
match xs with
| [] -> false
| z :: zs ->
match x with
| z -> true
| _ -> contains zs x
When you use an identifier as a pattern, you will bind the value you match on to that identifier. I.e
match x with
| z -> true
will bind the value of x to the name z. You will also get a warning about z and the _ branch being unused.
You also don't need a second pattern match since it can be folded into the first:
let rec contains xs x =
match xs with
| [] -> false
| z :: _ when z = x -> true
| _ :: zs -> contains zs x

Trouble with Coq type classes when defining recursive dependent fields

I am trying to define a set of descriptions and their interpretation into Coq Types, and this is what I came up with so far:
Class Desc (D : Type) :=
{ denotesInto : D -> Type
; denote : forall (d : D), denotesInto d
}.
Notation "⟦ d ⟧" := (denote d).
Inductive TypeD : Type :=
| ArrowD : TypeD -> TypeD -> TypeD
| ListD : TypeD -> TypeD
| NatD : TypeD
.
Global Instance Desc_TypeD : Desc TypeD :=
{ denotesInto := fun _ => Type
; denote := fix go d :=
match d with
| ArrowD dL dR => (go dL) -> (go dR)
| ListD dT => list (go dT)
| NatD => nat
end
}.
When declaring the Desc_TypeD instance, I initially wanted to define it as:
(need some text here so that SO will format the next code block :(...)
Global Instance Desc_TypeD : Desc TypeD :=
{ denotesInto := fun _ => Type
; denote :=
match d with
| ArrowD dL dR => ⟦ dL ⟧ -> ⟦ dR ⟧
| ListD dT => list ⟦ dT ⟧
| NatD => nat
end
}.
But Coq would not let me. It seems to me that, it tries to resolve those calls to denote as some other instance that it can't find, while really they were meant to be a recursive call to the instance being defined.
Is there any convincing that will let me write this instance without the explicit fix?
Thanks!
It is hard to know why the fix bothers you without knowing more about you context. One way to get something close to what you want is to open up your TypeD, however, this will surely have other downsides:
Class Desc (D : Type) :=
{ denote : forall (d : D), Type }.
Notation "⟦ d ⟧" := (denote d).
Inductive TypeD (D : Type) : Type :=
| ArrowD : D -> D -> TypeD D
| ListD : D -> TypeD D
| NatD : TypeD D
.
Global Instance Desc_TypeD D `{DI : Desc D} : Desc (TypeD D) :=
{ denote := fun d =>
match d with
| ArrowD _ dL dR => ⟦dL⟧ -> ⟦dR⟧
| ListD _ dT => list ⟦dT⟧
| NatD _ => nat
end
}.
Note that we also need to make denote's type more general as we cannot get enough information about the parameter D.

How to shorten this OCaml code?

I am just wondering how to shorten these code as I suspect it is too redundant
let get ename doc =
try Some (StringMap.find ename doc) with Not_found -> None;;
let get_double ename doc =
let element = get ename doc in
match element with
| None -> None
| Some (Double v) -> Some v
| _ -> raise Wrong_bson_type;;
let get_string ename doc =
let element = get ename doc in
match element with
| None -> None
| Some (String v) -> Some v
| _ -> raise Wrong_bson_type;;
let get_doc ename doc =
let element = get ename doc in
match element with
| None -> None
| Some (Document v) -> Some v
| _ -> raise Wrong_bson_type;;
So, basically, I have different types of values, and I put all those kinds of values into a map.
The code above is for getting according type of values out of the map. What I do is that for each type, I have a get. To get one type of value, I have to see a). whether it is there or not; b). whether it is that type indeed, if not, raise an exception.
But the code above seems to redundant as you can see. The only diff between each type's get is just the type itself.
How can I shorten this code?
You can do this:
let get_generic extract ename doc =
let element = get ename doc in
match element with
| None -> None
| Some v -> Some (extract v)
let get_double = get_generic (function Double v -> v | _ -> raise Wrong_bson_type)
let get_string = get_generic (function String v -> v | _ -> raise Wrong_bson_type)
let get_doc = get_generic (function Document v -> v | _ -> raise Wrong_bson_type)
EDIT:
To remove the redundant raise Wrong_bson_type (But it is ugly):
let get_generic extract ename doc = try
let element = get ename doc in
match element with
| None -> None
| Some v -> Some (extract v)
with Match_failure _ -> raise Wrong_bson_type
let get_double = get_generic (fun (Double v) -> v)
let get_string = get_generic (fun (String v) -> v)
let get_doc = get_generic (fun (Document v)-> v)
You can use GADT to do that:
If you define a type expr like this:
type _ expr =
| Document: document -> document expr
| String: string -> string expr
| Double: float -> float expr
You can write a function get like this:
let get : type v. v expr -> v = function
Document doc -> doc
| String s -> s
| Double d -> d
With GADTs:
type _ asked =
| TDouble : float asked
| TString : string asked
| TDocument : document asked
let get : type v. v asked -> string -> doc StringMap.t -> v option =
fun asked ename doc ->
try
Some (match asked, StringMap.find ename doc with
| TDouble, Double f -> f
| TString, String s -> s
| TDocument, Document d -> d)
with Not_found -> None
let get_double = get TDouble
let get_string = get TString
let get_document = get TDocument
If you can live with these extractor functions:
let extract_double = function
| Double v -> v
| _ -> raise Wrong_bson_type
let extract_string = function
| String v -> v
| _ -> raise Wrong_bson_type
let extract_doc = function
| Document v -> v
| _ -> raise Wrong_bson_type
Then you can use monadic style for the higher-order function, which allows you to keep your original definition of get:
let return x = Some x
let (>>=) mx f =
match mx with
| Some x -> f x
| None -> None
let get_with exf ename doc =
(get ename doc) >>= fun v ->
return (exf v)
let get_double = get_with extract_double
let get_string = get_with extract_string
let get_doc = get_with extract_doc
Less redundant and abstracts the side effect to generic bind and return operations.

How to write a pattern match in Ocaml so it is easy to scale?

I am learning Jason Hickey's Introduction to Objective Caml.
There is an exercise like this:
Exercise 4.3 Suppose we have a crypto-system based on the following substitution cipher, where each plain letter is encrypted according to the following table.
Plain | A B C D
--------------------
Encrypted | C A D B
For example, the string BAD would be encrypted as ACB.
Write a function check that, given a plaintext string s1 and a ciphertext string s2, returns true if, and only if, s2 is the ciphertext for s1. Your function should raise an exception if s1 is not a plaintext string. You may wish to refer to the string operations on page 8. How does your code scale as the alphabet gets larger? [emphasis added]
Basically, I wrote two functions with might-be-stupid-naive ways for this exercise.
I would like to ask for advice on my solutions first.
Then I would like to ask for hints for the scaled solution as highlighted in the exercise.
Using if else
let check_cipher_1 s1 s2 =
let len1 = String.length s1 in
let len2 = String.length s2 in
if len1 = len2 then
let rec check pos =
if pos = -1 then
true
else
let sub1 = s1.[pos] in
let sub2 = s2.[pos] in
match sub1 with
| 'A' -> (match sub2 with
|'C' -> check (pos-1)
| _ -> false)
| 'B' -> (match sub2 with
|'A' -> check (pos-1)
| _ -> false)
| 'C' -> (match sub2 with
|'D' -> check (pos-1)
| _ -> false)
| 'D' -> (match sub2 with
|'B' -> check (pos-1)
| _ -> false)
| _ -> false;
in
check (len1-1)
else
false
Using pure match everywhere
let check_cipher_2 s1 s2 =
let len1 = String.length s1 in
let len2 = String.length s2 in
match () with
| () when len1 = len2 ->
let rec check pos =
match pos with
| -1 -> true
| _ ->
let sub1 = s1.[pos] in
let sub2 = s2.[pos] in
(*http://stackoverflow.com/questions/257605/ocaml-match-expression-inside-another-one*)
match sub1 with
| 'A' -> (match sub2 with
|'C' -> check (pos-1)
| _ -> false)
| 'B' -> (match sub2 with
|'A' -> check (pos-1)
| _ -> false)
| 'C' -> (match sub2 with
|'D' -> check (pos-1)
| _ -> false)
| 'D' -> (match sub2 with
|'B' -> check (pos-1)
| _ -> false)
| _ -> false
in
check (len1-1)
| () -> false
Ok. The above two solutions are similar.
I produced these two, because in here http://www.quora.com/OCaml/What-is-the-syntax-for-nested-IF-statements-in-OCaml, some people say that if else is not prefered.
This is essentially the first time I ever wrote a not-that-simple function in my whole life. So I am really hungry for suggestions here.
For exmaple,
how can I improve these solutions?
should I prefer match over if else?
Am I designing the rec or use the rec correctly?
if that in check (len1-1) correct?
Scale it
The exercise asks How does your code scale as the alphabet gets larger?. I really don't have a clue for now. In Java, I would say I will have a map, then for each char in s1, I am looking s2 for the according char and to see whether it is the value in the map.
Any suggestions on this?
Here's a simple solution:
let tr = function
| 'A' -> 'C'
| 'B' -> 'A'
| 'C' -> 'D'
| 'D' -> 'B'
| _ -> failwith "not a plaintext"
let check ~tr s1 s2 = (String.map tr s1) = s2
check ~tr "BAD" "ACD"
you can add more letters by composing with tr. I.e.
let comp c1 c2 x = try (c1 x) with _ -> (c2 x)
let tr2 = comp tr (function | 'X' -> 'Y')
how can I improve these solutions?
You misuse indentation which makes the program much harder to read. Eliminating unnecessary tabs and move check to outer scope for readability:
let check_cipher_1 s1 s2 =
let rec check pos =
if pos = -1 then
true
else
let sub1 = s1.[pos] in
let sub2 = s2.[pos] in
match sub1 with
| 'A' -> (match sub2 with
|'C' -> check (pos-1)
| _ -> false)
| 'B' -> (match sub2 with
|'A' -> check (pos-1)
| _ -> false)
| 'C' -> (match sub2 with
|'D' -> check (pos-1)
| _ -> false)
| 'D' -> (match sub2 with
|'B' -> check (pos-1)
| _ -> false)
| _ -> false in
let len1 = String.length s1 in
let len2 = String.length s2 in
if len1 = len2 then
check (len1-1)
else false
should I prefer match over if else?
It depends on situations. If pattern matching is superficial as you demonstrate in the 2nd function (match () with | () when len1 = len2) then it brings no value compared to a simple if/else construct. If you pattern match on values, it is better than if/else and potentially shorter when you make use of advanced constructs. For example, you can shorten the function by matching on tuples:
let check_cipher_1 s1 s2 =
let rec check pos =
if pos = -1 then
true
else
match s1.[pos], s2.[pos] with
| 'A', 'C' | 'B', 'A'
| 'C', 'D' | 'D', 'B' -> check (pos-1)
| _ -> false in
let len1 = String.length s1 in
let len2 = String.length s2 in
len1 = len2 && check (len1 - 1)
Here we also use Or pattern to group patterns having the same output actions and replace an unnecessary if/else block by &&.
Am I designing the rec or use the rec correctly?
if that in check (len1-1) correct?
Your function looks nice. There's no better way than testing with a few inputs on OCaml top-level.
Scale it
The number of patterns grows linearly with the size of the alphabet. It's pretty nice IMO.
The simplest solution seems to be to just cipher the text and compare the result:
let cipher_char = function
| 'A' -> 'C'
| 'B' -> 'A'
| 'C' -> 'D'
| 'D' -> 'B'
| _ -> failwith "cipher_char"
let cipher = String.map cipher_char
let check_cipher s1 s2 = (cipher s1 = s2)
The cipher_char function scales linearly with the size of the alphabet. To make it a bit more compact and generic you could use a lookup table of some form, e.g.
(* Assume that only letters are needed *)
let cipher_mapping = "CADB"
let cipher_char c =
try cipher_mapping.[Char.code c - Char.code 'A']
with Invalid_argument _ -> failwith "cipher_char"

Suppress exhaustive matching warning in OCaml

I'm having a problem in fixing a warning that OCaml compiler gives to me.
Basically I'm parsing an expression that can be composed by Bool, Int and Float.
I have a symbol table that tracks all the symbols declared with their type:
type ast_type = Bool | Int | Float
and variables = (string, int*ast_type) Hashtbl.t;
where int is the index used later in the array of all variables.
I have then a concrete type representing the value in a variable:
type value =
| BOOL of bool
| INT of int
| FLOAT of float
| UNSET
and var_values = value array
I'm trying to define the behaviour of a variable reference inside a boolean expression so what I do is
check that the variable is declared
check that the variable has type bool
to do this I have this code (s is the name of the variable):
| GVar s ->
begin
try
let (i,t) = Hashtbl.find variables s in
if (t != Bool) then
raise (SemanticException (BoolExpected,s))
else
(fun s -> let BOOL v = Array.get var_values i in v)
with
Not_found -> raise (SemanticException (VarUndefined,s))
end
The problem is that my checks assure that the element taken from var_values will be of type BOOL of bool but of course this constraint isn't seen by the compiler that warns me:
Warning P: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
(FLOAT _ |INT _ |UNSET)
How am I supposed to solve this kind of issues? Thanks in advance
This is a problem that you can solve using OCaml's polymorphic variants.
Here is some compilable OCaml code that I infer exhibits your problem:
type ast_type = Bool | Int | Float
and variables = (string, int*ast_type) Hashtbl.t
type value =
| BOOL of bool
| INT of int
| FLOAT of float
| UNSET
and var_values = value array
type expr = GVar of string
type exceptioninfo = BoolExpected | VarUndefined
exception SemanticException of exceptioninfo * string
let variables = Hashtbl.create 13
let var_values = Array.create 13 (BOOL false)
let f e =
match e with
| GVar s ->
begin
try
let (i,t) = Hashtbl.find variables s in
if (t != Bool) then
raise (SemanticException (BoolExpected,s))
else
(fun s -> let BOOL v = Array.get var_values i in v)
with
Not_found -> raise (SemanticException (VarUndefined,s))
end
It generates the warning:
File "t.ml", line 30, characters 42-48:
Warning P: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
(FLOAT _|INT _|UNSET)
Here is the same code transformed to use polymorphic variants. That code compiles without warnings. Note that polymorphic variants have more expressive power than standard types (here allowing to express that var_values is an array of BOOL only), but they can lead to puzzling warnings.
type ast_type = Bool | Int | Float
and variables = (string, int*ast_type) Hashtbl.t
type value =
[ `BOOL of bool
| `INT of int
| `FLOAT of float
| `UNSET ]
and var_values = value array
type expr = GVar of string
type exceptioninfo = BoolExpected | VarUndefined
exception SemanticException of exceptioninfo * string
let variables = Hashtbl.create 13
let var_values = Array.create 13 (`BOOL false)
let f e =
match e with
| GVar s ->
begin
try
let (i,t) = Hashtbl.find variables s in
if (t != Bool) then
raise (SemanticException (BoolExpected,s))
else
(fun s -> let `BOOL v = Array.get var_values i in v)
with
Not_found -> raise (SemanticException (VarUndefined,s))
end
Here are the types inferred by OCaml on the above code:
type ast_type = Bool | Int | Float
and variables = (string, int * ast_type) Hashtbl.t
type value = [ `BOOL of bool | `FLOAT of float | `INT of int | `UNSET ]
and var_values = value array
type expr = GVar of string
type exceptioninfo = BoolExpected | VarUndefined
exception SemanticException of exceptioninfo * string
val variables : (string, int * ast_type) Hashtbl.t
val var_values : [ `BOOL of bool ] array
val f : expr -> 'a -> bool
Take a look at this and search for "disable warnings". You should come to a flag -w.
If you want to fix it the "ocamlish" way, then I think you must make the pattern match exhaustive, i.e. cover all cases that might occur.
But if you don't want to match against all possible values, you might consider using wildcard (see here), that covers all cases you do not want to handle explicitly.
In this particular case, polymorphic variants, as explained by Pascal, are a good answer.
Sometimes, however, you're stuck with an impossible case. Then I find it natural to write
(fun s -> match Array.get var_values i with
| BOOL v -> v
| _ -> assert false)
This is much better than using the -w p flag which could hide other, undesired non-exhaustive pattern matches.
Whoops! Misread your question. Leaving my answer below for posterity.
Updated answer: is there a reason why you are doing the check in the hashtbl, or why you can't have the concrete data types (type value) in the hashtbl? That would simplify things. As it is, you can move the check for bool to the Array.get and use a closure:
| GVar s ->
begin
try
let (i,_) = Hashtbl.find variables s in
match (Array.get var_values i) with BOOL(v) -> (fun s -> v)
| _ -> raise (SemanticException (BoolExpected,s))
with
Not_found -> raise (SemanticException (VarUndefined,s))
end
Alternatively I think it would make more sense to simplify your code. Move the values into the Hashtbl instead of having a type, an index and an array of values. Or just store the index in the Hashtbl and check the type in the array.
INCORRECT ANSWER BELOW:
You can replace the if else with a match. Or you can replace the let with a match:
replace if/else:
| GVar s ->
begin
try
let (i,t) = Hashtbl.find variables s in
match t with Bool -> (fun s -> let BOOL v = Array.get var_values i in v)
| _ -> raise (SemanticException (BoolExpected,s))
with
Not_found -> raise (SemanticException (VarUndefined,s))
end
replace let:
| GVar s ->
begin
try
match (Hashtbl.find variables s) with (i, Bool) -> (fun s -> let BOOL v = Array.get var_values i in v)
| _ -> raise (SemanticException (BoolExpected,s))
with
Not_found -> raise (SemanticException (VarUndefined,s))
end

Resources