antlr same type arithmetic expression - math

Trying to write a antlr grammar that parse the arithmetic expression for only same typed variable. If it is not a same type as left or right side, it should not be parse. This is what I have;
stat
: Left = VARIABLE Op = ASSIGMENT Right = expr # Assigment
;
expr
: '(' Exp = expr ')' # Parens
| MINUS Exp = expr # UnaryMinus
| Left = expr Op = (TIMES | DIV) Right = expr # MulDiv
| Left = expr Op = (PLUS | MINUS) Right = expr # AddSub
| (VARIABLE | CONSTANT) # Element
;
ASSIGMENT : '=' ;
PLUS : '+' ;
MINUS : '-' ;
TIMES : '*' ;
DIV : '/' ;
LPAREN : '(' ;
RPAREN : ')' ;
I don't want anything like x = 5 + 'f' or x = c - 5, (if c is variable that is not integer)

It's called Semantic analysis.
When parsing is done you have to walk through the generated AST and check correctness of each expression and variable.

Related

OCaml map not updating before the next step in a sequence

I am trying to implement a basic parser, scanner, and a minimal language in OCaml. I believe the issue is that I'm trying to maintain a map between variables in this toy language and their values, and the language should be able to handle an expression like a=2;a and return 2. The name seems to successfully store the number 2, but by the time the program moves on to evaluating the second expression, it does not find the name in the map. And I can't understand why.
Below is the abstract syntax tree.
Ast.ml
type operator = Add (* for now just one operator *)
type expr =
Binop of expr * operator * expr
| Lit of int (* a number *)
| Seq of expr * expr (* a sequence, to behave like ";" *)
| Asn of string * expr (* assignment, to behave like "=" in an imperative language *)
| Var of string (* a variable *)
Here are the parser and scanner.
parser.mly
%{
open Ast
%}
%token SEQ PLUS ASSIGN EOF
%token <int> LITERAL
%token <string> VARIABLE
%left SEQ PLUS
%start expr
%type <Ast.expr> expr
%%
expr:
| expr SEQ expr { Seq($1, $3) }
| expr PLUS expr { Binop($1, Add, $3) }
| LITERAL { Lit($1) }
| VARIABLE { Var($1) }
| VARIABLE ASSIGN expr { Asn($1, $3) }
scanner.mll
{
open Parser
}
rule tokenize = parse
[' ' '\t' '\r' '\n'] { tokenize lexbuf }
| '+' { PLUS }
| ['0'-'9']+ as lit { LITERAL(int_of_string lit) }
| ['a'-'z']+ as id { VARIABLE(id) }
| '=' { ASSIGN }
| ';' { SEQ }
| eof { EOF }
And here's where I tried to implement a sort of name-space in a basic calculator.
calc.ml
open Ast
module StringMap = Map.Make(String)
let namespace = ref StringMap.empty
let rec eval exp = match exp with
| Lit(n) -> n
| Binop(e1, op, e2) ->
let v1 = eval e1 in
let v2 = eval e2 in
v1+v2
| Asn (name, e) ->
let v = eval e in
(namespace := StringMap.add name v !namespace; v)
| Var(name) -> StringMap.find name !namespace
| Seq(e1, e2) ->
(let _ = eval e1 in
eval e2)
let _ =
let lexbuf = Lexing.from_channel stdin in
let expr = Parser.expr Scanner.tokenize lexbuf in
let result = eval expr in
print_endline (string_of_int result)
To test it out I compile, and it compiles successfully, then run $ ./calc in a terminal, and enter a=2;a then press Ctrl+D. It should print 2 but it gives a Not found exception. Presumably this is coming from the StringMap.find line, and it's not finding the name in the namespace. I've tried throwing print lines around, and I think I can confirm that the sequence is being correctly processed in the terminal and that the first evaluation is happening successfully, with the name and value getting entered into the string map. But for some reason it seems not to be there when the program moves on to processing the second expression in the sequence.
I'd appreciate any clarification, thanks.
I cannot reproduce your error.
Feeding the AST directly to eval
let () =
let ast = Seq(Asn ("a", Lit 2),Var "a") in
let result = eval ast in
print_endline (string_of_int result)
prints 2 as expected.
After fixing your parser to recognize the end of the stream:
entry:
| expr EOF { $1 }
using it in
let () =
let s = Lexing.from_string "a=2;a\n" in
let ast = Parser.entry Scanner.tokenize s in
let result = eval ast in
print_endline (string_of_int result)
prints 2 as expected. And without this fix, your code fails with a syntax error.
EDIT:
Rather than using a makefile, I will advise to use dune, with the following simple dune file:
(menhir (modules parser))
(ocamllex scanner)
(executable (name calc))
it will at least solve your compilation troubles.
Hey so I don't know if you are still looking for the solution, but I was able to reproduce the problem and how I solved it was simply adding assign to the %left statement in parser.mly
%left SEQ ASSIGN PLUS

fsharp function doesn't return anything

I have the following smaller tokenizer for simple arithmetic expressions. I am new to fsharp and I don't know why this function doesn't return anything when being called. Can someone please help?
let tokenizer s =
let chars1 = scan s
let rec repeat list =
match list with
| []->[]
| char::chars ->
match char with
| ')' -> RP::repeat chars
| '(' -> LP::repeat chars
| '+' -> Plus::repeat chars
| '*' -> Times::repeat chars
| '^' -> Pow::repeat chars
| _ ->
let (x,y) = makeInt (toInt char) chars
Int x::repeat chars
repeat chars1
The implementation of scan, toInt, makeInt and the union type for the expression was not presented, but might be inferred as:
let scan (s:string) = s.ToCharArray() |> Array.toList
let toInt c = int c - int '0'
let makeInt n chars = (n,chars)
type expr = RP | LP | Plus | Times | Pow | Int of int
let tokenizer s =
let chars1 = scan s
let rec repeat list =
match list with
| []->[]
| char::chars ->
match char with
| ')' -> RP::repeat chars
| '(' -> LP::repeat chars
| '+' -> Plus::repeat chars
| '*' -> Times::repeat chars
| '^' -> Pow::repeat chars
| _ ->
let (x,y) = makeInt (toInt char) chars
Int x::repeat chars
repeat chars1
in which case:
tokenizer "1+1"
gives:
val it : expr list = [Int 1; Plus; Int 1]
It's possible the issue is in the implementation of your scan function.

Pyparser grammar not parsing correctly

Here is my grammar:
from pyparsing import Combine, Forward, Group, Literal, Optional, Word
from pyparsing import alphas, delimitedList, infixNotation, nums, oneOf, opAssoc, operatorPrecedence, quotedString, removeQuotes
integer = Combine(Optional(oneOf("+ -")) + Word(nums)).setParseAction(lambda t: int(t[0]))
real = Combine(Optional(oneOf("+ -")) + Word(nums) + "." + Optional(Word(nums))).setParseAction(
lambda t: float(t[0]))
variable = Word(alphas)
qs = quotedString.setParseAction(removeQuotes)
lt_brac = Literal('[').suppress()
rt_brac = Literal(']').suppress()
exp_op = Literal('^')
mult_op = oneOf('* /')
plus_op = oneOf('+ -')
relation = oneOf('== != < >')
regex_compare = Literal('~')
function_call = Forward()
operand = function_call | qs | real | integer | variable
expr = operatorPrecedence(operand,
[
(":", 2, opAssoc.LEFT),
(exp_op, 2, opAssoc.RIGHT),
(regex_compare, 2, opAssoc.LEFT),
(mult_op, 2, opAssoc.LEFT),
(plus_op, 2, opAssoc.LEFT),
(relation, 2, opAssoc.LEFT)
])
bool_operand = expr
bool_expr = infixNotation(bool_operand,
[
("not", 1, opAssoc.RIGHT),
("and", 2, opAssoc.LEFT),
("or", 2, opAssoc.LEFT),
])
function_param = function_call | expr | variable | integer | real
function_call <<= Group(variable + lt_brac + Group(Optional(delimitedList(function_param))) + rt_brac)
final_expr = Group(function_call | bool_expr | expr )
final_expr.enablePackrat()
def parse(expression):
return final_expr.parseString(expression)
The above grammar is suppose to parse arithmetic expression, relations statements like (<, >, !=, ==) the operands can be arithmetic expressions, bool expression ( or, and, not) the operands can be arithmetic or relational statement.
The grammar supports functions in the form of []. Params can be arithmetic expression.
This works fine in most cases. However I have the following question, using the above grammar when I try to parse
print(parse(""abs[abc:sec - abc:sge] > 1")
I get the following output
[[['abs', [[['abc', ':', 'sec'], '-', ['abc', ':', 'sge']]]]]]
Why is the ' > 1' ignored?
It's ignored because of this definition of final_expr:
final_expr = Group(function_call | bool_expr | expr )
Why do you define this expression this way? An expr is a simple bool_expr, and a function_call is a simple expr. Just do this:
final_expr = bool_expr
And you'll parse your given expression as:
[[['abs', [[['abc', ':', 'sec'], '-', ['abc', ':', 'sge']]]], '>', 1]]

How to create a function or operator using a symbol?

I would like to create a function with a symbol (for instance, ~), which works similarly to the "question mark" function.
You can't do something as "bare" as ?foo without messing with the C code that defines the syntax of R. For example, you can't make [fnord be meaningful.
This comes from the syntax definition in gram.y in the R sources.
| '~' expr %prec TILDE { $$ = xxunary($1,$2); }
| '?' expr { $$ = xxunary($1,$2); }
| expr ':' expr { $$ = xxbinary($2,$1,$3); }
| expr '+' expr { $$ = xxbinary($2,$1,$3); }
The second line above defines the syntax for ?foo. What exactly are you trying to do?
You can make functions and variables with arbitrary names, via use of the backtick `.
`~` <- `+`
y <- 5
x <- 10
y ~ x
# 15
I wouldn't mess with ~ though, unless you don't intend to do any statistical modelling....

How should I understand `let _ as s = "abc" in s ^ "def"` in OCaml?

let _ as s = "abc" in s ^ "def"
So how should understand this?
I guess it is some kind of let pattern = expression thing?
First, what's the meaning/purpose/logic of let pattern = expression?
Also, in pattern matching, I know there is pattern as identifier usage, in let _ as s = "abc" in s ^ "def", _ is pattern, but behind as, it is an expression s = "abc" in s ^ "def", not an identifier, right?
edit:
finally, how about this: (fun (1 | 2) as i -> i + 1) 2, is this correct?
I know it is wrong, but why? fun pattern -> expression is allowed, right?
I really got lost here.
The grouping is let (_ as s) = "abc" -- which is just a convoluted way of saying let s = "abc", because as with a wildcard pattern _ in front is pretty much useless.
The expression let pattern = expr1 in expr2 is pretty central to OCaml. If the pattern is just a name, it lets you name an expression. This is like a local variable in other language. If the pattern is more complicated, it lets you destructure expr1, i.e., it lets you give names to its components.
In your expression, behind as is just an identifier: s. I suspect your confusion all comes down to this one thing. The expression can be parenthesized as:
let (_ as s) = "abc" in s ^ "def"
as Andreas Rossberg shows.
Your final example is correct if you add some parentheses. The compiler/toplevel rightly complains that your function is partial; i.e., it doesn't know what to do with most ints,
only with 1 and 2.
Edit: here's a session that shows how to add the parentheses to your final example:
$ ocaml
OCaml version 4.00.0
# (fun (1 | 2) as i -> i + 1) 2;;
Error: Syntax error
# (fun ((1 | 2) as i) -> i + 1) 2;;
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
0
- : int = 3
#
Edit 2: here's a session that shows how to remove the warning by specifying an exhaustive set of patterns.
$ ocaml
OCaml version 4.00.0
# (function ((1|2) as i) -> i + 1 | _ -> -1) 2;;
- : int = 3
# (function ((1|2) as i) -> i + 1 | _ -> -1) 3;;
- : int = -1
#

Resources