how to remove left-recursion - recursion

I'd like to make a grammar that will allow curried function calls.
That is:
a() /// good
a()() /// good
a()()() /// good
a(a) /// good
a(a()()) /// good
/// etc
My first stab was this:
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;
fncall : expr '(' (expr (',' expr)*)? ')';
expr : ID|fncall;
But that fails due to left recursion.

Assuming (a)() would also be valid, here's a way to solve this:
grammar T;
options {
output=AST;
}
tokens {
EXPR_LIST;
CALL;
INDEX;
LOOKUP;
}
parse
: expr EOF -> expr
;
expr
: add_expr
;
add_expr
: mul_exp (('+' | '-')^ mul_exp)*
;
mul_exp
: atom (('*' | '/')^ atom)*
;
atom
: fncall
| NUM
;
fncall
: (fncall_start -> fncall_start) ( '(' expr_list ')' -> ^(CALL $fncall expr_list)
| '[' expr ']' -> ^(INDEX $fncall expr)
| '.' ID -> ^(LOOKUP $fncall ID)
)*
;
fncall_start
: ID
| '(' expr ')' -> expr
;
expr_list
: (expr (',' expr)*)? -> ^(EXPR_LIST expr*)
;
NUM : '0'..'9'+;
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;
The parser generated from the grammar above would parse the input:
(foo.bar().array[i*2])(42)(1,2,3)
and construct the following AST:
Without the tree rewrite rules, the grammar would look like this:
grammar T;
parse
: expr EOF
;
expr
: add_expr
;
add_expr
: mul_exp (('+' | '-') mul_exp)*
;
mul_exp
: atom (('*' | '/') atom)*
;
atom
: fncall
| NUM
;
fncall
: fncall_start ( '(' expr_list ')' | '[' expr ']' | '.' ID )*
;
fncall_start
: ID
| '(' expr ')'
;
expr_list
: (expr (',' expr)*)?
;
NUM : '0'..'9'+;
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*;

Related

What in this Nearley grammar is causing an infinite loop?

I'm attempting to write a Nearley grammar that will parse a .pbtxt file (a protobuf in textual format). I'm very close but seem to be encountering an infinite loop when tested in the Nearley playground (https://omrelli.ug/nearley-playground/). Can someone more comfortable with Nearley grammars spot the issue more readily?
#builtin "number.ne"
#builtin "string.ne"
#builtin "whitespace.ne"
Start -> Field:+
Field -> _ (ScalarField | MessageField) _ "\n":*
ScalarField -> FieldName _ ":" _ (ScalarValue | ScalarList) _
MessageField -> FieldName _ ":" _ (MessageValue | MessageList) _
FieldName -> [A-Za-z0-9_]:+
MessageValue -> "{" Field:+ "}"
MessageList -> "{" Field (_ "\n":+ Field):* "}"
ScalarValue -> String | Float | Integer
ScalarList -> "{" _ ScalarValue (_ "\n":+ ScalarValue):* "}"
String -> sqstring | dqstring | [A-Za-z0-9_]:+
Float -> decimal
Integer -> int
Here's an example .pbtxt that should pe parseable:
something: WHATEVER
some_list: {
bins: 32
bins: 64
bins: 128
bins: 256
}
things: {
some_path: "Data/thingything"
weight: 2
other_weights: {
positive_dense_bin: 8
low_heat: 1
}
}

Antlr4 order of token in lexer

lexer grammar
DESC: D | D E S C;
.
.
.
INCREMENTOPTION: S | H | M | D;
parser grammar:
sortExpression: integer? sortFieldList Desc = DESC?;
.
.
.
incrementOption: integer INCREMENTOPTION;
in the case of input 'd' i have a problem.
each of DESC or INCREMENTOPTION token be the upper token in lexer that is matched and the other one not matched
what can i do?!
You will have to do something like this:
sortExpression. : integer? sortFieldList desc?;
incrementOption : integer incrementoption;
desc : DESC | SINGLE_D;
incrementoption : SINGLE_D | SINGLE_S_H_M;
DESC : D E S C;
SINGLE_D : D;
SINGLE_S_H_M : S | H | M;

DynamoDB ConditionExpression if resulting value is positive?

I'm writing an application that has a function for tipping points. I want to make a conditional update which is only executed if the resulting value of a user's wallet would be 0 or higher. If the value is negative the update should not happen.
The function works without the conditional expression but when I add it, it breaks.
ConditionExpression: 'teleUser.wallet.points -:a > -1',
In the above line :a is a passed in integer. I'll post the context below, but the above line is where my problem occurs.
The error returned is ValidationException: Invalid ConditionExpression: Syntax error; token: "-", near: "points -:a".
Full function for context:
function removeFromWallet(msg, amount) {
console.log("remove");
let params = {
TableName: tableName,
Key: {"id": msg.from.id},
UpdateExpression: 'set teleUser.wallet.points = teleUser.wallet.points -:a',
ExpressionAttributeValues:{
":a": parseInt(amount)
},
ConditionExpression: 'teleUser.wallet.points -:a > -1',
ReturnValues:"UPDATED_NEW"
};
docClient.update(params, function(err, data) {
if (err) {
console.log(err);
} else {
const { Items } = data;
console.log(data.Attributes.teleUser.wallet.points);
addToWallet(msg, amount);
}
});
}
You can't perform calculations in ConditionExpression (see grammar for ConditionExpression)
condition-expression ::=
operand comparator operand
| operand BETWEEN operand AND operand
| operand IN ( operand (',' operand (, ...) ))
| function
| condition AND condition
| condition OR condition
| NOT condition
| ( condition )
comparator ::=
=
| <>
| <
| <=
| >
| >=
function ::=
attribute_exists (path)
| attribute_not_exists (path)
| attribute_type (path, type)
| begins_with (path, substr)
| contains (path, operand)
| size (path)
You can perform calculations in ExpressionAttributeValues, but in this particular case you'll probably have to use teleUser.wallet.points >= :a since column values aren't available in ExpressionAttributeValues

ANTLR4 left-recursive error

My ANTLR4 grammar in file power.g4 is this:
assign : id '=' expr ;
id : 'A' | 'B' | 'C' ;
expr : expr '+' term
| expr '-' term
| term ;
term : term '*' factor
| term '/' factor
| factor ;
factor : expr '**' factor
| '(' expr ')'
| id ;
WS : [ \t\r\n]+ -> skip ;
When I run command
antlr4 power.g4
This error occurred:
error(119): power.g4::: The following sets of rules are mutually left-recursive [expr, factor, term]
What can I do?
To avoid the left recursion error, put all forms of an expr in one rule, ordered by desired precedence:
expr : '(' expr ')'
| expr '+' expr
| expr '-' expr
| expr '*' expr
| expr '/' expr
| expr '**' expr
| id
;

Error about left-recursive in ANTLR . Need to do what now?

//Expression
exp: exp1 ASS_OP exp | exp1;
exp1: exp1 OR_OP exp2 | exp2;
exp2: exp2 AND_OP exp3 | exp3;
exp3: exp4 (EQUAL_OP | NOT_EQUAL_OP) exp4 | exp4;
exp4: exp5 (LESS_OP|GREATER_OP|LESS_EQUAL_OP|GREATER_EQUAL_OP) exp5 | exp5;
exp5: exp5 (ADD_OP | SUB_OP) exp6 | exp6;
exp6: exp6 (MUL_OP | DIV_OP | MOD_OP) exp7 | exp7;
exp7: (ADD_OP | SUB_OP | NOT_OP) exp7 | exp8;
exp8: LB exp RB | expl;
expl: invocation_exp | index_exp | ID | INTLIT |FLOATLIT | BOOLEANLIT | STRINGLIT;
index_exp: exp LSB exp RSB;
invocation_exp: ID LB (exp (COMMA exp)*)? RB;
[error] error(119): MC.g4::: The following sets of rules are mutually left-recursive [exp, index_exp, exp1, exp2, exp3, exp4, exp5, exp6, exp7, exp8, expl]
[trace] Stack trace suppressed: run last *:Antlr generates lexer and parser for the full output.
Hi, I'm new. I read some topics so that ANTLR4 supports only direct left-recursion. And authors seem don't want to change that. So anyone can help me fix my code? Thank for reading this.
One of the great things about ANTLR4 as opposed to some of the older tools is that you don't often have to "chain" your operator precedence like this:
exp: exp1 ASS_OP exp | exp1;
exp1: exp1 OR_OP exp2 | exp2;
exp2: exp2 AND_OP exp3 | exp3;
I remember those days of chaining expression rules for strict BNF grammars. But in ANTLR4, we can do better and have a clearer grammar too. Since the rules are evaluated top to bottom, the highest-predecence rules are listed first like in this snippet:
expr
: expr POW<assoc=right> expr #powExpr
| MINUS expr #unaryMinusExpr
| NOT expr #notExpr
| expr op=(MULT | DIV | MOD) expr #multiplicationExpr
| expr op=(PLUS | MINUS) expr #additiveExpr
| expr op=(LTEQ | GTEQ | LT | GT) expr #relationalExpr
| expr op=(EQ | NEQ) expr #equalityExpr
| expr AND expr #andExpr
| expr OR expr #orExpr
| atom #atomExpr
This might solve your operator precedence issues without wrangling with mutual left recursion. Highest precedence at top, with power (exponentiation) in this example (and by mathematical convention) having the highest binding in an expression.

Resources