Javacc conflicting productions - javacc

I have two inputs
a - b
a += b
And I have a production with a choice
void AssignmentExpression() : {}
{
LOOKAHEAD(3) ConditionalExpression()
| LOOKAHEAD(3) UnaryExpression() AssignmentOperator() AssignmentExpression()
}
With this production input (1) works, but input (2) does not work.
If I swap the choice in the production so that it becomes
void AssignmentExpression() : {}
{
LOOKAHEAD(3) UnaryExpression() AssignmentOperator() AssignmentExpression()
| LOOKAHEAD(3) ConditionalExpression()
}
Then input (2) works, but input (1) does not work.
How do I fix this? Increasing the LOOKAHEAD parameter does not help.

See Expression Parsing by Recursive Descent. Follow the "classic solution".
Since you are using JJTree, the answer to the question Make a calculator's grammar that make a binary tree with javacc will be helpful.

You might try
void AssignmentExpression() : {}
{
LOOKAHEAD(UnaryExpression() AssignmentOperator() )
UnaryExpression() AssignmentOperator() AssignmentExpression()
| ConditionalExpression()
}
Without seeing more of the grammar it is hard to know whether this will work. Since the use of the lookahead specification will suppress any warnings from JavaCC --JavaCC "assumes" you know what you are doing-- you have to do the analysis yourself.
My other answer is better.

Related

if {...} else {...} : Does the line break between "}" and "else" really matters?

I write my if {...} else {...} statement in R in the following way as I find it more readable.
Testifelse = function(number1)
{
if(number1>1 & number1<=5)
{
number1 =1
}
else if ((number1>5 & number1 < 10))
{
number1 =2
}
else
{
number1 =3
}
return(number1)
}
According to ?Control:
... In particular, you should not have a newline between } and else to avoid a syntax error in entering a if ... else construct at the keyboard or via source ...
the function above will cause syntax error, but actually it works! What's going on here?
Thanks for your help.
Original question and answer
If we put in R console:
if (1 > 0) {
cat("1\n");
}
else {
cat("0\n");
}
why does it not work?
R is an interpreted language, so R code is parsed line by line. (Remark by #JohnColeman: This judgement is too broad. Any modern interpreter does some parsing, and an interpreted language like Python has no problem analogous to R's problem here. It is a design decision that the makers of R made, but it wasn't a decision that was forced on them in virtue of the fact that it is interpreted (though doubtless it made the interpreter somewhat easier to write).)
Since
if (1 > 0) {
cat("1\n");
}
makes a complete, legal statement, the parser will treat it as a complete code block. Then, the following
else {
cat("0\n");
}
will run into error, as it is seen as a new code block, while there is no control statement starting with else.
Therefore, we really should do:
if (1 > 0) {
cat("1\n");
} else {
cat("0\n");
}
so that the parser will have no difficulty in identifying them as a whole block.
In compiled language like C, there is no such issue. Because at compilation time, the compiler can "see" all lines of your code.
Final update related to what's going on inside a function
There is really no magic here! The key is the use of {} to manually indicate a code block. We all know that in R,
{statement_1; statement_2; ...; statement_n;}
is treated as a single expression, whose value is statement_n.
Now, let's do:
{
if (1 > 0) {
cat("1\n");
}
else {
cat("0\n");
}
}
It works and prints 1.
Here, the outer {} is a hint to the parser that everything inside is a single expression, so parsing and interpreting should not terminate till reaching the final }. This is exactly what happens in a function, as a function body has {}.

Avoid common prefixes without change lookahead

I'm using JavaCC to make a specification to recognize a language. The problem I have is that JavaCC gives me a warning because public is a common prefix of Member() declaration. Member() can has Attributes() and/or Method() but must have at least one Method, the order does not matter.
The warning JavaCC gives me is:
Choice conflict in (...)+ construct at line 66, column 23.
Expansion nested within construct and expansion following construct have common prefixes, one of which is: "public". Consider using a lookahead of 2 or more for nested expansion.
The line 66 is the only line of Member(). Also I need to do this without change lookahead value.
Here is the code:
void Member() : {}
{
(Attribute())* (Method())+ (Attribute() | Method())*
}
void Attribute() : {}
{
"private" Type() <Id> [":=" Expr()]";"
}
void Method() : {}
{
MethodHead() MethodBody()
}
void MethodHead() : {}
{
("public")? (<Id> | Type() | "void") <Id> "(" Parameter() ")"
}
Thanks.
The problem is that this regular expression
(Method())+ (Attribute() | Method())*
is ambiguous. Let's abbreviate methods by M and attributes by A. If the input is MAM, there no problem. The (Method())+ matches the first M and the (Attribute() | Method())* matches the remaining AM. But if the input is MMA, where should the divide be? Either (Method())+ matches M and (Attribute() | Method())* matches MA or (Method())+ matches MM and (Attribute() | Method())* matches A. Both parses are possible. JavaCC doesn't know which parse you want, so it complains.
What you can do:
Nothing. Ignore the warning. The default behaviour is that as many Methods as possible will be recognized by (Method())+ and only methods after the first attribute will be recognized by (Attribute() | Method())*.
Suppress the warning using lookahead. You said you don't want to add lookahead, but for completeness, I'll mention that you could change (Method())+ to (LOOKAHEAD(1) Method())+. That won't change the behaviour of the parser, but it will suppress the warning.
Rewrite the grammar.
The offending line could be rewritten as either
(Attribute())* Method() (Attribute() | Method())*
or as
(Attribute())* (Method())+ [Attribute() (Attribute() | Method())*]

LTL Formula with Aorai

I am trying to find an example about the LTL operator _ F_ which means fatally with Aorai but i can't figure out exactly what this operator aims and there are no examples in the repository "tests" of Aorai
For example, i wrote this formula
CALL(main) && _X_ (CALL(a) && _X_(RETURN(a) && _F_ (RETURN(b) && _X_ (RETURN(main)) ) ))
which says that in my program main, i have to call the function a() and after this i don't understand what happens with the operator fatally but it seems that it takes and accepts whatever we call after the function a() with no warning or error from Aorai. If anybody could help me or could give a right example about it.
For example, i have this program below which i would like to test with this formula above
void a()
{}
void b()
{}
int main()
{ a();
a();
b();
b();
a();
return 0;}
I type frama-c -aorai-ltl test.ltl test.c
Normally, there should be an error or warning from Aorai. No?
Your question is more about temporal logic than Frama-C/Aorai itself, but the meaning of this formula is that main must call a, then do whatever it wants, before calling b and returning just after that.
NB: note that Aorai only traces call and return events, so that e.g. "just after" here means that main cannot not call any function after its last call to b, but can still perform some actions, such as x++;.
Update
I've run your complete example on Frama-C. Indeed a post-condition is missing in the contract for main generated by Aorai, namely that the state of the generated automaton at the end of main (T0_S4) is supposed to be accepting, which is not the case here. This is a bug. If you write explicitely an equivalent automaton in the ya language, as
%init: S0;
%accept: Sf;
S0: { CALL(main) } -> S1;
S1: { [ a() ] } -> S2;
S2: { RETURN(b) } -> S3
| other -> S2;
S3: { RETURN(main) } -> Sf;
Sf: -> Sf;
Then the generated contract for main contains a requires \false;, which indeed indicates that the function is not conforming to the automaton, and Aoraï warns about that.
Please note however that in the general case, Aoraï will not emit any warning. It generates contracts that, if fulfilled, imply that the whole program is conforming to the automaton. The proof of the contract must be done by another plugin (e.g. WP or Value Analysis)

javacc,why ' return jjtThis '

Why write return jjtThis at the end of the methods?
What effect will it make?
What if I don't write this line?
When should I add this line and when shouldn't I add this line?
Is it return for the judge of other place?
ASTDirectSQLStatement DirectSQLStatement() :
{}
{
DirectlyExecutableStatement() <SEMICOLON>
{
return jjtThis;
}
}
ASTDirectlyExecutableStatement DirectlyExecutableStatement() :
{}
{ (
LOOKAHEAD(<SELECT> | <DELETE> <FROM> | <INSERT> | <UPDATE> | <DECLARE>)
DirectSQLDataStatement()
| LOOKAHEAD(SQLSchemaStatement())
SQLSchemaStatement()
)
{
return jjtThis;
}
}
Thank you :)
Quite simply, jjtThis is a special identifier of SimpleNode class, that refers to the Function/Production it is written inside of. We use/return jjtThis when we generate a parse tree or AST. In your example; DirectSQLStatement production is a node in a tree, this production calls another production DirectlyExecutableStatement which will be DirectSQLStatement's child node in the tree, DirectlyExecutableStatement then calls some other production(s) which will be its children so on so forth.
SimpleNode is the class that makes your tree. Usually jjtThis is returned only from the first production/function of grammar and in main() you have a SimpleNode's object say "root" that catches it. Then by root.Dump(" ") the tree is printed. Hope it helps!

Recovering multiple errors in Javacc

Is there a way in javacc to parse an input file further even after detecting an error. I got to know that there are several ways such as panic mode recovery, phrase level recovery and so on. But I can't figure how to implement it in javacc jjt file.
For an example assume my input file is
Line 1: int i
Line 2: int x;
Line 3: int k
So what I want is after detecting the error of missing semicolon at line 1, proceed parsing and find the error at line 3 too.
I found the answer in the way of panic mode error recovery,but it too have some bugs. What I did was I edit my grammar so that once I encounter a missing character in a line of the input file(in the above case a semicolon) parser proceed until it finds a similar character. Those similar characters are called synchronizing tokens.
See the example below.
First I replaced all the SEMICOLON tokens in my grammar with this.
Semicolon()
Then add this new production rule.
void Semicolon() :
{}
{
try
{
<SEMICOLON>
} catch (ParseException e) {
Token t;
System.out.println(e.toString());
do {
t = getNextToken();
} while (t.kind != SEMICOLON && t!=null && t.kind != EOF );
}
}
Once I encounter a missing character parser search for a similar character.When it finds such character it returns to the rule which called it.
Example:-
Assume a semicolon missing in a variable declaration.
int a=10 <--- no semicolon
So parser search for a semicolon.At some point it finds a semicolon.
___(some code)__; method(param1);
So after finding the first semicolon in the above example it returns to the variable declaration rule(because it is the one which called the semicolon() method.) But what we find after the newly find semicolon is a function call,not a variable declaration.
Can anyone please suggest a way to solve this problem.

Resources