JavaCC get ParseException error when lookahead > 0 - javacc

I'm getting this error when I try to make a simple parser. The parser should be accepting (01|10|00|11)*(00|11). When I use lookahead = 0, 00100100 will trigger an error, even though its a correct input. Because JavaCC read it as 00 1 00 1 00, not 00 10 01 00. But when I add lookahead to fix it, I got
Exception in thread "main" ParseException: Encountered "" at line 1, column 6.
Was expecting one of:
at GS.generateParseException(GS.java:453)
at GS.jj_consume_token(GS.java:337)
at GS.q3(GS.java:50)
at GS.q0(GS.java:17)
at GS.q1(GS.java:32)
at GS.q0(GS.java:14)
at GS.q3(GS.java:43)
at GS.q0(GS.java:17)
at GS.main(GS.java:8)
Can anyone help me to find the cause?
Any help would be much appreciated. Thanks
options{
LOOKAHEAD = 4;
}
PARSER_BEGIN(GS)
public class GS {
public static void main(String args[]) throws ParseException {
GS parser = new GS(System.in);
parser.q0();
}
}
PARSER_END(GS)
TOKEN:
{
<END : (["\n", "\r", "\t"])+>
}
void q0():{}
{
"1" q1() | "0" q2() | "00" q3() | "11" q3()
}
void q1():{}
{
"0" q0()
}
void q2():{}
{
"1" q0()
}
void q3():{}
{
q0() | <END>
}

I think that writing
void q3():{}
{
<END> |q0()
}
instead of
void q3():{}
{
q0() | <END>
}
may fix the problem : if you inline q3, you can see that q0 is left recursive, so you need an infinite look-ahead to choose the alternative production. Putting END first will give it a higher priority, and the production will become right recursive.

Related

Error Recovery in JavaCC for Qbasic language

I am developing a compiler (with JavaCC) for QBasic language and I have an issue relate to Error Recovery (Error Recovery is showing all compiler errors when you compile the program)
so I had to handle ParseException and ignore the line where ParseException occurs
note : QBasic language has no semicolons so every statement has a separated line
I have tried to try catch the ParseException in every statement and handle it by using getNextToken repeatedly until I have "\n" token
unfortunately that does not work !!
Here is my program method :
void program():
{
Node n =null;
programNode ret = new programNode() ;
boolean canrun=true;
}
{
(< LINE > | < SPACE >)*
(
try {
n = statement()(< SPACE >)* <LINE>
}
catch(ParseException e)
{
canrun=false;
Excep.add(e);
Token t;
do
{
t=CodeParserTokenManager.getNextToken();
}while (t.image!="\n");
}
(< LINE > | < SPACE >)*{
if (n!=null)
ret.addChild(n);
})+ "?"
{
if (canrun)
ret.Start();
}
}
And here is my Parser class :
PARSER_BEGIN(CodeParser)
import java.util.ArrayList;
public class CodeParser
{
public static void main(String args[])
{
CodeParser Parser = new CodeParser(System.in);
try {
program() ;
}
catch(ParseException e)
{
}
}
}
PARSER_END(CodeParser)
I believe the problem is the line:
}while (t.image!="\n");
because 1) you shouldn't use != with strings, 2) the image could be different ("\r\n" for instance).
Try t.kind!=LINE.

JavaCC simple example not working

I am trying javacc for the first time with a simple naive example which is not working. My BNF is as follows:
<exp>:= <num>"+"<num>
<num>:= <digit> | <digit><num>
<digit>:= [0-9]
Based on this BNF, I am writing the SimpleAdd.jj as follows:
options
{
}
PARSER_BEGIN(SimpleAdd)
public class SimpleAdd
{
}
PARSER_END(SimpleAdd)
SKIP :
{
" "
| "\r"
| "\t"
| "\n"
}
TOKEN:
{
< NUMBER: (["0"-"9"])+ >
}
int expr():
{
int leftValue ;
int rightValue ;
}
{
leftValue = num()
"+"
rightValue = num()
{ return leftValue+rightValue; }
}
int num():
{
Token t;
}
{
t = <NUMBER> { return Integer.parseInt(t.toString()); }
}
using the above file, I am generating the java source classes. My main class is as follows:
public class Main {
public static void main(String [] args) throws ParseException {
SimpleAdd parser = new SimpleAdd(System.in);
int x = parser.expr();
System.out.println(x);
}
}
When I am entering the expression via System.in, I am getting the following error:
11+11^D
Exception in thread "main" SimpleAddTest.ParseException: Encountered "<EOF>" at line 0, column 0.
Was expecting:
<NUMBER> ...
at SimpleAddTest.SimpleAdd.generateParseException(SimpleAdd.java:200)
at SimpleAddTest.SimpleAdd.jj_consume_token(SimpleAdd.java:138)
at SimpleAddTest.SimpleAdd.num(SimpleAdd.java:16)
at SimpleAddTest.SimpleAdd.expr(SimpleAdd.java:7)
at SimpleAddTest.Main.main(Main.java:9)
Any hint to solve the problem ?
Edit Note that this answer answers an earlier version of the question.
When a BNF production uses a nonterminal that returns a result, you can record that result in a variable.
First declare the variables in the declaration part of the BNF production
int expr():
{
int leftValue ;
int rightValue ;
}
{
Second, in the main body of the production, record the results in the variables.
leftValue = num()
"+"
rightValue = num()
Finally, use the values of those variables to compute the result of this production.
{ return leftValue+rightValue; }
}

For Loop to Recursion Statement - Syntax

I was asked to create an assignment with the output: That would look like and execute like the following as long as the number is positive.
Please enter a number: 4
****
***
**
*
**
***
****
This works correctly with the for loop in which was created: However, I was told no for loop or any loop of any matter could be used. I was asked to change this to a recursive method and utilize the call in (2) if else statements. However, I have read all available published paper to change a for loop into recursive but I have been unsuccessful I would greatly appreciate some help to understanding with some in depth clarification.
static void printPattern(int pattern) {
for (int i=0; i<pattern; ++i) {
System.out.print("*");
}
System.out.println();
}
public static void printStars(int lines) {
if (lines<=1) {
printPattern(1);
} else {
printPattern(lines);
printStars(lines-1);
printPattern(lines);
}
}
}
Try
static void printPattern(int pattern) {
if(pattern>0){
System.out.print("*");
printPattern(--pattern);
}else{
System.out.println();
}
}

Do not change a loop variable inside a for-loop block

I want to implement the rule coding in my parser generated by javaCC :
Do not change a loop variable inside a for-loop block.
the Rule Production javacc of for-loop block is :
void MyMethod () : {}
{
"(" Argument () ")" {}
(Statement ()) *
}
void Statement () : {}
{
expressionFOR()
}
void expressionFOR() :{}
{
<For> <id> "= " 1 <to> 100
int J
int kk =SUM( , J)
......
}
thank you very much in advance
Assuming you are using JJTree with MULTI=false and VISITOR=true, you could write a visitor along this line
public void visit(SimpleNode node, Object data) {
if( this is a for loop node ) {
push the for loop variable onto a stack of variables
node.childrenAccept(this, null) ;
pop the stack }
else {
if( this is an assignment statement node
and the target variable is on the stack )
report rule violated
node.childrenAccept(this, null) ;
}
}

JavaCC grammar - proper lexing

I have a JavaCC grammar with following definitions:
<REGULAR_IDENTIFIER : (["A"-"Z"])+ > // simple identifier like say "DODGE"
<_LABEL : (["A"-"Z"])+ (":") > // label, eg "DODGE:"
<DOUBLECOLON : "::">
<COLON : ":">
Right now "DODGE::" lexed as <_LABEL> <COLON> ("DODGE:" ":")
but i need to lex it as <REGULAR_IDENTIFIER> <DOUBLECOLON> ("DODGE" "::")
I think the following will work
MORE: { < (["A"-"Z"])+ :S0 > } // Could be identifier or label.
<S0> TOKEN: { <LABEL : ":" : DEFAULT> } // label, eg "DODGE:"
<S0> TOKEN: { <IDENTIFIER : "" : DEFAULT > } // simple identifier like say "DODGE"
<S0> TOKEN: { <IDENTIFIER : "::" { matchedToken.image = image.substring(0,image.size()-2) ; } : S1 > }
<S1> TOKEN: { <DOUBLECOLON : "" { matchedToken.image = "::" ; } : DEFAULT> }
<DOUBLECOLON : "::">
<COLON : ":">
Note that "DODGE:::" is three tokens, not two.
In javacc the maximal match rule (longest prefix match rule) is used see:
http://www.engr.mun.ca/~theo/JavaCC-FAQ/javacc-faq-moz.htm#more-than-one
This means that the _LABEL token will be matched before the REGULAR_IDENTIFIER token, as the _LABEL token will contain more characters. This means that what you are trying to do should not be done in the tokenizer.
I have written a parser which recognizes the grammar correctly, I use the parser for recognizing the _LABEL's, instead of the tokenizer:
options {
STATIC = false;
}
PARSER_BEGIN(Parser)
import java.io.StringReader;
public class Parser {
//Main method, parses the first argument to the program
public static void main(String[] args) throws ParseException {
System.out.println("Parseing: " + args[0]);
Parser parser = new Parser(new StringReader(args[0]));
parser.Start();
}
}
PARSER_END(Parser)
//The _LABEL will be recognized by the parser, not the tokenizer
TOKEN :
{
<DOUBLECOLON : "::"> //The double token will be preferred to the single colon due to the maximal munch rule
|
<COLON : ":">
|
<REGULAR_IDENTIFIER : (["A"-"Z"])+ > // simple identifier like say "DODGE"
}
/** Root production. */
void Start() :
{}
{
(
LOOKAHEAD(2) //We need a lookahead of two, to see if this is a label or not
<REGULAR_IDENTIFIER> <COLON> { System.err.println("label"); } //Labels, should probably be put in it's own production
| <REGULAR_IDENTIFIER> { System.err.println("reg_id"); } //Regulair identifiers
| <DOUBLECOLON> { System.err.println("DC"); }
| <COLON> { System.err.println("C"); }
)+
}
In a real you should of cause move the <REGULAR_IDENTIFIER> <COLON> to a _label production.
Hope it helps.

Resources