Print matched token in JavaCC - javacc

I need to print the token that was matched by javacc, but I don't know how to "store it".
Let's say my token definition is:
TOKEN :
{
< BLAH: ["0"-"9"]>
}
and my parser.input() function is:
void Input():
{}
{ (<BLAH> { System.out.println("I recognize BLAH"); } )
}
However what I really want to output, given some input, let's say 5, is:
I recognize that BLAH is 5.
Any tips? Thanks

Basically you declare variables in the first curly braces and use them in the second:
void Input():
{ Token t; }
{
(t=<BLAH> { System.out.println("I recognize BLAH is " + t.image); } )
}

Related

Generate Parser for a File with JavaCC

I am a beginner with JavaCC,and i'm trying to generate a file Parser.
I have already been able to generate a successful parser interpenetrated a line that is entered on the keyboard.
Parser example when I enter the keyboard "First Name: William", I managed to display on the screen the name of the variable and the value.
Now I have a file .txt who contain a large number of names and their value, and I would like to successfully display them on the screen.
below is my .jj file that I have already written to generate a parser of a typed line
Now i want the same but for a file.
options
{
static = true;
}
PARSER_BEGIN(parser_name)
public class parser_name
{
public static void main(String args []) throws ParseException
{
System.out.println("Waiting for the Input:");
parser_name parser = new parser_name(System.in);
parser.Start();
}
}
PARSER_END(parser_name)
SKIP :
{
" "
| "\r"
| "\t"
| "\n"
}
TOKEN : { < DIGIT : (["0"-"9"])+ > }
TOKEN : { <VARIABLE: (["a"-"z", "A"-"Z"])+> }
TOKEN : { <VALUE: (~["\n",":"])+> }
TOKEN : { <ASSIGNMENT: ":"> }
void Start(): { Token t,t1,t2;}
{
t=<VARIABLE>
t1=<ASSIGNMENT>
t2=<VALUE>
{ System.out.println("The Variable is "+t.image+",and the Value is "+t2.image); }
}
I have already tried to replace the "System.in" at the parser constructor with an object of type File.And then read the file by line, but it did not work.
Pass a Reader to the parser's constructor.

JavaCC simple example not working

I am trying javacc for the first time with a simple naive example which is not working. My BNF is as follows:
<exp>:= <num>"+"<num>
<num>:= <digit> | <digit><num>
<digit>:= [0-9]
Based on this BNF, I am writing the SimpleAdd.jj as follows:
options
{
}
PARSER_BEGIN(SimpleAdd)
public class SimpleAdd
{
}
PARSER_END(SimpleAdd)
SKIP :
{
" "
| "\r"
| "\t"
| "\n"
}
TOKEN:
{
< NUMBER: (["0"-"9"])+ >
}
int expr():
{
int leftValue ;
int rightValue ;
}
{
leftValue = num()
"+"
rightValue = num()
{ return leftValue+rightValue; }
}
int num():
{
Token t;
}
{
t = <NUMBER> { return Integer.parseInt(t.toString()); }
}
using the above file, I am generating the java source classes. My main class is as follows:
public class Main {
public static void main(String [] args) throws ParseException {
SimpleAdd parser = new SimpleAdd(System.in);
int x = parser.expr();
System.out.println(x);
}
}
When I am entering the expression via System.in, I am getting the following error:
11+11^D
Exception in thread "main" SimpleAddTest.ParseException: Encountered "<EOF>" at line 0, column 0.
Was expecting:
<NUMBER> ...
at SimpleAddTest.SimpleAdd.generateParseException(SimpleAdd.java:200)
at SimpleAddTest.SimpleAdd.jj_consume_token(SimpleAdd.java:138)
at SimpleAddTest.SimpleAdd.num(SimpleAdd.java:16)
at SimpleAddTest.SimpleAdd.expr(SimpleAdd.java:7)
at SimpleAddTest.Main.main(Main.java:9)
Any hint to solve the problem ?
Edit Note that this answer answers an earlier version of the question.
When a BNF production uses a nonterminal that returns a result, you can record that result in a variable.
First declare the variables in the declaration part of the BNF production
int expr():
{
int leftValue ;
int rightValue ;
}
{
Second, in the main body of the production, record the results in the variables.
leftValue = num()
"+"
rightValue = num()
Finally, use the values of those variables to compute the result of this production.
{ return leftValue+rightValue; }
}

Semi-colon required, optional, or disallowed in gRPC option value?

I'm seeing one piece of code like the following:
rpc SayFallback (FooRequest) returns (FooResponse) {
option (com.example.proto.options.bar) = {
value : "{ message:\"baz\" }";
};
}
and another like the following:
rpc SayFallback (FooRequest) returns (FooResponse) {
option (com.example.proto.options.bar) = {
value : "{ message:\"baz\" }"
};
}
The first has a ; on the line with value while the second doesn't. Are either OK according to the standard?
Yes, they are considered optional. See the protobuf file source snippet:
while (!TryConsumeEndOfDeclaration("}", NULL)) {
if (AtEnd()) {
AddError("Reached end of input in method options (missing '}').");
return false;
}
if (TryConsumeEndOfDeclaration(";", NULL)) {
// empty statement; ignore
} else {
...
}

JavaCC IntegerLiteral

I am using JavaCC to build a lexer and a parser and I have the following code:
TOKEN:
{
< #DIGIT : [ "0"-"9" ] >
|< INTEGER_LITERAL : (<DIGIT>)+ >
}
SimpleNode IntegerLiteral() :
{
Token t;
}
{
(t=<INTEGER_LITERAL>)
{
Integer n = new Integer(t.image);
jjtThis.jjtSetValue( n );
return jjtThis;
}
}
Hence it should accept only integers but it is also accepting 4. or 4 %%%%%% etc.
Try turn on debugging in your parser spec file like:
OPTIONS {
DEBUG_TOKEN_MANAGER=true
}
This will create a printout of what the TokenManager is doing while parsing.
"4." and "4%%%%" are not really accepted because what is read by your parser is always "4"
if you set you DEBUG_PARSER = true; in the OPTION section you will see the currently read token.
I think if you change your grammar like this you can see that it throws a TokenMgrError when it reads the unhandled character
SimpleNode IntegerLiteral() :
{
Token t;
}
{
(
t=<DIGIT>
{
Integer n = new Integer(t.image);
jjtThis.jjtSetValue( n );
return jjtThis;
})+
}

JavaCC grammar - proper lexing

I have a JavaCC grammar with following definitions:
<REGULAR_IDENTIFIER : (["A"-"Z"])+ > // simple identifier like say "DODGE"
<_LABEL : (["A"-"Z"])+ (":") > // label, eg "DODGE:"
<DOUBLECOLON : "::">
<COLON : ":">
Right now "DODGE::" lexed as <_LABEL> <COLON> ("DODGE:" ":")
but i need to lex it as <REGULAR_IDENTIFIER> <DOUBLECOLON> ("DODGE" "::")
I think the following will work
MORE: { < (["A"-"Z"])+ :S0 > } // Could be identifier or label.
<S0> TOKEN: { <LABEL : ":" : DEFAULT> } // label, eg "DODGE:"
<S0> TOKEN: { <IDENTIFIER : "" : DEFAULT > } // simple identifier like say "DODGE"
<S0> TOKEN: { <IDENTIFIER : "::" { matchedToken.image = image.substring(0,image.size()-2) ; } : S1 > }
<S1> TOKEN: { <DOUBLECOLON : "" { matchedToken.image = "::" ; } : DEFAULT> }
<DOUBLECOLON : "::">
<COLON : ":">
Note that "DODGE:::" is three tokens, not two.
In javacc the maximal match rule (longest prefix match rule) is used see:
http://www.engr.mun.ca/~theo/JavaCC-FAQ/javacc-faq-moz.htm#more-than-one
This means that the _LABEL token will be matched before the REGULAR_IDENTIFIER token, as the _LABEL token will contain more characters. This means that what you are trying to do should not be done in the tokenizer.
I have written a parser which recognizes the grammar correctly, I use the parser for recognizing the _LABEL's, instead of the tokenizer:
options {
STATIC = false;
}
PARSER_BEGIN(Parser)
import java.io.StringReader;
public class Parser {
//Main method, parses the first argument to the program
public static void main(String[] args) throws ParseException {
System.out.println("Parseing: " + args[0]);
Parser parser = new Parser(new StringReader(args[0]));
parser.Start();
}
}
PARSER_END(Parser)
//The _LABEL will be recognized by the parser, not the tokenizer
TOKEN :
{
<DOUBLECOLON : "::"> //The double token will be preferred to the single colon due to the maximal munch rule
|
<COLON : ":">
|
<REGULAR_IDENTIFIER : (["A"-"Z"])+ > // simple identifier like say "DODGE"
}
/** Root production. */
void Start() :
{}
{
(
LOOKAHEAD(2) //We need a lookahead of two, to see if this is a label or not
<REGULAR_IDENTIFIER> <COLON> { System.err.println("label"); } //Labels, should probably be put in it's own production
| <REGULAR_IDENTIFIER> { System.err.println("reg_id"); } //Regulair identifiers
| <DOUBLECOLON> { System.err.println("DC"); }
| <COLON> { System.err.println("C"); }
)+
}
In a real you should of cause move the <REGULAR_IDENTIFIER> <COLON> to a _label production.
Hope it helps.

Resources