Generate Parser for a File with JavaCC - javacc

I am a beginner with JavaCC,and i'm trying to generate a file Parser.
I have already been able to generate a successful parser interpenetrated a line that is entered on the keyboard.
Parser example when I enter the keyboard "First Name: William", I managed to display on the screen the name of the variable and the value.
Now I have a file .txt who contain a large number of names and their value, and I would like to successfully display them on the screen.
below is my .jj file that I have already written to generate a parser of a typed line
Now i want the same but for a file.
options
{
static = true;
}
PARSER_BEGIN(parser_name)
public class parser_name
{
public static void main(String args []) throws ParseException
{
System.out.println("Waiting for the Input:");
parser_name parser = new parser_name(System.in);
parser.Start();
}
}
PARSER_END(parser_name)
SKIP :
{
" "
| "\r"
| "\t"
| "\n"
}
TOKEN : { < DIGIT : (["0"-"9"])+ > }
TOKEN : { <VARIABLE: (["a"-"z", "A"-"Z"])+> }
TOKEN : { <VALUE: (~["\n",":"])+> }
TOKEN : { <ASSIGNMENT: ":"> }
void Start(): { Token t,t1,t2;}
{
t=<VARIABLE>
t1=<ASSIGNMENT>
t2=<VALUE>
{ System.out.println("The Variable is "+t.image+",and the Value is "+t2.image); }
}
I have already tried to replace the "System.in" at the parser constructor with an object of type File.And then read the file by line, but it did not work.

Pass a Reader to the parser's constructor.

Related

Error Recovery in JavaCC for Qbasic language

I am developing a compiler (with JavaCC) for QBasic language and I have an issue relate to Error Recovery (Error Recovery is showing all compiler errors when you compile the program)
so I had to handle ParseException and ignore the line where ParseException occurs
note : QBasic language has no semicolons so every statement has a separated line
I have tried to try catch the ParseException in every statement and handle it by using getNextToken repeatedly until I have "\n" token
unfortunately that does not work !!
Here is my program method :
void program():
{
Node n =null;
programNode ret = new programNode() ;
boolean canrun=true;
}
{
(< LINE > | < SPACE >)*
(
try {
n = statement()(< SPACE >)* <LINE>
}
catch(ParseException e)
{
canrun=false;
Excep.add(e);
Token t;
do
{
t=CodeParserTokenManager.getNextToken();
}while (t.image!="\n");
}
(< LINE > | < SPACE >)*{
if (n!=null)
ret.addChild(n);
})+ "?"
{
if (canrun)
ret.Start();
}
}
And here is my Parser class :
PARSER_BEGIN(CodeParser)
import java.util.ArrayList;
public class CodeParser
{
public static void main(String args[])
{
CodeParser Parser = new CodeParser(System.in);
try {
program() ;
}
catch(ParseException e)
{
}
}
}
PARSER_END(CodeParser)
I believe the problem is the line:
}while (t.image!="\n");
because 1) you shouldn't use != with strings, 2) the image could be different ("\r\n" for instance).
Try t.kind!=LINE.

JavaCC IntegerLiteral

I am using JavaCC to build a lexer and a parser and I have the following code:
TOKEN:
{
< #DIGIT : [ "0"-"9" ] >
|< INTEGER_LITERAL : (<DIGIT>)+ >
}
SimpleNode IntegerLiteral() :
{
Token t;
}
{
(t=<INTEGER_LITERAL>)
{
Integer n = new Integer(t.image);
jjtThis.jjtSetValue( n );
return jjtThis;
}
}
Hence it should accept only integers but it is also accepting 4. or 4 %%%%%% etc.
Try turn on debugging in your parser spec file like:
OPTIONS {
DEBUG_TOKEN_MANAGER=true
}
This will create a printout of what the TokenManager is doing while parsing.
"4." and "4%%%%" are not really accepted because what is read by your parser is always "4"
if you set you DEBUG_PARSER = true; in the OPTION section you will see the currently read token.
I think if you change your grammar like this you can see that it throws a TokenMgrError when it reads the unhandled character
SimpleNode IntegerLiteral() :
{
Token t;
}
{
(
t=<DIGIT>
{
Integer n = new Integer(t.image);
jjtThis.jjtSetValue( n );
return jjtThis;
})+
}

JavaCC grammar - proper lexing

I have a JavaCC grammar with following definitions:
<REGULAR_IDENTIFIER : (["A"-"Z"])+ > // simple identifier like say "DODGE"
<_LABEL : (["A"-"Z"])+ (":") > // label, eg "DODGE:"
<DOUBLECOLON : "::">
<COLON : ":">
Right now "DODGE::" lexed as <_LABEL> <COLON> ("DODGE:" ":")
but i need to lex it as <REGULAR_IDENTIFIER> <DOUBLECOLON> ("DODGE" "::")
I think the following will work
MORE: { < (["A"-"Z"])+ :S0 > } // Could be identifier or label.
<S0> TOKEN: { <LABEL : ":" : DEFAULT> } // label, eg "DODGE:"
<S0> TOKEN: { <IDENTIFIER : "" : DEFAULT > } // simple identifier like say "DODGE"
<S0> TOKEN: { <IDENTIFIER : "::" { matchedToken.image = image.substring(0,image.size()-2) ; } : S1 > }
<S1> TOKEN: { <DOUBLECOLON : "" { matchedToken.image = "::" ; } : DEFAULT> }
<DOUBLECOLON : "::">
<COLON : ":">
Note that "DODGE:::" is three tokens, not two.
In javacc the maximal match rule (longest prefix match rule) is used see:
http://www.engr.mun.ca/~theo/JavaCC-FAQ/javacc-faq-moz.htm#more-than-one
This means that the _LABEL token will be matched before the REGULAR_IDENTIFIER token, as the _LABEL token will contain more characters. This means that what you are trying to do should not be done in the tokenizer.
I have written a parser which recognizes the grammar correctly, I use the parser for recognizing the _LABEL's, instead of the tokenizer:
options {
STATIC = false;
}
PARSER_BEGIN(Parser)
import java.io.StringReader;
public class Parser {
//Main method, parses the first argument to the program
public static void main(String[] args) throws ParseException {
System.out.println("Parseing: " + args[0]);
Parser parser = new Parser(new StringReader(args[0]));
parser.Start();
}
}
PARSER_END(Parser)
//The _LABEL will be recognized by the parser, not the tokenizer
TOKEN :
{
<DOUBLECOLON : "::"> //The double token will be preferred to the single colon due to the maximal munch rule
|
<COLON : ":">
|
<REGULAR_IDENTIFIER : (["A"-"Z"])+ > // simple identifier like say "DODGE"
}
/** Root production. */
void Start() :
{}
{
(
LOOKAHEAD(2) //We need a lookahead of two, to see if this is a label or not
<REGULAR_IDENTIFIER> <COLON> { System.err.println("label"); } //Labels, should probably be put in it's own production
| <REGULAR_IDENTIFIER> { System.err.println("reg_id"); } //Regulair identifiers
| <DOUBLECOLON> { System.err.println("DC"); }
| <COLON> { System.err.println("C"); }
)+
}
In a real you should of cause move the <REGULAR_IDENTIFIER> <COLON> to a _label production.
Hope it helps.

How to perform Multiple write opeartion on a single file?

I have a website in asp.net 2.0 which write some thing on a file. But at the same time if another user hit that site it does not work till the first one operation on the file completed after that second one can do operation with the files.
How to handle such situation.
AppConfiguration appConfiguration = new AppConfiguration();
string LogFile =String.Empty;
string sLogFormat =string.Empty;
string sErrorTime =string.Empty;
StreamWriter sw=null;
public LogManager()
{
if(!File.Exists(AppConfiguration.LogFilePath))
{
Directory.CreateDirectory(AppConfiguration.LogFilePath);
}
LogFile = AppConfiguration.LogFilePath+"WAP"+sErrorTime + ".log";
if(!File.Exists(LogFile))
{
File.Create(LogFile);
}
}
public void closeStream()
{
if(sw != null)
{
sw.Close();
}
}
public void LogException(string className,string methodName, string errorMessage)
{
try
{
if(!File.Exists(LogFile))
{
File.Create(LogFile);
}
sw = new StreamWriter(LogFile,true);
sw.WriteLine(DateTime.Now.ToString() + " | " + className + ":" + methodName + ":"+ errorMessage);
sw.Flush();
sw.Close();
}
catch(Exception)
{
if(sw != null)
{
sw.Close();
}
}
}
Rather than opening and closing the log file for every entry, you might consider having a single logging process (e.g. syslog or a separate logging thread) that keeps the log file open.
Writing lines to a plain text file will mess it up if multiple users are going to handle it. Rather, you can use database to store / retrieve text. You can even provide a button somewhere to export the records to a plain text file if needed.

Print matched token in JavaCC

I need to print the token that was matched by javacc, but I don't know how to "store it".
Let's say my token definition is:
TOKEN :
{
< BLAH: ["0"-"9"]>
}
and my parser.input() function is:
void Input():
{}
{ (<BLAH> { System.out.println("I recognize BLAH"); } )
}
However what I really want to output, given some input, let's say 5, is:
I recognize that BLAH is 5.
Any tips? Thanks
Basically you declare variables in the first curly braces and use them in the second:
void Input():
{ Token t; }
{
(t=<BLAH> { System.out.println("I recognize BLAH is " + t.image); } )
}

Resources