Recovering multiple errors in Javacc - javacc

Is there a way in javacc to parse an input file further even after detecting an error. I got to know that there are several ways such as panic mode recovery, phrase level recovery and so on. But I can't figure how to implement it in javacc jjt file.
For an example assume my input file is
Line 1: int i
Line 2: int x;
Line 3: int k
So what I want is after detecting the error of missing semicolon at line 1, proceed parsing and find the error at line 3 too.

I found the answer in the way of panic mode error recovery,but it too have some bugs. What I did was I edit my grammar so that once I encounter a missing character in a line of the input file(in the above case a semicolon) parser proceed until it finds a similar character. Those similar characters are called synchronizing tokens.
See the example below.
First I replaced all the SEMICOLON tokens in my grammar with this.
Semicolon()
Then add this new production rule.
void Semicolon() :
{}
{
try
{
<SEMICOLON>
} catch (ParseException e) {
Token t;
System.out.println(e.toString());
do {
t = getNextToken();
} while (t.kind != SEMICOLON && t!=null && t.kind != EOF );
}
}
Once I encounter a missing character parser search for a similar character.When it finds such character it returns to the rule which called it.
Example:-
Assume a semicolon missing in a variable declaration.
int a=10 <--- no semicolon
So parser search for a semicolon.At some point it finds a semicolon.
___(some code)__; method(param1);
So after finding the first semicolon in the above example it returns to the variable declaration rule(because it is the one which called the semicolon() method.) But what we find after the newly find semicolon is a function call,not a variable declaration.
Can anyone please suggest a way to solve this problem.

Related

How not to create temporary QRegularExpression objects

I am getting a warning (in the QtCreator IDE) regarding the code snippet below. The warning is that I should not create temporary QRegularExpression objects; instead use a static QRegularExpression object.
QRegularExpression re("SEARCHING...",QRegularExpression::CaseInsensitiveOption);
QRegularExpressionMatch match = re.match(frame);
if (match.hasMatch()) {
It's not obvious to me...how should I use the QRegular expression instead?
That's a clazy warning message which you can find a description of here. It's just suggesting that you don't want to keep recreating the QRegularExpression every time you enter that function because the expression is always the same. So doing something like this should work:
static QRegularExpression re("SEARCHING...", QRegularExpression::CaseInsensitiveOption);
QRegularExpressionMatch match = re.match(frame);
if (match.hasMatch()) {

if {...} else {...} : Does the line break between "}" and "else" really matters?

I write my if {...} else {...} statement in R in the following way as I find it more readable.
Testifelse = function(number1)
{
if(number1>1 & number1<=5)
{
number1 =1
}
else if ((number1>5 & number1 < 10))
{
number1 =2
}
else
{
number1 =3
}
return(number1)
}
According to ?Control:
... In particular, you should not have a newline between } and else to avoid a syntax error in entering a if ... else construct at the keyboard or via source ...
the function above will cause syntax error, but actually it works! What's going on here?
Thanks for your help.
Original question and answer
If we put in R console:
if (1 > 0) {
cat("1\n");
}
else {
cat("0\n");
}
why does it not work?
R is an interpreted language, so R code is parsed line by line. (Remark by #JohnColeman: This judgement is too broad. Any modern interpreter does some parsing, and an interpreted language like Python has no problem analogous to R's problem here. It is a design decision that the makers of R made, but it wasn't a decision that was forced on them in virtue of the fact that it is interpreted (though doubtless it made the interpreter somewhat easier to write).)
Since
if (1 > 0) {
cat("1\n");
}
makes a complete, legal statement, the parser will treat it as a complete code block. Then, the following
else {
cat("0\n");
}
will run into error, as it is seen as a new code block, while there is no control statement starting with else.
Therefore, we really should do:
if (1 > 0) {
cat("1\n");
} else {
cat("0\n");
}
so that the parser will have no difficulty in identifying them as a whole block.
In compiled language like C, there is no such issue. Because at compilation time, the compiler can "see" all lines of your code.
Final update related to what's going on inside a function
There is really no magic here! The key is the use of {} to manually indicate a code block. We all know that in R,
{statement_1; statement_2; ...; statement_n;}
is treated as a single expression, whose value is statement_n.
Now, let's do:
{
if (1 > 0) {
cat("1\n");
}
else {
cat("0\n");
}
}
It works and prints 1.
Here, the outer {} is a hint to the parser that everything inside is a single expression, so parsing and interpreting should not terminate till reaching the final }. This is exactly what happens in a function, as a function body has {}.

Print reverse of a string using recursion

This code works:
void reverse(char *str)
{
if(*str)
{
reverse(str+1);
printf("%c", *str);
}
}
But, if i change reverse(str+1) with reverse(++str), it doesn't print first character.
In: Geeks
Out: skee
I don't know why.
Because you're altering the pointer given to you in the very first call of the method, so when it finally gets around to printing itself out and completing the execution, the index has already been incremented to the second character.
In the first case, str+1, str isn't being modified at all, so the very last printf just prints the first character.
Keep in mind that the prefix and postfix ++ actually change the value of the variable.
++str increments first then prints, you need str++

If property exists logic in BizTalk message assignment shape

Is the if / else logic valid in a BizTalk Message Assignment shape?
I'm getting some event log errors regarding ErrorReport.FailedTime having no value, so I thought I'd put a guard clause in the
if (ErrorReport.FailureTime exists Msg_Failed)
{
Var_FailureTime = Msg_Failed(ErrorReport.FailureTime);
}
else
{
Var_FailureTime = System.DateTime.Now;
}
... rest of code constructing the error report message ...
But the compiler fails with ...
error X2254: unexpected keyword: 'if'
That is the expected behavior.
'If' is not supported in the Message Assignment Shape but it is supported in the Expression Shape. so, you will have to do this test/assigment before the Construct Shape.

Pass a string value in a recursive bison rule

i'm having some issues on bison (again).
I'm trying to pass a string value between a "recursive rule" in my grammar file using the $$,
but when I print the value I have passed, the output looks like a wrong reference ( AU�� ) instead the value I wrote in my input file.
line: tok1 tok2
| tok1 tok2 tok3
{
int len=0;
len = strlen($1) + strlen($3) + 3;
char out[len];
strcpy(out,$1);
strcat(out," = ");
strcat(out,$3);
printf("out -> %s;\n",out);
$$ = out;
}
| line tok4
{
printf("line -> %s\n",$1);
}
Here I've reported a simplified part of the code.
Giving in input the token tok1 tok2 tok3 it should assign to $$ the out variable (with the printf I can see that in the first part of the rule the out variable has the correct value).
Matching the tok4 sequentially I'm in the recursive part of the rule. But when I print the $1 value (who should be equal to out since I have passed it trough $$), I don't have the right output.
You cannot set:
$$ = out;
because the string that out refers to is just about to vanish into thin air, as soon as the block in which it was declared ends.
In order to get away with this, you need to malloc the storage for the new string.
Also, you need strlen($1) + strlen($3) + 4; because you need to leave room for the NUL terminator.
It's important to understand that C does not really have strings. It has pointers to char (char*), but those are really pointers. It has arrays (char []), but you cannot use an array as an aggregate. For example, in your code, out = $1 would be illegal, because you cannot assign to an array. (Also because $1 is a pointer, not an array, but that doesn't matter because any reference to an array, except in sizeof, is effectively reduced to a pointer.)
So when you say $$ = out, you are making $$ point to the storage represented by out, and that storage is just about to vanish. So that doesn't work. You can say $$ = $1, because $1 is also a pointer to char; that makes $$ and $1 point to the same character. (That's legal but it makes memory management more complicated. Also, you need to be careful with modifications.) Finally, you can say strcpy($$, out), but that relies on $$ already pointing to a string which is long enough to hold out, something which is highly unlikely, because what it means is to copy the storage pointed to by out into the location pointed to by $$.
Also, as I noted above, when you are using "string" functions in C, they all insist that the sequence of characters pointed to by their "string" arguments (i.e. the pointer-to-character arguments) must be terminated with a 0 character (that is, the character whose code is 0, not the character 0).
If you're used to programming in languages which actually have a string datatype, all this might seem a bit weird. Practice makes perfect.
The bottom line is that what you need to do is to create a new region of storage large enough to contain your string, like this (I removed out because it's not necessary):
$$ = malloc(len + 1); // room for NUL
strcpy($$, $1);
strcat($$, " = ");
strcat($$, $3);
// You could replace the strcpy/strcat/strcat with:
// sprintf($$, "%s = %s", $1, $3)
Note that storing mallocd data (including the result of strdup and asprintf) on the parser stack (that is, as $$) also implies the necessity to free it when you're done with it; otherwise, you have a memory leak.
I've solved it changin the $$ = out; line into strcpy($$,out); and now it works properly.

Resources