How to exclude character " in a token JavaCC - bnf

Hello i´m working with JavaCC and I am writing a token that put one String between " ". Context:
void literalString(): {} { """ (characteresString())? """ }
void characteresString(): {} { <characterString> | characteresString() <characterString> }
So i made this token to put one String:
TOKEN : {<characterString : ~["\", "] >}
The problem is I don´t know how to exclude the " symbol in the token, if I put """ it gives me error, if i put one " error again.
Thank you in advance

Instead of
void literalString(): {} { """ (characteresString())? """ }
use a token definition
TOKEN : { <STRING : "\"" (<CHAR>)* "\"" >
| <#CHAR : ~["\""] > // Any character that is not "
}
Now this defines a string to be a ", followed by zero or more characters that are not "s, followed by another ".
However some languages have further restrictions, such as only allowing characters in a certain range. For example if only printable ascii characters excluding "s where allowed, then you would use
TOKEN : { <STRING : "\"" (<CHAR>)* "\"" >
| <#CHAR: [" ","!","#"-"~"]> // Printable ASCII characters excluding "
}
But, say you want to allow " characters if the are preceded a by \ and you want to ban \ characters unless they are followed by a " or another \ or an n. Then you could use
TOKEN : { <STRING : "\"" (<CHAR> | <ESCAPESEQ>)* "\"" >
| <#CHAR: [" ","!","#"-"[","]"-"~"] > // Printable ASCII characters excluding \ and "
| <#ESCAPESEQ: "\\" ["\"","\\","n"] > // 2-character sequences \\, \", and \n
}

Related

I need to parse a string using javacc containing single quotes as part of the string

I have defined grammar rules like
TOKEN : { < SINGLE_QUOTE : " ' " > }
TOKEN : { < STRING_LITERAL : " ' " (~["\n","\r"])* " ' ">
But I am not able to parse sequences like 're'd' .I need the parser to parse re'd as a string literal.But the parser parses 're' seperately and 'd' seperately for these rules.
If you need to lex re'd as STRING_LITERAL token then use the following rule
TOKEN : { < SINGLE_QUOTE : "'" > }
TOKEN : { < STRING_LITERAL : "'"? (~["\n","\r"])* "'"?>
I didn't see the rule for matching "re" separately.
In javacc, definition of your lexical specification STRING_LITERAL is to start with "'" single quot. But your input doesn't have the "'" at starting.
The "?" added in the STRING_LITERAL makes the single quot optional and if present only one. so this will match your input and lex as STRING_LITERAL.
JavaCC decision making rules:
1.) JavaCC will looks for the longest match.
Here in this case even if the input starts with the "'" the possible matches are SINGLE_QUOTE and STRING_LITERAL. the second input character tells which token to choose STRING_LITERAL.
2.) JavaCC takes the the rule declared first in the grammar.
Here if the input is only "'" then it will be lexed as SINGLE_QUOTE even if there is the possible two matches SINGLE_QUOTE and STRING_LITERAL.
Hope this will help you...
The following should work:
TOKEN : { < SINGLE_QUOTE : "'" > }
TOKEN : { < STRING_LITERAL : "'" (~["\n","\r"])* "'"> }
This is pretty much what you had, except that I removed some spaces.
Now if there are two on more apostrophes on a line (i.e. without an intervening newline or return) then the first and the last of those apostrophes together with all characters between should be lexed as one STRING_LITERAL token. That includes all intervening apostrophes. This is assuming there are no other rules involving apostrophes. For example, if your file is 're'd' that should lex as one token; likewise 'abc' + 'def' should lex as one token.

Print Console in Application JavaFX [duplicate]

How can I make Java print "Hello"?
When I type System.out.print("Hello"); the output will be Hello. What I am looking for is "Hello" with the quotes("").
System.out.print("\"Hello\"");
The double quote character has to be escaped with a backslash in a Java string literal. Other characters that need special treatment include:
Carriage return and newline: "\r" and "\n"
Backslash: "\\"
Single quote: "\'"
Horizontal tab and form feed: "\t" and "\f"
The complete list of Java string and character literal escapes may be found in the section 3.10.6 of the JLS.
It is also worth noting that you can include arbitrary Unicode characters in your source code using Unicode escape sequences of the form \uxxxx where the xs are hexadecimal digits. However, these are different from ordinary string and character escapes in that you can use them anywhere in a Java program ... not just in string and character literals; see JLS sections 3.1, 3.2 and 3.3 for a details on the use of Unicode in Java source code.
See also:
The Oracle Java Tutorial: Numbers and Strings - Characters
In Java, is there a way to write a string literal without having to escape quotes? (Answer: No)
char ch='"';
System.out.println(ch + "String" + ch);
Or
System.out.println('"' + "ASHISH" + '"');
Escape double-quotes in your string: "\"Hello\""
More on the topic (check 'Escape Sequences' part)
You can do it using a unicode character also
System.out.print('\u0022' + "Hello" + '\u0022');
Adding the actual quote characters is only a tiny fraction of the problem; once you have done that, you are likely to face the real problem: what happens if the string already contains quotes, or line feeds, or other unprintable characters?
The following method will take care of everything:
public static String escapeForJava( String value, boolean quote )
{
StringBuilder builder = new StringBuilder();
if( quote )
builder.append( "\"" );
for( char c : value.toCharArray() )
{
if( c == '\'' )
builder.append( "\\'" );
else if ( c == '\"' )
builder.append( "\\\"" );
else if( c == '\r' )
builder.append( "\\r" );
else if( c == '\n' )
builder.append( "\\n" );
else if( c == '\t' )
builder.append( "\\t" );
else if( c < 32 || c >= 127 )
builder.append( String.format( "\\u%04x", (int)c ) );
else
builder.append( c );
}
if( quote )
builder.append( "\"" );
return builder.toString();
}
System.out.println("\"Hello\"");
System.out.println("\"Hello\"")
There are two easy methods:
Use backslash \ before double quotes.
Use two single quotes instead of double quotes like '' instead of "
For example:
System.out.println("\"Hello\"");
System.out.println("''Hello''");
Take note, there are a few certain things to take note when running backslashes with specific characters.
System.out.println("Hello\\\");
The output above will be:
Hello\
System.out.println(" Hello\" ");
The output above will be:
Hello"
Use Escape sequence.
\"Hello\"
This will print "Hello".
you can use json serialization utils to quote a java String.
like this:
public class Test{
public static String quote(String a){
return JSON.toJsonString(a)
}
}
if input is:hello output will be: "hello"
if you want to implement the function by self:
it maybe like this:
public static String quotes(String origin) {
// 所有的 \ -> \\ 用正则表达为: \\ => \\\\" 再用双引号quote起来: \\\\ ==> \\\\\\\\"
origin = origin.replaceAll("\\\\", "\\\\\\\\");
// " -> \" regExt: \" => \\\" quote to param: \\\" ==> \\\\\\\"
origin = origin.replaceAll("\"", "\\\\\\\"");
// carriage return: -> \n \\\n
origin = origin.replaceAll("\\n", "\\\\\\n");
// tab -> \t
origin = origin.replaceAll("\\t", "\\\\\\t");
return origin;
}
the above implementation will quote escape character in string but exclude
the " at the start and end.
the above implementation is incomplete. if other escape character you need , you can add to it.

parsing variable composed with lettre and numbers like " JAVAC 1.7.0.XXX"

I'm trying to parse regular expressions using JavaCC but I encountered a problem with variable " Y " composed of lettre and number for exemple : " JAVA 1.7.1.XXX" . knowing that I have already defined the Token
<id > = < lettre > | <number> < #lettre : [ "A"-"Z", "a"-"z"]> | < #number : [ "0"-"9" ] > in execution, the parser processing the first part of the variable " Y " like as <id>. after the parsing is stops. Thanks in advance.
Edit.
Here code parseur.jj:
TOKEN : { <ID2 : (["a"-"z","A"-"Z","0"-"9","_"])+
( (["0"-"9"])+ "." (["0"-"9"])+ "." (["0"-"9"])+)+
(["a"-"z","A"-"Z","_","."])+ >}
TOKEN : { <ID : ["a"-"z","A"-"Z","_"] (["a"-"z","A"-"Z","0"-"9","_"])* >}
Suppose the remaining input steams starts with this : MyFile1_Test 1.2.3.txt
then the token <ID> is attributed ?
and not <ID2>. normaly, why this rules not appilcatble : If more than one regular expression describes a prefix, then a regular expression that describes the longest prefix of the input stream is used. (This
is called the “maximal munch rule”.) thank you very much for your help
Here is the parseur.jj code:
TOKEN : { <ID2 : (["a"-"z","A"-"Z","0"-"9","_"])+ ( (["0"-"9"])+ "." (["0"-"9"])+ "." (["0"-"9"])+)+ (["a"-"z","A"-"Z","_","."])+ >}
TOKEN : { <ID : ["a"-"z","A"-"Z","_"] (["a"-"z","A"-"Z","0"-"9","_"])* >}
Suppose the remaining input steams starts with: MyFile1_Test 1.2.3.txt
then the token <ID> is attributed and not <ID2>. Normaly, why this rules not applicable:
If more than one regular expression describes a prefix, then a regular expression that describes the longest prefix of the input stream is used. (This is called the “maximal munch rule”.)

to use escape character rightly

i found it is difficult to create character : " is entered as \ "
when i input in R:
" is entered as \"
+
" is entered as \ "
[1] " is entered as "
>" is entered as \\ "
[1] " is entered as \ "
>" is entered as \\"
[1] " is entered as \"
how can i get character " is entered as \ "?
i am still confused
cat("is entered as \" )
is entered as >
> "is entered as \\"
[1] "is entered as \"
> print ("is entered as \\")
[1] "is entered as \"
"hoge \\" is actually hoge \
print shows \ as \\, so you find \\ for \.
try cat:
> cat("is entered as \\" )
is entered as \
and probably nchar will manifest this:
> nchar("\\" )
[1] 1
Is this what you were trying to achieve?:
> x <- "START \" is entered as \\\" END"
> cat(x)
Gives:
START " is entered as \" END
You have to escape both the double quote ", and the backslash \, in order to have them display normally.
To clear up any confusion between whether the double quotes printed in the output are part of the string, or just symbols that wrap around the string, I've added START at the start of the string and END at the end.

Escape character for " in ValidationExpression in ASP.NET

I am using regular expression to filter the invalid input entered by the end user.
The acceptable input is word, space, digital and . / # , # & $ _ : ? ' % ! – ~ " | + ; ” { } - \.
Below is my code.
<asp:RegularExpressionValidator ID="rgVEditTB1" runat="server" ControlToValidate="txtEditTB1"
ValidationExpression="^[\w\s\d\-\.\/\#\,\#\&\$\:\?\"\'\%\!\–\~\|\+\;\”\{\}\-\\]+$" ErrorMessage="Invalid Special Character" />
However, I am encountering problem to escape " in the ValidataionExpression, it errors out with
Server Tag is not well formed error.
I tried to change the escape character to:
\""
\"
""
It also gives me the same error.
What should be the correct escape character to put in the ValidationExpression?
You should be able to pass in the HTML encoding values. So, passing " would be like passing ". Something like this: ValidationExpression="^[^"]+$". In this regex I am saying: Match any character from the beginning till the end of the string which is not a quotation mark (").
The same applies to the other special symbols. You can take a look here for more encoding values.

Resources