flex (python PLY) regex for strings - ply

I'm using the python module PLY to write a parser, and I am implementing as I go. I have a simple rule to detect strings:
r'("|\').*("|\')'
When lexer errors are thrown I have this:
def t_error (t) :
print 'Illegal lexer input line ' + str(t.lineno) + ' ' + t.value[:16]
sys.exit(-1)
When I feed my parser the following input:
parse("preg_match('%^[\*\%]+$%', $keywords)")
I get back this in return:
Illegal lexer input line 1 %^[\*\%]+$%', $k
My questions are:
1) Why am I not parsing this string? It seems like my regex should properly handle this string.
2) How can I fix this?
edit:
I have narrowed the problem down a bit. The following strings throw illegal lexer input errors by themselves:
'%'
'^'

Even if this regex were working it isn't quite doing what you want it to, for example it would accept "this', which isn't really a string. This is also the cause of the "illegal lexer input"...
After having done it's job and found the first string in "preg_match(' the lexer is then upset when each of the next 11 characters %^[\*\%]+$% are illegal (and not in t_ignore), since they don't even start with " or '.
.
Try doing this with two cases for " and ': "Starts with quote, some things which aren't quote, ends with quote." That is:
r'("[^"]*")|(\'[^\']*\')'
Or, if you want to include escaped speech marks:
r'("(\\"|[^"])*")|(\'(\\\'|[^\'])*\')'

Related

"ERROR: syntax: cannot juxtapose string literal" when ending a triple-quoted string literal with a quote

I'm trying to create a string literal representing a CSV file with quoted fields. The intended CSV looks like this:
"a","b"
"1","2"
Triple quotes work if I want a newline character at the end of the string:
julia> """
"a","b"
"1","2"
"""
"\"a\",\"b\"\n\"1\",\"2\"\n"
But if I try to make a string without the newline character at the end, then I get a syntax error:
julia> """
"a","b"
"1","2""""
ERROR: syntax: cannot juxtapose string literal
Is there a simple way to get around this?
As an aside, note that there is no syntax error when you start the string-literal with a quote:
julia> """"a","b"
"1","2"
"""
"\"a\",\"b\"\n\"1\",\"2\"\n"
The issue is that this by itself is a valid string literal:
"""
"a","b"
"1","2"""
When you follow that with another " the parser thinks "woah, you can’t just follow a string with another string". You can force it to not consider the quote after 2 as part of a closing """ sequence by escaping it with \:
"""
"a","b"
"1","2\""""
At the start of the string, there's no such issue since the first three " characters are taken to start a string literal and the following " must just be a quote character inside of the string, which is what you want.
I'm not sure what you would consider the best solution to be. The options are:
Escape the " at the end;
Put the closing """ on a separate line.
The latter seems better to me, but it's your call.
See the Julia Docs for other examples on Triple-Quoted String Literals.

Error while Scilab module is loading

I downloaded module Metanet 0.6.2 and ran by Scilab
atomsInstall
After that i ran
`atomsLoad('metanet')`
but it shows
atomsLoad: An error occurred while loading 'metanet-0.6.2':
error(msprintf(gettext('%s module required."),'graph'));
^^
Error: Heterogeneous string detected, starting with ' and ending with ".
at line 335 of function atomsLoad ( D:\Program Files\scilab-6.0.1\modules\atoms\macros\atomsLoad.sci line 351 )
Why did it happen so?
It turns out that the metanet module is not supported by Scilab 6.0.1 yet. I had to install version 5.5.2.
Unfortunately, both the question and the accepted answer here on this page are very vague and misleading. Ideally, this kind of post should be blocked / down-voted, but I will try to answer it as much as I can.
Firstly when you want to run a Scilab command you do not put it in quotation marks, unless you want to use execstr command. However, the characters you have used are not quotations but backticks! I'm not sure why you have done that.
Secondly, the error:
Error: Heterogeneous string detected, starting with ' and ending with "
happens when a double quotation is used inside the single quotation or vice versa:
"This is a' string"
'this is a" string'
to solve the issue you should change the above strings to
"This is a'' string"
'this is a'" string'
basically adding one single quotation before any of the ' and " characters to turn them into literal ' and ".
bonus point if you want to pass a string to Tcl use curly brackets
TCL_EvalStr("set myVar {Hello World!}")
or
TCL_EvalStr("set myVar '"Hello World!'"")
but for PowerShell
powershell('$myVar= ''Hello World!''')
or
powershell("$myVar= ''Hello World!''")

Teradata remove enclosing single quotes from variable

I need to replace single quotes in a string of numbers and use in a WHERE IN clause. for example, I have
WHERE Group_ID IN (''4532','3422','1289'')
The criteria within parenthesis is being passed as a parameter, so I have no control over that. I tried using :
WHERE Group_ID IN (REGEXP_REPLACE(''4532','3422','1289'', '[']', ' ',1,0,i))
also tried using OReplace
WHERE Group_ID IN (OReplace(''4532','3422','1289'', '[']', ' '))
but get the same error:
[Teradata Database] [3707] Syntax error, expected something like ','
between a string or a Unicode character literal and the integer '4532'.
Please suggest how to remove the single enclosing quotes or even removing all single quotes should work as well.
The string ''4532','3422','1289'' you are using is incorrect because it contains non-escaped single quotes. This is a syntax error in SQL. In this particular form, no matter what function you use to fix it or which RDBMS you use, it will result in error with standard SQL.
Functions in the SQL cannot fix syntax errors. REGEXP_REPLACE and OReplace never get executed because the query never enters the execution state. It never goes past the SQL syntax parser.
To see the error from perspective of the SQL parser, you may break the string in to multiple parts
'' -- SQL Parser sees this as a starting and ending quote and hence an empty string
4532 -- Now comes what appears to SQL parser as an integer value
',' -- Now this is a pair of quotes containing a single comma
3422 -- Again an integer
',' -- Again a comma
1289 -- Again integer
'' -- Again emtpy string
This amalgam of strings and numbers will not mean anything to the SQL parser and will result in an error.
Fix
The fix is to properly escape the data. Single quotes must be escaped using another preceding single quote. So correct string in this scenario becomes '''4532'',''3422'',''1289'''
Another thing is that the OReplace usage (once syntax is fixed) is like OReplace(yourStringValueHere, '''', ' ')) Observe the usage of escaped single quote here. Two outer quotes are for the string start and end. First inner quote is the escape character and second inner quote is the actual data passed to the function.

Remove EOL in variable for comparing strings

I use Run Keyword Unless comparing a variable and a string:
Run Keyword Unless '${text}' == 'HelloWorld' My Keyword ${text}
Sometimes ${text} consists of two lines separated by "\n" (eg. "One line\ntwo lines"). If so, the tests fails with an error:
Evaluating expression ''One line
two lines' == 'HelloWorld'' failed: SyntaxError: EOL while scanning string literal (<string>, line 1)
I solved the problem removing '\n' with String.Replace String as follows:
${one_line_text}= String.Replace String ${text} \n ${SPACE}
Run Keyword Unless '${one_line_text}' == 'HelloWorld' My Keyword ${text}
Is there a way to do it without explicit removing of EOL in a separate keyword?
What about ${text.replace("\n", " ")}?
You can use python's string literals - """ or ''' - and not change the string at all:
Run Keyword Unless '''${text}''' == 'HelloWorld' My Keyword ${text}
They are designed for pretty much this purpose - to hold values having newline characters, plus quotes.

How to write a regex for any text except quotes or multiple hyphens?

Can anybody tell me how to write a regular expression for "no quotes (single or double) allowed and only single hyphens allowed"? For example, "good", 'good', good--looking are not allowed (but good-looking is).
I need put this regex like following:
<asp:RegularExpressionValidator ID="revProductName" runat="server"
ErrorMessage="Can not have " or '." Font-Size="Smaller"
ControlToValidate="txtProductName"
ValidationExpression="^[^'|\"]*$"></asp:RegularExpressionValidator>
The one I have is for double and single quotes. Now I need add multiple hyphens in there. I put like this "^[^'|\"|--]*$", but it is not working.
^(?:-(?!-)|[^'"-]++)*$
should do.
^ # Start of string
(?: # Either match...
-(?!-) # a hyphen, unless followed by another hyphen
| # or
[^'"-]++ # one or more characters except quotes/hyphen (possessive match)
)* # any number of times
$ # End of string
So, the regexp has to fail when ther is ', or ", or --.
So, the regexp should try this in every position, and if it's found, then fail:
^(?:(?!['"]|--).)*$
The idea is to consume all the line with ., but to check before using . each time that it not ', or ", or the beginning of --.
Also, I like the other answer very much. It uses a bit different approach. It consumes only non-'" symbols ([^'"]), and if it consumes -, it check if it's not followed by another -.
Also, there could be one more approach of searching for ', or ", or -- in the string, and then failing the regex if they are found. I could be achieved by using regex conditional expression. But this flavor of regex engine doesn't seem to support such kind of conditions.

Resources