Can I do CloseMatch of Defined Parsing grammar? - pyparsing

I have Defined a grammar
column = Word(alphanums + '._`')
stmt = column + Literal("(") + Group(delimitedList( column )) +Literal(")")
Now I want to match below query using close match
sql = seller(food_type,count(sellers),sum(weight),Earned_money)
I do not want to change the grammar defined above. How do I closeMatch given
functions as a argument
result = stmt.parseString(sql)
print result.dump()
def Review(sql):
stmt = GetGrammer(sql)
result = stmt.parseString(sql,parseAll=False)
print result.dump()
Where var sql is a procedure of 400-500 lines. So I Am making a Automating code Review part.For This purpose I have written grammar for sql statements.
But it is throwing exceptions Even If there is a small string which is not matching.And It is terminating after that.I want that it should not abort even if exceptions are Comming Because I know that atleast parsable part is useful for reviewing database queries.
Get Grammar is returning grammar for Procedure and all these are sql statements.
def Getgrammar(sql):
InputParameters = delimitedList( Optional((_in|_out|_inout),'') + column +
DataType)
DeclarativeSyntax = (_declare + column + DataType+';')
createProcedureStmt = createProcedure +
StoredProcedure.setResultsName("Procedure") +
lpar +
Optional(InputParameters.setResultsName("Input"),'') +
rpar +
Optional(_sql_security_invoker,'').setResultsName("SQLSECURITY") +
_begin +
ZeroOrMore( DeclarativeSyntax ).setResultsName("Declare") +
ZeroOrMore( ( selectStmt|setStmt|ifStmt.setResultsName("IfStmt")|
callStmt|updateStmt|createStmt|dropStmt|alterStmt|insertStmt
|deleteStmt|WhileStmt.setResultsName("WhileStmt")|createStmt ) + ';') +
_end+Optional(';','')
return createProcedureStmt

Related

Get-values from a html form in a for/do loop

I have a problem with get-value() method in progress4GL.
I am trying to get all values from html form.
My Progress4GL Code looks like:
for each tt:
do k = 1 to integer(h-timeframe):
h-from [k] = get-value(string(day(tt.date)) + "#" + string(tt.fnr) + "#" + string(tt.pnr) + "_von" + string(k)).
h-to [k] = get-value(string(day(tt.date)) + "#" + string(tt.fnr) + "#" + string(tt.pnr) + "_bis" + string(k)).
h-code [k] = get-value(string(day(tt.date)) + "#" + string(tt.fnr) + "#" + string(tt.pnr) + "_code" + string(k)).
end.
end.
h-timeframe is parameter and could be max. 10. (1-10)
tt is a temp-table and represents a week(fix 7 days)
It works perfectly till 9.Parameter. If I choose the 10 (which is max) then I get some performance Problem using get-value() Function.
Example when h-timeframe = 10:
as you can see from one get-value to another It takes really long time.( h-timeframe = 10 )
Example when h-timeframe = 9:
and here way much faster than other.
Can anyone explain why ? It is really strange and I have no Idea.
p.s: I have this problem just at 10. 0-9 It works perfectly
The performance difference is probably something external to your code snippet but, for performance, I would write it more like this:
define variable d as integer no-undo.
define variable n as integer no-undo.
define variable s as character no-undo.
for each tt:
// avoid recalculating and invoking functions N times per TT record
assign
d = day( tt.date )
n = integer( h-timeframe )
s = substitute( "&1#&2#&3_&&1&&2", d, tt.fnr, tt.pnr )
.
do k = 1 to n:
// consolidate multiple repeated operations, eliminate STRING() calls
assign
h-from [k] = get-value( substitute( s, "von", k ))
h-to [k] = get-value( substitute( s, "bis", k ))
h-code [k] = get-value( substitute( s, "code", k ))
.
end.
end.

pyparsing parse c/cpp enums with values as user defined macros

I have a usecase where i need to match enums where values can be userdefined macros.
Example enum
typedef enum
{
VAL_1 = -1
VAL_2 = 0,
VAL_3 = 0x10,
VAL_4 = **TEST_ENUM_CUSTOM(1,2)**,
}MyENUM;
I am using the below code, if i don't use format as in VAL_4 it works. I need match format as in VAL_4 as well. I am new to pyparsing, any help is appeciated.
My code:
BRACE, RBRACE, EQ, COMMA = map(Suppress, "{}=,")
_enum = Suppress("enum")
identifier = Word(alphas, alphanums + "_")
integer = Word("-"+alphanums) **#I have tried to "_(,)" to this but is not matching.**
enumValue = Group(identifier("name") + Optional(EQ + integer("value")))
enumList = Group(enumValue + ZeroOrMore(COMMA + enumValue) + Optional(COMMA))
enum = _enum + Optional(identifier("enum")) + LBRACE + enumList("names") + RBRACE + Optional(identifier("typedef"))
enum.ignore(cppStyleComment)
enum.ignore(cStyleComment)
Thanks
-Purna
Just adding more characters to integer is just the wrong way to go. Even this expression:
integer = Word("-"+alphanums)
isn't super-great, since it would match "---", "xyz", "q--10-", and many other non-integer strings.
Better to define integer properly. You could do:
integer = Combine(Optional('-') + Word(nums))
but I've found that for these low-level expressions that occur many places in your parse string, a Regex is best:
integer = Regex(r"-?\d+") # Regex(r"-?[0-9]+") if you like more readable re's
Then define one for hex_integer also,
Then to add macros, we need a recursive expression, to handle the possibility of macros having arguments that are also macros.
So at this point, we should just stop writing code for a bit, and do some design. In parser development, this design usually looks like a BNF, where you describe your parser in a sort of pseudocode:
enum_expr ::= "typedef" "enum" [identifier]
"{"
enum_item_list
"}" [identifier] ";"
enum_item_list ::= enum_item ["," enum_item]... [","]
enum_item ::= identifier "=" enum_value
enum_value ::= integer | hex_integer | macro_expression
macro_expression ::= identifier "(" enum_value ["," enum_value]... ")"
Note the recursion of macro_expression: it is used in defining enum_value, but it includes enum_value as part of its own definition. In pyparsing, we use a Forward to set up this kind of recursion.
See how that BNF is implemented in the code below. I build on some of the items you posted, but the macro expression required some rework. The bottom line is "don't just keep adding characters to integer trying to get something to work."
LBRACE, RBRACE, EQ, COMMA, LPAR, RPAR, SEMI = map(Suppress, "{}=,();")
_typedef = Keyword("typedef").suppress()
_enum = Keyword("enum").suppress()
identifier = Word(alphas, alphanums + "_")
# define an enumValue expression that is recursive, so that enumValues
# that are macros can take parameters that are enumValues
enumValue = Forward()
# add more types as needed - parse action on hex_integer will do parse-time
# conversion to int
integer = Regex(r"-?\d+").addParseAction(lambda t: int(t[0]))
# or just use the signed_integer expression found in pyparsing_common
# integer = pyparsing_common.signed_integer
hex_integer = Regex(r"0x[0-9a-fA-F]+").addParseAction(lambda t: int(t[0], 16))
# a macro defined using enumValue for parameters
macro_expr = Group(identifier + LPAR + Group(delimitedList(enumValue)) + RPAR)
# use '<<=' operator to attach recursive definition to enumValue
enumValue <<= hex_integer | integer | macro_expr
# remaining enum expressions
enumItem = Group(identifier("name") + Optional(EQ + enumValue("value")))
enumList = Group(delimitedList(enumItem) + Optional(COMMA))
enum = (_typedef
+ _enum
+ Optional(identifier("enum"))
+ LBRACE
+ enumList("names")
+ RBRACE
+ Optional(identifier("typedef"))
+ SEMI
)
# this comment style includes cStyleComment too, so no need to
# ignore both
enum.ignore(cppStyleComment)
Try it out:
enum.runTests([
"""
typedef enum
{
VAL_1 = -1,
VAL_2 = 0,
VAL_3 = 0x10,
VAL_4 = TEST_ENUM_CUSTOM(1,2)
}MyENUM;
""",
])
runTests is for testing and debugging your parser during development. Use enum.parseString(some_enum_expression) or enum.searchString(some_c_header_file_text) to get the actual parse results.
Using the new railroad diagram feature in the upcoming pyparsing 3.0 release, here is a visual representation of this parser:

How do you access name of a ProtoField after declaration?

How can I access the name property of a ProtoField after I declare it?
For example, something along the lines of:
myproto = Proto("myproto", "My Proto")
myproto.fields.foo = ProtoField.int8("myproto.foo", "Foo", base.DEC)
print(myproto.fields.foo.name)
Where I get the output:
Foo
An alternate method that's a bit more terse:
local fieldString = tostring(field)
local i, j = string.find(fieldString, ": .* myproto")
print(string.sub(fieldString, i + 2, j - (1 + string.len("myproto")))
EDIT: Or an even simpler solution that works for any protocol:
local fieldString = tostring(field)
local i, j = string.find(fieldString, ": .* ")
print(string.sub(fieldString, i + 2, j - 1))
Of course the 2nd method only works as long as there are no spaces in the field name. Since that's not necessarily always going to be the case, the 1st method is more robust. Here is the 1st method wrapped up in a function that ought to be usable by any dissector:
-- The field is the field whose name you want to print.
-- The proto is the name of the relevant protocol
function printFieldName(field, protoStr)
local fieldString = tostring(field)
local i, j = string.find(fieldString, ": .* " .. protoStr)
print(string.sub(fieldString, i + 2, j - (1 + string.len(protoStr)))
end
... and here it is in use:
printFieldName(myproto.fields.foo, "myproto")
printFieldName(someproto.fields.bar, "someproto")
Ok, this is janky, and certainly not the 'right' way to do it, but it seems to work.
I discovered this after looking at the output of
print(tostring(myproto.fields.foo))
This seems to spit out the value of each of the members of ProtoField, but I couldn't figure out the correct way to access them. So, instead, I decided to parse the string. This function will return 'Foo', but could be adapted to return the other fields as well.
function getname(field)
--First, convert the field into a string
--this is going to result in a long string with
--a bunch of info we dont need
local fieldString= tostring(field)
-- fieldString looks like:
-- ProtoField(188403): Foo myproto.foo base.DEC 0000000000000000 00000000 (null)
--Split the string on '.' characters
a,b=fieldString:match"([^.]*).(.*)"
--Split the first half of the previous result (a) on ':' characters
a,b=a:match"([^.]*):(.*)"
--At this point, b will equal " Foo myproto"
--and we want to strip out that abreviation "abvr" part
--Count the number of times spaces occur in the string
local spaceCount = select(2, string.gsub(b, " ", ""))
--Declare a counter
local counter = 0
--Declare the name we are going to return
local constructedName = ''
--Step though each word in (b) separated by spaces
for word in b:gmatch("%w+") do
--If we hav reached the last space, go ahead and return
if counter == spaceCount-1 then
return constructedName
end
--Add the current word to our name
constructedName = constructedName .. word .. " "
--Increment counter
counter = counter+1
end
end

HtmlProvider parses Fraction As DateTime

Using HtmlProvider to access a web-based table sometimes returns a fraction as a string (correct) and, at other times, returns a DateTime (incorrect).
What am I missing?
module Test =
open FSharp.Data
let [<Literal>] url = "https://www.example.com/fractions"
type profile = HtmlProvider<url>
let profile = profile.Load(url)
let [<Literal>] resultFile = #"C:\temp\data\Profile.csv"
let CsvResult =
do
use writer = new StreamWriter(resultFile, false)
writer.WriteLine "\"Date\";\"Fraction\""
for row in profile.Tables.Table1.Rows do
"\"" + row.``Date``.ToString() + "\"" + ";" |> writer.Write
"\"" + row.``Fraction``.ToString() + "\"" + ";" |> writer.WriteLine
writer.Close
let csvResult = CsvResult
Without seeing sample data I can't be 100% certain, but I'm guessing that it's parsing fractions as dates if the numbers involved would be valid dates in the culture you're using: e.g., 1/4 would be a valid date in any culture that uses / as a separator, and would be treated either as April 1st or as January 4th, depending on which parsing culture your system defaults to.
Other type providers in FSharp.Data (such as the CSV type provideryou could ) allow you to configure how each column will be parsed, but that's not an option the HTML type provider gives you. (Which is a bit of a missing feature, of course). But since the HTML type provider does allow you to specify the culture info for datetime and number parsing, one way you might be able to work around this is specify a culture that does not use / as a separator (but still uses . as a decimal point, since otherwise if the HTML you're parsing has numbers written like 1,000 for one thousand, that could be interpreted as 1). One such culture is the en-IN culture ("English (India)"), where the date separator is - and the decimal point is ..
So try passing Culture=System.Globalization.CultureInfo.GetCultureInfo("en-IN") in your HtmlProvider options, and see if that helps it stop treating fractions as dates.
The following combination of functions worked:
// http://www.fssnip.net/29/title/Regular-expression-active-pattern
module Solution =
open System
open System.Text.RegularExpressions
open FSharp.Data
let (|Regex|_|) pattern input =
let m = Regex.Match(input, pattern)
if m.Success then Some(List.tail [ for g in m.Groups -> g.Value ])
else None
let ptrnFraction = #"^([0-9]?[0-9]?)(\/)([0-9]?[0-9]?)$"
let ptrnDateTime = #"(\d{2})\/(\d{2})\/(\d{4}) (\d{2}):(\d{2}):(\d{2})"
let ToFraction input =
match input with
| Regex ptrnFraction [ numerator; operator; denominator ] ->
(numerator + operator + denominator).ToString()
| Regex ptrnDateTime [ day; month; year; hours; minutes; seconds ] ->
(day + "/" + month).ToString()
| _ -> "Not valid!"
let dtInput = #"05/09/2017 00:00:00"
let frcInput = #"13/20"
let outDate = ToFraction dtInput
printfn "Out Date: %s" outDate
let outFraction = ToFraction frcInput
printfn "Out Fraction: %s" outFraction
//Output:> Out Date: 05/09 Out Fraction: 13/20
Thus, I was able to replace:
"\"" + row.``Fraction``.ToString() + "\"" + ";" |> writer.WriteLine
with:
"\"" + ToFraction(row.``Fraction``.ToString()) + "\"" + ";" |> writer.Write
Thanks to #rmunn for the clarity of his explanations and the benefit of his expertise.

crc ip hdr checksum in verilog

I am implementing a task that i can use to obtain checksum from modified ip hdr. This is what i got:
task checksum_calc;
input [159:0] IP_hdr_data;
output [15:0] IP_chksum;
reg [19:0] IP_chksum_temp;
reg [19:0] IP_chksum_temp1;
reg [19:0] IP_chksum_temp2;
begin
IP_chksum_temp = IP_hdr_data[15:0] + IP_hdr_data[31:16] + IP_hdr_data[47:32] + IP_hdr_data[63:48] + IP_hdr_data[79:64] + IP_hdr_data[111:96] + IP_hdr_data[127:112] + IP_hdr_data[143:128] + IP_hdr_data[159:144];
IP_chksum_temp1 = IP_chksum_temp[15:0] + IP_chksum_temp[19:16];
IP_chksum_temp2 = IP_chksum_temp1[15:0] + IP_chksum_temp1[19:16];
IP_chksum = ! IP_chksum_temp2[15:0];
end
endtask
It's that correct? Or it will be some timing problems due to using cominational logic?
Looks like all you are doing is some combination logic calculation. A functions is a better choice. The primary purpose of a function is to return a value that is to be used in an expression.
This is huge combo logic, which in most of the scenario's will cause trouble for timing.
Better to run it through synthesis and timings check to know the exact result.
One suggestion as
IP_chksum_temp1 = IP_chksum_temp[15:0] + IP_chksum_temp[19:16];
can only generate flip the 16th bit. Hence, there is no need of 20 bits in next addition.
IP_chksum_temp2 = IP_chksum_temp1[15:0] + IP_chksum_temp1[19:16];
This can be done :-
reg [16:0] IP_chksum_temp1;
reg [16:0] IP_chksum_temp2;

Resources