building a nested pyparsing.Dict from a nested list - pyparsing

My input is growing from simple 2-level nested lists to a complex nested list of lists. I see where pyparsing.nestedExpr() is the bees' knees for this kind of thing, but I'm still wanting to build up a nested Dict structure.
With the basics somewhat squared away I've crafted this:
import pyparsing as pp
input_works = '''
(unitsOfMeasure
(altitudeUnits "m")
(capacitanceUnits "pF")
(designUnits "MIL")
(drawingUnits "MIL")
(drawingAccuracy 2)
(drawingHeight 28000)
)''
# recursive dict
input_doesnt_work = '''
(parameterFile "out.tf"
(revision "15.6")
(xcoord 1234.567)
(ycoord -3456.890)
(unitsOfMeasure
(altitudeUnits "m")
(capacitanceUnits "pF")
(designUnits "MIL")
(drawingUnits "MIL")
(drawingAccuracy 2)
(drawingHeight 28000)
)
)'''
v_string = pp.Word(pp.alphanums+'_'+'-'+'.')
v_quoted_string = pp.Combine( '"' + v_string + '"')
v_number = pp.Regex(r'[+-]?(?P<float1>\d+)(?P<float2>\.\d+)?(?P<float3>[Ee][+-]?\d+)?')
keyy = v_string
valu = pp.Or( [ v_string, v_quoted_string, v_number])
item = pp.Group( pp.Literal('(').suppress() + keyy + pp.OneOrMore( valu) + pp.Literal(')').suppress() )
# some magic - use Forward to make the dicts self-referential and thus recursive
dicts = pp.Forward()
dicts << pp.Group( pp.Literal('(').suppress() + \
keyy + \
pp.Optional( valu) + \
pp.OneOrMore( pp.Or( item, dicts)) + \
pp.Literal(')').suppress() )
print "dicts_input_works yields: ", dicts.parseString( input_works)
print "dicts_input_doesnt_work yields: ", dicts.parseString( input_doesnt_work
input_doesnt_work chokes on like 6, col 5, as if the self-reference in
pp.OneOrMore( pp.Or( item, dicts))
isn't being seen.
TIA,
code_warrior

Spotted my error 3sec after posting:
pp.OneOrMore( pp.Or( item, dicts )) + \ # wrong
pp.OneOrMore( pp.Or( [item, dicts] )) + \ # right
Never mind, nothing to see here, move along.
Thanks,
code_warrior

Related

Get-values from a html form in a for/do loop

I have a problem with get-value() method in progress4GL.
I am trying to get all values from html form.
My Progress4GL Code looks like:
for each tt:
do k = 1 to integer(h-timeframe):
h-from [k] = get-value(string(day(tt.date)) + "#" + string(tt.fnr) + "#" + string(tt.pnr) + "_von" + string(k)).
h-to [k] = get-value(string(day(tt.date)) + "#" + string(tt.fnr) + "#" + string(tt.pnr) + "_bis" + string(k)).
h-code [k] = get-value(string(day(tt.date)) + "#" + string(tt.fnr) + "#" + string(tt.pnr) + "_code" + string(k)).
end.
end.
h-timeframe is parameter and could be max. 10. (1-10)
tt is a temp-table and represents a week(fix 7 days)
It works perfectly till 9.Parameter. If I choose the 10 (which is max) then I get some performance Problem using get-value() Function.
Example when h-timeframe = 10:
as you can see from one get-value to another It takes really long time.( h-timeframe = 10 )
Example when h-timeframe = 9:
and here way much faster than other.
Can anyone explain why ? It is really strange and I have no Idea.
p.s: I have this problem just at 10. 0-9 It works perfectly
The performance difference is probably something external to your code snippet but, for performance, I would write it more like this:
define variable d as integer no-undo.
define variable n as integer no-undo.
define variable s as character no-undo.
for each tt:
// avoid recalculating and invoking functions N times per TT record
assign
d = day( tt.date )
n = integer( h-timeframe )
s = substitute( "&1#&2#&3_&&1&&2", d, tt.fnr, tt.pnr )
.
do k = 1 to n:
// consolidate multiple repeated operations, eliminate STRING() calls
assign
h-from [k] = get-value( substitute( s, "von", k ))
h-to [k] = get-value( substitute( s, "bis", k ))
h-code [k] = get-value( substitute( s, "code", k ))
.
end.
end.

pyparsing parse c/cpp enums with values as user defined macros

I have a usecase where i need to match enums where values can be userdefined macros.
Example enum
typedef enum
{
VAL_1 = -1
VAL_2 = 0,
VAL_3 = 0x10,
VAL_4 = **TEST_ENUM_CUSTOM(1,2)**,
}MyENUM;
I am using the below code, if i don't use format as in VAL_4 it works. I need match format as in VAL_4 as well. I am new to pyparsing, any help is appeciated.
My code:
BRACE, RBRACE, EQ, COMMA = map(Suppress, "{}=,")
_enum = Suppress("enum")
identifier = Word(alphas, alphanums + "_")
integer = Word("-"+alphanums) **#I have tried to "_(,)" to this but is not matching.**
enumValue = Group(identifier("name") + Optional(EQ + integer("value")))
enumList = Group(enumValue + ZeroOrMore(COMMA + enumValue) + Optional(COMMA))
enum = _enum + Optional(identifier("enum")) + LBRACE + enumList("names") + RBRACE + Optional(identifier("typedef"))
enum.ignore(cppStyleComment)
enum.ignore(cStyleComment)
Thanks
-Purna
Just adding more characters to integer is just the wrong way to go. Even this expression:
integer = Word("-"+alphanums)
isn't super-great, since it would match "---", "xyz", "q--10-", and many other non-integer strings.
Better to define integer properly. You could do:
integer = Combine(Optional('-') + Word(nums))
but I've found that for these low-level expressions that occur many places in your parse string, a Regex is best:
integer = Regex(r"-?\d+") # Regex(r"-?[0-9]+") if you like more readable re's
Then define one for hex_integer also,
Then to add macros, we need a recursive expression, to handle the possibility of macros having arguments that are also macros.
So at this point, we should just stop writing code for a bit, and do some design. In parser development, this design usually looks like a BNF, where you describe your parser in a sort of pseudocode:
enum_expr ::= "typedef" "enum" [identifier]
"{"
enum_item_list
"}" [identifier] ";"
enum_item_list ::= enum_item ["," enum_item]... [","]
enum_item ::= identifier "=" enum_value
enum_value ::= integer | hex_integer | macro_expression
macro_expression ::= identifier "(" enum_value ["," enum_value]... ")"
Note the recursion of macro_expression: it is used in defining enum_value, but it includes enum_value as part of its own definition. In pyparsing, we use a Forward to set up this kind of recursion.
See how that BNF is implemented in the code below. I build on some of the items you posted, but the macro expression required some rework. The bottom line is "don't just keep adding characters to integer trying to get something to work."
LBRACE, RBRACE, EQ, COMMA, LPAR, RPAR, SEMI = map(Suppress, "{}=,();")
_typedef = Keyword("typedef").suppress()
_enum = Keyword("enum").suppress()
identifier = Word(alphas, alphanums + "_")
# define an enumValue expression that is recursive, so that enumValues
# that are macros can take parameters that are enumValues
enumValue = Forward()
# add more types as needed - parse action on hex_integer will do parse-time
# conversion to int
integer = Regex(r"-?\d+").addParseAction(lambda t: int(t[0]))
# or just use the signed_integer expression found in pyparsing_common
# integer = pyparsing_common.signed_integer
hex_integer = Regex(r"0x[0-9a-fA-F]+").addParseAction(lambda t: int(t[0], 16))
# a macro defined using enumValue for parameters
macro_expr = Group(identifier + LPAR + Group(delimitedList(enumValue)) + RPAR)
# use '<<=' operator to attach recursive definition to enumValue
enumValue <<= hex_integer | integer | macro_expr
# remaining enum expressions
enumItem = Group(identifier("name") + Optional(EQ + enumValue("value")))
enumList = Group(delimitedList(enumItem) + Optional(COMMA))
enum = (_typedef
+ _enum
+ Optional(identifier("enum"))
+ LBRACE
+ enumList("names")
+ RBRACE
+ Optional(identifier("typedef"))
+ SEMI
)
# this comment style includes cStyleComment too, so no need to
# ignore both
enum.ignore(cppStyleComment)
Try it out:
enum.runTests([
"""
typedef enum
{
VAL_1 = -1,
VAL_2 = 0,
VAL_3 = 0x10,
VAL_4 = TEST_ENUM_CUSTOM(1,2)
}MyENUM;
""",
])
runTests is for testing and debugging your parser during development. Use enum.parseString(some_enum_expression) or enum.searchString(some_c_header_file_text) to get the actual parse results.
Using the new railroad diagram feature in the upcoming pyparsing 3.0 release, here is a visual representation of this parser:

pyparsing: Grouping guidelines

pyparsing: The below is the code i put up which can parse a nested function call , a logical function call or a hybrid call which nests both the function and a logical function call. The dump() data adds too many unnecessary levels of braces because of grouping. Removing the Group() results in a wrong output. Is there a guideline to use Group(parsers)?
Also the Pyparsing document does'nt detail on how to walk the tree created and not much of data is available out there. Please point me to a link/guide which helps me write the tree walker for recursively parsed data for my test cases.
I will be translating this parsed data to a valid tcl code.
from pyparsing import *
from pyparsing import OneOrMore, Optional, Word, delimitedList, Suppress
# parse action -maker; # from Paul's example
def makeLRlike(numterms):
if numterms is None:
# None operator can only by binary op
initlen = 2
incr = 1
else:
initlen = {0:1,1:2,2:3,3:5}[numterms]
incr = {0:1,1:1,2:2,3:4}[numterms]
# define parse action for this number of terms,
# to convert flat list of tokens into nested list
def pa(s,l,t):
t = t[0]
if len(t) > initlen:
ret = ParseResults(t[:initlen])
i = initlen
while i < len(t):
ret = ParseResults([ret] + t[i:i+incr])
i += incr
return ParseResults([ret])
return pa
line = Forward()
fcall = Forward().setResultsName("fcall")
flogical = Forward()
lparen = Literal("(").suppress()
rparen = Literal(")").suppress()
arg = Word(alphas,alphanums+"_"+"."+"+"+"-"+"*"+"/")
args = delimitedList(arg).setResultsName("arg")
fargs = delimitedList(OneOrMore(flogical) | OneOrMore(fcall) |
OneOrMore(arg))
fname = Word(alphas,alphanums+"_")
fcall << Group(fname.setResultsName('func') + Group(lparen +
Optional(fargs) + rparen).setResultsName('fargs'))
flogic = Keyword("or") | Keyword("and") | Keyword("not")
logicalArg = delimitedList(Group(fcall.setResultsName("fcall")) |
Group(arg.setResultsName("arg")))
#logicalArg.setDebug()
flogical << Group(logicalArg.setResultsName('larg1') +
flogic.setResultsName('flogic') + logicalArg.setResultsName('larg2'))
#logical = operatorPrecedence(flogical, [(not, 1, opAssoc.RIGHT,
makeLRlike(2)),
# (and, 2, opAssoc.LEFT,
makeLRlike(2)),
# (or , 2, opAssoc.LEFT,
makeLRlike(2))])
line = flogical | fcall #change to logical if operatorPrecedence is used
# Works fine
print line.parseString("f(x, y)").dump()
print line.parseString("f(h())").dump()
print line.parseString("a and b").dump()
print line.parseString("f(a and b)").dump()
print line.parseString("f(g(x))").dump()
print line.parseString("f(a and b) or h(b not c)").dump()
print line.parseString("f(g(x), y)").dump()
print line.parseString("g(f1(x), a, b, f2(x,y, k(x,y)))").dump()
print line.parseString("f(a not c) and g(f1(x), a, b, f2(x,y,
k(x,y)))").dump()
#Does'nt work fine yet;
#try changing flogical assignment to logicalArg | flogic
#print line.parseString("a or b or c").dump()
#print line.parseString("f(a or b(x) or c)").dump()

pyparsing delimitedList(..., combine=True) giving inconsistent result

I'm using pyparsing==2.1.5 with Python 3.4, and I'm getting what seems to be an odd result:
word = Word(alphanums)
word_list_no_combine = delimitedList(word, combine=False)
word_list_combine = delimitedList(word, combine=True)
print(word_list_no_combine.parseString('one, two')) # ['one', 'two']
print(word_list_no_combine.parseString('one,two')) # ['one', 'two']
print(word_list_combine.parseString('one, two')) # ['one']: ODD ONE OUT
print(word_list_combine.parseString('one,two')) # ['one,two']
It's not obvious to me why the "combine" option causes one of the parts of the list to be swallowed when a space is present, but not when it's absent. Is this a pyparsing bug or am I missing something obvious?
Rather than modify pyparsing, I suggest you do this work using normal uncombined delimited list with a custom parse action:
word_list_combine_using_parse_action = word_list_no_combine.copy().setParseAction(','.join)
print(word_list_combine_using_parse_action.parseString('one, two'))
Will print one,two
Looks like it's due to the behaviour of Combine(), specifically its default "adjacent=True" option, which is then used by delimitedList():
class Combine(TokenConverter):
"""Converter to concatenate all matching tokens to a single string.
By default, the matching patterns must also be contiguous in the input string;
this can be disabled by specifying C{'adjacent=False'} in the constructor.
"""
def __init__( self, expr, joinString="", adjacent=True ):
# ...
def delimitedList( expr, delim=",", combine=False ):
# ...
dlName = _ustr(expr)+" ["+_ustr(delim)+" "+_ustr(expr)+"]..."
if combine:
return Combine( expr + ZeroOrMore( delim + expr ) ).setName(dlName)
else:
return ( expr + ZeroOrMore( Suppress( delim ) + expr ) ).setName(dlName)
So it can be solved with a replacement:
def delimitedListPlus(expr, delim=",", combine=False, combine_adjacent=False):
dlName = str(expr) + " [" + str(delim) + " " + str(expr) + "]..."
if combine:
return Combine(expr + ZeroOrMore(delim + expr),
adjacent=combine_adjacent).setName(dlName)
else:
return (expr + ZeroOrMore(Suppress(delim) + expr)).setName(dlName)

BBC Basic Cipher Help Needed

I am doing a school project in which I need to make a "sort of" vigenere cipher in which the user inputs both the keyword and plaintext. However the vigenere assumes a=0 whereas I am to assume a=1 and I have changed this accordingly for my program. However I am required to make my cipher work for both lower and upper case, How could I make this also work for lower case, it may be a stupid question but I'm very confused at this point and I'm new to programming, thanks.
REM Variables
plaintext$=""
PRINT "Enter the text you would like to encrypt"
INPUT plaintext$
keyword$=""
PRINT "Enter the keyword you wish to use"
INPUT keyword$
encrypted$= FNencrypt(plaintext$, keyword$)
REM PRINTING OUTPUTS
PRINT "Key = " keyword$
PRINT "Plaintext = " plaintext$
PRINT "Encrypted = " encrypted$
PRINT "Decrypted = " FNdecrypt(encrypted$, keyword$)
END
DEF FNencrypt(plain$, keyword$)
LOCAL i%, offset%, Ascii%, output$
FOR i% = 1 TO LEN(plain$)
Ascii% = ASCMID$(plain$, i%)
IF Ascii% >= 65 IF Ascii% <= 90 THEN
output$ += CHR$((66 + (Ascii% + ASCMID$(keyword$, offset%+1)) MOD 26))
ENDIF
offset% = (offset% + 1) MOD LEN(keyword$)
NEXT
= output$
DEF FNdecrypt(encrypted$, keyword$)
LOCAL i%, offset%, n%, o$
FOR i% = 1 TO LEN(encrypted$)
n% = ASCMID$(encrypted$, i%)
o$ += CHR$(64 + (n% + 26 - ASCMID$(keyword$, offset%+1)) MOD 26)
offset% = (offset% + 1) MOD LEN(keyword$)
NEXT
= output$
You can always convert from upper to lowercase and the Stringlib library contains a function for doing this.
First import stringlib at the top of your program:
import #lib$+"stringlib"
then convert strings using:
plaintext$ = fn_lower(plaintext$)

Resources