How do I correctly name ParseResults? - pyparsing

I like to name the entities in my grammar so I can access them using the as_dict() feature of ParseResults. But somehow it is not obvious to me where exactly I should "group" and "name" them. This often results in some kind of trial and error process.
To make more clear what I mean I tried to strip down the problem to a minimal example:
If we define an identifier that is labelled with "I" and holds the name of the identifier:
from pyparsing import *
identifier = Word(alphas,nums)
gid = Group(identifier("I"))
idg = Group(identifier)("I")
t=gid.parseString("x1")
print(t.as_dict(), t.as_list())
t=idg.parseString("x1")
print(t.as_dict(), t.as_list())
results in:
{} [['x1']]
{'I': ['x1']} [['x1']]
which suggests that I should first "Group" then "name" the identifier.
However if I use a sequence of these (named "P") it's vice versa, as this (continued) example shows:
prog= [
Group(ZeroOrMore(gid)).setResultsName("P"),
Group(ZeroOrMore(idg)).setResultsName("P"),
]
s = "x1 x2"
for i in range(0,len(prog)):
t=prog[i].parseString(s)
print(t.as_dict(), t.as_list())
for v in t.P:
print(v.as_dict(), t.as_list())
which outputs:
{'P': [{'I': 'x1'}, {'I': 'x2'}]} [[['x1'], ['x2']]]
{'I': 'x1'} [[['x1'], ['x2']]]
{'I': 'x2'} [[['x1'], ['x2']]]
{'P': {'I': ['x2']}} [[['x1'], ['x2']]]
{} [[['x1'], ['x2']]]
{} [[['x1'], ['x2']]]
Am I doing something wrong? Or did I just misunderstand named results?
Cheers,
Alex

Welcome to pyparsing! Grouping and results names are really important features to get a good understanding of, for making parsers with useful results, so it's great that you are learning these basics.
I had suggested using create_diagram() to better see the structure and the names for these expressions. But they are almost too simple for the diagrams to really show much. As you work with pyparsing further, you might come back to using create_diagram to make parser railroad diagrams for your pyparsing parsers.
Instead, I replicated your steps, but instead of using results.as_dict() and results.as_list() (where results is the pyparsing ParseResults value returned from calling parse_string()), I used another visualizing method, results.dump(). dump() prints out results.as_list(), followed by an indented list of the items by results name, and then by sub-lists. I think dump() will show a little better how names and groups work in your expressions.
One of the main points is that as_dict() will only walk named items. If you had an expression for two identifiers like this (where only one expression has a results name:
two_idents = identifier() + identifier("final")
Then print(two_idents.parse_string("x1 x2").as_list()) will print:
['x1', 'x2']
But print(two_idents.parse_string("x1 x2").as_dict()) will only show:
{"final": "x2"}
because only the second item has a name. (This would even be the case if the unnamed item was a group containing a sub-expression with a results name. as_dict() only walks items with results names, so the unnamed containing group would be omitted.)
Here's how dump() would display these:
['x1', 'x2']
- final: 'x2'
It shows that a list view of the results has 'x1' and 'x2', and there is a top-level results name 'final' that points to 'x2'.
Here is my annotated version of your code, and the corresponding as_dict() and dump() output from each:
from pyparsing import *
identifier = Word(alphas, nums)
# group an expression that has a results name
gid = Group(identifier("I"))
# group an unnamed expression, and put the results name on the group
idg = Group(identifier)("I")
# groups with the results name "P" on the outer group
prog0 = Group(ZeroOrMore(gid)).setResultsName("P")
prog1 = Group(ZeroOrMore(idg)).setResultsName("P")
# pyparsing short-cut for x.set_name("x") for gid, idg, prog0, and prog1
autoname_elements()
s = "x1 x2"
for expr in (gid, idg, prog0, prog1):
print(expr) # prints the expression name
result = expr.parse_string(s)
print(result.as_dict())
print(result.dump())
print()
Gives this output:
gid
{}
[['x1']]
[0]:
['x1']
- I: 'x1'
idg
{'I': ['x1']}
[['x1']]
- I: ['x1']
[0]:
['x1']
prog0
{'P': [{'I': 'x1'}, {'I': 'x2'}]}
[[['x1'], ['x2']]]
- P: [['x1'], ['x2']]
[0]:
['x1']
- I: 'x1'
[1]:
['x2']
- I: 'x2'
[0]:
[['x1'], ['x2']]
[0]:
['x1']
- I: 'x1'
[1]:
['x2']
- I: 'x2'
prog1
{'P': {'I': ['x2']}}
[[['x1'], ['x2']]]
- P: [['x1'], ['x2']]
- I: ['x2']
[0]:
['x1']
[1]:
['x2']
[0]:
[['x1'], ['x2']]
- I: ['x2']
[0]:
['x1']
[1]:
['x2']
Explanations:
gid is an unnamed group containing a named item. Since there is no top-level named item, as_dict() returns an empty dict.
idg is a named group containing an unnamed item. as_dict() returns a dict with the outer with the single item 'x1'
prog0 is 0 or more unnamed groups contained in a named group. Each of the contained groups has a named item.
prog1 is 0 or more named groups contained in a named group. Since the named groups all have the same results name, only the last one is kept in the results - this is similar to creating a Python dict using the same key multiple times. print({'a':100, 'a':200}) will print {'a': 200}. You can override this default behavior in pyparsing by adding list_all_matches=True argument to your call to set_results_name. Using list_all_matches=True makes the result act like a defaultdict(list) instead of a dict.
Please visit the pyparsing docs at https://pyparsing-docs.readthedocs.io/en/latest/ and some additional tips in the pyparsing wiki at https://github.com/pyparsing/pyparsing/wiki .

Related

iterating 2D array in Elixir

I am new to Elixir language and I am having some issues while writing a piece of code.
What I am given is a 2D array like
list1 = [
[1 ,2,3,4,"nil"],
[6,7,8,9,10,],
[11,"nil",13,"nil",15],
[16,17,"nil",19,20] ]
Now, what I've to do is to get all the elements that have values between 10 and 20, so what I'm doing is:
final_list = []
Enum.each(list1, fn row ->
Enum.each(row, &(if (&1 >= 10 and &1 <= 99) do final_list = final_list ++ &1 end))
end
)
Doing this, I'm expecting that I'll get my list of numbers in final_list but I'm getting blank final list with a warning like:
warning: variable "final_list" is unused (there is a variable with the same name in the context, use the pin operator (^) to match on it or prefix this variable with underscore if it is not meant to be used)
iex:5
:ok
and upon printing final_list, it is not updated.
When I try to check whether my code is working properly or not, using IO.puts as:
iex(5)> Enum.each(list1, fn row -> ...(5)> Enum.each(row, &(if (&1 >= 10 and &1 <= 99) do IO.puts(final_list ++ &1) end))
...(5)> end
...(5)> )
The Output is:
10
11
13
15
16
17
19
20
:ok
What could I possibly be doing wrong here? Shouldn't it add the elements to the final_list?
If this is wrong ( probably it is), what should be the possible solution to this?
Any kind of help will be appreciated.
As mentioned in Adam's comments, this is a FAQ and the important thing is the message "warning: variable "final_list" is unused (there is a variable with the same name in the context, use the pin operator (^) to match on it or prefix this variable with underscore if it is not meant to be used)" This message actually indicates a very serious problem.
It tells you that the assignment "final_list = final_list ++ &1" is useless since it just creates a local variable, hiding the external one. Elixir variables are not mutable so you need to reorganize seriously your code.
The simplest way is
final_list =
for sublist <- list1,
n <- sublist,
is_number(n),
n in 10..20,
do: n
Note that every time you write final_list = ..., you actually declare a new variable with the same name, so the final_list you declared inside your anonymous function is not the final_list outside the anonymous function.

SML Create function receives list of tuples and return list with sum each pair

I'm studying Standard ML and one of the exercices I have to do is to write a function called opPairs that receives a list of tuples of type int, and returns a list with the sum of each pair.
Example:
input: opPairs [(1, 2), (3, 4)]
output: val it = [3, 7]
These were my attempts, which are not compiling:
ATTEMPT 1
type T0 = int * int;
fun opPairs ((h:TO)::t) = let val aux =(#1 h + #2 h) in
aux::(opPairs(t))
end;
The error message is:
Error: unbound type constructor: TO
Error: operator and operand don't agree [type mismatch]
operator domain: {1:'Y; 'Z}
operand: [E]
in expression:
(fn {1=1,...} => 1) h
ATTEMPT 2
fun opPairs2 l = map (fn x => #1 x + #2 x ) l;
The error message is: Error: unresolved flex record (need to know the names of ALL the fields
in this context)
type: {1:[+ ty], 2:[+ ty]; 'Z}
The first attempt has a typo: type T0 is defined, where 0 is zero, but then type TO is referenced in the pattern, where O is the letter O. This gets rid of the "operand and operator do not agree" error, but there is a further problem. The pattern ((h:T0)::t) does not match an empty list, so there is a "match nonexhaustive" warning with the corrected type identifier. This manifests as an exception when the function is used, because the code needs to match an empty list when it reaches the end of the input.
The second attempt needs to use a type for the tuples. This is because the tuple accessor #n needs to know the type of the tuple it accesses. To fix this problem, provide the type of the tuple argument to the anonymous function:
fun opPairs2 l = map (fn x:T0 => #1 x + #2 x) l;
But, really it is bad practice to use #1, #2, etc. to access tuple fields; use pattern matching instead. Here is a cleaner approach, more like the first attempt, but taking full advantage of pattern matching:
fun opPairs nil = nil
| opPairs ((a, b)::cs) = (a + b)::(opPairs cs);
Here, opPairs returns an empty list when the input is an empty list, otherwise pattern matching provides the field values a and b to be added and consed recursively onto the output. When the last tuple is reached, cs is the empty list, and opPairs cs is then also the empty list: the individual tuple sums are then consed onto this empty list to create the output list.
To extend on exnihilo's answer, once you have achieved familiarity with the type of solution that uses explicit recursion and pattern matching (opPairs ((a, b)::cs) = ...), you can begin to generalise the solution using list combinators:
val opPairs = map op+

Evaluate expression with local variables

I'm writing a genetic program in order to test the fitness of randomly generated expressions. Shown here is the function to generate the expression as well a the main function. DIV and GT are defined elsewhere in the code:
function create_single_full_tree(depth, fs, ts)
"""
Creates a single AST with full depth
Inputs
depth Current depth of tree. Initially called from main() with max depth
fs Function Set - Array of allowed functions
ts Terminal Set - Array of allowed terminal values
Output
Full AST of typeof()==Expr
"""
# If we are at the bottom
if depth == 1
# End of tree, return function with two terminal nodes
return Expr(:call, fs[rand(1:length(fs))], ts[rand(1:length(ts))], ts[rand(1:length(ts))])
else
# Not end of expression, recurively go back through and create functions for each new node
return Expr(:call, fs[rand(1:length(fs))], create_single_full_tree(depth-1, fs, ts), create_single_full_tree(depth-1, fs, ts))
end
end
function main()
"""
Main function
"""
# Define functional and terminal sets
fs = [:+, :-, :DIV, :GT]
ts = [:x, :v, -1]
# Create the tree
ast = create_single_full_tree(4, fs, ts)
#println(typeof(ast))
#println(ast)
#println(dump(ast))
x = 1
v = 1
eval(ast) # Error out unless x and v are globals
end
main()
I am generating a random expression based on certain allowed functions and variables. As seen in the code, the expression can only have symbols x and v, as well as the value -1. I will need to test the expression with a variety of x and v values; here I am just using x=1 and v=1 to test the code.
The expression is being returned correctly, however, eval() can only be used with global variables, so it will error out when run unless I declare x and v to be global (ERROR: LoadError: UndefVarError: x not defined). I would like to avoid globals if possible. Is there a better way to generate and evaluate these generated expressions with locally defined variables?
Here is an example for generating an (anonymous) function. The result of eval can be called as a function and your variable can be passed as parameters:
myfun = eval(Expr(:->,:x, Expr(:block, Expr(:call,:*,3,:x) )))
myfun(14)
# returns 42
The dump function is very useful to inspect the expression that the parsers has created. For two input arguments you would use a tuple for example as args[1]:
julia> dump(parse("(x,y) -> 3x + y"))
Expr
head: Symbol ->
args: Array{Any}((2,))
1: Expr
head: Symbol tuple
args: Array{Any}((2,))
1: Symbol x
2: Symbol y
typ: Any
2: Expr
[...]
Does this help?
In the Metaprogramming part of the Julia documentation, there is a sentence under the eval() and effects section which says
Every module has its own eval() function that evaluates expressions in its global scope.
Similarly, the REPL help ?eval will give you, on Julia 0.6.2, the following help:
Evaluate an expression in the given module and return the result. Every Module (except those defined with baremodule) has its own 1-argument definition of eval, which evaluates expressions in that module.
I assume, you are working in the Main module in your example. That's why you need to have the globals defined there. For your problem, you can use macros and interpolate the values of x and y directly inside the macro.
A minimal working example would be:
macro eval_line(a, b, x)
isa(a, Real) || (warn("$a is not a real number."); return :(throw(DomainError())))
isa(b, Real) || (warn("$b is not a real number."); return :(throw(DomainError())))
return :($a * $x + $b) # interpolate the variables
end
Here, #eval_line macro does the following:
Main> #macroexpand #eval_line(5, 6, 2)
:(5 * 2 + 6)
As you can see, the values of macro's arguments are interpolated inside the macro and the expression is given to the user accordingly. When the user does not behave,
Main> #macroexpand #eval_line([1,2,3], 7, 8)
WARNING: [1, 2, 3] is not a real number.
:((Main.throw)((Main.DomainError)()))
a user-friendly warning message is provided to the user at parse-time, and a DomainError is thrown at run-time.
Of course, you can do these things within your functions, again by interpolating the variables --- you do not need to use macros. However, what you would like to achieve in the end is to combine eval with the output of a function that returns Expr. This is what the macro functionality is for. Finally, you would simply call your macros with an # sign preceding the macro name:
Main> #eval_line(5, 6, 2)
16
Main> #eval_line([1,2,3], 7, 8)
WARNING: [1, 2, 3] is not a real number.
ERROR: DomainError:
Stacktrace:
[1] eval(::Module, ::Any) at ./boot.jl:235
EDIT 1. You can take this one step further, and create functions accordingly:
macro define_lines(linedefs)
for (name, a, b) in eval(linedefs)
ex = quote
function $(Symbol(name))(x) # interpolate name
return $a * x + $b # interpolate a and b here
end
end
eval(ex) # evaluate the function definition expression in the module
end
end
Then, you can call this macro to create different line definitions in the form of functions to be called later on:
#define_lines([
("identity_line", 1, 0);
("null_line", 0, 0);
("unit_shift", 0, 1)
])
identity_line(5) # returns 5
null_line(5) # returns 0
unit_shift(5) # returns 1
EDIT 2. You can, I guess, achieve what you would like to achieve by using a macro similar to that below:
macro random_oper(depth, fs, ts)
operations = eval(fs)
oper = operations[rand(1:length(operations))]
terminals = eval(ts)
ts = terminals[rand(1:length(terminals), 2)]
ex = :($oper($ts...))
for d in 2:depth
oper = operations[rand(1:length(operations))]
t = terminals[rand(1:length(terminals))]
ex = :($oper($ex, $t))
end
return ex
end
which will give the following, for instance:
Main> #macroexpand #random_oper(1, [+, -, /], [1,2,3])
:((-)([3, 3]...))
Main> #macroexpand #random_oper(2, [+, -, /], [1,2,3])
:((+)((-)([2, 3]...), 3))
Thanks Arda for the thorough response! This helped, but part of me thinks there may be a better way to do this as it seems too roundabout. Since I am writing a genetic program, I will need to create 500 of these ASTs, all with random functions and terminals from a set of allowed functions and terminals (fs and ts in the code). I will also need to test each function with 20 different values of x and v.
In order to accomplish this with the information you have given, I have come up with the following macro:
macro create_function(defs)
for name in eval(defs)
ex = quote
function $(Symbol(name))(x,v)
fs = [:+, :-, :DIV, :GT]
ts = [x,v,-1]
return create_single_full_tree(4, fs, ts)
end
end
eval(ex)
end
end
I can then supply a list of 500 random function names in my main() function, such as ["func1, func2, func3,.....". Which I can eval with any x and v values in my main function. This has solved my issue, however, this seems to be a very roundabout way of doing this, and may make it difficult to evolve each AST with each iteration.

getting values from nested lists in groovy

I have a 3 level nested list in groovy like this
rList = [[[12, name1],[22,name2],[49,name3]],[[33, name5],[22,name6],[21, name7]]]
how can I iterate it to get the name values from the 1st sublist
so I want like rsublist = [name1, name2, name3]
Thanks in advance.
rList = [[[12, 'name1'], [22, 'name2'], [49, 'name3']], [[33, 'name5'], [22, 'name6'], [21, 'name7']]]
names = rList[0]*.getAt(1)
assert names == ['name1', 'name2', 'name3']
First, rList[0] gives you the first sub-list, which is [[12, name1], [22, name2], [49, name3]]. Then, the spread operator, *., is used to apply the same method, getAt(1), to every element of that list, which will return a list with every second element of the sub-lists, which are the values you were looking for :)
You can also use rList[0].collect { it[1] } which is equivalent and might be more familiar if you are not used to the spread operator.
Based on the #epidemian's answer, here's a more readable version:
rList.first()*.last()

Appending data to an AT Field using transmogrifier

I have a CSV file of data like this:
1, [a, b, c]
2, [a, b, d]
3, [a]
and some Plone objects which should be updated like this:
ID, LinesField
a, [1,2,3]
b, [1,2]
c, [1]
d, [2]
So, to clarify, the object with the id a is named on lines 1, 2 and 3 of the CSV, and thus the LinesField property of object a needs to have those line ids (the first number on the line) listed.
Ideally I'd like to use Transmogrifier to import this information (and avoid doing any manipulation in Excel beforehand), and I can see two ways, theoretically of doing this, but I can't work out how to do this in practice. I'd be grateful for some pointers to examples. I think that either I need to transform the entire pipeline so that the items reflect the structure of my Plone objects and then use the ATSchemaUpdater blueprint, but I can't see any examples on how to add items to the pipeline (do I need to write my own blueprint?) Or, alternatively I could loop through the items as they exist and append the value in the left column to the items in the list in the right. For that I need a way of appending values with ATSchemaUpdater rather than overwriting them - again, is there a blueprint for that anywhere?
Here's a few sample csv lines:
"Name","Themes"
"Bessie Brown","cah;cab;cac"
"Fred Blogs","cah;cac"
"Dinah Washington","cah;cab"
The Plone object will be a theme and the lines field a list of names:
cah, ['Bessie Brown', 'Fred Boggs' etc etc]
I'm not pretty sure you want to read the CVS file using transmogrifier, but I think you can create a section to insert these values to the items in the pipeline using a function like this:
def transpose(cvs):
keys = []
[keys.extend(v) for v in cvs.values()]
keys = set(keys)
d = {}
for key in keys:
values = [k for k, v in cvs.iteritems() if key in v]
d[key] = values
return d
In this context, cvs is {1: ['a', 'b', 'c'], 2: ['a', 'b', 'd'], 3: ['a']}; keys will contain all possible values set(['a', 'c', 'b', 'd']); and d will be what you want {'a': [1, 2, 3], 'c': [1], 'b': [1, 2], 'd': [2]}.
Probably there are better ways to do it, but I'm not a Python magician.
The insert section could look like this one:
class Insert(object):
"""Insert new keys into items.
"""
classProvides(ISectionBlueprint)
implements(ISection)
def __init__(self, transmogrifier, name, options, previous):
self.previous = previous
self.new_keys = transpose(cvs)
def __iter__(self):
for item in self.previous:
item.update(self.new_keys)
yield item
After that you can use the SchemaUpdater section.

Resources