pyparsing nested assignment - pyparsing

I have a group of 500-600 files I want to search thru and extract data. I'm trying to use pyparsing with very limited success. There are only 3 things in a file (1) comments, (2) simple assignments and (3) nested assignments. The nesting gets about 6 levels deep.
My goal is to look at a particular value in a 3 level deep field and if it has a particular value, pull out a value from another 3rd level field that is part of the same 2nd level field.
First, is pyparsing the proper tool for doing this? Other recommendations if not?
I know how to build a list of files and iterate over them. Let me show a sample file and then the code I'm trying.
# TOP_OBJECT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TOP_OBJECT=
(
obj_fmt=
(
obj_name="foo"
obj_cre_date=737785182 # = Tue May 18 23:19:42 1993
opj_data=
(
a="continue"
b="quit"
)
obj_version=264192 # = Version 4.8.0
)
# LEVEL1_OBJECT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LEVEL1_OBJECT=
(
OBJ_part=
(
obj_type=1005
obj_size=120
)
# LEVEL2_OBJECT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LEVEL2_OBJECT_A=
(
OBJ_part=
(
obj_type=3001
obj_size=128
)
Another_part=
(
another_attr=
(
another_style=0
another_param=2
)
)
) ### End of LEVEL2_OBJECT_A ###
# LEVEL2_OBJECT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LEVEL2_OBJECT_B=
(
OBJ_part=
(
obj_type=3005
obj_size=128
)
Another_part=
(
another_attr=
(
another_style=0
another_param=8
)
)
) ### End of LEVEL2_OBJECT_B ###
) ### End of LEVEL1 OBJECT
) ### End of TOP_OBJECT ###
My code to digest the file looks like this:
from pyparsing import *
def Syntax():
comment = Group("#" + restOfLine).suppress()
eq = Literal('=')
lpar = Literal( '(' ).suppress()
rpar = Literal( ')' ).suppress()
num = Word(nums)
var = Word(alphas + "_")
simpleAssign = var + eq
nestedAssign = Group(lpar + OneOrMore(simpleAssign) + rpar)
expr = Forward()
atom = nestedAssign | simpleAssign
expr << atom
expr.ignore(comment)
return expr
def main():
expr = Syntax()
results = expr.parseFile( "for_show.asc" )
print results
if __name__ == '__main__':
main()
My results don't descend: ['TOP_OBJECT', '=']
Right now I'm not handling quoted strings or numbers, just trying to understand parsing nested lists.

Mostly there, just a few gaps in your parser - see the commented-out original code, compared to the current code:
def Syntax():
comment = Group("#" + restOfLine).suppress()
eq = Literal('=')
lpar = Literal( '(' ).suppress()
rpar = Literal( ')' ).suppress()
num = Word(nums)
#~ var = Word(alphas + "_")
var = Word(alphas + "_", alphanums+"_")
#~ simpleAssign = var + eq
expr = Forward()
simpleAssign = var + eq + (num | quotedString)
#~ nestedAssign = Group(lpar + OneOrMore(simpleAssign) + rpar)
nestedAssign = var + eq + Group(lpar + OneOrMore(expr) + rpar)
atom = nestedAssign | simpleAssign
expr << atom
expr.ignore(comment)
return expr
This gives:
['TOP_OBJECT',
'=',
['obj_fmt',
'=',
['obj_name',
'=',
'"foo"',
'obj_cre_date',
'=',
'737785182',
'opj_data',
'=',
['a', '=', '"continue"', 'b', '=', '"quit"'],
'obj_version',
'=',
'264192'],
'LEVEL1_OBJECT',
'=',
['OBJ_part',
'=',
['obj_type', '=', '1005', 'obj_size', '=', '120'],
'LEVEL2_OBJECT_A',
'=',
['OBJ_part',
'=',
['obj_type', '=', '3001', 'obj_size', '=', '128'],
'Another_part',
'=',
['another_attr',
'=',
['another_style', '=', '0', 'another_param', '=', '2']]],
'LEVEL2_OBJECT_B',
'=',
['OBJ_part',
'=',
['obj_type', '=', '3005', 'obj_size', '=', '128'],
'Another_part',
'=',
['another_attr',
'=',
['another_style', '=', '0', 'another_param', '=', '8']]]]]]
If you wrap the expr inside nestedAssign's OneOrMore with Group
nestedAssign = var + eq + Group(lpar + OneOrMore(Group(expr)) + rpar)
, I think you'll get better structure for your repeated nested assignments:
['TOP_OBJECT',
'=',
[['obj_fmt',
'=',
[['obj_name', '=', '"foo"'],
['obj_cre_date', '=', '737785182'],
['opj_data', '=', [['a', '=', '"continue"'], ['b', '=', '"quit"']]],
['obj_version', '=', '264192']]],
['LEVEL1_OBJECT',
'=',
[['OBJ_part',
'=',
[['obj_type', '=', '1005'], ['obj_size', '=', '120']]],
['LEVEL2_OBJECT_A',
'=',
[['OBJ_part',
'=',
[['obj_type', '=', '3001'], ['obj_size', '=', '128']]],
['Another_part',
'=',
[['another_attr',
'=',
[['another_style', '=', '0'], ['another_param', '=', '2']]]]]]],
['LEVEL2_OBJECT_B',
'=',
[['OBJ_part',
'=',
[['obj_type', '=', '3005'], ['obj_size', '=', '128']]],
['Another_part',
'=',
[['another_attr',
'=',
[['another_style', '=', '0'], ['another_param', '=', '8']]]]]]]]]]]
Also, your originally posted code contained TABs, I find them to be more trouble than they are worth, better off using 4-space indents.

Related

How to flatten a dict type column in a DF

i have a df with a dict type column named measures like below:
How can I flatten this column as new columns in the same DF?
I recently had the same problem, wanting to extract and flatten data from a JSON, it might be overkill for your issue and a bit obscure but here it is:
This expects Dicts and ignores missing or malformed data
function extract_flatten(data::AbstractDict, extract::AbstractDict; cmdchar::AbstractChar='%')
res = Dict()
for (key, val) in extract
temp = Any[data]
keys = [key]
for v in val
if v isa AbstractString
if v[1] == cmdchar
v = split(v[2:end], ':')
if v[1] == "all"
temp2 = []
keys2 = String[]
for (t,k) in zip(temp, keys)
for (kt,vt) in pairs(t)
push!(keys2, join([k; v[2:end]; kt], '_'))
push!(temp2, vt)
end
end
temp = temp2
keys = keys2
elseif v[1] == "name"
keys .*= '_' * join(v[2:end], '_')
else
error("$(repr(v)) is not a valid command")
end
else
temp .= getdefnothing.(temp, Ref(v))
end
elseif v isa Integer
temp .= getdefnothing.(temp, Ref(v))
else
error("$(repr(v)) is not a valid key")
end
nothings = isnothing.(temp)
deleteat!(temp, nothings)
deleteat!(keys, nothings)
isempty(temp) && break
end
push!.(Ref(res), keys .=> temp)
end
return res
end
getdefnothing(x, y) = nothing
getdefnothing(x::AbstractDict, y) = get(x, y, nothing)
getdefnothing(x::AbstractArray, y) = get(x, y, nothing)
example use:
using Test
const d = Dict
schema = d(
"a" => ["b", "c", "d"],
"b" => ["e"],
"c" => ["f", "%all:z", "g"]
)
a = d("z" => 3)
#test extract_flatten(a, schema) == d()
b = d("e" => 0.123)
#test extract_flatten(b, schema) == d("b" => 0.123)
c = d("e" => true, "b" => d("c" => d("d" => "ABC")))
#test extract_flatten(c, schema) == d("b" => true, "a" => "ABC")
e = d("f" => d(
"a" => d("g" => "A"),
"b" => d("g" => "B")
))
#test extract_flatten(e, schema) == d("c_z_a" => "A", "c_z_b" => "B")
f = d("f" => [
d("g" => "A"),
d("g" => "B")
])
#test extract_flatten(f, schema) == d("c_z_1" => "A", "c_z_2" => "B")
g = d("e" => nothing, "f" => [1,2,3])
#test extract_flatten(g, schema) == d()
Assuming that there is only one object in each of those lists, then something like this:
using JSON
using dataframes
transform(
df,
(
:measures =>
ByRow(d -> (; JSON.parse(d; dicttype=Dict{Symbol,Any})[1]...)) =>
AsTable
)
)
What this does is parse the entries in the measures column as JSON (length-one) lists of dicts, take the first element, convert to a NamedTuple, and then use => AsTable to tell transform to convert that NamedTuple into corresponding columns.

ASP.NET IQueryable WHERE OR

I am trying to write this piece of code that will search the database table and I am trying to search multiple columns. What I have below appears to be the equivalent to WHERE column = "this" AND column2 = "this", what I am trying to do is this WHERE column = "this" OR column2 = "this" How would I accomplish this?
query = query.Where(p => (p.ChckNumber.ToString()).Contains(globalSearch.ToString()));
query = query.Where(p => (p.BankAccount.ToString()).Contains(globalSearch.ToString()));
query = query.Where(p => (p.Description.ToString()).Contains(globalSearch.ToString()));
query = query.Where(p => (p.CheckAmount.ToString()).Contains(globalSearch.ToString()));
query = query.Where(p => (p.ClearedDate.ToString()).Contains(globalSearch.ToString()));
query = query.Where(p => (p.SentDate.ToString()).Contains(globalSearch.ToString()));
You should be able to do this in-line using the OR operator:
query = query.Where(p =>
p.ChckNumber.ToString().Contains(globalSearch.ToString()) ||
p.BankAccount.ToString().Contains(globalSearch.ToString()) ||
p.Description.ToString().Contains(globalSearch.ToString()) ||
p.CheckAmount.ToString().Contains(globalSearch.ToString()) ||
p.ClearedDate.ToString().Contains(globalSearch.ToString()) ||
p.SentDate.ToString().Contains(globalSearch.ToString())
);

How to get score of a registered-query

I'm trying to calculate a score for a complex match query.
For example:
if conditionA and conditionB and (conditionC or conditionD)
then score = 10
else score = 0
This is the solution I've come up:
let $idReq := cts:register(
cts:and-query((
cts:path-range-query("/person/name", "=", 'val1', ("score-function=linear", "collation=http://marklogic.com/collation//S1")),
cts:path-range-query("/person/country", "=", 'country', ("score-function=linear", "collation=http://marklogic.com/collation//S1")),
cts:or-query((
cts:path-range-query("/person/city", "=", 'city', ("score-function=linear", "collation=http://marklogic.com/collation//S1")),
cts:path-range-query("/person/school", "=", '', ("score-function=linear", "collation=http://marklogic.com/collation//S1"))
))
))
)
return
cts:score(cts:search(fn:doc(), cts:registered-query($idReq, ("unfiltered"), 10)))
All the indexes exists and the collation too.
When I execute this registered query, I always get 0 for the score.
EDITED
I've narrow down the problem , and it can reproduced by combining cts:register with cts:path-range-query.
let $query := cts:path-range-query("/person/name", "=", "val1", ("score-function=linear", "collation=http://marklogic.com/collation//S1"))
let $idReq := cts:register($query)
return
cts:score(
cts:search(fn:doc(),
cts:registered-query($idReq,("unfiltered"), 10)
(: $query :)
)
)
EDITED
Setup index config for testing:
import module namespace admin = "http://marklogic.com/xdmp/admin" at "/MarkLogic/admin.xqy";
let $dbid := xdmp:database("Documents")
let $config :=
admin:database-add-range-path-index(
admin:get-configuration(), $dbid,
admin:database-range-path-index(
$dbid, "string", "/person/name",
"http://marklogic.com/collation//S1",
fn:false(), "ignore"))
return admin:save-configuration($config)
Sample data:
xdmp:document-insert(
'/test/person1.xml',
<person>
<name>val1</name>
<city>city</city>
<country>country</country>
</person>
)

Update randomly certain number of rows

I'm trying to update a certain number of rows of my entity "Vehicule". I have no idea how could it work.
I'm actually trying to modify only two rows where direction= 5.This is the function I used in order to update.
public function ValidAction(\OC\UserBundle\Entity\User $direction) {
$qb = $this->getDoctrine()
->getRepository('CarPfeBundle:Vehicule')
->createQueryBuilder('v');
$q = $qb->update ('CarPfeBundle:vehicule v')
->set('v.direction', '?1')
->where('v.direction = ?2')
->setParameter(1, $direction)
->setParameter(2, 5)
->getQuery();
$p = $q->execute();
return $this->redirect($this->generateUrl('demandeveh_afficher'));
}
But the above code update all rows of my database. I need to update only two rows. Any help please?
Try to do this ;
public function ValidAction(\OC\UserBundle\Entity\User $direction) {
$qb = $this->getDoctrine()
->getRepository('CarPfeBundle:Vehicule')
->createQueryBuilder('v');
// $ids an array that contains all ids with your condition
$ids = $qb->select('v.id')
->where('v.direction = :direction')
->setParameter(
array(
'direction' => $direction
)
)
->getQuery()
->getResult();
$id1 = $ids[array_rand($ids)];
$id2 = $ids[array_rand($ids)];
//To be sure that $id1 is different from id2
while ($id1 == $id2) {
$id2 = $ids[array_rand($ids)];
}
$q = $qb->update ('CarPfeBundle:vehicule v')
->set('v.direction', ':val1')
->where('v.direction = :val2')
->andWhere('v.id IN (:id1, :id2)')
->setParameter(
array(
'val1' => $direction ,
'val2' => 5 ,
'id1' => $id1,
'id2' => $id2,
)
)
->getQuery();
$p = $q->execute();
return $this->redirect($this->generateUrl('demandeveh_afficher'));
}
With the above code I hope you can update only two rows and randomly.
Good luck !
While a solution like Houssem Zitoun suggested may work, why not use a subquery?
If you get the (like I did, if not, just skip the middle SELECT)
Error: #1235 - This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'
go with this answer and something like (doc): - untested
UPDATE CarPfeBundle:Vehicule v
SET v.direction = ?1
WHERE v.direction IN
(SELECT * FROM (
SELECT v.direction
FROM CarPfeBundle:Vehicule v2
WHERE v.direction = ?2 LIMIT 2
)) AS sq

Wordpress query with multiple meta meta keys

Can someone tell me what wrong with this query.
if ( isset( $_GET['lokacija'] ) && !empty( $_GET['lokacija'] ) ) {
$lokacija = $_GET['lokacija'];
} else { $lokacija = ''; }
if ( isset( $_GET['tip'] ) && !empty( $_GET['tip'] ) ) {
$tip = $_GET['tip'];
} else { $tip = ''; }
if ( isset( $_GET['sobe'] ) && !empty( $_GET['sobe'] ) ) {
$sobe = $_GET['sobe'];
} else { $sobe = ''; }
$paged = (get_query_var('paged')) ? get_query_var('paged') : 1;
$args2 = array(
'posts_per_page' => 10,
'post_type' => 'nekretnine',
'paged' => $paged,
if ($lokacija != '') {
'meta_query' => array(
array (
'key' => 'lokacija',
'value' => $lokacija.''
),
)
}
);
$wp_query = new WP_Query( $args2 );
This code gives me error
Parse error: syntax error, unexpected T_IF, expecting ')' in
*/wp-content/themes/gs/page-nek-pretraga.php on line 23;
Line 23 is line that starts with if ($lokacija)...
What i want to do is to use multiple meta_query that i can get from php get (www.blabla./com/page1/?lokacija=foo&tip=foo&sobe=3)
But, i want it only if lets say $lokacija is not empty. Same for other two (possible 5-6 later) fields.
You can not include if condition in array. Whatever you are trying to achieve with above code is you can achieve with this following code.
$args2 = array(
'posts_per_page' => 10,
'post_type' => 'nekretnine',
'paged' => $paged,
);
if ($lokacija != '') {
$args2['meta_query'] = array(
array (
'key' => 'lokacija',
'value' => $lokacija.''
),
);
}
To check for multiple custom fields we have to join the meta table twice.
So the copy of the table is joined with a different temporary table name.
global $wpdb; $query = " SELECT * FROM {$wpdb--->prefix}posts
INNER JOIN {$wpdb->prefix}postmeta m1
ON ( {$wpdb->prefix}posts.ID = m1.post_id )
INNER JOIN {$wpdb->prefix}postmeta m2
ON ( {$wpdb->prefix}posts.ID = m2.post_id )
WHERE
{$wpdb->prefix}posts.post_type = 'post'
AND {$wpdb->prefix}posts.post_status = 'publish'
AND ( m1.meta_key = 'date' AND m1.meta_value > '2010-12-05 00:00:00' )
AND ( m1.meta_key = 'date' AND m1.meta_value < '2010-12-12 00:00:00' ) AND ( m2.meta_key = 'some_other_meta_value' AND m2.meta_value != '' ) GROUP BY {$wpdb->prefix}posts.ID
ORDER BY {$wpdb->prefix}posts.post_date
DESC;
For More Details Visit : http://realtuts.com/write-custom-wordpress-sql-query-multiple-meta-values/
";

Resources