pyparsing nested assignment - pyparsing
I have a group of 500-600 files I want to search thru and extract data. I'm trying to use pyparsing with very limited success. There are only 3 things in a file (1) comments, (2) simple assignments and (3) nested assignments. The nesting gets about 6 levels deep.
My goal is to look at a particular value in a 3 level deep field and if it has a particular value, pull out a value from another 3rd level field that is part of the same 2nd level field.
First, is pyparsing the proper tool for doing this? Other recommendations if not?
I know how to build a list of files and iterate over them. Let me show a sample file and then the code I'm trying.
# TOP_OBJECT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TOP_OBJECT=
(
obj_fmt=
(
obj_name="foo"
obj_cre_date=737785182 # = Tue May 18 23:19:42 1993
opj_data=
(
a="continue"
b="quit"
)
obj_version=264192 # = Version 4.8.0
)
# LEVEL1_OBJECT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LEVEL1_OBJECT=
(
OBJ_part=
(
obj_type=1005
obj_size=120
)
# LEVEL2_OBJECT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LEVEL2_OBJECT_A=
(
OBJ_part=
(
obj_type=3001
obj_size=128
)
Another_part=
(
another_attr=
(
another_style=0
another_param=2
)
)
) ### End of LEVEL2_OBJECT_A ###
# LEVEL2_OBJECT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LEVEL2_OBJECT_B=
(
OBJ_part=
(
obj_type=3005
obj_size=128
)
Another_part=
(
another_attr=
(
another_style=0
another_param=8
)
)
) ### End of LEVEL2_OBJECT_B ###
) ### End of LEVEL1 OBJECT
) ### End of TOP_OBJECT ###
My code to digest the file looks like this:
from pyparsing import *
def Syntax():
comment = Group("#" + restOfLine).suppress()
eq = Literal('=')
lpar = Literal( '(' ).suppress()
rpar = Literal( ')' ).suppress()
num = Word(nums)
var = Word(alphas + "_")
simpleAssign = var + eq
nestedAssign = Group(lpar + OneOrMore(simpleAssign) + rpar)
expr = Forward()
atom = nestedAssign | simpleAssign
expr << atom
expr.ignore(comment)
return expr
def main():
expr = Syntax()
results = expr.parseFile( "for_show.asc" )
print results
if __name__ == '__main__':
main()
My results don't descend: ['TOP_OBJECT', '=']
Right now I'm not handling quoted strings or numbers, just trying to understand parsing nested lists.
Mostly there, just a few gaps in your parser - see the commented-out original code, compared to the current code:
def Syntax():
comment = Group("#" + restOfLine).suppress()
eq = Literal('=')
lpar = Literal( '(' ).suppress()
rpar = Literal( ')' ).suppress()
num = Word(nums)
#~ var = Word(alphas + "_")
var = Word(alphas + "_", alphanums+"_")
#~ simpleAssign = var + eq
expr = Forward()
simpleAssign = var + eq + (num | quotedString)
#~ nestedAssign = Group(lpar + OneOrMore(simpleAssign) + rpar)
nestedAssign = var + eq + Group(lpar + OneOrMore(expr) + rpar)
atom = nestedAssign | simpleAssign
expr << atom
expr.ignore(comment)
return expr
This gives:
['TOP_OBJECT',
'=',
['obj_fmt',
'=',
['obj_name',
'=',
'"foo"',
'obj_cre_date',
'=',
'737785182',
'opj_data',
'=',
['a', '=', '"continue"', 'b', '=', '"quit"'],
'obj_version',
'=',
'264192'],
'LEVEL1_OBJECT',
'=',
['OBJ_part',
'=',
['obj_type', '=', '1005', 'obj_size', '=', '120'],
'LEVEL2_OBJECT_A',
'=',
['OBJ_part',
'=',
['obj_type', '=', '3001', 'obj_size', '=', '128'],
'Another_part',
'=',
['another_attr',
'=',
['another_style', '=', '0', 'another_param', '=', '2']]],
'LEVEL2_OBJECT_B',
'=',
['OBJ_part',
'=',
['obj_type', '=', '3005', 'obj_size', '=', '128'],
'Another_part',
'=',
['another_attr',
'=',
['another_style', '=', '0', 'another_param', '=', '8']]]]]]
If you wrap the expr inside nestedAssign's OneOrMore with Group
nestedAssign = var + eq + Group(lpar + OneOrMore(Group(expr)) + rpar)
, I think you'll get better structure for your repeated nested assignments:
['TOP_OBJECT',
'=',
[['obj_fmt',
'=',
[['obj_name', '=', '"foo"'],
['obj_cre_date', '=', '737785182'],
['opj_data', '=', [['a', '=', '"continue"'], ['b', '=', '"quit"']]],
['obj_version', '=', '264192']]],
['LEVEL1_OBJECT',
'=',
[['OBJ_part',
'=',
[['obj_type', '=', '1005'], ['obj_size', '=', '120']]],
['LEVEL2_OBJECT_A',
'=',
[['OBJ_part',
'=',
[['obj_type', '=', '3001'], ['obj_size', '=', '128']]],
['Another_part',
'=',
[['another_attr',
'=',
[['another_style', '=', '0'], ['another_param', '=', '2']]]]]]],
['LEVEL2_OBJECT_B',
'=',
[['OBJ_part',
'=',
[['obj_type', '=', '3005'], ['obj_size', '=', '128']]],
['Another_part',
'=',
[['another_attr',
'=',
[['another_style', '=', '0'], ['another_param', '=', '8']]]]]]]]]]]
Also, your originally posted code contained TABs, I find them to be more trouble than they are worth, better off using 4-space indents.
Related
How to flatten a dict type column in a DF
i have a df with a dict type column named measures like below: How can I flatten this column as new columns in the same DF?
I recently had the same problem, wanting to extract and flatten data from a JSON, it might be overkill for your issue and a bit obscure but here it is: This expects Dicts and ignores missing or malformed data function extract_flatten(data::AbstractDict, extract::AbstractDict; cmdchar::AbstractChar='%') res = Dict() for (key, val) in extract temp = Any[data] keys = [key] for v in val if v isa AbstractString if v[1] == cmdchar v = split(v[2:end], ':') if v[1] == "all" temp2 = [] keys2 = String[] for (t,k) in zip(temp, keys) for (kt,vt) in pairs(t) push!(keys2, join([k; v[2:end]; kt], '_')) push!(temp2, vt) end end temp = temp2 keys = keys2 elseif v[1] == "name" keys .*= '_' * join(v[2:end], '_') else error("$(repr(v)) is not a valid command") end else temp .= getdefnothing.(temp, Ref(v)) end elseif v isa Integer temp .= getdefnothing.(temp, Ref(v)) else error("$(repr(v)) is not a valid key") end nothings = isnothing.(temp) deleteat!(temp, nothings) deleteat!(keys, nothings) isempty(temp) && break end push!.(Ref(res), keys .=> temp) end return res end getdefnothing(x, y) = nothing getdefnothing(x::AbstractDict, y) = get(x, y, nothing) getdefnothing(x::AbstractArray, y) = get(x, y, nothing) example use: using Test const d = Dict schema = d( "a" => ["b", "c", "d"], "b" => ["e"], "c" => ["f", "%all:z", "g"] ) a = d("z" => 3) #test extract_flatten(a, schema) == d() b = d("e" => 0.123) #test extract_flatten(b, schema) == d("b" => 0.123) c = d("e" => true, "b" => d("c" => d("d" => "ABC"))) #test extract_flatten(c, schema) == d("b" => true, "a" => "ABC") e = d("f" => d( "a" => d("g" => "A"), "b" => d("g" => "B") )) #test extract_flatten(e, schema) == d("c_z_a" => "A", "c_z_b" => "B") f = d("f" => [ d("g" => "A"), d("g" => "B") ]) #test extract_flatten(f, schema) == d("c_z_1" => "A", "c_z_2" => "B") g = d("e" => nothing, "f" => [1,2,3]) #test extract_flatten(g, schema) == d()
Assuming that there is only one object in each of those lists, then something like this: using JSON using dataframes transform( df, ( :measures => ByRow(d -> (; JSON.parse(d; dicttype=Dict{Symbol,Any})[1]...)) => AsTable ) ) What this does is parse the entries in the measures column as JSON (length-one) lists of dicts, take the first element, convert to a NamedTuple, and then use => AsTable to tell transform to convert that NamedTuple into corresponding columns.
ASP.NET IQueryable WHERE OR
I am trying to write this piece of code that will search the database table and I am trying to search multiple columns. What I have below appears to be the equivalent to WHERE column = "this" AND column2 = "this", what I am trying to do is this WHERE column = "this" OR column2 = "this" How would I accomplish this? query = query.Where(p => (p.ChckNumber.ToString()).Contains(globalSearch.ToString())); query = query.Where(p => (p.BankAccount.ToString()).Contains(globalSearch.ToString())); query = query.Where(p => (p.Description.ToString()).Contains(globalSearch.ToString())); query = query.Where(p => (p.CheckAmount.ToString()).Contains(globalSearch.ToString())); query = query.Where(p => (p.ClearedDate.ToString()).Contains(globalSearch.ToString())); query = query.Where(p => (p.SentDate.ToString()).Contains(globalSearch.ToString()));
You should be able to do this in-line using the OR operator: query = query.Where(p => p.ChckNumber.ToString().Contains(globalSearch.ToString()) || p.BankAccount.ToString().Contains(globalSearch.ToString()) || p.Description.ToString().Contains(globalSearch.ToString()) || p.CheckAmount.ToString().Contains(globalSearch.ToString()) || p.ClearedDate.ToString().Contains(globalSearch.ToString()) || p.SentDate.ToString().Contains(globalSearch.ToString()) );
How to get score of a registered-query
I'm trying to calculate a score for a complex match query. For example: if conditionA and conditionB and (conditionC or conditionD) then score = 10 else score = 0 This is the solution I've come up: let $idReq := cts:register( cts:and-query(( cts:path-range-query("/person/name", "=", 'val1', ("score-function=linear", "collation=http://marklogic.com/collation//S1")), cts:path-range-query("/person/country", "=", 'country', ("score-function=linear", "collation=http://marklogic.com/collation//S1")), cts:or-query(( cts:path-range-query("/person/city", "=", 'city', ("score-function=linear", "collation=http://marklogic.com/collation//S1")), cts:path-range-query("/person/school", "=", '', ("score-function=linear", "collation=http://marklogic.com/collation//S1")) )) )) ) return cts:score(cts:search(fn:doc(), cts:registered-query($idReq, ("unfiltered"), 10))) All the indexes exists and the collation too. When I execute this registered query, I always get 0 for the score. EDITED I've narrow down the problem , and it can reproduced by combining cts:register with cts:path-range-query. let $query := cts:path-range-query("/person/name", "=", "val1", ("score-function=linear", "collation=http://marklogic.com/collation//S1")) let $idReq := cts:register($query) return cts:score( cts:search(fn:doc(), cts:registered-query($idReq,("unfiltered"), 10) (: $query :) ) ) EDITED Setup index config for testing: import module namespace admin = "http://marklogic.com/xdmp/admin" at "/MarkLogic/admin.xqy"; let $dbid := xdmp:database("Documents") let $config := admin:database-add-range-path-index( admin:get-configuration(), $dbid, admin:database-range-path-index( $dbid, "string", "/person/name", "http://marklogic.com/collation//S1", fn:false(), "ignore")) return admin:save-configuration($config) Sample data: xdmp:document-insert( '/test/person1.xml', <person> <name>val1</name> <city>city</city> <country>country</country> </person> )
Update randomly certain number of rows
I'm trying to update a certain number of rows of my entity "Vehicule". I have no idea how could it work. I'm actually trying to modify only two rows where direction= 5.This is the function I used in order to update. public function ValidAction(\OC\UserBundle\Entity\User $direction) { $qb = $this->getDoctrine() ->getRepository('CarPfeBundle:Vehicule') ->createQueryBuilder('v'); $q = $qb->update ('CarPfeBundle:vehicule v') ->set('v.direction', '?1') ->where('v.direction = ?2') ->setParameter(1, $direction) ->setParameter(2, 5) ->getQuery(); $p = $q->execute(); return $this->redirect($this->generateUrl('demandeveh_afficher')); } But the above code update all rows of my database. I need to update only two rows. Any help please?
Try to do this ; public function ValidAction(\OC\UserBundle\Entity\User $direction) { $qb = $this->getDoctrine() ->getRepository('CarPfeBundle:Vehicule') ->createQueryBuilder('v'); // $ids an array that contains all ids with your condition $ids = $qb->select('v.id') ->where('v.direction = :direction') ->setParameter( array( 'direction' => $direction ) ) ->getQuery() ->getResult(); $id1 = $ids[array_rand($ids)]; $id2 = $ids[array_rand($ids)]; //To be sure that $id1 is different from id2 while ($id1 == $id2) { $id2 = $ids[array_rand($ids)]; } $q = $qb->update ('CarPfeBundle:vehicule v') ->set('v.direction', ':val1') ->where('v.direction = :val2') ->andWhere('v.id IN (:id1, :id2)') ->setParameter( array( 'val1' => $direction , 'val2' => 5 , 'id1' => $id1, 'id2' => $id2, ) ) ->getQuery(); $p = $q->execute(); return $this->redirect($this->generateUrl('demandeveh_afficher')); } With the above code I hope you can update only two rows and randomly. Good luck !
While a solution like Houssem Zitoun suggested may work, why not use a subquery? If you get the (like I did, if not, just skip the middle SELECT) Error: #1235 - This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery' go with this answer and something like (doc): - untested UPDATE CarPfeBundle:Vehicule v SET v.direction = ?1 WHERE v.direction IN (SELECT * FROM ( SELECT v.direction FROM CarPfeBundle:Vehicule v2 WHERE v.direction = ?2 LIMIT 2 )) AS sq
Wordpress query with multiple meta meta keys
Can someone tell me what wrong with this query. if ( isset( $_GET['lokacija'] ) && !empty( $_GET['lokacija'] ) ) { $lokacija = $_GET['lokacija']; } else { $lokacija = ''; } if ( isset( $_GET['tip'] ) && !empty( $_GET['tip'] ) ) { $tip = $_GET['tip']; } else { $tip = ''; } if ( isset( $_GET['sobe'] ) && !empty( $_GET['sobe'] ) ) { $sobe = $_GET['sobe']; } else { $sobe = ''; } $paged = (get_query_var('paged')) ? get_query_var('paged') : 1; $args2 = array( 'posts_per_page' => 10, 'post_type' => 'nekretnine', 'paged' => $paged, if ($lokacija != '') { 'meta_query' => array( array ( 'key' => 'lokacija', 'value' => $lokacija.'' ), ) } ); $wp_query = new WP_Query( $args2 ); This code gives me error Parse error: syntax error, unexpected T_IF, expecting ')' in */wp-content/themes/gs/page-nek-pretraga.php on line 23; Line 23 is line that starts with if ($lokacija)... What i want to do is to use multiple meta_query that i can get from php get (www.blabla./com/page1/?lokacija=foo&tip=foo&sobe=3) But, i want it only if lets say $lokacija is not empty. Same for other two (possible 5-6 later) fields.
You can not include if condition in array. Whatever you are trying to achieve with above code is you can achieve with this following code. $args2 = array( 'posts_per_page' => 10, 'post_type' => 'nekretnine', 'paged' => $paged, ); if ($lokacija != '') { $args2['meta_query'] = array( array ( 'key' => 'lokacija', 'value' => $lokacija.'' ), ); }
To check for multiple custom fields we have to join the meta table twice. So the copy of the table is joined with a different temporary table name. global $wpdb; $query = " SELECT * FROM {$wpdb--->prefix}posts INNER JOIN {$wpdb->prefix}postmeta m1 ON ( {$wpdb->prefix}posts.ID = m1.post_id ) INNER JOIN {$wpdb->prefix}postmeta m2 ON ( {$wpdb->prefix}posts.ID = m2.post_id ) WHERE {$wpdb->prefix}posts.post_type = 'post' AND {$wpdb->prefix}posts.post_status = 'publish' AND ( m1.meta_key = 'date' AND m1.meta_value > '2010-12-05 00:00:00' ) AND ( m1.meta_key = 'date' AND m1.meta_value < '2010-12-12 00:00:00' ) AND ( m2.meta_key = 'some_other_meta_value' AND m2.meta_value != '' ) GROUP BY {$wpdb->prefix}posts.ID ORDER BY {$wpdb->prefix}posts.post_date DESC; For More Details Visit : http://realtuts.com/write-custom-wordpress-sql-query-multiple-meta-values/ ";