Gremlin: How to reference for loop variable within query? - gremlin

reI'm stuck with a gremlin query to assign rank values to nodes based on a sorted list of keys passed to the query as a parameter.
Each node identified by "uniqueId" values should be assigned a rank based on order of occurrence in the reranked array.
This works:
reranked = [uniqueId1, uniqueId2, uniqueId3]
v.outE.as('e').inV.filter{it.key == reranked[2]}.back('e').sideEffect{it.rank = 2}
But this doesn't (replacing int with for-loop variable):
reranked = [uniqueId1, uniqueId2, uniqueId3]
for (i in 1..reranked.size())
v.outE.as('e').inV.filter{it.key == reranked[i]}.back('e').sideEffect{it.rank = i}
Do you know why this doesn't work? I'd also be happy for simpler ideas to reach the same goal.

You could do this using Groovy's eachWithIndex like:
reranked = [uniqueId1, uinqueId2, uniqueId3]
reranked.eachWithIndex{uniqueId, idx -> v.outE.as('e').inV.has('key', uniqueId).back('e').sideEffect{it.rank = idx} }
I have used Gremlin's has step above because it's much more efficient than filter in case of simple property look-up.

Well, it looks like I found a solution, maybe clumsy but hey, it works!
c=0
v.as('st').outE.as('e').inV.filter{it.key == reranked[c]}.back('e').sideEffect{
it.rank = reranked.size() - c }.outV.loop('st'){ c++ < so.size() }
I will still accept another answer if there's a fix for above situation and perhaps a more elgant approach to the solution.

Related

Removing the first half of the entries in LinkedHashMap other than looping

I was going to use Hashtable but some existing answer said only LinkedHashMap preserve the insertion order. So, it seems that I can get the insertion order with the entries or keys properties.
My question is, when the map has n elements, if I want to remove the first n/2 elements, is there a better way than looping through the keys and repeatedly calling remove(key)? That is, something like this
val a = LinkedHashMap<Int, Int>();
val n = 10;
for(i in 1 .. n)
{
a[i] = i*10;
}
a.removeRange(0,n/2);
instead of
val a = LinkedHashMap<Int, Int>();
val n = 10;
for(i in 1 .. n)
{
a[i] = i*10;
}
var i = 0;
var keysToRemove= ArrayList<Int>();
for(k in a.keys)
{
if(i >= n/2)
break;
else
i++
keysToRemove.add(k);
}
for(k in keysToRemove)
{
a.remove(k);
}
The purpose of this is that I use the map as a cache, and when the cache is full, I want to purge the oldest half of the entries. I do not have to use LinkedHashMap as long as I can:
Find the value using a key, efficiently.
Remove a range of entries at once.
There's no method in the class that makes this possible. The source code doesn't have any operations for ranges of keys or entries. Since the linking is built on top of the HashMap logic, individual entries still have to be individuatlly found by a hashed key lookup to remove them, so being able to remove a range couldn't be done faster in a LinkedHashMap, which is unlike the analogy of a LinkedList to an ArrayList.
For simpler code that's equivalent to what you're doing:
a.keys.take(a.size / 2).forEach(a::remove)
If you don't want to use a library for a cache set, LinkedHashSet is designed so you can easily build your own by subclassing. For instance, a basic one that simply removes the oldest entry when you add elements above a certain collection size:
class CacheHashMap<K, V>(private var maxSize: Int): LinkedHashMap<K, V>() {
override fun removeEldestEntry(eldest: MutableMap.MutableEntry<K, V>?): Boolean =
size == maxSize
}
Also, if you set accessOrder to true in your constructor call, it orders by last used to most recently used entry, which might be more apt for your situation than insertion order.
EDIT: sorry I missed the part about using this as an LRU cache, for that use case, TreeMap will not be suitable.
If insertion order is just incidental for you, and what you want is in fact the actual order of comparable keys, you should use a TreeMap instead.
However, the specific use case of removing half the keys might not be supported directly. You will rather find methods to remove keys below/above a certain value, and get the highest/lowest keys.

Runtime error:dictionary changed size during iteration

I iterate thru items of a dictionary "var_dict".
Then as I iterate in a for loop, I need to update the dictionary.
I understand that is not possible and that triggers the runtime error I experienced.
My question is, do I need to create a different dictionary to store data? As is now, I am trying to use same dictionary with different keys.
I know the problem is related to iteration thru the key and values of a dictionary and attempt to change it. I want to know if the best option in this case if to create a separate dictionary.
for k, v in var_dict.items():
match = str(match)
match = match.strip("[]")
match = match.strip("&apos;&apos;")
result = [index for index, value in enumerate(v) if match in value]
result = str(result)
result = result.strip("[]")
result = result.strip("&apos;")
#====> IF I print(var_dict), at this point I have no error *********
if result == "0":
#It means a match between interface on RP PSE2 model was found; Interface position is on PSE2 architecture
print (f&apos;PSE-2 Line cards:{v} Interfaces on PSE2:{entry} Interface PortID:{port_id}&apos;)
port_id = int(port_id)
print(port_id)
if port_id >= 19:
#print(f&apos;interface:{entry} portID={port_id} CPU_POS={port_cpu_pos} REPLICATION=YES&apos;)
if_info = [entry,&apos;PSE2=YES&apos;,port_id,port_cpu_pos,&apos;REPLICATION=YES&apos;]
var_dict[&apos;IF_PSE2&apos;].append(if_info)
#===> *** This is the point that if i attempt to print var_dict, I get the Error during olist(): dictionary changed size during iteration
else:
#print(f&apos;interface:{entry},portID={port_id} CPU_POS={port_cpu_pos} REPLICATION=NO&apos;)
if_info = [entry,&apos;PSE2=YES&apos;,port_id,port_cpu_pos,&apos;REPLICATION=NO&apos;]
var_dict[&apos;IF_PSE2&apos;].append(if_info)
else:
#it means the interface is on single PSE. No replication is applicable. Just check threshold between incoming and outgoing rate.
if_info = [entry,&apos;PSE2=NO&apos;,int(port_id),port_cpu_pos,&apos;REPLICATION=NO&apos;]
var_dict[&apos;IF_PSE1&apos;].append(if_info)
I did a shallow copy and that allowed me to iterate a dictionary copy and make modifications to the original dictionary. Problem solved. Thanks.
(...)
temp_var_dict = var_dict.copy()
for k, v in temp_var_dict.items():
(...)

how return a new type with an update value

If I want to change a value on a list, I will return a new list with the new value instead of changing the value on the old list.
Now I have four types. I need to update the value location in varEnd, instead of changing the value, I need to return a new type with the update value
type varEnd = {
v: ctype;
k: varkind;
l: location;
}
;;
type varStart = {
ct: ctype;
sy: sTable;
n: int;
stm: stmt list;
e: expr
}
and sEntry = Var of varEnd | Fun of varStart
and sTable = (string * sEntry) list
type environment = sTable list;;
(a function where environment is the only parameter i can use)
let allocateMem (env:environment) : environment =
I tried to use List.iter, but it changes the value directly, which type is also not mutable. I think List.fold will be a better option.
The biggest issue i have is there are four different types.
I think you're saying that you know how to change an element of a list by constructing a new list.
Now you want to do this to an environment, and an environment is a list of quite complicated things. But this doesn't make any difference, the way to change the list is the same. The only difference is that the replacement value will be a complicated thing.
I don't know what you mean when you say you have four types. I see a lot more than four types listed here. But on the other hand, an environment seems to contain things of basically two different types.
Maybe (but possibly not) you're saying you don't know a good way to change just one of the four fields of a record while leaving the others the same. This is something for which there's a good answer. Assume that x is something of type varEnd. Then you can say:
{ x with l = loc }
If, in fact, you don't know how to modify an element of a list by creating a new list, then that's the thing to figure out first. You can do it with a fold, but in fact you can also do it with List.map, which is a little simpler. You can't do it with List.iter.
Update
Assume we have a record type like this:
type r = { a: int; b: float; }
Here's a function that takes r list list and adds 1.0 to the b fields of those records whose a fields are 0.
let incr_ll rll =
let f r = if r.a = 0 then { r with b = r.b +. 1.0 } else r in
List.map (List.map f) rll
The type of this function is r list list -> r list list.

pyparsing for querying a database of chemical elements

I would like to parse a query for a database of chemical elements.
The database is stored in a xml file. Parsing that file produces a nested dictionary that is stored in a singleton object that inherit from collections.OrderedDict.
Asking for an element will give me an ordered dictionary of its corresponding properties
(i.e. ELEMENTS['C'] --> {'name':'carbon','neutron' : 0,'proton':6, ...}).
Conversely, asking for a propery will give me an ordered dictionary of its values for all the elements (i.e. ELEMENTS['proton'] --> {'H' : 1, 'He' : 2} ...).
A typical query could be:
mass > 10 or (nucleon < 20 and atomic_radius < 5)
where each 'subquery' (i.e. mass > 10) will return the set of elements that matches it.
Then, the query will be converted and transformed internally to a string that will be evaluated further to produce a set of the indexes of the elements that matched it. In that context the operators and/or are not boolean operator but rather ensemble operator that acts upon python sets.
I recently sent a post for building such a query. Thanks to the useful answers I got, I think that I did more or less the job (I hope on a nice way !) but I still have some questions related to pyparsing.
Here is my code:
import numpy
from pyparsing import *
# This import a singleton object storing the datase dictionary as
# described earlier
from ElementsDatabase import ELEMENTS
and_operator = oneOf(['and','&'], caseless=True)
or_operator = oneOf(['or' ,'|'], caseless=True)
# ELEMENTS.properties is a property getter that returns the list of
# registered properties in the database
props = oneOf(ELEMENTS.properties, caseless=True)
# A property keyword can be quoted or not.
props = Suppress('"') + props + Suppress('"') | props
# When parsed, it must be replaced by the following expression that
# will be eval later.
props.setParseAction(lambda t : "numpy.array(ELEMENTS['%s'].values())" % t[0].lower())
quote = QuotedString('"')
integer = Regex(r'[+-]?\d+').setParseAction(lambda t:int(t[0]))
float_ = Regex(r'[+-]?(\d+(\.\d*)?)?([eE][+-]?\d+)?').setParseAction(lambda t:float(t[0]))
comparison_operator = oneOf(['==','!=','>','>=','<', '<='])
comparison_expr = props + comparison_operator + (quote | float_ | integer)
comparison_expr.setParseAction(lambda t : "set(numpy.where(%s)%s%s)" % tuple(t))
grammar = Combine(operatorPrecedence(comparison_expr, [(and_operator, 2, opAssoc.LEFT) (or_operator, 2, opAssoc.LEFT)]))
# A test query
res = grammar.parseString('"mass " > 30 or (nucleon == 1)',parseAll=True)
print eval(' '.join(res._asStringList()))
My question are the following:
1 using 'transformString' instead of 'parseString' never triggers any
exception even when the string to be parsed does not match the grammar.
However, it is exactly the functionnality I need. Is there is a way to do so ?
2 I would like to reintroduce white spaces between my tokens in order
that my eval does not fail. The only way I found to do so it the one
implemented above. Would you see a better way using pyparsing ?
sorry for the long post but I wanted to introduce in deeper details its context. BTW, if you find this approach bad, do not hesitate to tell it me!
thank you very much for your help.
Eric
do not worry about my concern, I found a work around. I used the SimpleBool.py example shipped with pyparsing (thanks for the hint Paul).
Basically, I used the following approach:
1 for each subquery (i.e. mass > 10), using the setParseAction method,
I joined a function that returns the set of eleements that matched
the subquery
2 then, I joined the following functions for each logical operator (and,
or and not):
def not_operator(token):
_, s = token[0]
# ELEMENTS is the singleton described in my original post
return set(ELEMENTS.keys()).difference(s)
def and_operator(token):
s1, _, s2 = token[0]
return (s1 and s2)
def or_operator(token):
s1, _, s2 = token[0]
return (s1 or s2)
# Thanks for Paul for the hint.
grammar = operatorPrecedence(comparison_expr,
[(not_token, 1,opAssoc.RIGHT,not_operator),
(and_token, 2, opAssoc.LEFT,and_operator),
(or_token, 2, opAssoc.LEFT,or_operator)])
Please not that these operators acts upon python sets rather than
on booleans.
And that does the job.
I hope that this approach will help anyone of you.
Eric

Python: get all values associated with key in a dictionary, where the values may be a list or a single item

I'm looking to get all values associated with a key in a dictionary. Sometimes the key holds a single dictionary, sometimes a list of dictionaries.
a = {
'shelf':{
'book':{'title':'the catcher in the rye', 'author':'j d salinger'}
}
}
b = {
'shelf':[
{'book':{'title':'kafka on the shore', 'author':'haruki murakami'}},
{'book':{'title':'atomised', 'author':'michel houellebecq'}}
]
}
Here's my method to read the titles of every book on the shelf.
def print_books(d):
if(len(d['shelf']) == 1):
print d['shelf']['book']['title']
else:
for book in d['shelf']:
print book['book']['title']
It works, but doesn't look neat or pythonic. The for loop fails on the single value case, hence the if/else.
Can you improve on this?
Given your code will break if you have a list with a single item (and this is how I think it should be), if you really can't change your data structure this is a bit more robust and logic:
def print_books(d):
if isinstance(d['shelf'], dict):
print d['shelf']['book']['title']
else:
for book in d['shelf']:
print book['book']['title']
Why not always make 'shelf' map to a list of elements, but in the single element case it's a ... single element list? Then you'd always be able to treat each bookshelf the same.
def print_books(d):
container = d['shelf']
books = container if isinstance(container, list) else [container['book']]
books = [ e['book'] for e in books ]
for book in books:
print book['title']
I would first get the input consistent, then loop through all the books even if only one.
def print_books(d):
books = d['shelf'] if type(d['shelf']) is list else [ d['shelf'] ]
for book in books:
print book['book']['title']
I think this looks a little neater and pythonic, although some might argue not as efficient as your original code to create an array with one element and loop through it.

Resources