How to output Gremlin query from a Java GraphTraversal object? The default output (graphTraversal.toString()) looks like [HasStep([~label.eq(brand), name.eq(Nike), status.within([VALID])])] which is not easy to read.
Gremlin provides the GroovyTranslator class to help with that. Here is an example.
// Simple traversal we can use for testing a few things
Traversal t =
g.V().has("airport","region","US-TX").
local(values("code","city").
fold());
// Generate the text form of the query from a Traversal
String query;
query = GroovyTranslator.of("g").
translate(t.asAdmin().getBytecode());
System.out.println("\nResults from GroovyTranslator on a traversal");
System.out.println(query);
This is taken from a set of examples located here: https://github.com/krlawrence/graph/blob/master/sample-code/RemoteWriteText.java
You can use getByteCode() method on a DefaultGraphTraversal to get output gremlin query.
For example, consider the following graph
Graph graph = TinkerGraph.open();
Vertex a = graph.addVertex(label, "person", "name", "Alex", "Age", "23");
Vertex b = graph.addVertex(label, "person", "name", "Jennifer", "Age", "20");
Vertex c = graph.addVertex(label, "person", "name", "Sophia", "Age", "22");
a.addEdge("friends_with", b);
a.addEdge("friends_with", c);
Get a graph Traversal as following:
GraphTraversalSource gts = graph.traversal();
GraphTraversal graphTraversal =
gts.V().has("name","Alex").outE("friends_with").inV().has("age", P.lt(20));
Now you can get your traversal as a String as:
String traversalAsString = graphTraversal.asAdmin().getBytecode().toString();
It gives you output as:
[[], [V(), has(name, Alex), outE(friends_with), inV(), has(age, lt(20))]]
It is much more readable, almost like the one you have provided as the query. You can now modify/parse the string to get the actual query if you want like replacing [,], adding joining them with . like in actual query.
Related
I want to combine the topic to a single field. Currently I am trying this:
data_format = "json_v2"
[[inputs.mqtt_consumer.json_v2]]
[[inputs.mqtt_consumer.topic_parsing]]
topic = "+/+/+/"
tags = "name/id/value"
fields = "name/id/_"
[[inputs.mqtt_consumer.json_v2.tag]]
path = "timestamp"
But it will split it into name and id, but it should combine both to a new string and save that one.
I have performed union in my query which gives me the following output
{
key1 = value1,
key2 = value2
},
{
key3 = value3
}
I need output as follows,
{
key1 = value1,
key2 = value2,
key3 = value3
}
You have to deconstruct and reconstruct the maps. This operation is described in some detail in Gremlin Recipes if you want to read more.
The following code gets you to the position of your union():
gremlin> x = [[key1:"value1",key2:"value2"],[key3:"value3"]]
==>[key1:value1,key2:value2]
==>[key3:value3]
gremlin> g.inject(x).unfold()
==>[key1:value1,key2:value2]
==>[key3:value3]
Then you just unfold() those maps to map entries (i.e. key/value pairs) and group() them back together:
gremlin> g.inject(x).unfold().unfold().group().by(keys).by(values)
==>[key1:[value1],key2:[value2],key3:[value3]]
gremlin> g.inject(x).unfold().unfold().group().by(keys).by(select(values))
==>[key1:value1,key2:value2,key3:value3]
Problem description
I'm using sqlalchemy (v1.2) declarative, and I have a simple class Node with an id and a label. I would like to build a self-referencing many-to-many relationship where the association table is not a database table, but a dynamic select statement. This statement selects from two joined aliases of Node and returns rows of the form (left_id, right_id), defining the relationship. The code I have so far works if I access the relationship through an instance object, but when I try to filter by the relationship the joins are messed up.
The "classical" self-referential many-to-many relation
For reference, let's start with the example from the documentation on Self-Referential Many-to-Many Relationship, which uses an association table:
node_to_node = Table(
"node_to_node", Base.metadata,
Column("left_node_id", Integer, ForeignKey("node.id"), primary_key=True),
Column("right_node_id", Integer, ForeignKey("node.id"), primary_key=True)
)
class Node(Base):
__tablename__ = 'node'
id = Column(Integer, primary_key=True)
label = Column(String, unique=True)
right_nodes = relationship(
"Node",
secondary=node_to_node,
primaryjoin=id == node_to_node.c.left_node_id,
secondaryjoin=id == node_to_node.c.right_node_id,
backref="left_nodes"
)
def __repr__(self):
return "Node(id={}, Label={})".format(self.id, self.label)
Joining Node to itself through this relationship:
>>> NodeAlias = aliased(Node)
>>> print(session.query(Node).join(NodeAlias, Node.right_nodes))
SELECT node.id AS node_id, node.label AS node_label
FROM node JOIN node_to_node AS node_to_node_1
ON node.id = node_to_node_1.left_node_id
JOIN node AS node_1
ON node_1.id = node_to_node_1.right_node_id
Everything looks well.
The many-to-many relation through an association select statement
As an example we implement a relationship next_two_nodes which connects a node to the two nodes with id+1 and id+2 (if existent). The complete code for testing.
Here is a function which generates the select statement for the "dynamic" association table:
_next_two_nodes = None
def next_two_nodes_select():
global _next_two_nodes
if _next_two_nodes is None:
_leftside = aliased(Node, name="leftside")
_rightside = aliased(Node, name="rightside")
_next_two_nodes = select(
[_leftside.id.label("left_node_id"),
_rightside.id.label("right_node_id")]
).select_from(
join(
_leftside, _rightside,
or_(
_leftside.id + 1 == _rightside.id,
_leftside.id + 2 == _rightside.id
)
)
).alias()
return _next_two_nodes
Note that the function caches the result in a global variable, so that successive calls always return the same object instead of using new aliases. Here is my attempt to use this select in a relationship:
class Node(Base):
__tablename__ = 'node'
id = Column(Integer, primary_key=True)
label = Column(String, unique=True)
next_two_nodes = relationship(
"Node", secondary=next_two_nodes_select,
primaryjoin=(lambda: foreign(Node.id)
== remote(next_two_nodes_select().c.left_node_id)),
secondaryjoin=(lambda: foreign(next_two_nodes_select().c.right_node_id)
== remote(Node.id)),
backref="previous_two_nodes",
viewonly=True
)
def __repr__(self):
return "Node(id={}, Label={})".format(self.id, self.label)
Some test data:
nodes = [
Node(id=1, label="Node1"),
Node(id=2, label="Node2"),
Node(id=3, label="Node3"),
Node(id=4, label="Node4")
]
session.add_all(nodes)
session.commit()
Accessing the relationship through an instance works as expected:
>>> node = session.query(Node).filter_by(id=2).one()
>>> node.next_two_nodes
[Node(id=3, Label=Node3), Node(id=4, Label=Node4)]
>>> node.previous_two_nodes
[Node(id=1, Label=Node1)]
However, filtering on the relationship does not give the expected result:
>>> session.query(Node).join(NodeAlias, Node.next_two_nodes).filter(NodeAlias.id == 3).all()
[Node(id=1, Label=Node1),
Node(id=2, Label=Node2),
Node(id=3, Label=Node3),
Node(id=4, Label=Node4)]
I would expect only Node1 and Node2 to be returned. And indeed, the SQL statement of the join is wrong:
>>> print(session.query(Node).join(NodeAlias, Node.next_two_nodes))
SELECT node.id AS node_id, node.label AS node_label
FROM node JOIN (SELECT leftside.id AS left_node_id, rightside.id AS right_node_id
FROM node AS leftside JOIN node AS rightside
ON leftside.id + 1 = rightside.id OR leftside.id + 2 = rightside.id) AS anon_1
ON anon_1.left_node_id = anon_1.left_node_id
JOIN node AS node_1 ON anon_1.right_node_id = node_1.id
Comparing with the working example above, instead of ON anon_1.left_node_id = anon_1.left_node_id it should clearly read ON node.id = anon_1.left_node_id. My primaryjoin seems to be wrong, but I cannot figure out how to connect the last dots.
After more debugging I found that "Clause Adaption" is replacing my ON clause. I'm not sure about the details, but for some reasen sqlalchemy thinks that I am referring to the node.id from the select rather than from the original Node table. The only way I found to suppress clause adaption was to select in text form:
select(
[literal_column("leftside.id").label("left_node_id"),
literal_column("rightside.id").label("right_node_id")]
)...
This way the relationship to Node is broken and filtering works as expected. It feels like a hack with unforeseeable side effects, maybe someone knows a cleaner way...
So I have a root model that looks like this:
class Contact(ndb.Model):
first_name= ndb.StringProperty()
last_name= ndb.StringProperty()
age = ndb.IntegerProperty()
and a child model that looks like this:
class Address(ndb.Model)
address_type=ndb.StringProperty(choices=['Home','Office','School'],default='Home')
street = ndb.StringProperty()
city = ndb.StringProperty()
state = ndb.StringProperty()
I want to be able to perform a query similar to this:
Select first_name, last_name, street, city, state WHERE contact.age > 25 and address.city = 'Miami' and address_type = 'School'
I know I can perform searches more easily if I were to setup the addresses as a structured property within the contact model, but I don't like using Structured Properties because they don't have their own keys, thus making entity maintenance more challenging.
I tried doing a search for contacts first and then feeding the resulting keys into a WHERE IN clause but it didn't work, example:
query1 = Contact.query(Contact.age>25).iter(keys_only = True)
query2 = Address.query(Address.city=='Miami', Address.address_type=='School',Address.ancestor.IN(query1))
Any ideas as to how to go about this would be appreciated.
OK so it looks like my original idea of filtering one query by passing in the keys of another will work. The problem is that you can't perform a WHERE-IN clause against an ancestor property so you have to store the parent key as a standard ndb.KeyProperty() inside of the child entity, then perform the WHERE-IN clause against that KeyProperty field.
Here's an example that will work directly from the interactive console in the Appengine SDK:
from google.appengine.ext import ndb
class Contact(ndb.Model):
first_name= ndb.StringProperty()
last_name= ndb.StringProperty()
age = ndb.IntegerProperty()
class Address(ndb.Model):
address_type=ndb.StringProperty(choices=['Home','Office','School'],default='Home')
street = ndb.StringProperty()
city = ndb.StringProperty()
state = ndb.StringProperty()
contact = ndb.KeyProperty()
# Contact 1
contact1 = Contact(first_name='Homer', last_name='Simpson', age=45)
contact1_result = contact1.put()
contact1_address1 = Address(address_type='Home',street='742 Evergreen Terrace', city='Springfield', state='Illinois', contact=contact1_result, parent=contact1_result)
contact1_address1.put()
contact1_address2 = Address(address_type='Office',street=' 1 Industry Row', city='Springfield', state='Illinois', contact=contact1_result, parent=contact1_result)
contact1_address2.put()
# Contact 2
contact2 = Contact(first_name='Peter', last_name='Griffan', age=42)
contact2_result = contact2.put()
contact2_address1 = Address(address_type='Home',street='31 Spooner Street', city='Quahog', state='Rhode Island', contact=contact2_result, parent=contact2_result)
contact2_address1.put()
# This gets the keys of all the contacts that are over the age of 25
qry1 = Contact.query(Contact.age>25).fetch(keys_only=True)
# This query gets all addresses of type 'Home' where the contacts are in the result set of qry1
qry2 = Address.query(Address.address_type=='Home').filter(Address.contact.IN(qry1))
for item in qry2:
print 'Contact: %s,%s,%s,%s'% (item.contact.get().first_name, item.contact.get().last_name, item.address_type, item.street)
This will render a result that looks kinda like this:
Contact: Peter,Griffan,Home,31 Spooner Street
Contact: Homer,Simpson,Home,742 Evergreen Terrace
Can you use an Ancestor query?
query1 = Contact.query(Contact.age>25).iter(keys_only = True)
for contact in query1:
query2 = Address.query(Address.city=='Miami',
Address.address_type=='School',
ancestor=contact)
If that's not efficient enough, how about filtering the addresses?
query1 = Contact.query(Contact.age>25).iter(keys_only = True)
contacts = set(query1)
query2 = Address.query(Address.city=='Miami', Address.address_type=='School')
addresses = [address for address in query2 if address.key.parent() in contacts]
I have two paths in a graph: A-B-C-D and A-B-E-F. I would like to assign identification numbers to those paths, i.e. A-B-C-D would be 1 and A-B-E-F would be 2.
Is it possible? If yes, how?
You mean like a persistent path ID? This isn't directly featured, but you can do it in a Query in Cypher.
If you want somthing persistent, you can always use an index, so create a Relationship index that would store the Relationships of Path 1 under the key/value of Path:1.
EDIT: After getting more information, here's a use case using the index:
It would be up to you to define this in the Index. Here is what you do:
Node a = db.createNode();
Node b = db.createNode();
Node c = db.createNode();
Node d = db.createNode();
Node e = db.createNode();
Node f = db.createNode();
Relationship aTob = a.createRelationshipTo(b, DynamicRelationshipType.withName("RELATIONSHIP"));
Relationship bToc = b.createRelationshipTo(c, DynamicRelationshipType.withName("RELATIONSHIP"));
Relationship cTod = c.createRelationshipTo(d, DynamicRelationshipType.withName("RELATIONSHIP"));
Relationship bToe = b.createRelationshipTo(e, DynamicRelationshipType.withName("RELATIONSHIP"));
Relationship eTof = e.createRelationshipTo(f, DynamicRelationshipType.withName("RELATIONSHIP"));
Index<Relationship> relationshipIndex = db.index().forRelationships("PathIndex");
String pathRId = UUID.randomUUID().toString();
String pathMId = UUID.randomUUID().toString();
relationshipIndex.add(aTob, "PathId", pathRId);
relationshipIndex.add(bToc, "PathId", pathRId);
relationshipIndex.add(cTod, "PathId", pathRId);
relationshipIndex.add(aTob, "PathId", pathMId);
relationshipIndex.add(bToe, "PathId", pathMId);
relationshipIndex.add(eTof, "PathId", pathMId);
Then when you want to find a path, you would search by the ID. You would be responsible for maintaining the Set Id in the index, here I use UUID, but you can use something more representative of your information. The relationships would not be in any repeatable order when returned from the Index.