Iterate list of values from traversal A in traversal B (Gremlin) - gremlin

This is my test data:
graph = TinkerGraph.open()
g= graph.traversal()
g.addV('Account').property('id',"0x0").as('a1').
addV('Account').property('id',"0x1").as('a2').
addV('Account').property('id',"0x2").as('a3').
addV('Token').property('address','1').as('tk1').
addV('Token').property('address','2').as('tk2').
addV('Token').property('address','3').as('tk3').
addV('Trx').property('address','1').as('Trx1').
addV('Trx').property('address','1').as('Trx2').
addV('Trx').property('address','3').as('Trx3').
addE('sent').from('a1').to('Trx1').
addE('sent').from('a2').to('Trx2').
addE('received_by').from('Trx1').to('a2').
addE('received_by').from('Trx2').to('a3').
addE('distributes').from('a1').to('tk1').
addE('distributes').from('a1').to('tk2').
addE('distributes').from('a1').to('tk3').
iterate()
I need to first get all the Token addresses using the distributes relationship and then with those values loop through a traversal. This is an example of what I need for one single token
h = g.V().has('Account','id','0x0').next()
token = '1'
g.V(h).
out('sent').has('address',token).as('t1').
out('received_by').as('a2').
out('sent').has('address',token).as('t2').
out('received_by').as('a3').
select('a3','a2'). \
by('id').toList()
This is the output:
[a3:0x2,a2:0x1]
Instead of doing that has('address',token) on each hop I could omit it and just make sure the token address is the same by placing a where('t1',eq('t2')).by('address') at the end of the traversal, but this performs badly given my database design and indexes.
So what I do to iterate is:
tokens = g.V(h).out('distributes').values('address').toList()
finalList = []
for (token in tokens){
finalList.add(g.V(h).
out('sent').has('address',token).
out('received_by').as('a2').
out('sent').has('address',token).
out('received_by').as('a3').
select('a3','a2'). \
by('id').toList())
}
And this is what's stored in finalList at the end:
==>[[a3:0x2,a2:0x1]]
==>[]
==>[]
This works but I was wondering how can I iterate that token list this way without leaving Gremlin and without introducing that for loop. Also, my results contain empty results which is not optimal. The key here for me is to always be able to do that has('address',token) for each hop with the tokens that the Account node has ever sent. Thank you very much.

There is still uncertainty about what you are trying to achieve.
Nevertheless, I think this query does what you need:
g.V().has('Account', 'id', '0x0').as('a').
out('distributes').values('address').as('t').
select('a').
repeat(out('sent').where(values('address').
as('t')).
out('received_by')).
emit()
Example: https://gremlify.com/spwya4itlvd

Related

Union step does not work with multiple elements

The following query returns a user map with an "injected" property called "questions", it works as expected when g.V().has() returns a single user, but not when returns multiple users:
return g.V().has("user", "userId", 1)
.union(
__.valueMap().by(__.unfold()),
__.project('questions').by(
__.outE('response').valueMap().by(__.unfold()).fold()
)
)
.unfold()
.group()
.by(__.select(column.keys))
.by(__.select(column.values));
It works, but when I change the first line to return multiple users:
g.V().hasLabel("user").union(....
I finish the query calling .toList() so I was expecting to get a list of all the users in the same way it works with a single user but instead I still get a single user.
How can I get my query to work for both, multiple users or a single user?
When using Gremlin, you have to think in terms of a stream. The stream contains traversers which travel through the steps you've written. In your case, with your initial test of:
g.V().has("user", "userId", 1)
.union(
__.valueMap().by(__.unfold()),
__.project('questions').by(
__.outE('response').valueMap().by(__.unfold()).fold()
)
)
.unfold()
.group()
.by(__.select(column.keys))
.by(__.select(column.values))
you have one traverser (i.e. V().has("user", "userId", 1) produces one user) that flows to the union() and is split so that it goes to both valueMap() and project() both producing Map instances. You now have two traversers which are unfolded to a stream and grouped together to one final Map traverser.
So with that in mind what changes when you do hasLabel("user")? Well, you now have more than one starting traverser which means you will produce two traversers for each of those users when you get to union(). They will each be flatted to stream by unfold() and then they will just overwrite one another (because they have the same keys) to produce one final Map.
You really want to execute your union() and follow on operations once per initial "user" vertex traverser. You can tell Gremlin to do that with map():
g.V().has("user", "userId", 1)
.map(
.union(
__.valueMap().by(__.unfold()),
__.project('questions').by(
__.outE('response').valueMap().by(__.unfold()).fold()
)
)
.unfold()
.group()
.by(__.select(column.keys))
.by(__.select(column.values))
)
Finally, you can simplify your final by() modulators as:
g.V().has("user", "userId", 1)
.map(
.union(
__.valueMap().by(__.unfold()),
__.project('questions').by(
__.outE('response').valueMap().by(__.unfold()).fold()
)
)
.unfold()
.group()
.by(keys)
.by(values)
)

Sort paths based on Edge properties

Sorting traversal paths based on Edge property and Dedup
Hello,
I'm having a in memory graph and I want to sort paths based on Edge property and also dedup where paths leading to same destination.
E.g.
String NAME = "name";
String id = "id";
g.addV().property(id, 1).property(NAME, "u1").as("u1")
.addV().property(id, 2).property(NAME, "u2").as("u2")
.addV().property(id, 3).property(NAME, "u3").as("u3")
.addV().property(id, 4).property(NAME, "u4").as("u4")
.addE(rel).from("u2").to("u1").property("order", 2)
.addE(rel).from("u3").to("u1").property("order", 1)
.addE(rel).from("u4").to("u2").property("order", 3)
.addE(rel).from("u4").to("u3").property("order", 4)
.iterate();
What I'm trying to achieve is a traversal which gives me only one path i.e.
vertices = [path[u1, u3, u4]].
I tried using below gremlin.
List<Path> maps = g.V()
.has("id", 1)
.repeat(in()
.simplePath())
.until(inE().count().is(0))
.order().by(outE("rel").values("order"),Order.asc)
.path().by("name")
.toList();
However sorting doesn't happen. It gives me two paths :
vertices = [path[u1, u2, u4], path[u1, u3, u4]]
But I'm looking for output as vertices = [path[u1, u3, u4]]
I'm new to gremlin and ran out of options to try.
can someone help ?
g.V()
.has("id", 1)
.repeat(in("rel") .order() .by(outE().values("order"), Order.asc) .simplePath() )
.until(inE().count().is(0))
.dedup()
.path()
.by("name")
.toList() ;
Using toList will give you all the passible traversals. In your case you did order the answers but didn't take only the first one.
You should add limit step:
...
.limit(1).toList()
Or you can use next() instead of toList()

Odd behaviour of values()

In Gremlin console, I see the following:
1) Setup
graph = TinkerGraph.open()
g = graph.traversal()
2) Insert
g.addV('mylabel').property(id, 1).property('name', 'myname').next()
3) This works fine:
g.V(1).values('name')
==>myname // as expected
3) This instead gives unexpected result
g.V().sideEffect{println it.get().values('name')}
org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils$3#12a14b74 // expected: "myname"
Shouldn't it.get() be the vertex and values('name') be "myname"?
Speaking strictly to the API and usage of Vertex.values() in a sideEffect() step, you're getting precisely what you asked for because values() returns an Iterator - see javadocs. You need to instead do:
g.V().sideEffect{println it.get().values('name').next()}
or better (assuming "name" isn't a multi-property):
g.V().sideEffect{println it.get().value('name')}
or better still, avoid the closure/sideEffect() and just do:
g.V().values('name')

ArangoDB copy Vertex and Edges to neighbors

I'm trying to copy a vertex node and retain it's relationships in ArangoDB. I'm getting a "access after data-modification" error (1579). It doesn't like it when I iterate over the source node's edges and insert an edge copy within the loop. This makes sense but I'm struggling to figure out how to do what I'm wanting within a single transaction.
var query = arangojs.aqlQuery`
let tmpNode = (FOR v IN vertices FILTER v._id == ${nodeId} RETURN v)[0]
let nodeCopy = UNSET(tmpNode, '_id', '_key', '_rev')
let nodeCopyId = (INSERT nodeCopy IN 'vertices' RETURN NEW._id)[0]
FOR e IN GRAPH_EDGES('g', ${nodeId}, {'includeData': true, 'maxDepth': 1})
let tmpEdge = UNSET(e, '_id', '_key', '_rev')
let edgeCopy = MERGE(tmpEdge, {'_from': nodeCopyId})
INSERT edgeCopy IN 'edges'
`;
This quesion is somewhat similar to 'In AQL how to re-parent a vertex' - so let me explain this in a similar way.
One should use the ArangoDB 2.8 pattern matching traversals to solve this.
We will copy Alice to become Sally with similar relations:
let alice=DOCUMENT("persons/alice")
let newSally=UNSET(MERGE(alice, {_key: "sally", name: "Sally"}), '_id')
let r=(for v,e in 1..1 ANY alice GRAPH "knows_graph"
LET me = UNSET(e, "_id", "_key", "_rev")
LET newEdge = (me._to == "persons/alice") ?
MERGE(me, {_to: "persons/sally"}) :
MERGE(me, {_from: "persons/sally"})
INSERT newEdge IN knows RETURN newEdge)
INSERT newSally IN persons RETURN newSally
We therefore first load Alice. We UNSET the properties ArangoDB should set on its own. We change the properties that have to be uniq to be uniq for Alice so we have a Sally afterwards.
Then we open a subquery to traverse ANY first level relations of Alice. In this subequery we want to copy the edges - e. We need to UNSET once more the document attributes that have to be autogenerated by ArangoDB. We need to find out which side of _from and _to pointed to Alice and relocate it to Sally.
The final insert of Sally has to be outside of the subquery, else this statement will attempt to insert one Sally per edge we traverse. We can't insert Saly in front of the query as you already found out - no subsequent fetches are allowed after the insert.

Users who mentioned each other in Gremlin

We have a smaller example twitter database:
user -[TWEETED]-> tweet -[MENTIONED]-> user2
and I would like to find out how to write a query in Gremlin, that shows who were the users who mentioned each other. I have already read the docs but I don't know how to do it.
Given this sample data that assume marko and stephen mention each other and marko and daniel mention each other:
g = new TinkerGraph()
vMarko = g.addVertex("marko", [type:"user"])
vStephen = g.addVertex("stephen", [type:"user"])
vDaniel = g.addVertex("daniel", [type:"user"])
vTweetm1s = g.addVertex("m1s", [type:"tweet"])
vTweetm2d = g.addVertex("m2d", [type:"tweet"])
vTweets1m = g.addVertex("s1m", [type:"tweet"])
vTweetd1m = g.addVertex("d1m", [type:"tweet"])
vMarko.addEdge("tweeted",vTweetm1s)
vMarko.addEdge("tweeted",vTweetm2d)
vStephen.addEdge("tweeted",vTweets1m)
vDaniel.addEdge("tweeted",vTweetd1m)
vTweetm1s.addEdge("mentioned", vStephen)
vTweetm2d.addEdge("mentioned", vDaniel)
vTweets1m.addEdge("mentioned", vMarko)
vTweetd1m.addEdge("mentioned", vMarko)
you could handle it with the following:
gremlin> g.V.has("type","user").as('s')
.out("tweeted").out("mentioned").as('m').out("tweeted")
.out("mentioned").as('e').select.filter{it[0]==it[2]}
==>[s:v[daniel], m:v[marko], e:v[daniel]]
==>[s:v[stephen], m:v[marko], e:v[stephen]]
==>[s:v[marko], m:v[stephen], e:v[marko]]
==>[s:v[marko], m:v[daniel], e:v[marko]]
This approach uses select to extract the data from the labelled steps then a final filter to find those where "s" (vertex in the first position) is equal to the "e" (vertex in the final position). This of course means that there is cycle pattern detected where the one user mentioned another and the other mentioned that person back at some point.
If you follow that much then we can clean up the result a little bit so as to get the unique set of pairs:
gremlin> g.V.has("type","user").as('s')
.out("tweeted").out("mentioned").as('m')
.out("tweeted").out("mentioned").as('e')
.select.filter{it[0]==it[2]}
.transform{[it[0].id,it[1].id] as Set}.toList() as Set
==>[daniel, marko]
==>[stephen, marko]
By adding a transform to the previous code, we can convert the result to "id" (the user's name in this case) and flip everything to Set so as to get unique pairs of results.

Resources