How to add multiple edges in a single gremlin query? - gremlin

My scenario is to add multiple edges between vertices in a single query:
Assume the nodes below:
These are the labels and ids I have
Users:
4100
Songs:
4200
4355
4676
I have to establish edges between these vertices
4100 --> 4200,
4100 --> 4355,
4100 --> 4676.
We can do it normally by creating single edge between node.it is not a efficient method if we want to create edge between more than 50 vertices at a time. I am using Tinkerpop 3.0.1.

I had a similar problem. With the C# SDK I do it like this:
g.V('4100')
.addE('knows').to(g.V('4200')).outV()
.addE('knows').to(g.V('4355')).outV()
.addE('knows').to(g.V('4676'))

Using the latest Tinkerpop. You could do the following:
Create a sample graph:
gremlin> graph = TinkerGraph.open();
gremlin> graph.addVertex("User").property("id", 4100);
==>vp[id->4100]
gremlin> graph.addVertex("Song").property("id", 4200);
==>vp[id->4200]
gremlin> graph.addVertex("Song").property("id", 4355);
==>vp[id->4355]
gremlin> graph.addVertex("Song").property("id", 4676);
==>vp[id->4676]
Now add the edges in a single traversal:
gremlin> graph.traversal().V().hasLabel("User").as("a").
V().hasLabel("Song").
addE("edge to song").from("a");
==>e[8][0-edge to song->2]
==>e[9][0-edge to song->4]
==>e[10][0-edge to song->6]
This shows another example of using addE within a traversal as a side effect.

If you have the vertex ids, it is very efficient to lookup by id. If you are using Gremlin Server, each request to the Gremlin Server is treated as a single transaction. You can pass the multiple statements in a Gremlin query on a single request (with bindings) rather than sending multiple requests. Separate the statements in the Gremlin query with semicolons.
l=[4200, 4355, 4676]; v=graph.vertices(4100).next(); l.each { v.addEdge("knows", graph.vertices(it).next()) }

Try this
gremlin> songs = g.V().has("album","albumname")).toList();user = g.V().has('fullName','arunkumar').next(); songs.each{user.addEdge("in",it)}
gremlin> g.E() //check the edge
Hope this helps :)

Related

Gremlin query combine vertices with unrelated vertices CosmosDB

I would like to get several vertices e.G. with the label "user" combined with vertices, they are not related to, yet e.G. with the label "movie".
I know, that the strength of Gremlin is traversing the vertex, and combining objects that are not related is not the best use case for the graph. I am using Azure CosmosDB for my application, so if there is any idea how to do this more performant feel free to let me know. If you can do this with gremlin I need some help with the query. I provide an example here:
There are 4 users: bob, jose, frank, peter
and 4 movies: movie1, movie2, movie3, movie4
Between the users and movies there can be an edge "watched"
My example data looks as follows:
watched:
[bob, [movie1,movie2]]
[jose, [movie3]]
[frank, []]
[peter, [movie]]
The result and format I would like to get is following:
not watched:
[bob, movie3]
[bob, movie4]
[jose, movie1]
[jose, movie2]
[jose, movie4]
[frank, movie1]
[frank, movie2]
[frank, movie3]
[frank, movie4]
[peter, movie1]
[peter, movie2]
[peter, movie3]
The script to set up the graph (using /partition_key as partition key):
g.addV("user").property("partition_key", 1).property("id", "bob")
g.addV("user").property("partition_key", 1).property("id", "jose")
g.addV("user").property("partition_key", 1).property("id", "frank")
g.addV("user").property("partition_key", 1).property("id", "peter")
g.addV("movie").property("partition_key", 1).property("id", "movie1")
g.addV("movie").property("partition_key", 1).property("id", "movie2")
g.addV("movie").property("partition_key", 1).property("id", "movie3")
g.addV("movie").property("partition_key", 1).property("id", "movie4")
g.V("bob").addE("watched").to(g.V("movie1"))
g.V("bob").addE("watched").to(g.V("movie2"))
g.V("jose").addE("watched").to(g.V("movie3"))
g.V("peter").addE("watched").to(g.V("movie4"))
Please consider, that I cannot use lambdas, because Azure CosmosDB doesn't support them.
A join in gremlin can be realized by repeating the V() step. After realizing that, the gremlin query almost reads as an ordinary SQL query, see below.
g.V().has("id", "bob").addE("watched").to(__.V().has("id", "movie1"))
g.V().has("id", "bob").addE("watched").to(__.V().has("id", "movie2"))
g.V().has("id", "jose").addE("watched").to(__.V().has("id", "movie3"))
g.V().has("id", "peter").addE("watched").to(__.V().has("id", "movie4"))
g.V().hasLabel("user").as("u").
V().hasLabel("movie").as("m").
in("watched").where(neq("u")).
select("u", "m").by("id").
order().by("u").by("m")
==>[u:bob,m:movie3]
==>[u:bob,m:movie4]
==>[u:frank,m:movie1]
==>[u:frank,m:movie2]
==>[u:frank,m:movie3]
==>[u:frank,m:movie4]
==>[u:jose,m:movie1]
==>[u:jose,m:movie2]
==>[u:jose,m:movie4]
==>[u:peter,m:movie1]
==>[u:peter,m:movie2]
==>[u:peter,m:movie3]
You are right in saying that this query does not perform well in gremlin and I would advise you to use the SQL API of CosmosDb.

Get all edges between multiple vertices - JanusGraph

I am trying to write a gremlin query to get all the edges between a list of vertices.
Data redacted to protect privacy. User A(42651832) -reports_to-> User
B(28729440) -reports_to-> User C(19546208)
ids = [19546208, 28729440, 42651832]
I need to find all the edges between an arbitrary list of vertices
Seems simple, but I am unable to write a query that gives results I need.
gremlin> g.V(42651832).outE('reports_to').otherV().id()
==>28729440
gremlin> g.V(28729440).outE('reports_to').otherV().id()
==>19546208
gremlin> ids = [19546208, 28729440, 42651832]
==>19546208
==>28729440
==>42651832
gremlin> g.V(ids)
==>v[19546208]
==>v[28729440]
==>v[42651832]
gremlin> g.V(ids).bothE().where(otherV().hasId(ids))
gremlin> g.V(ids).bothE().where(otherV().hasId(within(ids)))
gremlin> g.V(ids).bothE().where(otherV().hasId(within(19546208, 28729440, 42651832)))
Apparently, I think there is a type conversion issue between gremlin console and JanusGraph.
quoting as string or converting to Long seems to work.
gremlin> g.V(ids).bothE().where(otherV().hasId(within("19546208", "28729440", "42651832")))
==>e[10r7d8-h3rs0-i6t-bmxy8][28729440-reports_to->19546208]
==>e[10r7d8-h3rs0-i6t-bmxy8][28729440-reports_to->19546208]
==>e[128qvr-pe6d4-i6t-h3rs0][42651832-reports_to->28729440]
==>e[128qvr-pe6d4-i6t-h3rs0][42651832-reports_to->28729440]
gremlin> g.V(ids).bothE().where(otherV().hasId(within(19546208L, 28729440L, 42651832L)))
==>e[10r7d8-h3rs0-i6t-bmxy8][28729440-reports_to->19546208]
==>e[10r7d8-h3rs0-i6t-bmxy8][28729440-reports_to->19546208]
==>e[128qvr-pe6d4-i6t-h3rs0][42651832-reports_to->28729440]
==>e[128qvr-pe6d4-i6t-h3rs0][42651832-reports_to->28729440]
gremlin>
gremlin> g.V(ids).bothE().where(otherV().hasId(within(19546208L, 28729440L, 42651832L))).dedup()
==>e[10r7d8-h3rs0-i6t-bmxy8][28729440-reports_to->19546208]
==>e[128qvr-pe6d4-i6t-h3rs0][42651832-reports_to->28729440]
Any other suggestions. Not sure why JanusGraph works this way.
This isn't quite a full answer, but hopefully it gets you close enough. I used GraphOfTheGods to test it out.
This will get all the paths from the ids list to the quoted ids in hasID() and then output a list of all the edges traversed in each path. I added a limit for readability. You could easily add all the values to a set to get a deduped answer.
# Save all the graph of the gods vertex ids to a variable
ids = [4112,4128,4136,8232,12328,16424,20520,4296,4328,4344,8440,12536]
paths = g.V(ids).until(hasId("8440","12536")).repeat(bothE().aggregate("e").otherV().simplePath()).limit(3).select('e')
==>[e[74v-6ig-9hx-368][8440-battled->4112]]
==>[e[74v-6ig-9hx-368][8440-battled->4112],e[7xb-6ig-9hx-36o][8440-battled->4128],e[1l0-36o-b2t-9o8][4128-lives->12536],e[9vp-co8-bv9-36o][16424-pet->4128]]
==>[e[74v-6ig-9hx-368][8440-battled->4112],e[7xb-6ig-9hx-36o][8440-battled->4128],e[1l0-36o-b2t-9o8][4128-lives->12536],e[9vp-co8-bv9-36o][16424-pet->4128]]
What I was able to originally get was a full path with connecting vertexes which I am including in case it could potentially be helpful.
paths = g.V(ids).until(hasId("8440","12536")).repeat(bothE().otherV().simplePath()).path().limit(5)
==>[v[4112],e[74v-6ig-9hx-368][8440-battled->4112],v[8440]]
==>[v[4128],e[7xb-6ig-9hx-36o][8440-battled->4128],v[8440]]
==>[v[4128],e[1l0-36o-b2t-9o8][4128-lives->12536],v[12536]]
==>[v[4128],e[9vp-co8-bv9-36o][16424-pet->4128],v[16424],e[9hh-co8-b2t-9o8][16424-lives->12536],v[12536]]
==>[v[4128],e[9vp-co8-bv9-36o][16424-pet->4128],v[16424],e[8p1-co8-cnp-3co][16424-brother->4344],v[4344],e[6cf-6ig-7x1-3co][8440-father->4344],v[8440]]
Separately I did some checking with the GraphOfTheGods and the explain() step and it definitely seems like a bug. If I set a list to variable it is performing an equal step instead of a within step.
paths = g.V(ids).until(hasId(ids)).repeat(out().simplePath()).limit(10).path().explain()
...RepeatStep(until([HasStep([~id.eq([4112, 4128, ...])])]),
where as listing in quotes it will properly do a within check.
paths = g.V(ids).until(hasId("8440","12536")).repeat(outE().simplePath()).limit(10).path().explain()
...RepeatStep(until([HasStep([~id.within([8440, 12536])])])

Gremlin: Adding Edges to Graph Over HTTP using Vertex Variables

I am trying to execute a gremlin script over https against a remote JanusGraph instance. I have filtered my problem to the part where I am trying to add an edge using vertex variables. I am trying to add two vertices, assign the results to a variable and use them to add an edge. Also I am also tying to avoid a single line script like g.V().addV(..).aaddV(..).addE(..), because of the program logic that is behind the script
The following gremlin works in the gremlin console (remote session)
def graph=ConfiguredGraphFactory.open("ga");
def g = graph.traversal();
v1=g.addV('node1');
v2=g.addV('node2');
v1.addE('test').to(v2);
But when I try to do the same over https (issued against a compose-janusgraph server), I get an error. I did add .iterate() to the addV() and the vertexes are getting added if I remove the addE(..) line. But when I try
{"gremlin":"def graph=ConfiguredGraphFactory.open('ga');
def g = graph.traversal();
v1=g.addV('node16').property('name','testn16').iterate();
v2=g.addV('node17').property('name','testn2').iterate();
v1.addE('test18').to(v2);
g.tx().commit()"}
I get the exception
The traversal strategies are complete and the traversal can no longer
be modulated","Exception-Class":"java.lang.IllegalStateException"
Also note that I am joining the whole gremlin into one single line before sending it over curl. I have split them to newlines here for readability. Any help would be great. -- Thank you
iterate() doesn't return a Vertex...it just iterates the traversal to generate side-effects (i.e. the graph gets a vertex added but no result is returned). You probably just need to do:
{"gremlin":"graph=ConfiguredGraphFactory.open('ga');
g = graph.traversal();
g.addV('node16').property('name','testn16').as('v1').
addV('node17').property('name','testn2').as('v2').
addE('test18').from('v1').to('v2').iterate();
g.tx().commit()"}

Query regarding the functionality of repeat().until()

Data: Tinkerpop Modern sample graph
Query:
gremlin> g.V(1).repeat(both()).until(hasLabel("person")).path().by("name")
==>[marko,vadas]
==>[marko,josh]
==>[marko,lop,marko]
==>[marko,lop,josh]
==>[marko,lop,peter]
What doc says:
...If until() comes after repeat() it is do/while looping. If until()
comes before repeat() it is while/do looping...
Doubt:
Shouldn't until terminate the query after first match marko-vadas? or am I missing something?
It terminates the traverser on this path, otherwise you would get [marko,vadas,marko] next. The other paths were found by other traversers. If you only care about the first path, add a .limit(1).
gremlin> g.V(1).repeat(both()).until(hasLabel("person")).limit(1).path().by("name")
==>[marko,vadas]

Printing/Fetching Vertex values from a path

Just getting started with gremlin.
Printing out all the Vertex values worked out fine
gremlin> g.V().values()
==>testing 2
==>Cash Processing
==>Sales
==>Marketing
==>Accounting
I was able to find all the directly connected path between my Vertices.
gremlin> g.V().hasLabel('Process')
.repeat(both().simplePath())
.until(hasLabel('Process'))
.dedup().path()
==>[v[25],v[28]]
==>[v[25],v[26]]
==>[v[26],v[27]]
==>[v[26],v[25]]
Now am trying to print out the values in the path like ['Sales', 'Accounting'] instead of [v[25],v[28]]
Not been able to figure out a way yet
Already tried and failed with
Unfold: Does not get me 1-1 mapping
gremlin> g.V().hasLabel('Process').repeat(both().simplePath()).until(hasLabel('Process')).dedup().path().unfold().values()
==>Cash Processing
==>Accounting
==>Cash Processing
==>Sales
==>Sales
==>Marketing
==>Sales
==>Cash Processing
Path seems to be of a different data-type and does not support .values() function
gremlin> g.V().hasLabel('Process')
.repeat(both().simplePath())
.until(hasLabel('Process'))
.dedup().path().values()
org.apache.tinkerpop.gremlin.process.traversal.step.util.ImmutablePath cannot be cast to org.apache.tinkerpop.gremlin.structure.Element
Tried the following google searches and didnt get the answer
gremlin print a path
gremlin get values in a path
and few more word twisting
Found one at here that was for java but that didnt work for me
l = []; g.V().....path().fill(l)
(but cant create list, Cannot set readonly property: list for class: org.apache.tinkerpop.gremlin.structure.VertexProperty$Cardinality
)
I have running it on Gremlin console (running ./gremlin.sh)
You can use the by step to modulate the elements inside the path. For example by supplying valueMap(true) to by you get the properties of the vertices, together with the vertex labels and their ids:
gremlin> g.V().repeat(both().simplePath()).times(1).dedup().path().by(valueMap(true))
==>[[id:1,name:[marko],label:person,age:[29]],[id:3,name:[lop],lang:[java],label:software]]
==>[[id:1,name:[marko],label:person,age:[29]],[id:2,name:[vadas],label:person,age:[27]]]
==>[[id:1,name:[marko],label:person,age:[29]],[id:4,name:[josh],label:person,age:[32]]]
==>[[id:2,name:[vadas],label:person,age:[27]],[id:1,name:[marko],label:person,age:[29]]]
==>[[id:3,name:[lop],lang:[java],label:software],[id:6,name:[peter],label:person,age:[35]]]
==>[[id:4,name:[josh],label:person,age:[32]],[id:5,name:[ripple],lang:[java],label:software]]
I used the modern graph which is one of TinkerPop's toy graphs that are often used for such examples. Your output will look a bit different and you may want to use something else than valueMap(true) for the by modulator. The TinkerPop documentation of the path step itself contains two more advanced examples for path().by() that you might want to check out.

Resources