how to exclude vertices based on edges using gremlin/IBM graph - gremlin

I'm working with IBM graph and using gremlin to try to filter certain vertices. My edges have two labels required and optional. this the situation:
V3 -- V5 -- V6
/ \
V1 \
\ \
V2 -- V4
This is a little like a supply/demand chain. V1 is an object that supplies two types of supplies (V3 and V2). V4 needs both supplies to be able to work. V5 needs both V3 and V6 to be able to work. Since I only have V2 and V3. I need a query that allows me to move from V2 and V3 to each outgoing vertex but exclude a vertex based on whether or not that vertex has an edge that is required (but allows for optional edges).
After much trying this is what a colleague came up with:
def g = graph.traversal(); g.V(1).out().outE().aggregate('edges').inV().where(inE('required').where(not(within('edges'))).count().is(eq(0))).dedup()
Is this the best way to do this? Or is there are smarter way?

Assuming this is your graph:
gremlin> g.addV().property(id,1).as('1').
......1> addV().property(id,2).as('2').
......2> addV().property(id,3).as('3').
......3> addV().property(id,4).as('4').
......4> addV().property(id,5).as('5').
......5> addV().property(id,6).as('6').
......6> addE('supplies').from('1').to('2').
......7> addE('supplies').from('1').to('3').
......8> addE('required').from('2').to('4').
......9> addE('required').from('3').to('4').
.....10> addE('required').from('3').to('5').
.....11> addE('required').from('6').to('5').iterate()
and this is the expected output:
gremlin> g.V(1).
......1> out().
......2> outE().
......3> aggregate('edges').
......4> inV().
......5> where(inE('required').
......6> where(not(within('edges'))).
......7> count().is(eq(0))).
......8> dedup()
==>v[4]
Then aggregating edges already traversed edges is probably the best approach. (It's always best to include a sample graph as a Gremlin script in your questions.) I suppose it is worth noting that you don't need the eq() in your is() and not(without(...)) is just without:
gremlin> g.V(1).
......1> out().
......2> outE().
......3> aggregate('edges').
......4> inV().
......5> where(inE('required').
......6> where(without('edges')).
......7> count().is(0)).
......8> dedup()
==>v[4]
Or just do away with counting all together since you want those vertices that return no new edges to traverse:
gremlin> g.V(1).
......1> out().
......2> outE().
......3> aggregate('edges').
......4> inV().
......5> not(inE('required').
......6> where(without('edges'))).
......7> dedup()
==>v[4]
The above approach might be better as the immediate return of just one edge from
your filter of inE('required').where(not(within('edges'))) will immediately filter the vertex out and you don't have to wait for a count of all edges.

Related

Using a back step to get two connected vertexes

If I have 3 vertex's A, B, C, where B has an edge to A and C. Starting with B how can I get values for A and C
g.V("b").out("toC").as("c").out("toA").as("a").select("c", "a").next()
This is what I have but it causes an error because I don't think you can go out to A from C since they aren't connected. I need a way to go back to B first but there is no back step that I have seen.
Using this graph
gremlin> g.addV('A').as('a').
......1> addV('B').as('b').
......2> addV('C').as('c').
......3> addE('toA').from('b').to('a').
......4> addE('toC').from('b').to('c')
==>e[42783][42780-toC->42781]
You can find the vertices connected to B using
gremlin> g.V().hasLabel('B').out().elementMap()
==>[id:42774,label:A]
==>[id:42776,label:C]
You can also filter using specific edge labels in cases where there are lots of edges from B and you only want specific ones:
gremlin> g.V().hasLabel('B').out('toA','toC').elementMap()
==>[id:42774,label:A]
==>[id:42776,label:C]
If you really do need to write the query so that it works the way you showed in the question, then this is one way:
gremlin> g.V().hasLabel('B').as('b').
......1> out('toA').as('a').
......2> select('b').
......3> out('toC').as('c').
......4> select('a','c').
......5> by(elementMap())
==>[a:[id:42779,label:A],c:[id:42781,label:C]]
You can also try:
gremlin> g.V().hasLabel('B').
......1> outE().hasLabel('toA','toC').
......2> inV().elementMap()
==>[id:0,label:A]
==>[id:2,label:C]

Gremlin drop deletes only the first vertex of the sub-graph

I have a graph where A is connected to B,C,D with edges.
I'm using the following to delete A and its connected vertexes.
g.V('A').
union(__(),
repeat(out()).emit()).drop()
But it deletes only A.
When I run it without the drop() it returns a list of all the sub-graph vertexes.
I'm using the tinkerpop console/server version 3.4.9.
You need to collect the results of the union step before calling drop. Here is the query slightly modified.
gremlin> g.addV('A').as('a').
......1> addV('B').as('b').
......2> addV('C').as('c').
......3> addV('D').as('d').
......4> addE('knows').from('a').to('b').
......5> addE('knows').from('b').to('c').
......6> addE('knows').from('c').to('d')
==>e[41][37-knows->38]
gremlin> g.V().hasLabel('A').
......1> union(identity(),
......2> repeat(out()).emit()).fold()
==>[v[35],v[36],v[37],v[38]]
gremlin> g.V().hasLabel('A').
......1> union(identity(),
......2> repeat(out()).emit()).fold().unfold().drop()
gremlin> g
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
Note also that this can be done without the need for a union step by moving the emit before the repeat and using store.
g.V().hasLabel('A').
emit().
repeat(store('v').out()).
cap('v').
unfold().
drop()

Is there an efficient way to see how many edges and verticies were created in a gremlin Upsert?

Using a query like the following
g.V().has('person','name','marko').
fold().
coalesce(unfold(),
addV('person').property('name','marko')).
property('age',29)
Is there an efficient way to also return how many verticies were created?
I.e. 0 in the case that the marko vertex exists, 1 in the case that the marko vertex does not exist.
I think I'd go with union():
gremlin> g.V().has('person','name','marko').
......1> fold().
......2> union(coalesce(unfold(),
......3> addV('person').property('name','marko')).
......4> property('age',29),
......5> count(local))
==>v[1]
==>1
gremlin> g.V().has('person','name','darko').
......1> fold().
......2> union(coalesce(unfold(),
......3> addV('person').property('name','marko')).
......4> property('age',29),
......5> count(local))
==>v[13]
==>0
It's the reverse of what you asked for in terms of 1 and 0 existence but I suppose that would be easy enough to remedy if you really needed to but it would add additional complexity to the traversal around that count(local) which is pretty easy to read right now as it is.

How to use less than inside option of CHOOSE step in gremlin

.choose(values('points')
.option(lt(500), updating the vertex properties)
.option(gt(500), updating and creating new properties)
.option(none,creating new vertex ))
points is vertex property of type integer
I want to perform less than and greater than operations inside option. I am trying in the above way.Please correct me if I am wrong.
If there is any other way to perform this please let me know.
Thank you
This feature hasn't been implemented as of Apache TinkerPop 3.4.2. It is a long standing open issue: TINKERPOP-1084. I think that if you want this functionality, you're going to have do a nested choose():
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V().hasLabel('person').
......1> choose(values('age').is(lt(29)),
......2> constant('lt29'),
......3> choose(values('age').is(gt(29)),
......4> constant('gt29'),
......5> constant('29')))
==>29
==>lt29
==>gt29
==>gt29
Note that you aren't restricted to constant() as a result of these nested choose() operations. You can add any anonymous traversal you like there as in:
gremlin> g.V().hasLabel('person').
......1> choose(values('age').is(lt(29)),
......2> constant('lt29'),
......3> choose(values('age').is(gt(29)),
......4> math('_ - 29').by('age'),
......5> constant('29')))
==>29
==>lt29
==>3.0
==>6.0

Add different values to each of multiple vertices

I know that I can select multiple specific vertices by id in a single step, for example (using gremlin-python syntax):
g.V([1, 2, 3]).next()
>>> [v[1], v[2], v[3]]
How can I then add a property with different values to each of these selected items? For example, I would like that:
Vertex 1 has property 'x' with value 'a'
Vertex 2 has property 'x' with value 'b'
Vertex 3 has property 'x' with value 'c'
So I'd like to be able to say something like:
g.V([1, 2, 3]).<add property 'x' with values ['a', 'b', 'c']>.next()
Is it possible?
You could use choose() to accomplish something like that which will give you if-then-else type branching. Using the modern toy graph that ships with TinkerPop distributions:
gremlin> g.V(1,2,3).
......1> choose(id()).
......2> option(1, property('switch','a')).
......3> option(2, property('switch','b')).
......4> option(3, property('switch','c')).iterate()
gremlin> g.V(1,2,3).
......1> project('id','switch').
......2> by(id).
......3> by('switch')
==>[id:1,switch:a]
==>[id:2,switch:b]
==>[id:3,switch:c]
As of TinkerPop 3.3.3, you could also do something like this which uses a side-effect:
gremlin> g.withSideEffect('x',[1:'a',2:'b',3:'c']).
......1> V(1,2,3).as('v').
......2> property('switch', select('x').
......3> select(select('v').id()))
==>v[1]
==>v[2]
==>v[3]
gremlin> g.V(1,2,3).
......1> project('id','switch').
......2> by(id).
......3> by('switch')
==>[id:1,switch:a]
==>[id:2,switch:b]
==>[id:3,switch:c]
You need 3.3.3+ because you need the ability to select() runtime generated keys which was only added in that version.
I don't think you can use the key of the Map of "x" to handle the lookup of the vertex without a lambda (maybe i'm not thinking of something simple), so unfortunately you have to specify the ids twice, but you could parameterize that pretty easily because you just need to construct the Map and then grab the .keys() from it to give to V().

Resources