Choose between 2 available edges during traversal - gremlin

I'm relatively new to Gremlin, and the company I'm with is looking to implement a graph database with some temporary edges within it. Each vertex could have 1 or more edge, and each edge would have a property on it that is essentially isTemporary true/false.
When traversing the graph, if "isTemporary" = true we should follow that edge, otherwise we should follow the edge where "isTemporary" = false.
I.e.,
A-[isTemporary:true, value 1] -> B
A-[isTemporary:false, value 2] -> C
B-[isTemporary: false, value 3] -> D
Running a single gremlin query should return A->B->D in this case.
I've looked through TinkerPop3 documentation, and it seems like "choose" may be what I want to use here, but all the examples seem to return a value, when what I want is a traversal to come back so I can repeatedly act on the traversal.
Any help would be appreciated.

You could be looking for the coalesce step.
Considering this graph:
g.addV().as('a').property('name', 'A').
addV().as('b').property('name', 'B').
addV().as('c').property('name', 'C').
addV().as('d').property('name', 'D').
addE('someLink').from('a').to('b').
property('isTemporary', true).property('value', 1).
addE('someLink').from('a').to('c').
property('isTemporary', false).property('value', 2).
addE('someLink').from('b').to('d').
property('isTemporary', false).property('value', 3)
The following query will return all paths from A to D, attempting to traverse via isTemporary: true edges if present, or via isTemporary: false edges otherwise (coalesce step), iteratively.
g.V().has('name', 'A').
repeat(
coalesce(
outE().has('isTemporary', true).inV(),
outE().has('isTemporary', false).inV()
)
).
until(has('name', 'D')).
path().by('name')
Result:
==>[A,B,D]

Related

ArangoDB: Find last node in path

I'm pretty new to Arangodb and I'm trying to get the last/leaf node (I guess vertex) in a graph. So given I've the following graph:
Now I want start the traversal with 6010142. The query should return 6010625 because it is the last node that can be reached via 6010145. But how does the query looks like?
I already tried:
FOR v, e, p IN 1..5 OUTBOUND {_id: 'nodes/6010142'} GRAPH 'test' RETURN v
But it also returns 6010145. Furthermore it is limited to a maxDepth of 5 but my graph can exceed the limit. So I also need a solution that works for any depth. Hopefully anyone can help me :-)
I'm also just starting out with AQL but maybe this can help.
FOR v IN 1..5 OUTBOUND {_id: 'nodes/6010142'} GRAPH 'test' OPTIONS {uniqueVertices: 'global', bfs: true}
FILTER LENGTH(FOR vv IN OUTBOUND v GRAPH 'test' LIMIT 1 RETURN 1) == 0
RETURN v
This approach follows an older ArangoDB cook book (p. 39) for finding leaf nodes. The filter line takes the connected nodes found by the first line and does a second traversal to check if this is actually a leaf.
The OPTIONS {uniqueVertices: 'global', bfs: true} part is an optimization if you are only interested in unique leaf nodes and not all the different paths to those nodes.
Regarding maxDepth I would just use a sufficiently high number. The worst case would be the number of nodes in your graph.
(The graph you posted and your description seem to disagree about the direction of the edges. Maybe you need to use INBOUND.)

Repeat in gremlin

Two queries related to gremlin are as follows:
Want to stop the traversal when a condition is satisfied during repeated condition check.
g.V().has('label_','A')).emit().repeat(inE().outV()).until(has('stop',1)).project('depth','values').by(valueMap('label_','stop'))
I want the query to stop returning further values when the stop is equal to 1 for the node encountered during the repeat statement. But the query doesn't stop and return all the records.
Output required:
=>{label_='A',stop=0}
=>{label_='B',stop=0}
=>{label_='C',stop=1}
Query to return traversal values in the following format considering if edge exists between them. Considering the graph as A->E1->B->E2->C. The output must be as follows
=> A,E1,B
=> B,E2,C
A, B, C, E1, E2 represents properties respectively where is the starting node
For the first part, it seems you traversing on the in edges and not on the out is this on purpose? if so replace the out() in the repeat to in
g.V().has(label, 'A').emit().
repeat(out()).until(has('stop', 1)).
project('label', 'stop').
by(label).
by(values('stop'))
example: https://gremlify.com/ma2xkkszkzr/1
for the second part, I'm still not sure what you meant if you just want to get all edges with their out and in you can use elementMap:
g.E().elementMap()
example: https://gremlify.com/ma2xkkszkzr/4
and if not supported you can maybe do something like this:
g.E().local(union(
outV(),
identity(),
inV()
).label().fold())
example: https://gremlify.com/ma2xkkszkzr/2

How do I produce output even when there is no edge and when using select for projection

Can someone help me please with this simple query...Many thanks in advance...
I am using the following gremlin query and it works well giving me the original vertex (v) (with id-=12345), its edges (e) and the child vertex (id property). However, say if the original vertex 'v' (with id-12345) has no outgoing edges, the query returns nothing. I still want the properties of the original vertex ('v') even if it has no outgoing edges and a child. How can I do that?
g.V().has('id', '12345').as('v').
outE().as('e').
inV().
as('child_v').
select('v', 'e', 'child_v').
by(valueMap()).by(id).by(id)
There are a couple of things going on here but the major update you need to the traversal is to use a project() step instead of a select().
select() and project() steps are similar in that they both allow you to format the results of a traversal however they differ in (at least) one significant way. select() steps function by allowing you to access previously traversed and labeled elements (via as). project() steps allow you take the current traverser and branch it to manipulate the output moving forward.
In your original traversal, when there are no outgoing edges from original v so all the traversers are filtered out during the outE() step. Since there are no further traversers after the outE() step then remainder of the traversal has no input stream so there is no data to return. If you use a project() step after the original v you're able to return the original traverser as well as return the edges and incident vertex. This does lead to a slight complication when handling cases where no out edges exist. Gremlin does not handle null values, such as no out edges existing, you need to return some constant value for these statements using a coalesce statement.
Here is functioning version of this traversal:
g.V().hasId(3).
project('v', 'e', 'child_v').
by(valueMap()).
by(coalesce(outE().id(), constant(''))).
by(coalesce(out().id(), constant('')))
Currently you will get a lot of duplicate data, in the above query you will get the vertex properties E times. probably will be better to use project:
g.V('12345').project('v', 'children').
by(valueMap()).
by(outE().as('e').
inV().as('child').
select('e', 'child').by(id).fold())
example: https://gremlify.com/a1
You can get the original data format if you do something like this:
g.V('12345').as('v').
coalesce(
outE().as('e').
inV().
as('child_v')
select('v', 'e', 'child_v').
by(valueMap()).by(id).by(id),
project('v').by(valueMap())
)
example: https://gremlify.com/a2

Add edge if not exist using gremlin

I'm using cosmos graph db in azure.
Does anyone know if there is a way to add an edge between two vertex only if it doesn't exist (using gremlin graph query)?
I can do that when adding a vertex, but not with edges. I took the code to do so from here:
g.Inject(0).coalesce(__.V().has('id', 'idOne'), addV('User').property('id', 'idOne'))
Thanks!
It is possible to do with edges. The pattern is conceptually the same as vertices and centers around coalesce(). Using the "modern" TinkerPop toy graph to demonstrate:
gremlin> g.V().has('person','name','vadas').as('v').
V().has('software','name','ripple').
coalesce(__.inE('created').where(outV().as('v')),
addE('created').from('v').property('weight',0.5))
==>e[13][2-created->5]
Here we add an edge between "vadas" and "ripple" but only if it doesn't exist already. the key here is the check in the first argument to coalesce().
The performance of the accepted answer isn't great since it use inE(...), which is an expensive operation.
This query is what I use for my work in CosmosDB:
g.E(edgeId).
fold().
coalesce(
unfold(),
g.V(sourceId).
has('pk', sourcePk).
as('source').
V(destinationId).
has('pk', destinationPk).
addE(edgeLabel).
from('source').
property(T.id, edgeId)
)
This uses the id and partition keys of each vertex for cheap lookups.
I have been working on similar issues, trying to avoid duplication of vertices or edges. The first is a rough example of how I check to make sure I am not duplicating a vertex:
"g.V().has('word', 'name', '%s').fold()"
".coalesce(unfold(),"
"addV('word')"
".property('name', '%s')"
".property('pos', '%s')"
".property('pk', 'pk'))"
% (re.escape(category_),re.escape(category_), re.escape(pos_))
The second one is the way I can make sure that isn't a directional edge in either direction. I make use of two coalesce statements, one nested inside the other:
"x = g.V().has('word', 'name', '%s').next()\n"
"y = g.V().has('word', 'name', '%s').next()\n"
"g.V(y).bothE('distance').has('weight', %f).fold()"
".coalesce("
"unfold(),"
"g.addE('distance').from(x).to(y).property('weight', %f)"
")"
% (word_1, word_2, weight, weight)
So, if the edge exists y -> x, it skips producing another one. If y -> x doesn't exist, then it tests to see if x -> y exists. If not, then it goes to the final option of creating x -> y
Let me know if anyone here knows of a more concise solution. I am still very new to gremlin, and would love a cleaner answer. Though, this one appears to suffice.
When I implemented the previous solutions provided, when I ran my code twice, it produced an edge for each try, because it only tests one direction before creating a new edge.

Use Gremlin to find the shortest path in a graph avoiding a given list of vertices?

I need to use Gremlin find the shortest path between two nodes (vertices) while avoiding a list of given vertices.
I already have:
v.bothE.bothV.loop(2){!it.object.equals(y)}.paths>>1
To get my shortest path.
I was attempting something like:
v.bothE.bothV.filter{it.name!="ignored"}.loop(3){!it.object.equals(y)}.paths>>1
but it does not seem to work.
Please HELP!!!
The second solution you have looks correct. However, to be clear on what you are trying to accomplish. If x and y are the vertices that you want to find the shortest path between and a vertex to ignore during the traversal if it has the property name:"ignored", then the query is:
x.both.filter{it.name!="ignored"}.loop(2){!it.object.equals(y)}.paths>>1
If the "list of given vertices" you want filtered is actually a list, then the traversal is described as such:
list = [ ... ] // construct some list
x.both.except(list).loop(2){!it.object.equals(y)}.paths>>1
Moreover, I tend to use a range filter just to be safe as this will go into an infinite loop if you forget the >>1 :)
x.both.except(list).loop(2){!it.object.equals(y)}[1].paths>>1
Also, if there is a potential for no path, then to avoid an infinitely long search, you can do a loop limit (e.g. no more than 4 steps):
x.both.except(list).loop(2){!it.object.equals(y) & it.loop < 5}.filter{it.object.equals(y)}.paths>>1
Note why the last filter step before paths is needed. There are two reasons the loop is broken out of. Thus, you might not be at y when you break out of the loop (instead, you broke out of the loop because it.loops < 5).
Here is you solution implemented over the Grateful Dead graph distributed with Gremlin. First some set up code, where we load the graph and define two vertices x and y:
gremlin> g = new TinkerGraph()
==>tinkergraph[vertices:0 edges:0]
gremlin> g.loadGraphML('data/graph-example-2.xml')
==>null
gremlin> x = g.v(89)
==>v[89]
gremlin> y = g.v(100)
==>v[100]
gremlin> x.name
==>DARK STAR
gremlin> y.name
==>BROWN EYED WOMEN
Now your traversal. Note that there is not name:"ignored" property, so instead, I altered it to account for the number of performances of each song along the path. Thus, shortest path of songs played more than 10 times in concert:
gremlin> x.both.filter{it.performances > 10}.loop(2){!it.object.equals(y)}.paths>>1
==>v[89]
==>v[26]
==>v[100]
If you use Gremlin 1.2+, then you can use a path closure to provide the names of those vertices (for example) instead of just the raw vertex objects:
gremlin> x.both.filter{it.performances > 10}.loop(2){!it.object.equals(y)}.paths{it.name}>>1
==>DARK STAR
==>PROMISED LAND
==>BROWN EYED WOMEN
I hope that helps.
Good luck!
Marko.

Resources