Gremlin query unintentionally iterates whole graph - azure-cosmosdb

I'm trying to upsert two vertices in a single query. I'm doing the following query, but the second vertex somehow iterates the entire graph (see image clipped from executeprofile).
What am I misunderstanding here?
g
.V('1')
.fold()
.coalesce(
unfold(),
addV('Company').property('id', '1').property('pk','1')
)
.aggregate('a')
.V('2')
.fold()
.coalesce(
unfold(),
addV('Person').property('id', '2').property('pk', '2')
)
.as('b')

Eventually, I managed to figure out the solution, albeit not the reason.
I converted to (roughly) this:
g
.V('1')
.fold()
.coalesce(
g.V('1'),
addV('Company').property('id', '1').property('pk','1')
)
.aggregate('a')
.coalesce(
g.V('2'),
addV('Person').property('id', '2').property('pk', '2')
)
.as('b')
So basically removing the second .V('2') and instead putting it inside the coalesce. I guess there must be some way to remove the very first V('1'), but I haven't quite figured that out yet.

Related

Gremlin traversal referencing previous edges

I have a fairly simple graph, with the caveat that all edges are marked with 'to' and 'from' dates stored as timestamps.
I'm trying to write a traversal that will only follow edges that overlap with the first traversed edge's To and From dates.
Note that the To-date can be non-existent, indicating that the edge is still active.
I've tried various things, but the closest I've come so far is:
g.V('1').outE().as('firstEdge')
.sideEffect(values('To').store('eTo'))
.sideEffect(values('From').store('eFrom'))
.outV()
.repeat(outE().where(
lte('firstEdge')).by('From') // The edge has a from-date less than firstEdge's from-date.
.and(
or(
hasNot('To'), // The edge is still active, OR
has('To', gte('eFrom')) // the edge has a to date after the firstEdge's from-date.
)
)
.and(
or(
select('firstEdge').hasNot('To'), // The first edge does not have a to-date, OR
has('From', lte('eTo')) // The edge has a from date that is less than the first edge's to-date.
)
)
.inV()
)
.emit()
.times(2) // Limit to two edges away.
.where(hasLabel('Person'))
I'm pretty sure the problem is related to the store commands and subsequent usage, as the following query does not return anything either (It should return any inactive edges created after the first edge).
g.V('1').outE('is_board')
.sideEffect(values('From').store('eFrom'))
.outV()
.repeat(outE().has('To', gte('eFrom')).inV())
.emit()
.times(2)
.where(hasLabel('Person'))

Tinkerpop Gremlin Get Edges that go to vertices within a list

I'm trying to query for edges that go to vertices within an aggregated list. It sounds quite simple, and it should be, but I seem to be writing my queries wrong, and I just can't figure out why. Anyway, I'll use the Modern Toy Graph to make an example, that won't necessarily make much sense in this context, but still illustrates what I wish to do:
graph = TinkerFactory.createModern()
g = graph.traversal()
g.V().
hasLabel('person').
aggregate('x').
outE().
where(inV().is(within('x')))
What I'm doing is traversing to all 'person' vertices, aggregating them, then trying to get all the outgoing edges that lead to another vertex within that aggregated list. I expect the above query to return the edge labelled "knows" that goes between vertex 1 and 2, and the one between 1 and 4, however nothing is returned. If i simple want to get the vertices on the other end of those edges, rather than the edges themselves, the following works fine, returning vertex 2 and 4:
g.V().
hasLabel('person').
aggregate('x').
out().
where(within('x'))
So how can I get edges that lead to vertices already aggregated in a list?
(Once again, I'm aware this example doesn't make much sense within this particular graph, and I could easily query outE('knows'), but this query is relevant to a different graph.)
Thanks.
You can't use is() quite that way. An easy fix would be to just combine your "working" traversal with the one that doesn't:
gremlin> g.V().hasLabel('person').
......1> aggregate('x').
......2> outE().
......3> where(inV().where(within('x')))
==>e[7][1-knows->2]
==>e[8][1-knows->4]

With Gremlin, when using a groupCount() with more than one property, how can I order the results?

g.V().as('a').
outE().as('r').
inV().as('b').
select('a', 'r', 'b').
project('source','label','target').
by(select('a').label()).
by(select('r').label()).
by(select('b').label()).
groupCount().
order(local).by(values, desc).
unfold()
This query should work on any gremlin dataset.
sample output:
==>{source=Node, label=POINTS_AT, target=Node}=63
==>{source=Node, label=IS_RELATED_TO, target=Other}=14
==>{source=Other, label=POINTS_AT, target=Other}=3
I am just starting to learn Gremlin. This query does a groupCount on the out vertex label, edge label, and in vertex label and orders the results by the values, which are just the counts.
The keys are Map entries with multiple values 'source', 'label', and 'target'. I would like to order the results by the edge label, then by the source label, and by the target label last. How can I accomplish this?
Also, is there is a better way to approach this than the query I came up with?
Without your data set I am not sure if this is exactly what you want but does this do what you need?
g.V().as('a').
outE().as('r').
inV().as('b').
project('source','label','target').
by(select('a').label()).
by(select('r').label()).
by(select('b').label()).
groupCount().
order(local).
by(select(keys).select('source')).
by(select(keys).select('target')).
by(select(keys).select('label')).
unfold()
You can add desc to any of the by modulators if that is what you need.

TinkerPop: Filter by Edge count

Sample data: TinkerPop Modern
Summary: I want to find People who have created 2 softwares.
I started with the basics, and got the count properly
g.V().hasLabel("Person").as("from" ,"to1" )
.repeat(bothE().as("e1").otherV().as("to1").dedup("from", "to1")).times(1)
.emit(filter(hasLabel("Software"))).hasLabel("Software")
.group().by(select("from").by("name")).by(count()).as("c")
Result:
>> {'Marko': 1, 'Peter': 1, 'Josh': 2}
So I tried to apply a filter but its not working (ie. Result is incorrect), what I tried:
g.V().hasLabel("Person").as("from")
.repeat(bothE().as("e1").otherV().as("to1").dedup("from", "to1")).times(1)
.filter(bothE().otherV().hasLabel("Software").count(local).is(eq(1)))
.dedup()
.values("name")
Any idea what am I doing wrong?
Sample data:
If you just need "person" vertices by edge count I don't really see why you need all that repeat() infrastructure. It's just:
gremlin> g.V().hasLabel('person').
......1> filter(outE('created').limit(2).count().is(2))
==>v[4]
You only need to count outgoing edges because the schema is such that the "created" label only connects to "software", so you don't need to check the
"software" vertex label. You limit(2) to exit the edge iteration as soon as possible but not before you have the 2 edges you are trying to count.

How to get all Graph Nodes in ArangoDB without Start-Node

Like in OrientDB, for get the All Graph Only use 'Select From v'
So far, I have use AQL in ArangoDB with start node:
for v,e,p IN 2 ANY 'user/188802' graph 'a' return p
And now I want to get all graph nodes in ArangoDB without the start node?
Graphs are a grouping of Edge collections. Each Edge collection references _from and _to documents which are stored in Document collections.
The graph traversal queries expect you to have a starting position and it returns the results for that single starting position.
It is possible to identify all possible starting positions, and then run graph traversals over those positions.
You'll need to know the names of the document collections that make up your graph, you can insert them into an AQL query like this:
FOR vertex IN UNION(
(FOR v IN document_collection_1 RETURN v._id),
(FOR v IN document_collection_2 RETURN v._id),
(FOR v IN document_collection_3 RETURN v._id)
)
FOR v, e IN 1..5 OUTBOUND vertex GRAPH 'my_graph_name' OPTIONS { uniqueVertices: true }
RETURN DISTINCT [
{
_from: e._from,
_to: e._to
}
]
Remember that in ArangoDB it is possible for a document collection to be bound to more than one graph, so you'll need to ensure you identify all document collections that are part of the graph.
This query will then extract an array of objects that contain all links defined in the graph. This query focuses only on vertices with edges that are part of the graph. If the vertex has no edge on it, it won't appear in the output as it is not part of the graph.

Resources