Gremlin connected components prints one by one - gremlin

I'm newbie to the gremlin QL, My requirement to generate the connected components on huge graph. I tried the below query but it's printing as a group of values but I need to print one by one.
Connected components Query:
g.V().emit(cyclicPath().or().not(both())).repeat(both()).until(cyclicPath()).path().aggregate("p").unfold().dedup().map(__.as("v").select("p").unfold().filter(unfold().where(eq("v"))).unfold().dedup().order().by(id).fold()).dedup()
[v[89826185]]
[v[89826188], v[89826189], v[89826190], v[89826191], v[89826192], v[89826193], v[89826194]]
[v[89826195], v[89826196], v[89826198]]
I need to print the values like below way.
min-id of group(list) to each element of the group(list).
Ex:
89826188 89826189
89826188 89826190
89826188 89826191
89826188 89826192
89826188 89826193
89826188 89826194
89826188 89826188 (self)

You could do that in your application's code. Doing it at the query level will only blow up the result size, but here you go:
g.V().
emit(cyclicPath().or().not(both())).
repeat(both()).
until(cyclicPath()).
path().aggregate("p").
unfold().dedup().
map(__.as("v").select("p").unfold().
filter(unfold().where(eq("v"))).
unfold().dedup().
order().
by(id).
fold()).
dedup().as("list").
unfold().
map(union(select("list").
by(limit(local, 1)),
identity()).
id().fold())
It's basically the same query, I only added the final map() step to reformat the result.

Related

Gremlin: how to find other edges with the same property

I have a graph with two vertices having id(s) 'a' and 'b'.
gremlin> g.V()
==>v[b]
==>v[a]
There are two edges from 'a' to 'b'.
gremlin> g.E()
==>e[a6b4bead-c161-5a61-d232-abfa2bfad54e][a-LIKES->b]
==>e[10b4bead-a0fc-8d2c-d69f-26b3e9e4c5d8][a-KNOWS->b]
gremlin> g.E().valueMap(true)
==>{id=a6b4bead-c161-5a61-d232-abfa2bfad54e, semantics=social, label=LIKES}
==>{id=10b4bead-a0fc-8d2c-d69f-26b3e9e4c5d8, semantics=social, label=KNOWS}
My question: given an id for one of the edges, I would like to find all other edges with the same value for the property "semantics". For example, given a.LIKES.id, I would like to execute a query that will return a.KNOWS using the value a.LIKES.semantics.
I started with:
g.E('a6b4bead-c161-5a61-d232-abfa2bfad54e')
.property('semantics').as('semantics')...this is where I am stuck
Thanks,
Joel
where() in conjunction with a by() modulator will do the job:
g.E('a6b4bead-c161-5a61-d232-abfa2bfad54e').as('e').
outV().inE().
where(eq('e')).by('semantics'). // return edges with the same semantics property value
where(neq('e')) // ... except the one we started with

In Gremlin, how can one query vertices where one property value is greater than another property value?

I'm sure this is straightforward, but I'm not sure how to do it. I have vertices, with a certain label, which have two integer properties. Let's call them integer1 and integer2. I simply want to query for all vertices where integer2 is greater than integer1.
I have tried the following:
g.V().hasLabel("myLabel").has("integer2", P.gt(values("integer1"))).toList();
but this results in an exception - understandably, as the the "values" method call results in a traversal step where as the predicate expects a number.
Exception in thread "main" java.lang.ClassCastException: org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.DefaultGraphTraversal cannot be cast to java.lang.Integer
at java.lang.Integer.compareTo(Integer.java:52)
at org.apache.tinkerpop.gremlin.process.traversal.Compare$3.test(Compare.java:92)
at org.apache.tinkerpop.gremlin.process.traversal.P.test(P.java:72)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.HasContainer.testValue(HasContainer.java:118)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.HasContainer.test(HasContainer.java:94)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.HasContainer.testAll(HasContainer.java:180)
at org.apache.tinkerpop.gremlin.tinkergraph.process.traversal.step.sideEffect.TinkerGraphStep.iteratorList(TinkerGraphStep.java:116)
at org.apache.tinkerpop.gremlin.tinkergraph.process.traversal.step.sideEffect.TinkerGraphStep.vertices(TinkerGraphStep.java:88)
at org.apache.tinkerpop.gremlin.tinkergraph.process.traversal.step.sideEffect.TinkerGraphStep.lambda$new$0(TinkerGraphStep.java:59)
at org.apache.tinkerpop.gremlin.tinkergraph.process.traversal.step.sideEffect.TinkerGraphStep$$Lambda$23/1123629720.get(Unknown Source)
...
Any help would be greatly appreciated. Thanks.
One way to do it would be with a where() clause. First, to demonstrate I modified the "modern" graph to include a "k" property with an integer value:
g = TinkerFactory.createModern().traversal()
g.V().hasLabel('person').property('k',30)
and then:
gremlin> g.V().hasLabel('person').as('a').
......1> where('a', gt('a')).by('age').by('k').
......2> valueMap('age','k')
==>[k:[30],age:[32]]
==>[k:[30],age:[35]]

Trouble simultaneously fetching filtered vertices and unfiltered vertices count

I'm trying to return a limited number of vertices matching a pattern, as well as the total (non-limited) count of vertices matching that pattern.
g.V()
.hasLabel("PersonPublic")
.has('partitionKey', "Q2r1NaG6KWdScX4RaeZs")
.has('docId', "Q2r1NaG6KWdScX4RaeZs")
.out("CONTACT_LIST")
.out("SUBSCRIBER")
.dedup()
.order()
.by("identifier")
.by("docId")
.fold()
.project('people','total')
.by(
unfold()
.has('docId', gt("23")),
.limit(2)
.project('type','id')
.by(label())
.by(values('docId'))
)
.by(unfold().count())
In plain English, I'm finding a person, finding all the contact lists of that person, finding all the subscribers to those contact lists, de-duplicating the subscribers, ordering the subscribers, pausing there to collect everything and then projecting the results in the form
{
people: [{type: string, id: string}],
total: number,
}
The "people" part of the projection is unfolded, filtered to only contain results with a "docId" greater than "23", limited to 2, and then projected again.
The "total" part of the projection is unfolded (no-limit) and counted.
My goal is to allow paging through a pattern while still retrieving the total number of vertices associated with the pattern.
Unfortunately, on cosmosdb this query is not working. Results are in the form
{
people: {type: string, id: string},
total: number,
}
And only the first person result is returned (rather than an array).
Any help would be greatly appreciated!
You need to fold() the projected value again, otherwise, it's always gonna be trimmed to the first one. Also, for the total you don't need to unfold(), that's just a waste of resources.
g.V()
.hasLabel("PersonPublic")
.has('partitionKey', "Q2r1NaG6KWdScX4RaeZs")
.has('docId', "Q2r1NaG6KWdScX4RaeZs")
.out("CONTACT_LIST")
.out("SUBSCRIBER")
.dedup()
.order()
.by("identifier")
.by("docId")
.fold()
.project('people','total')
.by(
unfold()
.has('docId', gt("23"))
.limit(2)
.project('type','id')
.by(label)
.by('docId')
.fold()
)
.by(count(local))

Get array's first item as object in TinkerPop3 Gremlin query and JanusGraph

I faced this issue during a migration of gremlin queries from v2 to v3.
V2-way: inE().has(some condition).outV().map().toList()[0] will return an object. This is wrapped in transform{label: it./etc/} step.
V3-way, still WIP: inE().has(some condition).outV().fold() will return an array. This is wrapped in project(...).by(...) step.
V3 works fine, I just have to unwrap an item from the array manually. I wonder if there is a more sane approach (anyway, this feels like non-graph-friendly step).
Environment: JanusGraph, TinkerPop3+. For v2: Titan graph db and TinkerPop2+.
Update: V3 query sample
inE('edge1').
has('cond1').outV(). // one vertex left
project('items', 'count'). // pagination
by(
order().
by('field1', decr).
project('vertex_itself', 'vertex2', 'vertices3').
by(identity()).
by(outE('edge2').has('type', 'type1').limit(1).inV().fold()). // now this is empty array or single-element array, can we return element itself?
by(inE('edge2').has('type', 'type2').outV().fold()).
fold()).
by(count())
Desired result shape:
[{
items: [
{vertex_itself: Object, vertex2: Object/null/empty, veroces3: Array},
{}...
],
cont: Number,
}]
Problem: vertex2 property is always an array, empty or single-element.
Expected: vertex2 to be object or null/empty.
Update 2: it turns out my query is not finished yet, it returns many object if there are no single element in has('cond1').outV() step, e.g. [{items, count}, {items, count}...]
it looks like your main issue is getting a single item from the traversal.
you can do this with next(), which will retrieve the next element in the current traversal iteration:
inE().has(some condition).outV().next()
the iteratee's structure is, i think, implementation specific. e.g. in javascript, you can access the item with the value property:
const result = await inE().has(some condition).outV().next();
const item = result.value;
I may not fully understand, but it sounds like from this:
inE().has(some condition).outV().fold()
you want to just grab the first vertex you come across. If that's right, then is there a reason to fold() at all? maybe just do:
inE().has(some condition).outV().limit(1)

How to iterate XQuery Path?

<Response xmlns="https://**********/token">
<rows>
<e>
<e>1,4507</e>
<e>1</e>
<e>2017-02-28T05:10:32.0606654Z</e>
<e>2017-02-28T05:10:32.2267838Z</e>
<e>Depart</e>
<e>Depart</e>
<e>acct: xxx on 10/2/2015</e>
<e>16.666131292069704</e>
<e null="true"/>
<e>Open</e>
<e null="true"/>
<e null="true"/>
<e null="true"/>
</e>
<e>
<e>1,4507</e>
<e>2</e>
<e>2017-02-28T05:10:32.0606654Z</e>
<e>2017-02-28T05:10:32.2267838Z</e>
<e>Depart</e>
<e>Depart</e>
<e>acct: xxx on 10/2/2015</e>
<e>16.666131292069704</e>
<e null="true"/>
<e>Open</e>
<e null="true"/>
<e null="true"/>
<e null="true"/>
</e>
</rows>
<skip>0</skip>
<take>126</take>
</Response>
The above is my xml response:
I wanted to verify each row that status is Open.
Open
declare namespace ns1='https://**********/token';
<Result>
for $x in //ns1:Response[1]/ns1:rows[1]
return data($x/ns1:e[1]/ns1:e[10]/text())
</Result>
but it is not returning the value.. It just gives the following in the expected result.
<Result>for $x in //ns1:Response[1]/ns1:rows[1]
return data($x/ns1:e[1]/ns1:e[2]/text())</Result>
Is it correct xquery expression? would be helpful if some check this?
You should wrap your XQuery expression in {} otherwise it will be treated as string literal :
declare namespace ns1='https://**********/token';
<Result>
{
for $x in //ns1:Response[1]/ns1:rows[1]
return data($x/ns1:e[1]/ns1:e[10]/text())
}
</Result>
demo
output :
<Result>Open</Result>
It isn't clear what is the expected output exactly though, since you've been using exact index on every path step which will cause the expression to match only one element at a time...
If you just want to know that every row has e[10] = "Open" then you can do:
<Result>{empty(/ns1:Rows/*/*[10][.!="Open"])}</Result>
(Your question title talks of iterating. That's procedural thinking. In XQuery, just like in SQL, you generally operate on sets of items directly: you don't need to iterate over them to process them one at a time.)

Resources