Gremlin - Using an OR step to get different types of connected vertices - gremlin

So I have a graph schema where vertex type A can connect inwards to vertex type B or type C in a one to many relationship. I'm trying to write a query that outputs any of those relationships if they exist, for instance a sample output would be:
Type A | Type B | Type C
Sample1A,'', Sample1C
Sample2A, Sample2B, ''
Sample3A, Sample3B, Sample3C
Sample4A, 'Sample4Ba, Sample4Bb', Sample4C
The fourth example is if A is connected to multiple B types. If B and C don't exist, then nothing is output.
So far I have the query: g.V().hasLabel('A').as('A').in('connect').hasLabel('B').as('B').or().in('connect').hasLabel('C').as('C').select('A','B','C')
But this query only returns the A vertices without any B's or C's.
Using AWS Neptune if that matters.

As kevin mentioned in the comment, you can use .project() method for this scenario.
g.V().hasLabel("A").as("A").project("a","b", "c")
.by(select("A"))
.by(choose(in("connect").hasLabel("B").count().is(0), constant("NO_B").value(), in("connect").hasLabel("B")))
.by(choose(in("connect").hasLabel("C").count().is(0), constant("NO_C").value() , in("connect").hasLabel("C")));

Your or() steps are not returning a result as written. You could simplify the query as follows:
g.V().hasLabel('A').as('A').in('connect').hasLabel(within('B','C').as('B').select('A','B')
This avoids using select('A','B','C') as only one of 'B' or 'C' will have a result in the or() case.
Here is a version that still uses or()
g.V().hasLabel('A').as('A').in().or(hasLabel('B'),hasLabel('C')).as('B').select('A','B')

Related

Unpack vertices' properties returned from a select()

I would like to write a MATCH query in gremlin to find a subgraph matching a particular pattern. The pattern of interest contains 4 different types/labels of nodes: c, p, r, and s. And 4 different types of edges as shown below:
(c)->[affecting]->(p)
(c)-[c_found_in_release]->(r)
(p)-[p_found_in_release]->(r)
(s)-[severity]->(c)
So far I have the query below which works fine, however, the results do not show the properties of the vertices returned. Since the verticies returned from the select() statement belong to different types of nodes, I cannot use something like value() or valueMap()
g.V().match(
__.as('c').out('affecting').as('p'), \
__.as('c').out('cve_found_in_release').as('r'), \
__.as('p').out('pack_found_in_release').as('r'), \
__.as('s').both('severity').as('c') \
). \
select('c', 'p', 'r', 's').limit(10)
Current result:
==>[c:v[0],p:v[3],r:v[6],s:v[10]]
How to get something more detailed like this instead:
Desired result:
==>[
c:[cve_id:[CVE-2021-3618],publishedOn:[2022-03-23],
p:[name:[vsftpd],version:[3.0.3]],
r:[sourceBranch:[1.0],detectedOn:[2022-04-05],status:[Upgraded]],
s:[severity:[High]],
]
You can simply add additional by() modulators. On the Tinkergraph modern example graph:
g = TinkerFactory.createModern().traversal()
g.V().match(
__.as('v').hasLabel('software').as('s'),
__.as('s').both().hasLabel('person').as('p')
).select('s', 'p')
.by(values('name', 'lang').fold())
.by(values('name', 'age').fold())
==>[s:[lop,java],p:[marko,29]]
==>[s:[lop,java],p:[josh,32]]
==>[s:[lop,java],p:[peter,35]]
==>[s:[ripple,java],p:[josh,32]]

Gremlin continue traversal only if 2 vertices are not the same

I have a query which looks at 2 different vertices and I want to stop traversing if they don't both roll up to the same root ancestor via a path of "contains" edges.
g.V('node1')
.until(hasLabel('root')).repeat(in('contains')).as('node1Root')
.V('node2')
.until(hasLabel('root')).repeat(in('contains')).as('node2Root')
//FILTER|WHERE clause
I'd like to confirm that node1Root and node2root are the same vertex before continuing the traversal, but for the life of me I cannot figure out how to do this.
I've tried the following:
g.V('node1')
.until(hasLabel('root')).repeat(in('contains')).as('node1Root')
.V('node2')
.until(hasLabel('root')).repeat(in('contains')).as('node2Root')
//.where('node1Root', P.eq('node2Root')
//.where(select("node1Root").is(P.eq("node2Root")))
//.where(select("node1Root").is("node2Root"))
What's interesting is that the following query does work to filter appropriately.
g.V('node1').as('1')
.V('node2').as('2')
.where('1', P.eq('2'))
I'm not sure if there's something up with the until/repeat that screws it up or if I'm just doing something blatantly wrong. Any help would be much appreciated.
Thanks!
I found How to check equality with nodes from an earlier part of query in Gremlin?
and it seems like you use "as" with the same key as the previous "as" and if they match its considered equal.
So here's the winner (I think):
g.V('node1')
.until(hasLabel('root')).repeat(in('contains')).as('node1Root')
.V('node2')
.until(hasLabel('root')).repeat(in('contains')).as('node2Root')
.where(select('node1Root').as('node2Root')
//.not(select('node1Root').as('node2Root')) //OR this to determine they aren't the same
//continue traversal
I also found that my original issue was that the .until().repeat() steps could return a LIST, but in my case I know that my graph model will always return a single 'root' so to make it work, I can use 'unfold'
g.V('node1')
.until(hasLabel('root')).repeat(in('contains')).unfold().as('node1Root')
.V('node2')
.until(hasLabel('root')).repeat(in('contains')).unfold().as('node2Root')
.where('node1Root', P.eq('node2Root')
I think I'll be going with the second solution because I'm much more confident in it, unless I hear otherwise.
You can try this gremlin query
g.V(node1-id)
.map(until(hasLabel('root')).repeat(in().aggregate('x')).cap('x')).as("array")
.V(node2-id)
.until(
as("i").select("array").unfold().as("j")
.where("i", eq("j"))
).repeat(in())
Here we are putting all the vertices in path to root from node1 in an array, and secondly we are checking existence of node in array.
this query can only work with traversal with only one iteration because aggregate step collect to a global variable to traversal that means it will be same array for every iteration. To fix this If you are doing this on jvm do use lamda/groovy closures
g.V(node-start-id-1,node-start-id-2)
.map(
{ x->
var v = x.get()
var g = getGraph().get().traversal();
g.V(v.id())until(hasLabel('root')).repeat(in().aggregate('x')).cap('x')).next()
}
)
.as("array")
.V(node2-id)
.until(
as("i").select("array").unfold().as("j")
.where("i", eq("j"))
).repeat(in())

Xtext - Get cross-referenced child

I have a grammar that looks like:
A:
myField=[B]
B:
C | D | E
I have a function that gets A (let's say a) as a parameter and I want to access C, for example.
I did a.myField that returns a B object (let's say b). Than I used
EcoreUtil2.getAllContentsOfType(b,C) - but it returns an empty list.
Maybe the reason is that B is not really parsed again, but cross-referenced. If so, is there any function that allows me to access C/D/E in the above example?
Thank you.
Update
Apparently b is null, so of course getAllContentsOfType() returns an empty list. How do I access B (which is cross-referenced from A)?
Had to check that a.myField isn't null.

PostScript forall on dictionaries

According to the PLRM it doesn't matter in which order you execute a forall on a dict:
(p. 597) forall pushes a key and a value on the operand stack and executes proc for each key-value pair in the dictionary
...
(p. 597) The order in which forall enumerates the entries in the dictionary is arbitrary. New entries put in the dictionary during the execution of proc may or may not be included in the enumeration. Existing entries removed from the dictionary by proc will not be encountered later in the enumeration.
Now I was executing some code:
/d 5 dict def
d /abc 123 put
d { } forall
My output (operand stack) is:
--------top-
/abc
123
-----bottom-
The output of ghostscript and PLRM (operand stack) is:
--------top-
123
/abc
-----bottom-
Does it really not matter in what order you process the key-value pairs of the dict?
on the stack, do you first need to push the value and then the key, or do you need to push the key first? (as the PLRM only talks about "a key and a value", but doesnt tell you anything about the order).
Thanks in advance
It would probably help if you quoted the page number qhen you quote sections from the PLRM, its hard to see where you are getting this from.
When executing forall the order in which forall enumerates the dictionary pairs is arbitrary, you have no influence over it. However forall always pushes the key and then the value. Even if this is implied in the text you (didn't quite) quote, you can see from the example in the forall operator that this is hte case.
when you say 'my output' do you mean you are writing your own PostScript interpreter ? If so then your output is incorrect, when pushing a key/value pair the key is pushed first.

use aggregate functions in the WHERE clause (Neo4j)

How do I select all nodes that are connected to node(2) [from] with more than one path?
START from=node(2)
MATCH p=from-->to
where count(p) > 1
return from,to
To Neo4J team : Any plans to implement Count/Having functions?
great job so far with the product!
actually found the solution combining the 'WITH' keyword
START from=node(*)
MATCH p=from-->to
WITH from as from , to as to, count(p) as paths
WHERE paths >1
RETURN to,paths

Resources