Get neighborhood of a starting node ArangoDB - graph

I'm using ArangoDB 3.2.25. I want to extract neighbors from a starting node.
Here is what I tried:
FOR x IN 1..1
ANY "vert1/5001" Col_edge_L
RETURN x
but I'm getting missing vert2.
Here is the schema of the collection
{"_from":"vert1/560","_to":"vert2/5687768","id":771195,"score":218}

What you do in your query is to start at the vertex with key 5001 from the collection vert1 and follow all edges stored in collection Col_edge_L in any direction (so _from or _to equal to vert1/5001).
If there are edges in Col_edge_L like
{ "_from": "vert1/5001", "_to": "vert1/789" }
{ "_from": "vert2/44", "_to": "vert1/5001" }
then the result should be:
[
{ "_id": "vert2/44", ... },
{ "_id": "vert1/789", ... }
]
Exception: if the vertex collections exist, but not the vertices referenced in the _from and _to properties of the edges, the traversal will work but return null for the missing vertices (x variable).
The edge you posted in your question does not reference the starting vertex vert1/5001, so it wouldn't be followed and no vertex returned from this edge. If you miss vertices in the result, there might simply be no edges that link the starting vertex to another document.

Related

How to traverse a graph and pattern-match a subgraph in Gremlin

I have a graph which is made of many instances of the same pattern (or subgraph).
The subgraph of interest is pictured below.
The relationship cardinality between the nodes are:
s -> c (one-many)
c -> p (many-many)
p -> aid (one-many)
p -> rA (one-one)
p -> rB (one-one)
p -> o (many-one)
The goal is to return a list of all instances of this subgraph or pattern as shown below
[
{
s-1,
c-1,
p-1,
aid-1,
o-1,
rA-1,
rB-1
},
{
s-2,
c-2,
p-2,
aid-2,
o-2,
rA-2,
rB-2
},
{
... so on and so forth
}
]
How do I query my graph to return this response?
I have tried using a combination of and() and or() as shown below, but that did not capture the entire subpattern as desired.
g.V().hasLabel('severity').as('s').out('severity').as('c').out('affecting').as('p')
.and(
out('ownedBy').as('o'),
out('rA').as('rA'),
out('rB').as('rB'),
out('package_to_aid').as('aid')
)
.project('p', 'c', 's', 'o', 'rA', 'r', 'aid').
by(valueMap()).
by(__.in('affecting').values('cve_id')).
by(__.in('affecting').in('severity').values('severity')).
by(out('ownedBy').values('name')).
by(out('rA').valueMap()).
by(out('rB').valueMap()).
by(out('package_to_aid').values('aid')).
I know I can use a series of out() and in() steps to traverse a non-branching path (for example the nodes: s->c->p), however I am struggling with capturing/traversing paths that branch out (for example, the node p and its 3 children nodes: rA, rB, and o)
I looked at Union() but I am unable to make it work either.
I am unable to find examples of similar queries online. Does Gremlin allow this sort of traversal, or do I have to remodel my graph as a Linked-list for this to work?
ps. I am doing this on Cosmos where Match() step is not supported

Multiple add if doesn't exist steps Gremlin

I have an injected array of values. I'm I want to add vertices if they don't exist. I use the fold and coalesce step, but it doesn't work in this instance since I'm trying to do it for multiple vertices. Since 1 vertex exists I can no longer get a null value, and the the unfold inside the coalesce step returns a value from there on. This leads to vertices that don't exist yet not to be added.
This is my current traversal:
const traversal = await g
?.inject([
{ twitterPostId: 'kay', like: true, retweet: false },
{ twitterPostId: 'fay', like: true, retweet: false },
{ twitterPostId: 'nay', like: true, retweet: false },
])
.unfold()
.as('a')
.aggregate('ta')
.V()
.as('b')
.where('b', p.eq('a'))
.by(__.id())
.by('twitterPostId')
.fold()
.coalesce(__.unfold(), __.addV().property(t.id, __.select('ta').unfold().select('twitterPostId')))
.toList();
Returns:
[Bn { id: 'kay', label: 'vertex', properties: undefined }]
Without using coalesce you can do conditional upserts using what we often refer to as "map injection". The Gremlin does get a little advanced, but here is an example
g.withSideEffect('ids',['3','4','xyz','abc']).
withSideEffect('p',['xyz': ['type':'dog'],'abc':['type':'cat']]).
V('3','4','xyz','abc').
id().fold().as('found').
select('ids').
unfold().
where(without('found')).as('missing').
addV('new-vertex').
property(id,select('missing')).
property('type',select('p').select(select('missing')).select('type'))
That query will look for a set of vertices, figure out which ones exist, and for the rest use the ID values and properties from the map called 'p' to create the new vertices. You can build on this pattern a great many ways and I find it very useful until mergeV and mergeE are more broadly available
You can also use the list of IDs in the query to check which ones exist. However, this may lead to inefficient query plans depending on the given implementation:
g.withSideEffect('ids',['3','4','xyz','abc']).
withSideEffect('p',['xyz': ['type':'dog'],'abc':['type':'cat']]).
V().
where(within('ids')).
by(id).
by().
id().fold().as('found').
select('ids').
unfold().
where(without('found')).as('missing').
addV('new-vertex').
property(id,select('missing')).
property('type',select('p').select(select('missing')).select('type'))
This is trickier than the first query, as the V step cannot take a traversal. So you cannot do V(select('ids')) in Gremlin today.

Gremlin: how to find other edges with the same property

I have a graph with two vertices having id(s) 'a' and 'b'.
gremlin> g.V()
==>v[b]
==>v[a]
There are two edges from 'a' to 'b'.
gremlin> g.E()
==>e[a6b4bead-c161-5a61-d232-abfa2bfad54e][a-LIKES->b]
==>e[10b4bead-a0fc-8d2c-d69f-26b3e9e4c5d8][a-KNOWS->b]
gremlin> g.E().valueMap(true)
==>{id=a6b4bead-c161-5a61-d232-abfa2bfad54e, semantics=social, label=LIKES}
==>{id=10b4bead-a0fc-8d2c-d69f-26b3e9e4c5d8, semantics=social, label=KNOWS}
My question: given an id for one of the edges, I would like to find all other edges with the same value for the property "semantics". For example, given a.LIKES.id, I would like to execute a query that will return a.KNOWS using the value a.LIKES.semantics.
I started with:
g.E('a6b4bead-c161-5a61-d232-abfa2bfad54e')
.property('semantics').as('semantics')...this is where I am stuck
Thanks,
Joel
where() in conjunction with a by() modulator will do the job:
g.E('a6b4bead-c161-5a61-d232-abfa2bfad54e').as('e').
outV().inE().
where(eq('e')).by('semantics'). // return edges with the same semantics property value
where(neq('e')) // ... except the one we started with

Using Parameters in Neo4j Relationship Queries

I'm struggling to work around a small limitation of Neo4j in that I am unable to use a parameter in the Relationship section of a Cypher query.
Christophe Willemsen has already graciously assisted me in working my query to the following:
MATCH (n1:Point { name: {n1name} }),
(n2:Point { name: {n2name} }),
p = shortestPath((n1)-[r]->(n2))
WHERE type(r) = {relType}
RETURN p
Unfortunately as r is a Collection of relationships and not a single relationship, this fails with an error:
scala.collection.immutable.Stream$Cons cannot be cast to org.neo4j.graphdb.Relationship
Removing the use of shortestPath() allows the query to run successfully but returns no results.
Essentially my graph is a massive collection of "paths" that link "points" together. It is currently structured as such:
http://console.neo4j.org/r/rholp
I need to be able to provide a starting point (n1Name), an ending point (n2Name), and a single path to travel along (relType). I need a list of nodes to come out of the query (all the ones along the path).
Have I structured my graph incorrectly / not optimally? I am open to advice on whether the overall structure is not optimal as well as advice on how best to structure the query!
EDIT
Regarding your edit, the nodes() function returns you the nodes along the path :
MATCH p=allShortestPaths((n:Point { name:"Point5" })-[*]->(n2:Point { name:"Point8" }))
WHERE ALL (r IN rels(p) WHERE type(r)={relType})
RETURN nodes(p)
In the console link, it is returning nodes Points 5,6,7,8
I guess in your case that using a common relationship type name for connecting your Point nodes would be more efficient.
If having a Path1, Path2, .. is for knowing the distance between two points, you can easily know the distance by asking for the length of the path, like this query related to your console link :
MATCH (n:Point { name:"Point1" })
WITH n
MATCH (n2:Point { name:"Point4" })
WITH n, n2
MATCH p=shortestPath((n)-[]->(n2))
RETURN length(p)
If you need to return only paths having a defined relationship length, you can use it without the shortestPath by specifying a strict depth :
MATCH (n:Point { name:"Point1" })
WITH n
MATCH (n2:Point { name:"Point4" })
WITH n, n2
MATCH p=(n)-[*3..3]->(n2)
RETURN length(p)
LIMIT1
As you can see here, the need to specify the relationship is not mandatory, you can just omit it or add the :NEXT type if you have other relationship types in your graph
If you need to match on the type, for e.g. the path from point 5 to point 8 in your console link, and the path can only have a PATH_TWO relationship, then you can do this :
MATCH (n:Point { name:"Point5" })
WITH n
MATCH (n2:Point { name:"Point8" })
WITH n, n2
MATCH p=(n)-[r*]->(n2)
WHERE type(r[0])= 'PATH_TWO'
WITH p, length(p) AS l
ORDER BY l
RETURN p, l
LIMIT 1
If you really NEED to have the Path1, Path2 style, maybe a short explanation on the need could help us find the more appropriate query
MATCH p=shortestpath((n1:Point{name:{n1name}})-[:relType *]->(n2:Point {name:{n2name}}))
RETURN p

Generic traversal of a directed tree with Neo4J

I modelled a directed tree structure using the graph database Neo4J. So I have something like this: http://ouwarovite.net/YAPC/220px-Binary_tree.svg.png (not mandatory binary)
Users of my database can add child nodes of existing nodes at will, so the height of the tree and the degree of the single nodes is unknown.
Now, I want to query my tree like this: Starting with node x, give me all leaves that are descendants of leave x.
Is this kind of query performable with Gremlin or Cypher and if so, how to do this achieving the maximum of performance? (I haven't found a possibility to perform queries on 'generic' trees because you alway have to specify a maximum depth)
I know, that it's possible with the REST / JSON framework and the JAVA API like this:
POST /db/data/node/51/traverse/node
{
"return_filter" :
{
"body" : "position.endNode().hasProperty('leave')" ,
"language" : "javascript"
},
"relationships" : [ { "type" : "_default", "direction" : "out" } ] ,
"prune_evaluator" : { "name" : "none" , "language" : "builtin" }
}
(my leaves have the property 'leave', my edges have no type -> so _default)
Is there a simpler / better way to do this maybe with a better performance?
Cypher could look like that:
start root=node({rootId})
match root-[*]->child
where child.leave
return child
rootId being a parameter to be passed in.

Resources