I am looking at a kryo file with the following vertices
# Tree Vertices
V(label=tree, properties={treeId:1, treeName:treeA})
V(label=tree, properties={treeId:2, treeName:treeB})
# Root Node Vertices
V(label=node, properties={treeId:1, nodeId:111, nodeType:root})
V(label=node, properties={treeId:2, nodeId:222, nodeType:root})
There are no edges between the vertices labeled as tree and the vertices labeled as node. There are further edges nodes connected to the root nodes but they are irrelevant to this question. I do not want to add any edges as this graph file gets vended to me and I am treating it as read-only.
Now I want to join/project the treeNames into a traversal over the root nodes.
g.V()
.hasLabel('node').has('nodeType', 'root')
.project('nodeId', 'treeId', 'treeName') # return nodeId, treeId, treeName for each root node
.by(values('nodeId'))
.by(values('treeId'))
.by(""" # pseudo-sqlish gremlin to clarify my intent
select treeName
from V().hasLabel('tree')
.where(values('treeid'), eq($thisNode.values('treeId'))
"""
)
In SQL terms I'd say: I want to run a subquery (fully independent sub traversal starting from scratch) and then join it with my outer traversal on a given property. And again: No edge between trees and roots.
WITH
trees as (SELECT treeId, treeName FROM vertices v WHERE v.label = 'tree'),
roots as (SELECT nodeId, treeId FROM vertices v where v.label = 'node')
SELECT roots.nodeId, roots.treeId, trees.treeName
FROM roots
JOIN trees ON (roots.treeId, trees.treeId)
So I am looking for a way to perform a projection based on another traversal + one of the returned vertex properties
How abusive is this?
How to do it?
You can do it by starting a new traversal inside the project like this:
g.V().hasLabel('node').
has('nodeType', 'root').as('root').
project('nodeId', 'treeId', 'treeName').
by(values('nodeId')).
by(values('treeId')).
by(coalesce(
V().hasLabel('tree').where(eq('root')).
by('treeId').
values('treeName'),
constant('tree not exist')
))
see the example here: https://gremlify.com/bybp7s9mdia
How abusive is this: Very.
starting a sub-query for each node vertex can be very 'heavy' performance-wise.
and it's missing all of the advantages of graph DB if your graph schema doesn't fit your requirement
Related
I want to return a sub-graph that is attached to a certain vertex (Vertex 3). It should not matter how this sub-graph is structured in detail. Ultimately, all I want is to get the vertices and edges that make up the sub-graph.
Simple graph with sub-graph
Using the TinkerPop-modern sample graph you can do this with the following query:
graph = TinkerFactory.createModern()
g = graph.traversal()
g.E(8).drop() // Edge interferes with OP example graph
g.V(1).bothE().as('e').otherV().where(id().not(is(2))).repeat(
bothE().where(neq('e')).subgraph('subGraph').otherV().simplePath()
).emit().times(4).cap('subGraph').
next().traversal().V()
==>v[3]
==>v[4]
==>v[5]
==>v[6]
Explanation:
v(1) is the vertex connected to the subgraph to be found
v(2) is your start vertex, to be excluded from the subgraph
The edge between v(1) and the subgraph also needs to be excluded from the subgraph and gets a reference 'e'
The repeat(...).emit().times(4) does the looping starting from the subgraph's initial edge(s)
The where(neq('e')) makes sure v(1) is not included in the subgraph
cap('subGraph') makes the traversal hold a TinkerGraph objects, created from all the edges referenced by subgraph('subGraph')
next().traversal().V() returns the subgraph, creates a GraphTraversalSource from it and show all vertices in the subgraph
It does not seem possible to start the traversal from v(2), your start vertex, because gremlin's subgraph mechanism does not create separate subgraphs for different branches in the total graph but rather accumulates them in the global SideEffect referenced by 'subGraph'.
There is an un-directed cyclic graph with n nodes and there is a root node. Each node in the graph has a weight. Given a integer k, select k nodes from the graph with the two below conditions :
• Sum of weights on the selected nodes should be minimum
• All the selected nodes should have a path to root node
The weights are on the nodes.
First split the graph into its connected components. Thereafter only consider the component containing the root node. Select the top k vertices under a non-decreasing ordering by weight. Note that neither existence nor uniqueness of a solution is guaranteed.
Caveat
Something may be missing from the problem description, as the cyclicity of the graph does not figure in the solution.
Possibly the sum of weights of all vertices of the tree composed of the paths from the solution nodes to the root are to be minimized ? The question heading suggests this but the body says otherwise.
In case this surmise is true, the task to solve is the node-weighted Steiner Tree problem restricted to instances with k being the size of terminals.
I am new to Neptune DB, I have created vertices and connected two vertices with edges and I have given some properties to both the edge and value
I want to retrieve both the edge and vertices properties values
Can someone provide me a sample query for this?
Thanks in advance.
Eg:
Vertices: p1, P2, p3
Edges E1-connecting P1 and P2, E2- connecting P2 and P3
Vertices property: name
Edge property: relation
Now I need to take out name and relation for all the vertices connected to P1
path step is what you are looking for. Using the by modulator you can select properties in a round-robin fashion, i.e. vertex-edge.
Start by locating p1 vertex:
g.V().hasLabel("testV").has("name","p1")
Repeat traversal along edges with "relation" property:
.repeat(outE("testE").has("relation").inV()).until(__.not(outE("testE")))
Get the traversal path (or tree), and select "name" for vertices, and "relation" for edges using the by modulator:
.path().by("name").by("relation")
To see results in arrays of strings:
.local(unfold().fold())
Note that this traversal doesn't handle cycles, but that's another question.
If you need only first level neighbors, you can take a different approach:
g.V().hasLabel("testV").has("name","p2").bothE()
.project("relation","name")
.by(values("relation"))
.by(otherV().values("name"))
Assume i have graph relations defined as A->B, A->D, C->A, B->C, B->D. I need to get the following subgraph in Neo4j -
Get all 1st degree connections (indegree or outdegree) of a node i.e. For node A, it would be B, C, D
Get all the edges between these 1st degree nodes. Since B,C,D are 1st degree connections, edges would be B->C, B->D
For the 1st part, I have the following query -
MATCH (s:Node)->(d:Node)
WHERE s.name = 'A' OR d.name = 'A'
I'm not able to get the 2nd part of the data in the same query. Do i need to iterate through all the nodes?
This is a simple pattern that says that you start the path from the node, then go to the node next to it, then one of which you can return to the starter node:
MATCH (A:Node {name:'A'}) WITH A
MATCH (A)--(FD1:Node)-[r]-(FD2:Node)--(A)
WHERE ID(FD1) > ID(FD2)
RETURN FD1, r, FD2
P.S. And remember that you can not specify the direction of the relationship.
Like in OrientDB, for get the All Graph Only use 'Select From v'
So far, I have use AQL in ArangoDB with start node:
for v,e,p IN 2 ANY 'user/188802' graph 'a' return p
And now I want to get all graph nodes in ArangoDB without the start node?
Graphs are a grouping of Edge collections. Each Edge collection references _from and _to documents which are stored in Document collections.
The graph traversal queries expect you to have a starting position and it returns the results for that single starting position.
It is possible to identify all possible starting positions, and then run graph traversals over those positions.
You'll need to know the names of the document collections that make up your graph, you can insert them into an AQL query like this:
FOR vertex IN UNION(
(FOR v IN document_collection_1 RETURN v._id),
(FOR v IN document_collection_2 RETURN v._id),
(FOR v IN document_collection_3 RETURN v._id)
)
FOR v, e IN 1..5 OUTBOUND vertex GRAPH 'my_graph_name' OPTIONS { uniqueVertices: true }
RETURN DISTINCT [
{
_from: e._from,
_to: e._to
}
]
Remember that in ArangoDB it is possible for a document collection to be bound to more than one graph, so you'll need to ensure you identify all document collections that are part of the graph.
This query will then extract an array of objects that contain all links defined in the graph. This query focuses only on vertices with edges that are part of the graph. If the vertex has no edge on it, it won't appear in the output as it is not part of the graph.