I have a tree structure that is defined across two tables that I need to use to work out, starting from the leaf object, the folder path in which the leaf object exists.
FOLDER_DETAILS contains the object and the folder details, FOLDER_HIERARCHY contains the folder and its parent folder association, if there is one.
Each leaf object could be 1 folder level deep, or it could be 100 folder levels deep.
At the moment, I'm using recursive C# code to perform a query-per-level, which feels expensive and is definitely time-consuming over my connection to the database. I'm wondering if there is a way to do this via a single recursive query?
Current method:
Because I start this process already knowing the leaf object, it is easy for me to get its OBJECT_ID and OWNER_ID values. The OWNER_ID value for the leaf object is the id of the leaf folder.
From there, the basic outline of the queries I execute are:
SELECT FD.FOLDER_NAME, FD.FOLDER_ID, FH.OWNER_ID
FROM FOLDER_DETAILS FD
INNER JOIN FOLDER_HIERARCHY FH ON FD.FOLDER_ID = FH.ID
WHERE FD.OBJECT_ID = :object_id
AND FH.OWNER_ID = :owner_id;
From each query I collect the "FOLDER_NAME" value as the folder name for that level, and the "FOLDER_ID" (becomes the OBJECT_ID) and "OWNER_ID" are fed into the next iteration of the recursion.
Do this enough times, and I get a path made up of each of the folder names, from the leaf object back to the root folder.
But again, I would love to do this recursively in a single query...
Link provided by #TonyAndrews was the solution. hierarchical (aka "connected by") query
Related
When implementing a recursive data structure like TREE, I need a common attribute per TREE, and I wonder how to implement it:
Adding the attribute to a TREE node, replicates the attribute for every node, not once per TREE
Using a once attribute, I get only one shared attribute for all TREEs, not one per TREE.
Is there any elegant Eiffel-style solution for that?
A tree has a distinguished root node that can be used to store information related to the whole tree rather than to a specific node. In order to retrieve this information, there should be a possibility to reach the root node from any other node of the tree. One possible solution is to have a feature parent that would return the parent node of the specified node (or Current for the root). Then, the feature that obtains the root node can look like
root: TREE
-- The root of the tree.
local
p: TREE
do
from
Result := Current
p := parent
until
Result = p -- `Result = Result.parent` when `Result` is root.
loop
Result := p
p := p. parent
end
ensure
Result.parent = Result -- `Result` has no other parent.
end
Then the tree-specific attribute value can be retrieved from an arbitrary tree node n with n.root.my_attribute.
EDIT:
Another possibility is to have a dedicated CELL with the required data, and all nodes in the tree simply refer to this cell. The benefit is that no reference to the parent node is need and the access to the data is immediate.
We often use GraphBLAS for graph processing so we need to use the incidence matrix. I haven't been able to find a way to export this from Grakn to a csv or any file. Is this possible?
There isn't a built-in way to dump data to CSV in Grakn right now. However, we do highly encourage our community to contribute open source tooling for these kinds of tasks! Feel free to chat to use about it on our discord.
As to how it can be done, conceptually it's pretty easy:
Query to get stream all hyper-relations out:
match $r isa relation;
and then for each relation, we can pipeline another query (possibly in new transaction if you wish to keep memory usage lower):
match $r iid <iid of $r from previous query>; $r ($x); get $x;
which will get you everything in this particular hyper relation $r playing a role.
If you also wish to extract attributes that are attached to the hyper relation, you can use the following
match $r iid <iid of $r from first query>; $r has $a; get $a;
In effect we can use these steps to build up each column in the A incidence matrix.
There are a couple if important caveats I should bring up:
What you'll end up with, will exclude all type information about the hyper relations, the role players in the relations, and the actual role that is being played by the role player, and attribute types owned.
==> It would be interesting to hear/discuss how one could encode types information for use in GraphBLAS
In Graql, it's entirely possible to have relations participating in relations. in the worst case, this means all hyper-edges E will also be present in the set V. In practice only a few relations will play a role in other relations, so only a subset of E may be in V.
So the incidence matrix is equivalent to the nodes/edges array used in force graph visualisation. In this case it is pretty straight forward.
My approach would be slightly different than the above as all i need to do is pull all of the things in the db (entities, relations, attributes), with
match $ting isa thing;
Now when i get my transaction back, for each $ting I want to pull all of the available properties using both local and remote methods if I am building a force graph viz, but for your incidence matrix, I really only care about pulling 3 bits of data:
The iid of the thing
The attributes the thing may own.
The roles the thing owns if it is a relation
Essentially one tests each returned object to find out the type (e.g. entity, attribute, relation), and then uses some of the local and remote methods to get the data one wants. In Python, the code for pulling the data for relations looks like
# pull relation data
elif thing.is_relation():
rel = {}
rel['type'] = 'relation'
rel['symbol'] = key
rel['G_id'] = thing.get_iid()
rel['G_name'] = thing.get_type().get_label().name()
att_obj = thing.as_remote(r_tx).get_has()
att = []
for a in att_obj:
att.append(a.get_iid())
rel['has'] = att
links = thing.as_remote(r_tx).get_players_by_role_type()
logger.debug(f' links are -> {links}')
edges = {}
for edge_key, edge_thing in links.items():
logger.debug(f' edge key is -> {edge_key}')
logger.debug(f' edge_thing is -> {list(edge_thing)}')
edges[edge_key.get_label().name()] = [e.get_iid() for e in list(edge_thing)]
rel['edges'] = edges
res.append(rel)
layer.append(rel)
logger.debug(f'rel -> {rel}')
This then gives us a node array, which we can easily process to build an edges array (i.e. the links joining an object and the attributes it owns, or the links joining a relation to its role players). Thus, exporting your incidence matrix is pretty straightforward
Say I want to find the shortest path between some node of a specific type (say "central production unit") and a defined end node (say "consumer" with an id), how to I calculate this in Neo4j with Cypher?
With such queries i'd like to answer questions like: "Which production unit feeds this customer with the shortest distance".
I tried with queries like:
match p=AllShortestPaths((source:Asset)-[:LINKS_TO*]-(destination:Asset))
where source.type = 'central production unit' and destination.id = '1234'
return extract(n in nodes(p)| n.type) as type_path,
extract(n in nodes(p)| n.id) as id_path,
length(p) as path_length;
queries like the one above will run into an out of memory error.
Using the same query but instead of a type of node, entering a specific id works perfectly fine.
Sniffing around on Stackoverflow i've found ceveral examples of 1 specific node to 1 other specific node, but not 1 yet to determine node of a certain type to 1 specific node.
I think i've found a solution using the sanningTree procedure.
This works pretty fast! I do not understand why. And how to include link properties to minimize on fysical properties instead of number of hops.
// first match to and collect end nodes
MATCH (m:Asset {type:'central production unit'})
WITH collect(m) as endNodes
MATCH (n:Asset {id:'1234'})
// proc call will be executed per n node, finding the first shortest path found from n to one of the end nodes
CALL apoc.path.spanningTree(n, {endNodes:endNodes, limit:1}) YIELD path
RETURN path
I have designed a model in Neo4j in order to get paths from one station to another including platforms/legs involved. The model is depicted down here. Basically, I need a query to take me from NBW to RD. also shows the platforms and legs involved. I am struggling with the query. I get no result. Appreciate if someone helps.
Here is my cypher statement:
MATCH p = (a:Station)-[r:Goto|can_board|can_alight|has_platfrom*0..]->(c:Station)
WHERE (a.name='NBW')
AND c.name='RD'
RETURN p
Model:
As mentioned in the comments, in Cypher you can't use a directed variable-length relationship that uses differing directions for some of the relationships.
However, APOC Procedures just added the ability to expand based on sequences of relationships. You can give this a try:
MATCH (start:station), (end:station)
WHERE start.name='NBW' AND end.name='THT'
CALL apoc.path.expandConfig(start, {terminatorNodes:[end], limit:1,
relationshipFilter:'has_platform>, can_board>, goto>, can_alight>, <has_platform'}) YIELD path
RETURN path
I added a limit so that only the first (and shortest) path to your end station will be returned. Removing the limit isn't advisable, since this will continue to repeat the relationships in the expansion, going from station to station, until it finds all possible ways to get to your end station, which could hang your query.
EDIT
Regarding the new model changes, the reason the above will not work is because relationship sequences can't contain a variable-length sequence within them. You have 2 goto> relationships to traverse, but only one is specified in the sequence.
Here's an alternative that doesn't use sequences, just a whitelisting of allowed relationships. The spanningTree() procedure uses NODE_GLOBAL uniqueness so there will only be a single unique path to each node found (paths will not backtrack or revisit previously-visited nodes).
MATCH (start:station), (end:station)
WHERE start.name='NBW' AND end.name='RD'
CALL apoc.path.spanningTree(start, {terminatorNodes:[end], limit:1,
relationshipFilter:'has_platform>|can_board>|goto>|can_alight>|<has_platform'}) YIELD path
RETURN path
Your query is directed --> and not all of the relationships between your two stations run in the same direction. If you remove the relationship direction you will get a result.
Then once you have a result I think something like this could get you pointed in the right direction on extracting the particular details from the resulting path once you get that working.
Essentially I am assuming that everything you are interested in is in your path that is returned you just need to filter out the different pieces that are returned.
As #InverseFalcon points out this query should be limited in a larger graph or it could easily run away.
MATCH p = (a:Station)-[r:Goto|can_board|can_alight|has_platfrom*0..]-(c:Station)
WHERE (a.name='NBW')
AND c.name='THT'
RETURN filter( n in nodes(p) WHERE 'Platform' in labels(n)) AS Platforms
i have a following graph in neo4j graph database and by using the cypher query language, i want to retrieve the whole data with is connected to root node and their child node.
For example :
kindly find the below graph image.
[As per the image, node 1 has two child and their child also have too many child with the same relationship. now what i want, using Cypher, i hit the node 1 and it should response with the whole data of child node and there child node and so on, relationship between nodes are "Parent_of" relationship.]
can anyone help me on this.
start n=node(1) // use the id, or find it using an index
match n-[:parent_of*0..]->m
return m
will get you all the graph nodes in m. You could also take m.some_property instead of m if you don't want the node itself, but some property that is stored in your nodes.
Careful though, as the path has no limit, this query could become pretty huge in a large graph.
You can see an example of *0.. here: http://gist.neo4j.org/?6608600