Get path lengths for every relationship neo4j - graph

So I have a graph that looks like this(starting from the rightmost side) with relationships that have a unique number attribute called Isnad. I want to write a query to get the length of every Isnad from the start node to the end node but I can't figure it out. I don't know how to traverse every path for every Isnad separately. Any help?

I don't know if it is the most elegant and solution, but I think it works. First, I'm getting all unique Isnad values of relationships outgoing from the rightmost side node using an identifier. Then I'm using a variable-length pattern matching where all relationships have the same value for Isnad property. Then the Isnad value and the path length are returned.
match ({id:'unique-identifier-of-rightmost-side-node'})-[r]->()
with distinct r.Isnad as Isnad
match p = ()-[*{Isnad : Isnad}]->()
return Isnad, length(p) as Length

Related

Why can't I MATCH (v:<tag>)-[e:<edge>]-(v2:<tag>) RETURN v LIMIT 10 in the NebulaGraph database

The Nebula Graph docs say that "When traversing all vertices of the specified Tag or edge of the specified Edge Type, such as MATCH (v:player) RETURN v LIMIT N, there is no need to create an index, but you need to use LIMIT to limit the number of output results." But when I run the statement in the preceding screenshot, it told me that I did not have a limit number, which I did.
What is the correct way to RETURN v without creating indexes?
I met the same issue before. Actually, when you specify both a tag and an edge for a query simultaneously, you need to create an index for the tag or the edge first.
Create an index for the tag company first and then try to execute it again.

How to properly use MATCH inside UNWIND for a Nebula query

I’m currently working with the Nebula graph database for the first time and I’m running into some issues with a query. In terms of the schema, I have “Person” nodes, which have a “name” property, as well as Location nodes also with a name property. These node types can be connected by a relationship edge, called HAS_LIVED (to signify whether a person has lived in a certain location). Now for the query, I have a list of names (strings). The query looks like:
UNWIND [“Anna”, “Emma”, “Zach”] AS n
MATCH (p:Person {name: n})-[:HAS_LIVED]->(loc)
RETURN loc.Location.name
This should return a list of three places, i.e. [“London”, “Paris”, “Berlin”]. However, I am getting nothing as a result from the query. When I get rid of the UNWIND and write three separate MATCH queries with each name, it works individually. Not sure why.
Try this instead. It is using "where" clause.
UNWIND [“Anna”, “Emma”, “Zach”] AS n
MATCH (p:Person)-[:HAS_LIVED]->(loc)
where p.name = n
RETURN loc.Location.name

Gremlin query: Find all the related vertices till end which match with edge properties

I need to start with one vertex and find all the related vertices till end. Criteria is to match any one of the edge properties(attributes) in the edge inV vertex. If edge attribute ‘value’ doesn’t match inV vertex ‘attribute’ name I should skip the vertex. Attribute value of edge is propagated as Attribute name in the inV vertex
Am using below query, however this gives me json output of parent node, next node and edges between. With the output am writing logic to pick only next attributes which match with the edge attributes. If the matching of attributes can be done with gremlin query, that would be great
var graphQuery = "g.V().has('sPath', '/Assisted/CSurvey/CSurvey.ss').as('ParentStream').outE.inV().as('Edges').map(select('Edges').inV().fold()).as ('NextStream').select('ParentStream', 'NextStream','Edges')";
In below/attached image. I need to get vertex1 and vertex2 and skip vertex3 as there are no attributes matching with edge
image link
Use graph traversal and filter
Example in Scala:
graph.traversal().V().has().bothE('sPath').filter{it:Edge =>
(it.property('sPath').value() == '/Assisted/CSurvey/CSurvey.ss')}.toList()
Hope this helps

CYPHER storing count()-value for further usage

I want to create a relationship between nodes that have one or multiple things in common and want to set the count of the common things as a property inside the relationship.
For example: in the movie-tutorial-graph I want to create a relationship between actors that have acted in the same movie(s) together and the set count of the movies they played in together as a property in the relationship.
For the basic counting, the tutorial provides a query:
MATCH (n)-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors)
RETURN n.name, coActors.name, count(*) AS Strength ORDER BY Strength DESC
This gives me a list of 2 names and the amount of times they played in movies together (f.e. "Keanu Reeves", "Carrie-Anne Moss", Strength: 3 -> as there are the 3 Matrix movies inside the graph.)
Now I want to create a relationship (ACTED_WITH) between these pairs and set the strength-value as a property inside it.
I can create a relationship like this:
MATCH (a)-[:ACTED_IN]->(p)<-[:ACTED_IN]-(b) MERGE (a)-[r:ACTED_WITH]->(b)
MERGE ensures that there is only one relationship created, but I just can't get the counting-stuff to work with the creation.
I'm not sure to understand what you want but maybe something like that:
MATCH (a)-[:ACTED_IN]->(m)<-[:ACTED_IN]-(b)
WITH a,b, collect(m) AS movies ORDER BY size(movies) DESC
MERGE (a)-[r:ACTED_WITH]-(b)
ON CREATE SET r.Strength=SIZE(movies)
You can use the SET Cypher clause to set properties on a matched node. If you try to set it in the MERGE clause, than merge will treat the key(s) as a unique identifier, and will create a new relationship if one doesn't exist with that specific value yet.
MATCH (a)-[:ACTED_IN]->(p)<-[:ACTED_IN]-(b)
MERGE (a)-[r:ACTED_WITH]->(b)
// reduce matched set to one row of data
WITH DISTINCT a, b, r, COUNT(p) as strength, COLLECT(p) as movies
// set r
SET r.strength = strength
// Return everything to verify above results
RETURN *
SET will overwrite any previous value. If you want to only set it if you created the relationship, you can use ON CREATE or ON MATCH.

Referencing graph nodes by integer ID

As a bit of a learning project, I am working to replace a somewhat slow program in perl with a Chapel implementation. I've got the algorithms down, but I'm struggling with the best way to reference the data in Chapel. I can do a direct translation, but it seems likely I'm missing a better way.
Details of existing program:
I have a graph with ~32000 nodes and ~2.1M edges. State is saved in
data files, but it's run as a daemon that keeps data in memory.
Each node has a numeric ID (assigned by another system) and have a variety
of other attributes defined by string, integer, and boolean values.
The edges are directional and have a couple of boolean values
attributed to them.
I have an external system that interacts with this daemon that I cannot change. It makes requests, such as "Add node (int) with these attributes", "find shortest path from node (int) to node (int)", or "add edges from node (int) to node(s) (int, int, int)"
In Perl, the program uses hashes with common integer IDs for node and edge attributes. I can certainly replicate this in Chapel with associative arrays.
Is there a better way to bundle this all together? I've been trying to wrap my head around ways to have opaque node and edge with each item defined, but struggling with how to reference them with the integer IDs in an easy fashion.
If somebody can provide an ideal way to do the following, it would get me the push I need.
Create two nodes with xx attributes identified by integer ID.
Create an edge between the two with xx attribues
Respond to request "show me the xx attribute of node (int)"
Cheers, and thanks.
As you might expect, there are a number of ways to approach this in Chapel, though I think given your historical approach and your external system's interface, associative domains and arrays are definitely an appropriate way to go. Specifically, given your desire to refer to nodes by integer IDs makes associative domains/arrays a natural match.
For Chapel newbies: associative domains are essentially sets of arbitrary values, like the set of integer node IDs in this case. Associative arrays are mappings from the indices of an associative domain to elements (variables) of a given type . Essentially, the domain represents the keys and the array the values in a key-value store or hash table.
To represent the nodes and edges themselves, I'm going to take the approach of using Chapel records. Here's my record for a node:
record node {
var id: int;
var str: string,
i: int,
flag: bool;
var edges: [1..0] edge;
}
As you can see, it stores its id as an integer, arbitrary attribute fields of various types (a string str, an integer i, and a boolean flag — you can probably come up with better names for your program), and an array of edges which I'll return to in a second. Note that it may or may not be necessary for each node to store its ID... perhaps in any context where you'd have the node, you would already know its ID, in which case storing it could be redundant. Here I stored it just to show you could, not because you must.
Returning to the edges: In your question, it sounded as though edges might have their own integer IDs and get stored in the same pool as the nodes, but here I've taken a different approach: In my experience, given a node, I typically want the set of edges leading out of it, so I have each node store an array of its outgoing edges. Here, I'm using a dense 1D array of edges which is initially empty (1..0 is an empty range in Chapel since 1 > 0). You could also use an associative array of edges if you wanted to give them each a unique ID. Or you could remove the edges from the node data structure altogether and store them globally. Feel free to ask follow-up questions if you'd prefer a different approach.
Here's my record for representing an edge:
record edge {
var from, to: int,
flag1, flag2: bool;
}
The first two fields (from and to) indicate the nodes that the edge connects. As with the node ID above, it may be that the from field is redundant / unnecessary, but I've included it here for completeness. The two flag fields are intended to represent the data attributes you'd associate with an edge.
Next, I'll create my associative domain and array to represent the set of node IDs and the nodes themselves:
var NodeIDs: domain(int),
Nodes: [NodeIDs] node;
Here, NodeIDs is an associative domain (set) of integer IDs representing the nodes. Nodes is a an associative array that maps from those integers to values of type node (the record we defined above).
Now, turning to your three operations:
Create two nodes with xx attributes identified by integer ID.
The following declaration creates a node variable named n1 with some arbitrary attributes using the default record constructor/initializer that Chapel provides for records that don't define their own:
var n1 = new node(id=1, "node 1", 42, flag=true);
I can then insert it into the array of nodes as follows:
Nodes[n1.id] = n1;
This assignment effectively adds n1.id to the NodeIDs domain and copies n1 into the corresponding array element in Nodes. Here's an assignment that creates a second anonymous node and adds it to the set:
Nodes[2] = new node(id=2, "node 2", i=133);
Note that in the code above, I've assumed that you want to choose the IDs for each node explicitly (e.g., perhaps your data file establishes the node IDs?). Another approach (not shown here) might be to have them be automatically determined as the nodes are created using a global counter (maybe an atomic counter if you're creating them in parallel).
Having populated our Nodes, we can then iterate over them serially or in parallel (here I'm doing it in parallel; replacing forall with for will make them serial):
writeln("Printing all node IDs (in an arbitrary order):");
forall nid in NodeIDs do
writeln("I have a node with ID ", nid);
writeln("Printing all nodes (in an arbitrary order):");
forall n in Nodes do
writeln(n);
The order in which these loops print the IDs and nodes is arbitrary for two reasons: (1) they're parallel loops; (2) associative domains and arrays store their elements in an arbitrary order.
Create an edge between the two with xx attribues
Since I associated the edges with nodes, I took the approach of creating a method on the node type that will add an edge to it:
proc node.addEdge(to: int, flag1: bool, flag2: bool) {
edges.push_back(new edge(id, to, flag1, flag2));
}
This procedure takes the destination node ID, and the attributes as its arguments, creates an edge using that information (and supplying the originating node's ID as the from field), and uses the push_back() method on rectangular arrays to add it to the list of edges.
I then call this routine three times to create some edges for node 2 (including redundant and self-edges since I only have two nodes so far):
Nodes[2].addEdge(n1.id, true, false);
Nodes[2].addEdge(n1.id, false, true);
Nodes[2].addEdge(2, false, false);
And at this point, I can loop over all of the edges for a given node as follows:
writeln("Printing all edges for node 2: (in an arbitrary order):");
forall e in Nodes[2].edges do
writeln(e);
Here, the arbitrary printing order is only due to the use of the parallel loop. If I'd used a serial for loop, I'd traverse the edges in the order they were added due to the use of a 1D array to represent them.
Respond to request "show me the xx attribute of node (int)"
You've probably got this by now, but I can get at arbitrary attributes of a node simply by indexing into the Nodes array. For example, the expression:
...Nodes[2].str...
would give me the string attribute of node 2. Here's a little helper routine I wrote to get at (and print) some various attributes):
proc showAttributes(id: int) {
if (!NodeIDs.member(id)) {
writeln("No such node ID: ", id);
return;
}
writeln("Printing the complete attributes for node ", id);
writeln(Nodes[id]);
writeln("Printing its string field only:");
writeln(Nodes[id].str);
}
And here are some calls to it:
showAttributes(n1.id);
showAttributes(2);
showAttributes(3);
I am working to replace a somewhat slow program in perl with a Chapel implementation
Given that speed is one of your reasons for looking at Chapel, once your program is correct, re-compile it with the --fast flag to get it running quickly.

Resources