NebulaGraph Database: when I filter edges by their property values, it shows incorrect results - nebula-graph

Here are some details:
NebulaGraph version is 3.2.1
Deployment way is distributed
Installation way is RPM
Disk is Non-SSD
CPU and memory information are Single node 4core8g
Specific description of the problem
I use the following nGQL to query the 2-hop path data with vid of 76517618765348
MATCH p=(v)-[e:`belong`*2]-(v2)
WHERE id(v) IN ["76517618765348"]
AND ALL(l IN e WHERE l.create_long == 1665368234) RETURN p LIMIT 100
The results are as shown in the figure below
Among them, the edge belong has the property of which type is long and name create_long. When I use the == to query the edge with a value of 1665368234 to filter. Why can’t I find the corresponding data?
The real data relationship is as shown in the figure below

Are you sure you would like to filter all edges with this WHERE clause: WHERE l.create_long == 1665368234?
MATCH p=(v)-[e:`belong`*2]-(v2)
WHERE id(v) IN ["76517618765348"]
AND ALL(l IN e WHERE l.create_long == 1665368234) RETURN p LIMIT 100
As you are using == here, I guess maybe you actually would like to express one of the edge in list of e fulfills create_long == 1665368234?
If it’s the case as I guessed, you should instead use ANY instead of ALL here:
And that is:
MATCH p=(v)-[e:`belong`*2]-(v2)
WHERE id(v) IN ["76517618765348"]
AND ANY(e_ IN e WHERE e_.create_long == 1665368234)
RETURN p LIMIT 100

Related

Neo4j - cypher - variable length pattern slow

I'm playing around with neo4j and cypher and trying to find out why my query is slow.
Here is my original query that is marked as deprecated:
match (h:Hierarchy)-[r:PARENT_OF*1..4]->(s:SoldTo)
where h.id='0100001709'
and all(x in r
where x.to>=date('2022-02-15')
and x.from <= date('2022-02-15')
and x.hvkorg='S000'
and x.hvtweg='D1'
)
and s.loevm=""
and s.aufsd=""
and s.cassd=""
and s.faksd=""
and s.lifsd=""
and s.sperr=""
and s.sperz=""
r>eturn distinct h.id,s.id
This one works fine and returns a result quite quickly: Started streaming 60 records after 1 ms and completed after 17 ms. However neo4j gives the below warning:
This feature is deprecated and will be removed in future versions.
Binding relationships to a list in a variable length pattern is deprecated. (Binding a variable length relationship pattern to a variable ('r') is deprecated and will be unsupported in a future version. The recommended way is to bind the whole path to a variable, then extract the relationships:
MATCH p = (...)-[...]-(...)
WITH *, relationships(p) AS r)
Now, i've tried this:
match p=(h:Hierarchy)-[:PARENT_OF*1..4]->(s:SoldTo)
with *, relationships(p) as r
where h.id='0100001709'
and all(x in r
where x.to>=date('2022-02-15')
and x.from <= date('2022-02-15')
and x.hvkorg='S000'
and x.hvtweg='D1'
)
and s.loevm=""
and s.aufsd=""
and s.cassd=""
and s.faksd=""
and s.lifsd=""
and s.sperr=""
and s.sperz=""
return distinct h.id,s.id
But, this is very slow: Started streaming 60 records in less than 1 ms and completed after 36931 ms.
17ms vs 36931ms
Would any of you have any recommendation to speed things up using relationships()?
There is a typo error on the query. Instead of doing a scan on the first line, put the where clause (found in line #3 to line#2). This will enable a starting node at h and will make your query faster.
Original:
match p=(h:Hierarchy)-[:PARENT_OF*1..4]->(s:SoldTo)
with *, relationships(p) as r
where h.id='0100001709'
Updated:
match p=(h:Hierarchy)-[:PARENT_OF*1..4]->(s:SoldTo)
where h.id='0100001709'
with h, s, relationships(p) as r
...
...
return distinct h.id,s.id

Gremlin : How do you find vertex and edges when some edges does not exists

I am new to gremlin.
I am facing issue in fetching the vertex and edges when sometimes edge from a vertex does not exists.
for example bellow query works fine if it gets all the vertex and edges.
but for one use case edge
`.outE("PRODUCES").`as`("produces"))`
does not exists in db.
in that case bellow query doesnt return any result.
I need your help to resolve this issue.
when edges does not exit then i want input_entity and processed_by in result.
janusGraph.traversal().V()
.has("isActive", "true")
.hasLabel("ENTITY").`as`("input_entity")
.outE("PROCESSED_BY").`as`("processed_by")
.inV().`as`("job")
.outE("PRODUCES").`as`("produces")
.select<String>("job").outE("HAS_STATE")
.`as`("job_state_edge").inV().hasLabel("JOB_STATE").`as`("job_state")
.select<String>("input_entity").outE("HAS_STATE")
.`as`("input_entity_state_edge").inV().hasLabel("ENTITY_STATE").`as`("input_entity_state")
.select<String>("input_entity").outE("CONNECTS_TO").`as`("connects_to")
.inV().hasLabel("ENTITY").has("entityName", TextP.startingWith(rootNamespace))
.`as`("output_entity").outE("HAS_STATE")
.`as`("output_entity_state_edge").inV().hasLabel("ENTITY_STATE").`as`("output_entity_state")
.select<String>("input_entity","output_entity","processed_by","produces","job","job_state","input_entity_state","output_entity_state","input_entity_state_edge","output_entity_state_edge","job_state_edge","connects_to")
.by(elementMap<Element, Any>()).toList()
with optional
janusGraph.traversal().V()
.has("isActive", "true")
.hasLabel("ENTITY").`as`("input_entity")
.outE("HAS_STATE").`as`("input_entity_state_edge").inV().hasLabel("ENTITY_STATE").`as`("input_entity_state")
.select<String>("input_entity").outE("PROCESSED_BY").`as`("processed_by")
.inV().`as`("job").outE("HAS_STATE").`as`("job_state_edge").inV().hasLabel("JOB_STATE").`as`("job_state")
.select<String>("job")
.optional(
outE("PRODUCES").`as`("produces")
.select<String>("input_entity").outE("CONNECTS_TO").`as`("connects_to")
.inV().hasLabel("ENTITY").has("entityName", TextP.startingWith(rootNamespace))
.`as`("output_entity").outE("HAS_STATE").`as`("output_entity_state_edge").inV().hasLabel("ENTITY_STATE").`as`("output_entity_state"))
.select<String>("input_entity","output_entity","processed_by","produces","job","job_state","input_entity_state","output_entity_state","input_entity_state_edge","output_entity_state_edge","job_state_edge","connects_to")
.by(elementMap<Element, Any>()).toList()
There are two Gremlin steps that can help in cases like this. When you have a part of a query that may or may not exist, but you either want the results up to that point if it does not exist or the results afterwards if it does exist, you can wrap that part of the query in an optional step.
For example :
g.V('3').optional(out())
Will either return V['3'] or the adjacent vertices if out yields results.
In cases where you want to select a value that may not exist, you can do something like this:
coalesce(select('a'),constant('No results'))
EDITED to add:
If you need to return multiple results, rather than just using select try a project('a','b,',c') type of approach where each by modulator for the project can contain its own coalesce step.

RedisGraph: Merge nodes and move all the reletionships?

In RedisGraph using Cypher/python is there a way to
Merge two nodes and move all the relationships from the old node to the new node ?
I suspect there are no pure Cypher solution... in that case what will be the equivalent atomic operations and how to combine them to achieve MERGE-nodes+rel
neo4j have apoc.refactor.mergeNodes(nodes, options), apoc.refactor.mergeRelationships(rels, options), but that doesn't help me !:( because I'm using RedisGraph.
the problem is that in RG I dont have lower level access to do enumeration/iteration to do this programmatically !!
this worked in one direction I have to apply -> the reverse <- second time.
MATCH (old)-[r:q]->(from_to)
WHERE old.val = $old
MATCH (new) WHERE new.val = $new
MERGE (new)-[nr2:q]->(from_to)
SET nr2.val = r.val
DELETE r
any way to combine it in single query ?
I think this can be accomplished in pure Cypher:
MATCH (old {val: 'old'})-[e:E]->(old_to)
MERGE (new {val: 'new'})
CREATE (new)-[e2:E]->(old_to)
SET e2.prop1 = e.prop1, [...]
DELETE e
The chief annoyance here is that all of the edge properties (and node properties, if those are also to be migrated) must be set explicitly, as RedisGraph does not currently support setting property maps.

Iterate over nodes in neo4j

I have a graph with 2137 different paths, not connected to each other. The nodes in each path are numbered by a property, "Isnad", where Isnad = 1,2,3....2137. I want to return the lengths of all 2137 paths.
match(n:Narrator)
where n.Isnad = "1"
return count(n)
Now this works for me but I want to iterate for "Isnad" so I get the count for all 2137 paths instead of just one. Any suggestions?
In pseudocode, it would be like:
for(int i = 1 to 2137)
match(n:Narrator)
where n.Isnad = i
return count(n)
Use range and UNWIND:
unwind range(1, 2137) AS i
match p=(n:Narrator)-[*]->(m)
where toInteger(n.Isnad) = i
and not ()-[]->(n)
and not (m)-[]->()
return n.Isnad, length(p)
limit 10
Note that we are looking for the longest paths. This is expressed in the WHERE condition: the first node of the path should not have a preceding node (not ()-[]->(n)), while the last node of the path should not have a succeeding node (not (m)-[]->()) -- see this answer.
I included the limit 10 line so that the computation and returning the result will not freeze the server or the web user interface. If the query seems okay, remove limit 10 and see if it still works. When embedded to a client application, it can scale to more results as the rendering of the web UI will no longer be a bottleneck.

RNeo4j cypher - retrieving paths

I'm trying to extract a sub-graph from a global network (sub-networks of specific nodes to a specific depth).
The network is composed of nodes labeled as Account with a property of iban and relationships of TRANSFER_TO_AGG.
The cypher syntax is as followed:
MATCH (a:Account { iban :'FR7618206004274157697300156' }),(b:Account),
p = allShortestPaths((a)-[:TRANSFER_TO_AGG*..3]-(b))
RETURN p limit 250
This works perfectly on the Neo4J web interface. However, when trying to save the results to an R object using the command cypher I get the following error:
"Error in as.data.frame.list(value, row.names = rlabs) :
supplied 92 row names for 1 rows"
I believe this is due to the fact that if returning data, you can only query for tabular results. That is, this method has no current functionality for Cypher results containing array properties, collections, nodes, or relationships.
Can anyone offer a solution ?
I've recently added functionality for returning pathways as R objects. First, uninstall / reinstall RNeo4j. Then, see:
?getSinglePath
?getPaths
?shortestPath
?allShortestPaths
?nodes
?rels
?startNode
?endNode
For your query specifically, you would use getPaths():
library(RNeo4j)
graph = startGraph("http://localhost:7474/db/data/")
query = "
MATCH (a:Account { iban :'FR7618206004274157697300156' }),(b:Account),
p = allShortestPaths((a)-[:TRANSFER_TO_AGG*..3]-(b))
RETURN p limit 250
"
p = getPaths(graph, query)
p is a list of path objects. See the docs for examples of using the apply family of functions with a list of path objects.

Resources