Cypher Order By Number of Paths - graph

Let's say I have a graph of movies and directors, where movies are connected to each other by co-viewership. I want to find similar directors, i.e. directors whose films tend to be watched together.
START n=node:index(Name="Steven Spielberg") MATCH n-->m--l<--o RETURN o;
This gets me all of the related directors, but how do I order them by the number of paths that connect them? Bonus points if I can also take weight of the tie between films into consideration.

count(*) is the number of paths that start with n and end with o
START n=node:index(Name="Steven Spielberg")
MATCH n-->m--l<--o
RETURN o,count(*)
order by count(*) desc;
with weights on the relationships
START n=node:index(Name="Steven Spielberg")
MATCH path=n-->m--l<--o
RETURN o,sum(reduce(sum=0,r in rels(path) : sum+r.weight)) as weight
ORDER BY weight desc;

START n=node:index(Name="Steven Spielberg")
MATCH path=n-->m--l<--o
RETURN o
ORDER BY length(path);

Related

Neo4j expand path and return distinct nodes and relationships

I have given one or more start nodes (by ID) and I need to expand over one or more hops and return one result with an array of distinct nodes and and array of distinct relationships.
I can do this either via apoc.path.expand() or apoc.path.subgraphAll(), but either way it produces multiple rows for each expanded path, and therefore may contain duplicate nodes. To reduce the multiple rows into one row, I have used collect() with apoc.coll.toSet() and apoc.coll.flatten() to remove duplicates from the nodes and relationships array:
apoc.path.subgraphAll:
MATCH (n) WHERE id(n) IN $ids
CALL apoc.path.subgraphAll(n, { minLevel: 1, maxLevel: 2 }) YIELD nodes, relationships
WITH collect(nodes) as nodes, collect(relationships) as relationships
RETURN apoc.coll.toSet(apoc.coll.flatten(nodes)) as nodes, apoc.coll.toSet(apoc.coll.flatten(relationships)) as relationships
apoc.path.expand:
MATCH (n) WHERE id(n) IN $ids
CALL apoc.path.expand(n, null, null, 1, 2) YIELD path
WITH collect(nodes(path)) as nodes, collect(relationships(path)) as relationships
RETURN apoc.coll.toSet(apoc.coll.flatten(nodes)) as nodes, apoc.coll.toSet(apoc.coll.flatten(relationships)) as relationships
Is there another way to remove the duplicates from the two arrays or to query the nodes and relationships?

Get relationship node from 2 edges in 1 cypher query

I need a cypher query that retrieves the weight of two edges at the same time. This is my attempt:
MATCH (n:User)-[r:VIEWED|UPDATED]->(f:File) WHERE f.id IN 'some_id','another_id'] RETURN n, r.weight, ORDER BY r.weight DESC
The result contains two lines for each user who updated and viewed the file. However, I want the result in one line. For example: user: x - updated: 12 - viewed:15
How can I do this?
UPDATED:
try:
MATCH (f:File)
OPTIONAL MATCH (n:User)-[r1:VIEWED]->(f:File)
OPTIONAL MATCH (n:User)-[r2:UPDATED]->(f:File)
where f.id IN ['some_id','another_id']
return n,
sum(r1.weight) as totalViewedWeight,
sum(r2.weight) as totalUpdatedWeight
MATCH (n:User)-[r:VIEWED|UPDATED]->(f:File)
WHERE f.id IN ['some_id','another_id']
RETURN n, collect(type(r)), collect(r.weight)

Queries to be tested in Neo4j

Why I can't get any results with this queries? What I'm doing wrong here?
QUERY 1
MATCH (person:Person)-[:PRS_knows_PRS*1..2]-(friend:Person),
(friend)<-[:CMT_hasCreator_PRS]-(friendPost:Post)-[:PST_hasTag_TAG]->(knownTag:Tag {nameTag:2})
WHERE not(person=friend)
MATCH (friendPost)-[:PST_hasTag_TAG]->(commonTag:Tag)
WHERE not(commonTag=knownTag)
WITH DISTINCT commonTag, knownTag, friend
MATCH (commonTag)<-[:PST_hasTag_TAG]-(commonPost:Post)-[:PST_hasTag_TAG]->(knownTag)
WHERE (commonPost)-[:CMT_hasCreator_PRS]->(friend)
RETURN
commonTag.nameTag AS tagName,
count(commonPost) AS postCount
ORDER BY postCount DESC, tagName ASC
LIMIT 3
QUERY 2
MATCH (person:Person)-[:PRS_knows_PRS*1..2]-(friend:Person)
WHERE not(person=friend)
WITH DISTINCT friend
MATCH (friend)-[worksAt:PRS_worksAt_ORG]->(company:Organisation)-[:ORG_isLocatedIn_PLC]->(:Country {name:{3}})
WHERE worksAt.workFromPWAO < {2}
RETURN
friend.idPerson AS friendId,
friend.firstNamePerson AS friendFirstName,
friend.lastNamePerson AS friendLastName,
worksAt.workFromPWAO AS workFromYear,
company.nameOrganisation AS companyName
ORDER BY workFromYear ASC, friendId ASC, companyName DESC
LIMIT 4
I have this nodes, property Keys and relatioship types:
Could you please help me with this problem?
On your query you use this relationship name:
worksAt:PRS_worksAt_ORG
but on your screen shot of the properties the relationship is called 'PRS_workAt_ORG' without the 's' on workAt.
Maybe that's what's missing on the second query?
Usually it helps to build up the queries step by step.
Then you see where they stop returning data.
A PROFILE (prefix) of your queries should also help to see where it starts to return ZERO rows.
Could also be a case-typo in one of the labels and rel-types.

Neo4j find nodes with the same properties

I have a Neo4j graph where each node is a person. Each person has two properties: Name and City. And the relationships are: friend_of and love.
I'm trying to get the nodes that have friends ONLY in their same city (A live in Paris, B in Paris, C in Madrid, D in Madrid, A-[friend_of]->B, B-[friend_of]->C a A-[friend_of]->C, D-[friend_of]->C I only need to get A and D because their friends live in their same city and only there) and order them by City first and then by Name.
I have tried the following:
MATCH (n)-[r:FRIEND_OF]-(n1) WHERE (n.City = n1.City) RETURN n,n1 ORDER BY n.City, n.Name
That gives me the nodes wanted, but some of them are wrong too (they have friends in other cities).
Thank you!
Based on the following neo4j console http://console.neo4j.org/r/5c2n8h, this query returns you only D as wanted :
MATCH (user:User)-[:FRIEND_OF]-(friend)
WITH user, collect(friend) AS friends
WHERE ALL (f IN friends WHERE f.City = user.City)
RETURN user
ORDER BY user.City, user.name
Try this:
MATCH (u:User {name:"carlos"})-[:FRIEND_OF]-(f:User)
WITH u, collect(f) as friends
WHERE ALL(f in friends WHERE f.City = u.City)
UNWIND friends as friend
RETURN friend
ORDER BY friend.City, friend.Name

Neo4j - querying N items per group

The following is my query:
MATCH (u:User{id:1})-[r:FOLLOWS]->(p:Publisher)<-[:PUBLISHED]-(i:Item)-[:TAGGED]->(t:Tag)<-[f:FOLLOWS]-u
RETURN a, count(t) ORDER BY count(k) DESC LIMIT 100
So User can follow Publisher and a Tag. The query find the items, that user may like by counting matching tags.
Suppose there two properties, MIN and MAX, on relationship u-r->p. These properties specify, how many items user wants to see from each publisher. How can I rewrite the query to allow this?
Here is one thought. Say for instance that the FOLLOWS relationship has a min value and a max value set. You could use the following query to limit the data that is returned by the query based on those values. I have not rewritten the entire query to include the tags and a limit there either.
// find the user and the publisher and the relationship
// which has the min/max parameters
match (u:User {id: 1})-[r:FOLLOWS]->(p:Publisher)
with u, p, r
// macth the items that the publisher published
match p-[:PUBLISHED]-(i:Item)
// order them just because we can
with u, p, r, i
order by i.name
// collect the ordered items as the total list of items
with u, p, r, collect(i.name) as items
// make sure the collection is >= the minimum size of the list
// if so then return the items in the collection up to the max length
// otherwise return and empty collection
// you might want to do something else
with u, p, r, case
when length(items) >= r.min then items[..r.max]
else []
end as items
return u.name, p.name, r.min, r.max, items
The unfortunate thing about this is that you have already performed the query to get the items and are just filtering them out for display purposes. It would be nice to know the person's preference before hand so you could apply the max limit in the query for the items using limit and a parameter. This would eliminate unnecessary database hits. Depending on the publisher there could be many, many items and limiting them up front might be advantageous.
Here are a couple of variations to experiment with too. You could also do something like this...
// slight variation where the minimum is enforced with where instead of case
match (u:User {id: 1})-[r:FOLLOWS]->(p:Publisher)
with u, p, r
match p-[:PUBLISHED]-(i:Item)
with u, p, r, i
order by i.name
with u, p, r, collect(i.name) as items
where length(items) >= r.min
return u.name, p.name, items[..r.max]
or even this...
// only results actually between the min and max are returned
match (u:User {id: 1})-[r:FOLLOWS]->(p:Publisher)
with u, p, r
match p-[:PUBLISHED]-(i:Item)
with u, p, r, i
order by i.name
with u, p, r, collect(i.name) as items
where length(items) >= r.min
and length(items) <= r.max
return u.name, p.name, items[..r.max]

Resources