Order neo4j cypher query by node depth - graph

I have the following graph:
I want to return all users which have the CAN_DISTRIBUTE Credits permission attached through a Role which APPLIES_ON a Group.
The following query returns both sara and admin as user names:
MATCH (users)-[:IS]->()<-[:CHILD_OF*0..]-(roles)-[:CAN_DISTRIBUTE]->(asset:Asset{name:"Credits"}),
(roles)-[:APPLIES_ON]->(group:Group{name:"Digital"})
WITH DISTINCT users
RETURN collect(users.name)
Now, I'm having a really hard time to order the users returned by their Role relationship depth. I want sara to be returned first as the Manager role is a child of SuperManager.
In english it's like saying, give me all the users which can distribute credits on group X, ordered by their role hierarchy.
Do you guys have any ideas?
Here is the query to create this graph:
CREATE (admin:User{name:"admin"})
CREATE (sara:User{name:"sara"})
CREATE (c:Asset{name:"Credits"})
CREATE (marketing:Group{name:"Marketing"})
CREATE (digital:Group{name:"Digital"})
CREATE (super_manager:Role{name:"SuperManager"})
CREATE (manager:Role{name:"Manager"})
CREATE (manager)-[:CAN_DISTRIBUTE]->(c)
CREATE (admin)-[:IS]->(super_manager)
CREATE (sara)-[:IS]->(manager)
CREATE (super_manager)-[:APPLIES_ON]->(marketing)
CREATE (super_manager)-[:APPLIES_ON]->(digital)
CREATE (manager)-[:APPLIES_ON]->(marketing)
CREATE (manager)-[:APPLIES_ON]->(digital)
CREATE (manager)-[:CHILD_OF]->(super_manager)

You can do it using the length of the entire path, this way:
MATCH p = (users)-[:IS]->()<-[:CHILD_OF*0..]-(roles)-[:CAN_DISTRIBUTE]->(asset:Asset{name:"Credits"}),
(roles)-[:APPLIES_ON]->(group:Group{name:"Digital"})
WITH DISTINCT users, length(p) as pathLength
RETURN users.name
ORDER BY pathLength
The output for the given data set:
╒════════════╕
│"users.name"│
╞════════════╡
│"sara" │
├────────────┤
│"admin" │
└────────────┘

Related

Cypher Query - Excluding certain relationships

I am querying my graph where it has the following nodes:
Customer
Account
Fund
Stock
With the following relationships:
HAS (a customer HAS an account)
PURCHASED (an account PURCHASES a fund or stock)
HOLDS (a fund HOLDS a stock)
The query I am trying to achieve is returning all Customers that have accounts that hold Microsoft through a fund. The following is my query:
MATCH (c:Customer)-[h:HAS]->(a:Account)-[p:PURCHASED]-(f:Fund)-[holds:HOLDS]->(s:Stock {ticker: 'MSFT'})
WHERE exists((f)-[:HOLDS]->(s:Stock))
AND exists ((f:Fund)-[holds]->(s:Stock))
AND NOT exists((a:Account {account_type: 'Individual'})-[p:PURCHASED]->(s:Stock))
RETURN *
This almost gets me the desired results but I keep getting 2 relationships out of the Microsoft stock that is tied to an Individual account where I do not want those included.
Any help would be greatly appreciated!
Result:
Desired Result:
There is duplications in your query. Lines 2 and 3 are the same. Line 2 is a subgraph of Line 1. Then you are using the variables a, p and s more than once in line 1 and line 4. Below query is not tested but give it a try. Please tell me if it works for you or not.
MATCH (c:Customer)-[h:HAS]->(a:Account)-[p:PURCHASED]-(f:Fund)-[holds:HOLDS]->(s:Stock {ticker: 'MSFT'})
WHERE NOT exists((:Account{account_type: 'Individual'})-[:PURCHASED]->(:Stock))
RETURN *
It seems to me that you should just uncheck the "Connect result nodes" option in the Neo4j Browser:

First get list of users from table-1, then compare with current user by specific field from table-2

What is the procedure to do if I have 2 tables: From table-1 I get all the users I want, and then, after I have the list of user ids, I want to compare each user with current user with a field found on table-2. The first task is easy, I just have an onDataChange that populates the users list with their ids. But now that I have this list, how to iterate each user, and compare it with the current user based on a specific field from table-2.
What I currently try is to use a for loop to iterate each user on the list with each having onDataChange call to table-2, and then I populate the necessary dataset. But when this for loop ends, this dataset is no longer visible.
I hope what I try to achieve in this post is understandable.
I'll try to demonsrate with tables:
Assuming I get user list from table-1 based on data1:
table-1
|
|_____data1
|____uid20
|____uid30
|____uid44
Now I have list of 3 users: uid20, uid30, uid44. Then, I need to compare the list of users, with current user, call it user1, from table-1, based on another field (timestamp). What I mean is, after I have list of users, I want to filter these to have a timestamp that's close to the current user, for up to certain amount of time. So in my example, I want to have only users that are within 2 minutes of the current user timestamp.
table-2
|
|______uid1
| |____timestamp: <some_timestamp>
|
|______uid20
| |____timestamp: <some_timestamp>
|
|______uid30
| |____timestamp: <some_timestamp>
|
|______uid44
|____timestamp: <some_timestamp>
But every time there is something that's out of scope of the new listener, and also it looks like it's not the right procedure. Maybe I first need to save what's found on table-1 locally ? Or, it can be done somehow purely with Firebase calls?
**This is some code:
Getting the current user, is easy:
final FirebaseUser user = FirebaseAuth.getInstance().getCurrentUser();
final String userId = user.getUid();
So I always have it visible at any scope
First, as I understand, you can use the startAt and endAt to get a range of the values within two minutes of difference. What I mean is that before getting each value from your table-2 you can just get the values that matches your use case, in this case, values that are 2 minutes of the current timestamp.
For example, in your table 2, I would query like this:
ref.orderByChild('timestamp').startAt(yourCurrentTimeStamp).endAt(yourCurrentTimeStamp+120000);
where 120000 is 2 seconds in miliseconds
and then when looping through this elements I would use getKey to get each key of the values filtered by this query, so I would get only the users with 2 minutes of difference, and then compare them with the first for loop you did in order to see if they match.
to compare 2 users ID you can use equals, since it's a String:
if(snapshotUserTable1.getKey().equals(snapshotUserTable2.getKey())){
/...
}

Neo4j cypher to return all nodes of children

I'm just starting out with Neo4j and I struggle a bit on the folling case. Given the following graph:
As "theo", I want to return the list of other user names who can also manage glossaries. Being member of a parent group should give you access to the same permissions as your children.
For example, for "theo", we should return sara and bob as sara is member of PoleManager which is a parent of ProjectManager group. Bob is member of ProjectManager group which have permission to manage glossaries.
So far I have the following query but it does not return sara as a candidate:
MATCH (me:User{name:"theo"})-[:MEMBER_OF]->(g:Group),
(g)-[:CAN_MANAGE]->(Asset{name:"Glossaries"}),
(users:User)-[:MEMBER_OF]->(g)
RETURN me.name AS Me, collect(users.name) AS Users
UNION MATCH (me:User{name:"theo"})-[:MEMBER_OF]->(Group)<-[:CHILD_OF*]-(children:Group),
(children)-[:CAN_MANAGE]->(Asset{name:"Glossaries"}),
(users:User)-[:MEMBER_OF]->(children)
RETURN me.name AS Me, collect(users.name) AS Users
I'm also open to better ideas to represent this graph.
The latter half of the query is almost right:
MATCH p = (me:User{name:$Me})-[:MEMBER_OF]->()<-[:CHILD_OF*0..]-()-[:CAN_MANAGE]->(:Asset{name:"Glossaries"})
UNWIND nodes(p)[1..-1] as group
MATCH (users:User)-[:MEMBER_OF]->(group)
RETURN $Me, collect(users.name) AS Users
If you're starting from the :User named "theo", then it should be enough to parameterize the name input, and return the name parameter in the RETURN instead of accessing it on the node, as above.
I'm also wondering if the match from Theo is really necessary, since it seems like all you want are users that are members of groups that can manage glossaries. If so, you can remove the Theo part from your query:
MATCH (users)-[:MEMBER_OF]->(group)<-[:CHILD_OF*0..]-()-[:CAN_MANAGE]->(:Asset{name:"Glossaries"})
RETURN $name collect(users.name) AS Users
For this one, I removed the labels for users and group, but you can only do this if your graph structure is well-defined enough that the relationships present here can only connect to :User and :Group nodes. If other types of nodes can also be connected by these relationships, you'll need to add the :User and :Group labels back on.

Neo4j - How to build proper multi-dimensional query for my graph

I have a simple social-networking like graph w/ users, friends, comments, likes etc. Users can "own" items, comment on "items", like "items". I am trying to write a cypher query that returns "items" along w/ extra information to display them in my stream.
I have tried using optional match and collect and stuff, but there is always some part of the result that doesn't work.
Specifically, for a given user(say user1), I want to return "items" that:
a specific user + his friends own
show number of likes,
also show number of comments,
Know if the item is already owned by me (so I can hide "own" button in the UI)
If the item is owned by friends, I want to show name, image of up to 2 friends (but not more than 2 friends if, say, 5 friends own that item)
You can copy-paste below to get the graph
// 3 users
CREATE (u1:USER{name:"USER1", image: "image1"})
CREATE (u2:USER{name:"USER2", image: "image2"})
CREATE (u3:USER{name:"USER3", image: "image3"})
//3 items
CREATE (i1:ITEM{name:"ITEM1"})
CREATE (i2:ITEM{name:"ITEM2"})
CREATE (i3:ITEM{name:"ITEM3"})
// OWNERSHIP ..
//user1 owns 2 items
CREATE (u1)-[:OWNS]->(i1)
CREATE (u1)-[:OWNS]->(i2)
// user2 owns i2 and i3
CREATE (u2)-[:OWNS]->(i2)
CREATE (u2)-[:OWNS]->(i3)
// user3 also owns i2 and i3 (so i2 is owned by u1, u2 and u3; and i3 is owned by u2 and u3)
CREATE (u3)-[:OWNS]->(i2)
CREATE (u3)-[:OWNS]->(i3)
// FRIENDSHIP..
// user1 is friend of both user2 and user3
CREATE (u1)-[:FRIEND_OF]->(u2)
CREATE (u1)-[:FRIEND_OF]->(u3)
// COMMENTS ..
//user1 has commented on all those items he owns
CREATE (u1i1:COMMENT{text:"user1 comment on item1"})
CREATE (u1)-[:COMMENTED]->(u1i1)-[:COMMENT_FOR]->(i1)
CREATE (u1i2:COMMENT{text:"user1 comment on item2"})
CREATE (u1)-[:COMMENTED]->(u1i2)-[:COMMENT_FOR]->(i2)
//user 2 has also commented on all those he owns
CREATE (u2i2:COMMENT{text:"user2 comment on item2"})
CREATE (u2)-[:COMMENTED]->(u2i2)-[:COMMENT_FOR]->(i2)
CREATE (u2i3:COMMENT{text:"user2 comment on item3"})
CREATE (u2)-[:COMMENTED]->(u2i3)-[:COMMENT_FOR]->(i3)
// LIKES ..
//user1 has liked user2's and user3's items
CREATE (u1)-[:LIKED]->(i2)
CREATE (u1)-[:LIKED]->(i3)
//user2 has liked user1's items
CREATE (u2)-[:LIKED]->(i1)
Let's build your query up step by step:
Specifically, for a given user(say user1), I want to return "items" that:
a specific user + his friends own
MATCH (u:USER {name: "USER1"})-[:FRIEND_OF*0..1]-(friend:USER)-[:OWNS]-(i:ITEM)
WITH u,i,
// Know if the item is already owned by me (so I can hide "own" button in the UI)
sum(size((u)-[:OWNS]->(i))) > 0 as user_owns,
// If the item is owned by friends, I want to show name, image of up to 2 friends
collect({name:friend.name, image:friend.image})[0..2] as friends
RETURN u,i, user_owns, friends
// show number of likes,
sum(size(()-[:LIKED]->(i))) as like,
// also show number of comments,
sum(size(()-[:COMMENT_FOR]->(i))) as comments
Actually because it is such a good question, I sat down and created a GraphGist documenting each step here.
Fairly easy. First you need to have a variable path length match from 0..1 on FRIEND_OF returning either yourself. Follow to all items being owned by those.
Use OPTIONAL MATCH for likes and comments since there might or might not exist any.
Since there are potentially multiple paths to a single item, you need to count the distinct likes and comments.
To check if you already own the item, check the endpoint of the variable path match from above if its name is yours.
For getting up to two images of the friends owning the item filter the list for your friends and return the image property. Last step is to slice the collection for the first two elements using subscript operator.
MATCH (:USER { name:'USER1' })-[:FRIEND_OF*0..1]->(me_or_friend)-[:OWNS]->(item)
OPTIONAL MATCH (item)<-[l:LIKED]-()
OPTIONAL MATCH (item)<-[c:COMMENT_FOR]-()
WITH item, count(distinct l) AS likes, count(distinct c) AS comments,
collect(DISTINCT me_or_friend) AS me_or_friends
RETURN item, likes, comments,
ANY (x IN me_or_friends WHERE x.name='USER1') AS i_already_own,
[x IN me_or_friends WHERE x.name<>'USER1' | x.image][0..2] as friendImages
Final comment:
On SO we appreciate if you show in your question what you've already tried yourself to solve the problem. Question like "solve that problem for me" are not that much welcome.

Missing values in google.com:analytics-bigquery:LondonCycleHelmet.ga_sessions_20130910

I am working with the practice repository in preparation for doing upcoming work with a large enterprise client using BQ. The repository link is: google.com:analytics-bigquery:LondonCycleHelmet.ga_sessions_20130910
I have 3 questions to ask in relation to the sample repository & a query that was run (please see the bottom of the link for the query that motivated the question:
1) What is the difference between customDimensions.index, customDimensions.value and hits.customDimensions.index, hits.customDimensions.value?
2) If a single hit has multiple custom dimensions/metrics how is that returned/queried? I only see single dimensions matching at the hit level in the sample data.
3) There are no custom metric values passed in the example data, what will those values look like?
Here is the query that motivated the previous 3 questions:
SELECT hits.page.pagePath AS urls,
hits.time,
customDimensions.index,
customDimensions.value,
hits.customMetrics.index,
hits.customMetrics.value,
trafficSource.medium,
hits.customVariables.index,
hits.customVariables.customVarName,
hits.customVariables.customVarValue
FROM [google.com:analytics-bigquery:LondonCycleHelmet.ga_sessions_20130910]
Every record in that table represents one Google Analytics Session. Big Query has this concept of nested fields and that's how individual hits are defined. They are nested into the hits record.
Answering your questions:
1) customDimensions.index and customDimensions.value are the index and value for user or session scoped custom dimensions. hits.customDimensions.index and hits.customDimensions.value re custom Dimensions set at hit scope level. The scope is defined when you create the custom Dimension through GA interface. indexes are integers from 1 to 20 (as defined in the Admin section) and value is the string passed as the value for that custom Dimension. More info about Custom Dimensions/Metrics
2) Both rows and rows.customDimensions are REPEATED RECORDS in Big Query. So in essence every row in that BQ table looks like this:
|- date
|- (....)
+- hits
|- time
+- customDimensions
|- index
|- value
But when you query the data this should be FLATTEN by default. Because it's flatten if a single hit has multiple custom dimensions and metrics it should show multiple rows, one for each.
3) Should be the same as customDimensions but the values are INTEGER instead of STRINGS.
For a simpler and more educational dataset I suggest that you create a brand new BQ table and load the data provided on this developer document page.
PS: Tell my good friends at Cardinal Path that Eduardo said Hello!

Resources