NEO4J CYPHER Query to Group By a Sorted Query - collections

I am having a hard time coming up with the right CYPHER query for a use case:
I have USER and HOBBY as nodes. Comment is relationship on User's Hobbies. User can have many Hobbies and other Users can COMMENT on Hobbies of User.
Problem: Show the HOBBIES and COMMENTS with latest Comments for a User sorted by the most recent ones.
The query I am trying to perform is:
MATCH (u:user)-[r:comment]->(n:Hobby)
WHERE has(r.`comment`) and u.id = 'Test'
RETURN
DISTINCT n.id,n.name, n.description, COLLECT(r.id) as `comment`
ORDER BY
r.creationdate DESC
Apparently, r.creationdate can't be used to sort within a grouped result. Is there an alternative?

Try imposing order before aggregation. How about
MATCH (u:user {id:'test'})-[r:comment]->(n:Hobby)
WHERE has(r.comment)
WITH n, r
ORDER BY r.creationdate
RETURN n.id, n.name, n.description, COLLECT(r.id) as comment
(Sidenote: I'd recommend 'idiomatic' or at least consistent style for your queries, i.e. type labels like you would your name ((:User)) and relationship types as if you were shouting on the internet ([:COMMENT]). This makes no difference for validity or performance, but it's just nice and it helps avoid silly mistakes to follow even arbitrary conventions.)

you don't need distinct in your case, aggregation already makes distinct
r doesn't exist anymore at the point where you want to sort it.
Either you don't collect, then you can sort by it,
or you sort it upfront with WITH.
or you sort by min(r.creationdate) or max(r.creationdate)
for number 2:
MATCH (u:user)-[r:comment]->(n:Hobby)
WHERE has(r.`comment`) and u.id = 'Test'
WITH
n, r.id as rid
ORDER BY
r.creationdate DESC
RETURN n.id,n.name, n.description, collect(rid) as comment
for number 3:
MATCH (u:user)-[r:comment]->(n:Hobby)
WHERE has(r.`comment`) and u.id = 'Test'
RETURN
DISTINCT n.id,n.name, n.description, COLLECT(r.id) as `comment`, max(r.creationdate)
ORDER BY
max(r.creationdate) DESC

Related

How do I return a subset of sessions that meet specific conditions?

I guess everything is in the title. How do I return a subset of sessions that meet specific conditions? or to ask my question differently how can I return sessions that meet a specific conditions without un-nesting them?
So for example return all the hits (nested) from sessions during which a purchase occurred?
Is this possible? Does it make sense? It probably as to do with STRUCK or ARRAY but still don't really understand this.
Without specific code to go on, a general query pattern for this type of issue might look like the following:
with selected_sessions as (
select distinct session_id
from dataset.sessions
left join unnest(hits) h
where h.event = 'purchase' -- insert your own logic here
)
select *
from dataset.sessions
inner join selected_sessions using(session_id)

Cant navigate into a specific column sqlite

The first link is the problems.
I am very inexperienced in SQLite and needed some help.
Thanks in advance!
https://imgur.com/v7BdVe3
These are all the tables displayed open
https://imgur.com/a/VMOxAuc
https://imgur.com/a/LrDcCBQ
The schema
https://imgur.com/a/bv5KTHN
This is as far as I could get which is close but I couldn't figure out how to sort it also by marina 1.
SELECT BOAT_NAME, OWNER.OWNER_NUM, LAST_NAME, FIRST_NAME from OWNER inner join MARINA_SLIP on OWNER.OWNER_NUM = MARINA_SLIP.OWNER_NUM;
If you know anything else bout the other questions feel free to help me with those too, Thanks!
I believe that you want
SELECT BOAT_NAME, OWNER.OWNER_NUM, LAST_NAME, FIRST_NAME
FROM OWNER INNER JOIN MARINA_SLIP ON OWNER.OWNER_NUM = MARINA_SLIP.OWNER_NUM
WHERE MARINA_NUM = 1
ORDER BY BOAT_NAME;
The second question involves multiple joins.
The third question asks you to use the count(*) function, noting that this is an aggregate function and will result in the number of rows for the GROUP as per the GROUP BY clause (if no GROUP BY clause then there is just the one GROUP i.e. all resultant rows).
The fourth question progresses a little further asking you to extend the GROUP BY clause with the HAVING clause (see link above for GROUP BY).

Efficient insertion of row and foreign table row if it does not exist

Similar to this question and this solution for PostgreSQL (in particular "INSERT missing FK rows at the same time"):
Suppose I am making an address book with a "Groups" table and a "Contact" table. When I create a new Contact, I may want to place them into a Group at the same time. So I could do:
INSERT INTO Contact VALUES (
"Bob",
(SELECT group_id FROM Groups WHERE name = "Friends")
)
But what if the "Friends" Group doesn't exist yet? Can we insert this new Group efficiently?
The obvious thing is to do a SELECT to test if the Group exists already; if not do an INSERT. Then do an INSERT into Contacts with the sub-SELECT above.
Or I can constrain Group.name to be UNIQUE, do an INSERT OR IGNORE, then INSERT into Contacts with the sub-SELECT.
I can also keep my own cache of which Groups exist, but that seems like I'm duplicating functionality of the database in the first place.
My guess is that there is no way to do this in one query, since INSERT does not return anything and cannot be used in a subquery. Is that intuition correct? What is the best practice here?
My guess is that there is no way to do this in one query, since INSERT
does not return anything and cannot be used in a subquery. Is that
intuition correct?
You could use a Trigger and a little modification of the tables and then you could do it with a single query.
For example consider the folowing
Purely for convenience of producing the demo:-
DROP TRIGGER IF EXISTS add_group_if_not_exists;
DROP TABLE IF EXISTS contact;
DROP TABLE IF EXISTS groups;
One-time setup SQL :-
CREATE TABLE IF NOT EXISTS groups (id INTEGER PRIMARY KEY, group_name TEXT UNIQUE);
INSERT INTO groups VALUES(-1,'NOTASSIGNED');
CREATE TABLE IF NOT EXISTS contact (id INTEGER PRIMARY KEY, contact TEXT, group_to_use TEXT, group_reference TEXT DEFAULT -1 REFERENCES groups(id));
CREATE TRIGGER IF NOT EXISTS add_group_if_not_exists
AFTER INSERT ON contact
BEGIN
INSERT OR IGNORE INTO groups (group_name) VALUES(new.group_to_use);
UPDATE contact SET group_reference = (SELECT id FROM groups WHERE group_name = new.group_to_use), group_to_use = NULL WHERE id = new.id;
END;
SQL that would be used on an ongoing basis :-
INSERT INTO contact (contact,group_to_use) VALUES
('Fred','Friends'),
('Mary','Family'),
('Ivan','Enemies'),
('Sue','Work colleagues'),
('Arthur','Fellow Rulers'),
('Amy','Work colleagues'),
('Henry','Fellow Rulers'),
('Canute','Fellow Ruler')
;
The number of values and the actual values would vary.
SQL Just for demonstration of the result
SELECT * FROM groups;
SELECT contact,group_name FROM contact JOIN groups ON group_reference = groups.id;
Results
This results in :-
1) The groups (noting that the group "NOTASSIGNED", is intrinsic to the working of the above and hence added initially) :-
have to be careful regard mistakes like (Fellow Ruler instead of Fellow Rulers)
-1 used because it would not be a normal value automatically generated.
2) The contacts with the respective group :-
Efficient insertion
That could likely be debated from here to eternity so I leave it for the fence sitters/destroyers to decide :). However, some considerations:-
It works and appears to do what is wanted.
It's a little wasteful due to the additional wasted column.
It tries to minimise the waste by changing the column to an empty string (NULL may be even more efficient, but for some can be confusing)
There will obviously be an overhead BUT in comparison to the alternatives probably negligible (perhaps important if you were extracting every Facebook user) but if it's user input driven likely irrelevant.
What is the best practice here?
Fences again. :)
Note Hopefully obvious, but the DROP statements are purely for convenience and that all other SQL up until the INSERT is run once
to setup the tables and triggers in preparation for the single INSERT
that adds a group if necessary.

Using Cosmos DB how do I query just on the partition key

We have a group of related documents all sharing the same partition key. The thinking is simply grouping these up should be a case of querying on the partition key and stitching them together. What am I missing?
So
Select * from c where c.CustomerId = "500"
Would return say 3 documents, Address, Sales and Invoices who all have a property named CustomerId , with a value of 500.
I appreciate its not the primary key and I am purposely omiitng a row key.
Perhaps not splitting the documents is the answer but then the different documents have different TTLs and this would then becone problematic, wouldnt it(
CustomerId is the partition key.
The ms docs say this is possible (citing a city = seattle ) example. Where their partitionkey is city....
So, what am I missing, a complete misunderstaning of querying is cosmos ? (i can say I know a partition key is used to break up related data into partitions) I didnt know this made it an unqueryable aspect.
Also I can query with partition key and rowkey no problem.
EDIT 2:
This works:
SELECT * FROM c WHERE c.CustomerId > "499" AND c.CustomerId < "501"
Ok,
So the range query working was a bit of a lead.
Custom indexing on the collection was causing issues.
At this moment, I have removed the custom indexing entirely and will build back up and then post a more specific answer.
What I did read was that the PartitionKey is implicitly indexed anyway. There was an index on this ALSO so maybe this was causing funnies.
Indexing Policies CosmosDB
Maybe I'm not getting at all, but you have to be explicit about the value that you are looking for, I think is not the same:
c.CustomerId = "500"
VS
c.CustomerId = 500
because one is looking for text and the other one for a number, review how is stored your data, and it has to be the same if you want to perform the query using that value (and having in mind CustomerId is the Partition Key).

SQLite: SELECT from grouped and ordered result

I'm new to SQL(ite), so i'm sorry if there is a simple answer i just were to stupid to find the right search terms for.
I got 2 tables: 1 for user information and another holding points a user achieved. It's a simple one to many relation (a user can achieve points multiple times).
table1 contains "userID" and "Username" ...
table2 contains "userID" and "Amount" ...
Now i wanted to get a highscore rank for a given username.
To get the highscore i did:
SELECT Username, SUM(Amount) AS total FROM table2 JOIN table1 USING (userID) GROUP BY Username ORDER BY total DESC
How could i select a single Username and get its position from the grouped and ordered result? I have no idea how a subselect would've to look like for my goal. Is it even possible in a single query?
You cannot calculate the position of the user without referencing the other data. SQLite does not have a ranking function which would be ideal for your user case, nor does it have a row number feature that would serve as an acceptable substitute.
I suppose the closest you could get would be to drop this data into a temp table that has an incrementing ID, but I think you'd get very messy there.
It's best to handle this within the application. Get all the users and calculate rank. Cache individual user results as necessary.
Without knowing anything more about the operating context of the app/DB it's hard to provide a more specific recommendation.
For a specific user, this query gets the total amount:
SELECT SUM(Amount)
FROM Table2
WHERE userID = ?
You have to count how many other users have a higher amount than that single user:
SELECT COUNT(*)
FROM table1
WHERE (SELECT SUM(Amount)
FROM Table2
WHERE userID = table1.userID)
>=
(SELECT SUM(Amount)
FROM Table2
WHERE userID = ?);

Resources