CYPHER storing count()-value for further usage - graph

I want to create a relationship between nodes that have one or multiple things in common and want to set the count of the common things as a property inside the relationship.
For example: in the movie-tutorial-graph I want to create a relationship between actors that have acted in the same movie(s) together and the set count of the movies they played in together as a property in the relationship.
For the basic counting, the tutorial provides a query:
MATCH (n)-[:ACTED_IN]->(m)<-[:ACTED_IN]-(coActors)
RETURN n.name, coActors.name, count(*) AS Strength ORDER BY Strength DESC
This gives me a list of 2 names and the amount of times they played in movies together (f.e. "Keanu Reeves", "Carrie-Anne Moss", Strength: 3 -> as there are the 3 Matrix movies inside the graph.)
Now I want to create a relationship (ACTED_WITH) between these pairs and set the strength-value as a property inside it.
I can create a relationship like this:
MATCH (a)-[:ACTED_IN]->(p)<-[:ACTED_IN]-(b) MERGE (a)-[r:ACTED_WITH]->(b)
MERGE ensures that there is only one relationship created, but I just can't get the counting-stuff to work with the creation.

I'm not sure to understand what you want but maybe something like that:
MATCH (a)-[:ACTED_IN]->(m)<-[:ACTED_IN]-(b)
WITH a,b, collect(m) AS movies ORDER BY size(movies) DESC
MERGE (a)-[r:ACTED_WITH]-(b)
ON CREATE SET r.Strength=SIZE(movies)

You can use the SET Cypher clause to set properties on a matched node. If you try to set it in the MERGE clause, than merge will treat the key(s) as a unique identifier, and will create a new relationship if one doesn't exist with that specific value yet.
MATCH (a)-[:ACTED_IN]->(p)<-[:ACTED_IN]-(b)
MERGE (a)-[r:ACTED_WITH]->(b)
// reduce matched set to one row of data
WITH DISTINCT a, b, r, COUNT(p) as strength, COLLECT(p) as movies
// set r
SET r.strength = strength
// Return everything to verify above results
RETURN *
SET will overwrite any previous value. If you want to only set it if you created the relationship, you can use ON CREATE or ON MATCH.

Related

How to properly use MATCH inside UNWIND for a Nebula query

I’m currently working with the Nebula graph database for the first time and I’m running into some issues with a query. In terms of the schema, I have “Person” nodes, which have a “name” property, as well as Location nodes also with a name property. These node types can be connected by a relationship edge, called HAS_LIVED (to signify whether a person has lived in a certain location). Now for the query, I have a list of names (strings). The query looks like:
UNWIND [“Anna”, “Emma”, “Zach”] AS n
MATCH (p:Person {name: n})-[:HAS_LIVED]->(loc)
RETURN loc.Location.name
This should return a list of three places, i.e. [“London”, “Paris”, “Berlin”]. However, I am getting nothing as a result from the query. When I get rid of the UNWIND and write three separate MATCH queries with each name, it works individually. Not sure why.
Try this instead. It is using "where" clause.
UNWIND [“Anna”, “Emma”, “Zach”] AS n
MATCH (p:Person)-[:HAS_LIVED]->(loc)
where p.name = n
RETURN loc.Location.name

Neo4j - how to include start node in my query?

I'm attempting to build a recommendation engine for a library system.
This is my db schema:
My starting point is a LoanerCard. The flow is then supposed to look like this: Get all copies -> get the material -> get all copies of the material (including the original) -> get LoanerCard from copy -> get all loaned copies -> return the material name of the copies + an aggregated count to indicate the strength of the recommendation.
My best attempt so far has resulted in this query:
MATCH (L:LoanerCard {Barcode:"10007"})-[:LOANED]->(myLoans)-[:COPY_OF]-
(masterMaterial),
(masterMaterial)<-[:COPY_OF]-(allCopies),
(allCopies)<-[:LOANED]-(coLoaners),
(coLoaners)-[r:LOANED]->(theirCopies),
(theirCopies)-[:COPY_OF]-(materials)
RETURN materials.Title as Recommended, count(*) as Strength ORDER BY Strength DESC
My issue here is that when I traverse the graph it doesn't include the original copy and the adjacent LoanerCards of that so essentially it only traverses the area circled in red and never reaches LoanerCard 10817 and 10558
How can I design my query so it includes these?
A MATCH clause automatically filters out duplicate relationships. Therefore, in order to traverse the same relationships twice, you need to split your MATCH clause in two.
Try this:
MATCH (:LoanerCard {Barcode:"10007"})-[:LOANED]->()-[:COPY_OF]-(masterMaterial)
MATCH (masterMaterial)<-[:COPY_OF]-()<-[:LOANED]-()-[:LOANED]->()-[:COPY_OF]-(materials)
RETURN materials.Title as Recommended, count(*) as Strength ORDER BY Strength DESC

DyanmoDB shows Item Count = 0, not being populated, and not working in Appsync query

I have added an index to my DynamoDB table in order to order the results but it doesn't appear to be doing anything. In the DyanmoDB dashboard it shows with 0 size and 0 item count.
There are several hundred items in the table and they all have an id (the primary key) and a created value. I didn't set a range property when I created the table. The items in the picture below are in the correct order but the response via appsync is not.
I have added the index to the query which returns all the items and it does not seem to do anything, the order of the items is the same with or without the index:
"version" : "2017-02-28",
"operation" : "Scan",
"index" : "id-created-index",
"limit": $util.defaultIfNull(${ctx.args.limit}, 20),
"nextToken": $util.toJson($util.defaultIfNullOrBlank($ctx.args.nextToken, null))
What am I missing? Has the index not been built or is there something else I need to do to use it in a query?
Update:
The index now shows the correct item_count although it is still not ordering the results:
Your base table has a partition key of id and no sort key. By definition this means each item in your table has a unique id.
Your GSI has a partition key of id and a sort key of created. Data is sorted by the created attribute within each partition key. As each of your ids is unique, the sort key is basically not doing anything.
Scan operations against a table or index returns the results in a random order. In order to have results sorted coming from DynamoDB, you'll need to run a Query operation, where the partition/hash key is fixed, and results will be sorted according to the sort key. However, since your table/GSI always have unique IDs, there's no additional records within a single partition (the id).
So yes, if you wanted results ordered by created, you'd need to have a fixed attribute on your table set as the partition key for your Index. The caveat here is that all your records in the index would belong to a single partition, which would be a bottleneck. There are a few ways around this; one way would be to see if there's a different access pattern where you can keep a different attribute fixed to query against (ie. owner_id). If the number of records are low enough, filtering on the client side is probably the best option.

Do I need a sort key or should I use AWS DAX

I have a dilemma and I know I should of used an SQL DB from the beginning.
I am unsure if I can use a sort key for my particular use case. I have a table that contains multiple attributes brand, model ref, reference... What I am trying to do is let the user select brand then the model then the reference etc then get all products that match that criteria and give a mean of the prices of those items.
Now doing a scan operation of the whole DB that has 300K+ items is not very cost effect to say the least but this is the situation I am in.
My question is how can I most cost effectively do what I want to do?
Let the table T have only a partition key: ID.
For the sake of the simplicity you let your client choose n = 3 attributes: brand, model-ref, reference.
Now, define a Global Secondary Index (GSI) with partition key: brand_model-ref_reference and sorting key: ID. I suggest you to use Projection: ALL.
Thus, when your client has chosen its 3 values: a, b, c, all you have to do is to query the GSI with brand_model-ref_reference = "a#b#c". You will efficiently fetch all and only the items you need to compute your average. The size of the table is no longer of any importance.
Notes:
With this solution you have to fix in advance the number of criteria and the client must choose a value for all of them. Not so nice.
If there are more constraints all that solution becomes useless. Use it as a hint. :)

How can I implement a junction index in DynamoDB?

Given two DynamoDB tables: Books and Words, how can I create an index that associates the two? Specifically, I'd like to query to get all Books that contain a certain Word, and query to get all Words that appear in a specific Book.
The objective is to avoid scanning an entire table for these queries.
Based on your question I can't tell if you only care about unique words or if you want every word including duplicates. I'll assume unique words.
This can be done with a single table and a Global Secondary Index.
Create a table called BookWords with a Hash key of bookId and a Sort key of word. If you Query this table with a bookId you will get all of the unique words in that book.
Create a Global Secondary Index with a Hash key of word and a Sort key of bookId. If you Query this index with a word you will get all of the bookIds of books that contain that word.
Depending of your use case, you will probably want to normalize the words. For example, is "Word" the same as "word"?
If you want all words, not just unique words, you can use a similar approach with a few small changes. Let me know

Resources