Storing Partition Keys on CosmosDB - azure-cosmosdb

The concept is very straight away. Not difficulty to understand. But I can't find some practical examples... I've this json...
{
"agendaId": "688c99bc-c756-4b8f-9c39-60246c0ea66e",
"postId": "f07f16e5-e7d6-441e-bc1d-82ac671e0f63",
"name": "john doe"
}
And I defined this partition: /agendaId/postId
And then I inserted the item.
But I can't see any data on my CosmosDB Container Browser on the column of the partition key - I just can see the autogenerated Id...is this expected ?

/agendaId/postId means that the value is nested. The JSON to match that definition should be:
{
"agendaId": {
"postId": "f07f16e5-e7d6-441e-bc1d-82ac671e0f63",
},
"name": "john doe"
}
The / on the definition defines the level of nesting, so /attr1/attr2/attr3 would mean that the value is in attr3 that exists in attr2 which exists in attr1:
{
"attr1": {
"attr2": {
"attr3": "the value"
}
}
}
In your case, if you used, for example /agendaId is just the first level of attributes, or /postId. Both would pick up the value of the attributes in your document.
Reference: https://learn.microsoft.com/azure/cosmos-db/partitioning-overview#choose-partitionkey

Related

ConditionExpression for PutItem not evaluating to false

I am trying to guarantee uniqueness in my DynamoDB table, across the partition key and other attributes (but not the sort key). Something is wrong with my ConditionExpression, because it is evaluating to true and the same values are getting inserted, leading to data duplication.
Here is my table design:
email: partition key (String)
id: sort key (Number)
firstName (String)
lastName (String)
Note: The id (sort key) holds randomly generated unique number. I know... this looks like a bad design, but that is the use case I have to support.
Here is the NodeJS code with PutItem:
const dynamodb = new AWS.DynamoDB({apiVersion: '2012-08-10'})
const params = {
TableName: <table-name>,
Item: {
"email": { "S": "<email>" },
"id": { "N": "<someUniqueRandomNumber>" },
"firstName": { "S": "<firstName>" },
"lastName": { "S": "<lastName>" }
},
ConditionExpression: "attribute_not_exists(email) AND attribute_not_exists(firstName) AND attribute_not_exists(lastName)"
}
dynamodb.putItem(params, function(err, data) {
if (err) {
console.error("Put failed")
}
else {
console.log("Put succeeded")
}
})
The documentation https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.OperatorsAndFunctions.html says the following:
attribute_not_exists (path)
True if the attribute specified by path does not exist in the item.
Example: Check whether an item has a Manufacturer attribute.
attribute_not_exists (Manufacturer)
it specifically says "item" not "items" or "any item", so I think it really means that it checks only the item being overwritten. As you have a random sort key, it will always create a new item and the condition will be always true.
Any implementation which would check against a column which is not an index and would test all the records would cause a scan of all items and that is something what would not perform very well.
Here is an interesting article which covers how to deal with unique attributes in dynamodb https://advancedweb.hu/how-to-properly-implement-unique-constraints-in-dynamodb/ - the single table design together with transactions would be a possible solution for you if you can allow the additional partition keys in your table. Any other solution may be challenging under your current schema. DynamoDB has its own way of doing things and it may be frustrating to try to push to do things which it is not designed for.

AppSync query resolver: are expressionNames and expressionValues necessary?

https://docs.aws.amazon.com/appsync/latest/devguide/resolver-mapping-template-reference-dynamodb.html#aws-appsync-resolver-mapping-template-reference-dynamodb-query
AppSync doc says that expressionNames and expressionValues are optional fields, but they are always populated by code generation. First question, should they be included as a best practice when working with DynamoDB? If so, why?
AppSync resolver for a query on the partition key:
{
"version": "2017-02-28",
"operation": "Query",
"query": {
"expression": "#partitionKey = :partitionKey",
"expressionNames": {
"#partitionKey": "partitionKey"
},
"expressionValues": {
":partitionKey": {
"S": "${ctx.args.partitionKey}"
}
}
}
}
Second question, what exactly is the layman translation of the expression field here in the code above? What exactly is that statement telling DynamoDB to do? What is the use of the # in "expression": "#partitionKey = :partitionKey" and are the expression names and values just formatting safeguards?
Let me answer your second question first:
expressionNames
expressionNames are used for interpolation. What this means is after interpolation, this filter expression object:
"expression": "#partitionKey = :value",
"expressionNames": {
"#partitionKey": "id"
}
will be transformed to:
"expression": "id = :value",
the #partitionKey acts as a placeholder for your column name id. '#' happens to be the delimiter.
But why?
expressionNames are necessary because certain keywords are reserved by DynamoDB, meaning you can't use these words inside a DynamoDB expression.
expressionValues
When you need to compare anything in a DynamoDB expression, you will need also to use a substitute for the actual value using a placeholder, because the DynamoDB typed value is a complex object.
In the following example:
"expression": "myKey = :partitionKey",
"expressionValues": {
":partitionKey": {
"S": "123"
}
}
:partitionKey is the placeholder for the complex value
{
"S": "123"
}
':' is the different delimiter that tells DynamoDB to use the expressionValues map when replacing.
Why are expressionNames and expressionValues always used by code generation?
It is just simpler for the code generation logic to always use expressionNames and expressionValues because there is no need to have two code paths for reserved/non-reserved DynamoDB words. Using expressionNames will always prevent collisions!

$elemMatch support (or alternative)

I need to do an find operation on an array of objects in order to determine the permission, basically define a permission based on on whether an object exist in an array of objects defined as an attribute on the subject.
In MongoDB I would have used $elemMatch, is it supported? or do you recommend an alternative?
Example:
department = {
name: "name",
employees : [
{
Name: "John Doe",
title: "assistant"
},
{
Name: "Jane Doe",
title: "Manager"
}
]
};
I need to define an ability to only allow someone with the name "Jane Doe" and title "Manager" to update the name of the department.
Please do not focus on the very awful data model here, its just an example of what I am trying to achieve. The main focus is basing an ability on the existing of an object within a field that is an array of objects.
Fantastic library btw!
$elemMatch is supported. You can use it.

How do I make a Hasura data API query to fetch rows based on the length of the their array relationship's value?

Referring to the default sample schema mentioned in https://hasura.io/hub/project/hasura/hello-world/data-apis i.e. to the following two tables:
1) author: id,name
2) article: id, title, content, rating, author_id
where article:author_id has an array relationship to author:id.
How do I make a query to select authors who have written at least one article? Basically, something like select author where len(author.articles) > 0
TL;DR:
There's no length function that you can use in the Hasura data API syntax right now. Workaround 1) filter on a property that is guaranteed to be true for every row. Like id > 0. 2) Build a view and expose APIs on your view.
Option 1:
Use an 'always true' attribute as a filter.
{
"type": "select",
"args": {
"table": "author",
"columns": [
"*"
],
"where": {
"articles": {
"id": {
"$gt": "0"
}
}
}
}
}
This reads as: select all authors where ANY article has id > 0
This works because id is an auto-incrementing int.
Option 2:
Create a view and then expose data APIs on them.
Head to the Run SQL window in the API console and run a migration:
CREATE VIEW author_article_count as (
SELECT au.*, ar.no_articles
FROM
author au,
(SELECT author_id, COUNT(*) no_articles FROM article GROUP BY author_id) ar
WHERE
au.id = ar.author_id)
Make sure you mark this as a migration (a checkbox below the RunSQL window) so that this gets added to your migrations folder.
Now add data APIs to the view, by hitting "Track table" on the API console's schema page.
Now you can make select queries using no_articles as the length attribute:
{
"type": "select",
"args": {
"table": "author_article_count",
"columns": [
"*"
],
"where": {
"no_articles": {
"$gt": "0"
}
}
}
}

Validate multicolumn property Firebase Database Rules

I have a multicolumn index property filter_active in this structure.
"books": {
"435085rfddsfiou4r80": {
"name": "Harry Potter 1"
}
}
"review": {
"540398fsdo9043": {
"filter_active": "true|435085rfddsfiou4r80|false"
"active": true,
"archived": false,
"book_id": "435085rfddsfiou4r80"
"review": "good book"
}
}
Now I want to use the security rules to validate the filter_active property. I need to check if the book exists in the books node. Also the book id in the filter must be equal to the book_id in the review object.
There is no such thing as a split method in the Firebase Database Rules. In addition, I tried to create a dynamic regex but I believe this is not possible.
Is there anyway I can fix this problem?
If the unwanted text in filter_active is always a small number of known words or characters, they can be eliminated by repeated use of replace(). For example, to eliminate true, false, and |:
newData.child('filter_active').val().replace('true','')
.replace('false','').replace('|','')
You can then check for the existence of a book with the resulting key:
root.child('books').child(newData.child('filter_active').val()
.replace('true','').replace('false','').replace('|','')).exists()

Resources