DynamoDB and get item sorted - amazon-dynamodb

I've started using DynamoDB about 2 weeks ago, and now I've the necessity to get item based on a Sort Column... I've created my table as Token (PK) , CreateTimestamp (sort) but I've find no way of sorting on the CreateTimestamp column. I need this since I need to read the highest attribute that's defined on the last inserted row. How can I do this?
Thanks

I think you need to use composite keys, let call them Partition Key (PK) and Sort Key (SK).
To get the all the items in order, first you need to put them in the same partition (same PK value), then create a SK with the condition you wanna sort. For example:
If you wanna sort by date, oldest to newest, you can leverage KSUID
{
"PK": {
"S": "PK"
},
"SK": {
"S": "272XuKFxQEa1VTj5dHEaKXWXBdc"
}
}
272XuKFxQEa1VTj5dHEaKXWXBdc is create from 2022-03-29 09:02:55.460868
The default query will automatically get the oldest first, in case you wanna get the newest, you can set the param ScanIndexForward to false detail.
Otherwise, if you wanna sort as the number increment (0,1,2,3,etc) you need to use the padded number as below:
{
"PK": {
"S": "PK"
},
"SK": {
"S": "0000000"
}
},
{
"PK": {
"S": "PK"
},
"SK": {
"S": "0000001"
}
}
The query can be same as above.

Related

ConditionExpression for PutItem not evaluating to false

I am trying to guarantee uniqueness in my DynamoDB table, across the partition key and other attributes (but not the sort key). Something is wrong with my ConditionExpression, because it is evaluating to true and the same values are getting inserted, leading to data duplication.
Here is my table design:
email: partition key (String)
id: sort key (Number)
firstName (String)
lastName (String)
Note: The id (sort key) holds randomly generated unique number. I know... this looks like a bad design, but that is the use case I have to support.
Here is the NodeJS code with PutItem:
const dynamodb = new AWS.DynamoDB({apiVersion: '2012-08-10'})
const params = {
TableName: <table-name>,
Item: {
"email": { "S": "<email>" },
"id": { "N": "<someUniqueRandomNumber>" },
"firstName": { "S": "<firstName>" },
"lastName": { "S": "<lastName>" }
},
ConditionExpression: "attribute_not_exists(email) AND attribute_not_exists(firstName) AND attribute_not_exists(lastName)"
}
dynamodb.putItem(params, function(err, data) {
if (err) {
console.error("Put failed")
}
else {
console.log("Put succeeded")
}
})
The documentation https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.OperatorsAndFunctions.html says the following:
attribute_not_exists (path)
True if the attribute specified by path does not exist in the item.
Example: Check whether an item has a Manufacturer attribute.
attribute_not_exists (Manufacturer)
it specifically says "item" not "items" or "any item", so I think it really means that it checks only the item being overwritten. As you have a random sort key, it will always create a new item and the condition will be always true.
Any implementation which would check against a column which is not an index and would test all the records would cause a scan of all items and that is something what would not perform very well.
Here is an interesting article which covers how to deal with unique attributes in dynamodb https://advancedweb.hu/how-to-properly-implement-unique-constraints-in-dynamodb/ - the single table design together with transactions would be a possible solution for you if you can allow the additional partition keys in your table. Any other solution may be challenging under your current schema. DynamoDB has its own way of doing things and it may be frustrating to try to push to do things which it is not designed for.

Add to list only if string doesn't already exist in DynamoDB table

I'm trying with the following code
{
ExpressionAttributeNames: {
"#items": "items"
},
ExpressionAttributeValues: {
":item": [slug]
},
Key: {
listId: listId,
userId: userData.userId,
},
UpdateExpression: "SET #items = list_append(#items,:item)",
ConditionExpression: "NOT contains (#items, :item)",
TableName: process.env.listsTableName,
}
but the item is still added even if string already exists in the list. What am I doing wrong?
The list structure is like so:
{
Item: {
userId: userData.userId,
listId: crypto.createHash('md5').update(Date.now() + userData.userId).digest('hex'),
listName: 'Wishlist',
items: [],
},
TableName: process.env.listsTableName,
};
Later Edit: I know I should use SS as it does the condition for me but SS doesn't work in my context because SS can't be empty.
As the documentation explains, the contains() function only works on a string value (checking for a substring) or a set value (checking for membership). But in your case, you don't have a set but rather a list - with are different things in DynamoDB.
If all the items which you want to add to this list are strings, and you anyway don't want duplicates in the list, the most efficient way would be to stop using a list, and instead use the set-of-strings (a.k.a. SS) type. To add an item to the set (without duplicates), you would simply use "ADD #items :item" (no need for any additional condition - duplicates will not be added).

How do I make a Hasura data API query to fetch rows based on the length of the their array relationship's value?

Referring to the default sample schema mentioned in https://hasura.io/hub/project/hasura/hello-world/data-apis i.e. to the following two tables:
1) author: id,name
2) article: id, title, content, rating, author_id
where article:author_id has an array relationship to author:id.
How do I make a query to select authors who have written at least one article? Basically, something like select author where len(author.articles) > 0
TL;DR:
There's no length function that you can use in the Hasura data API syntax right now. Workaround 1) filter on a property that is guaranteed to be true for every row. Like id > 0. 2) Build a view and expose APIs on your view.
Option 1:
Use an 'always true' attribute as a filter.
{
"type": "select",
"args": {
"table": "author",
"columns": [
"*"
],
"where": {
"articles": {
"id": {
"$gt": "0"
}
}
}
}
}
This reads as: select all authors where ANY article has id > 0
This works because id is an auto-incrementing int.
Option 2:
Create a view and then expose data APIs on them.
Head to the Run SQL window in the API console and run a migration:
CREATE VIEW author_article_count as (
SELECT au.*, ar.no_articles
FROM
author au,
(SELECT author_id, COUNT(*) no_articles FROM article GROUP BY author_id) ar
WHERE
au.id = ar.author_id)
Make sure you mark this as a migration (a checkbox below the RunSQL window) so that this gets added to your migrations folder.
Now add data APIs to the view, by hitting "Track table" on the API console's schema page.
Now you can make select queries using no_articles as the length attribute:
{
"type": "select",
"args": {
"table": "author_article_count",
"columns": [
"*"
],
"where": {
"no_articles": {
"$gt": "0"
}
}
}
}

How to update a nested object inside an array in DynamoDB

Consider the following document item / syntax in a DynamoDB table:
{
"id": "0f00b15e-83ee-4340-99ea-6cb890830d96",
"name": "region-1",
"controllers": [
{
"id": "93014cf0-bb05-4fbb-9466-d56ff51b1d22",
"routes": [
{
"direction": "N",
"cars": 0,
"sensors": [
{
"id": "e82c45a3-d356-41e4-977e-f7ec947aad46",
"light": true,
},
{
"id": "78a6883e-1ced-4727-9c94-2154e0eb6139",
}
]
}
]
}
]
}
My goal is to update a single attribute in this JSON representation, in this case cars.
My approach
I know all the sensors IDs. So, the easiest way to reach that attribute is to find, in the array, the route which has a sensor with any of the ids. Having found that sensor, Dynamo should know which object in the routes array he has to update. However, I cannot run this code without my condition being rejected.
In this case, update attribute cars, where the route has a sensor with id e82c45a3-d356-41e4-977e-f7ec947aad46 or 78a6883e-1ced-4727-9c94-2154e0eb6139.
var params = {
TableName: table,
Key:{
"id": "0f00b15e-83ee-4340-99ea-6cb890830d96",
"name": "region-1"
},
UpdateExpression: "set controllers.intersections.routes.cars = :c",
ConditionExpression: ""controllers.intersections.routes.sensors.id = :s",
ExpressionAttributeValues:{
":c": 1,
":s": "e82c45a3-d356-41e4-977e-f7ec947aad46"
},
ReturnValues:"UPDATED_NEW"
};
docClient.update(params, ...);
How can I achieve this?
Unfortunately, you can't achieve this in DynamoDB without knowing the array index. You have very complex nested structure. The DynamoDB API doesn't have a feature to handle this scenario.
I think you need the array index for controllers, routes and sensors to get the update to work.
Your approach may work in other databases like MongoDB. However, it wouldn't work on DynamoDB. Generally, it is not recommended to have this complex structure in DynamoDB especially if your use case has update scenario.
TableName : 'tablename',
Key : { id: id},
ReturnValues : 'ALL_NEW',
UpdateExpression : 'set someitem['+`index`+'].somevalue = :reply_content',
ExpressionAttributeValues : { ':reply_content' : updateddata }
For updating nested array element need to fing out array index . Then you can update nested array element in dynamo db.

Hot to improve a query based on a nested object

I'm using AWS DynamoDB to store data in JSON format. The Partition key is "device" and sort key is "timestamp". I can query the table for a specific device in a range of dates. I can then filter the content by the specific endpoint (in the nested "reports" object) the application is interested in.
{
"device": "AAA111",
"attr1": "bbb",
"reports": [
{
"endpoint": 1,
"value": "23"
},
{
"endpoint": 3,
"value": "26"
},
{
"endpoint": 4,
"value": "20"
}
],
.........
............
...........
"timestamp": "2017-11-30T03:50:30z"
}
The problem I have is if for example, I want to retrieve the latest value of an specific "endpoint". So, I can retrieve the latest record for a "device" based on the latest "timestamp", but it doesn't guarantee this record will contain value for this particular endpoint (not all records contains all endpoints). To solve this I have to basically scan the latest records (in descending order) and return the first object where the endpoint is found. Also, I don't know how many records I have to retrieve to find one...
I'm wondering if there is a better way of doing this... I tried with secondary indexes but this would require to duplicate the data, creating an object for each endpoint value (duplicating the common data). I would like to avoid this...
I would appreciate any hints on how to solve this issue.
Thanks
Gus

Resources