Dynamodb re-order row priority field values and also make them sequential - amazon-dynamodb

I have a table in dynamodb, records has priority field. 1-N.
records shown to user with a form and user can update priority field, it means I need to change the priority of the field.
one solution is like when priority of a record changed I reorder all the records that their priory is more than it.
for example if I change a priority of record in N= 5 to 10, I need to order all records that their priority field is more than 5.
what do you recommend?

DynamoDB store all items(records) in order by a tables sort-attribute. However, you are unable to update a key value, you would need to delete and add a new item every time you update.
One way to overcome this is to create a GSI. Depending on the throughput required for you table you may need to artificially shard the partition key. If you expect to consume less than 1000 WCU per second, you won't need to.
gsipk
gsisk
data
1
001
data
1
002
data
1
007
data
1
009
data
Now to get all the data in order of priority you simply Query your index where gsipk = 1.
You can also Update the order attribute gsisk without having to delete and put an item.

Related

Update data with filter on Dynamo Db with Php Sdk

I have this DynamoDb table:
ID
customer_id
product_code
date_expire
3
12
TRE65GF
2023-11-15
5
12
WDD2
2023-11-15
4
44
BT4D
2023-06-23
What is the best way, in DynamoDb, to update the "date_expire" field to all customers with the same customer_id?
For example ,I want to set the date_expire to "2023-04-17" to all data with customer_id ="12".
Should I do a scan of the table to extract all the "IDs" and then a WriteRequestBatch?
Or is there a quicker way, like normal sql queries ("update table set field=value where condition=xx")?
If this is a common use-case, then I would suggest creating a GSI with a partition key of custome_id
customer_id
product_code
date_expire
ID
12
TRE65GF
2023-11-15
3
12
WDD2
2023-11-15
5
44
BT4D
2023-06-23
4
SELECT * FROM mytable.myindex WHERE customer_id = 12
First you do a Query on the customer_id to give you back all the customers data, then you have a choice on how to update the data:
UpdateItem
Depending on how many items returned it may be best to just iterate over them and call an UpdateItem on each item. UpdateItem is better than the PutItem or BatchWriteItem as its an upsert and not an overwrite, which means you will be less likely to corrupt your data due to conflicts/consistency.
BatchWriteItem
If you have a large amount of items for a customer, BatchWriteItem may be best for speed, where you can write batches of up to 25 items. But as mentioned above, you are overwriting data which can be dangerous when all you want to do is update.
TransactWriteItems
Transactions give you the ability to update batches of up to 100 items at a time, but the caveat is that the batch is ACID compliant, meaning if one item update fails for any reason, they all fail. However, based on your use-case, this may be what you intend to happen.
Examples
PHP examples are available here.

DynamoDB Limit on query

I have a doubt about Limit on query/scans on DynamoDB.
My table has 1000 records, and the query on all of them return 50 values, but if I put a Limit of 5, that doesn't mean that the query will return the first 5 values, it just say that query for 5 Items on the table (in any order, so they could be very old items or new ones), so it's possible that I got 0 items on the query. How can actually get the latest 5 items of a query? I need to set a Limit of 5 (numbers are examples) because it will to expensive to query/scan for more items than that.
The query has this input
{
TableName: 'transactionsTable',
IndexName: 'transactionsByUserId',
ProjectionExpression: 'origin, receiver, #valid_status, createdAt, totalAmount',
KeyConditionExpression: 'userId = :userId',
ExpressionAttributeValues: {
':userId': 'user-id',
':payment_gateway': 'payment_gateway'
},
ExpressionAttributeNames: {
'#valid_status': 'status'
},
FilterExpression: '#valid_status = :payment_gateway',
Limit: 5
}
The index of my table is like this:
Should I use a second index or something, to sort them with the field createdAt but then, how I'm sure that the query will look into all the items?
if I put a Limit of 5, that doesn't mean that the query will return the first 5 values, it just say that query for 5 Items on the table (in any order, so they could be very old items or new ones), so it's possible that I got 0 items on the query. How can actually get the latest 5 items of a query?
You are correct in your observation, and unfortunately there is no Query options or any other operation that can guarantee 5 items in a single request. To understand why this is the case (it's not just laziness on Amazon's side), consider the following extreme case: you have a huge database with one billion items, but do a very specific query which has just 5 matching items, and now making the request you wished for: "give me back 5 items". Such a request would need to read the entire database of a billion items, before it can return anything, and the client will surely give up by then. So this is not how DyanmoDB's Limit works. It limits the amount of work that DyanamoDB needs to do before responding. So if Limit = 100, DynamoDB will read internally 100 items, which takes a bounded amount of time. But you are right that you have no idea whether it will respond with 100 items (if all of them matched the filter) or 0 items (if none of them matched the filter).
So to do what you want to do efficiently, you'll need to think of a different way to model your data - i.e., how to organize the partition and sort keys. There are different ways to do it, each has its own benefits and downsides, you'll need to consider your options for yourself. Since you asked about GSI, I'll give you some hints about how to use that option:
The pattern you are looking for is called filtered data retrieval. As you noted, if you do a GSI with the sort key being createdAt, you can retrieve the newest items first. But you still need to do a filter, and still don't know how to stop after 5 filtered results (and not 5 pre-filtering) results. The solution is to ask DynamoDB to only put in the GSI, in the first place, items which pass the filtering. In your example, it seems you always use the same filter: "status = payment_gateway". DynamoDB doesn't have an option to run a generic filter function when building the GSI, but it has a different trick up its sleeve to achieve the same thing: Any time you set "status = payment_gateway", also set another attribute "status_payment_gateway", and when status is set to something else, delete the "status_payment_gateway". Now, create the GSI with "status_payment_gateway" as the partition key. DynamoDB will only put items in the GSI if they have this attribute, thereby achieving exactly the filtering you want.
You can also have multiple mutually-exclusive filtering criteria in one GSI by setting the partition key attribute to multiple different values, and you can then do a Query on each of these values separately (using KeyConditionExpression).

Keep a maximum of 5 rows in a table (Room/Sqlite)

I want to store the last 5 searches a user performed in a SQLite table using Room. How can I always delete the oldest entry when there are more than 5 entries?
I don't want to add a date column and sort by date, as for privacy reasons I don't want to store the time when a user performed the search
I don't want to use an autoincrement id column, as it's theoretically limited at some given maximum that the ID can be
Could I maybe use the rowid? So checking if the number of entries in the table is larger than 5, then sort by rowid ascending and delete the first entry? Any other ideas?
I don't want to use an autoincrement id column, as it's theoretically
limited at some given maximum that the ID can be
On that logic one shouldn't use autoincrement values in database at all. I doubt you really have such a data that Long with its maximum value (9223372036854775807) wouldn't be enough or could be achieved.
Well, as one more alternative - you can use next schema if there is 5 rows in your table and you have int id field for example:
Delete row with minimal id (I guess it would be 0)
Update all the rows with one query, decreasing their id by 1.
#Query("update search_table set id = id - 1")
fun reorderData()
insert new row.

dynamodb query to select all items that match a set of values

In a dynamo table I would like to query by selecting all items where an attributes value matches one of a set of values. For example my table has a current_status attribute so I would like all items that either have a 'NEW' or 'ASSIGNED' value.
If I apply a GSI to the current_status attribute it looks like I have to do this in two queries? Or instead do a scan?
DynamoDB does not recommend using scan. Use it only when there is no other option and you have fairly small amount of data.
You need use GSIs here. Putting current_status in PK of GSI would result in hot
partition issue.
The right solution is to put random number in PK of GSI, ranging from 0..N, where N is number of partitions. And put the status in SK of GSI, along with timestamp or some unique information to keep PK-SK pair unique. So when you want to query based on current_status, execute N queries in parallel with PK ranging from 0..N and SK begins_with current_status. N should be decided based on amount of data you have. If the data on each row is less than 4kb, then this parallel query operation would consume N read units without hot partition issue. Below link provides the details information on this
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-indexes-gsi-sharding.html
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-modeling-nosql-B.html

How to design DynamoDB table to facilitate searching by time ranges, and deleting by unique ID

I'm new to DynamoDB - I already have an application where the data gets inserted, but I'm getting stuck on extracting the data.
Requirement:
There must be a unique table per customer
Insert documents into the table (each doc has a unique ID and a timestamp)
Get X number of documents based on timestamp (ordered ascending)
Delete individual documents based on unique ID
So far I have created a table with composite key (S:id, N:timestamp). However when I come to query it, I realise that since my id is unique, because I can't do a wildcard search on ID I won't be able to extract a range of items...
So, how should I design my table to satisfy this scenario?
Edit: Here's what I'm thinking:
Primary index will be composite: (s:customer_id, n:timestamp) where customer ID will be the same within a table. This will enable me to extact data based on time range.
Secondary index will be hash (s: unique_doc_id) whereby I will be able to delete items using this index.
Does this sound like the correct solution? Thank you in advance.
You can satisfy the requirements like this:
Your primary key will be h:customer_id and r:unique_id. This makes sure all the elements in the table have different keys.
You will also have an attribute for timestamp and will have a Local Secondary Index on it.
You will use the LSI to do requirement 3 and batchWrite API call to do batch delete for requirement 4.
This solution doesn't require (1) - all the customers can stay in the same table (Heads up - There is a limit-before-contact-us of 256 tables per account)

Resources