I had a dynamodb schema which looks quite similar to the one described in aws doc: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-adjacency-graphs.html
PK = invoice-id, SK = bill-id
Here I have an invoice with more than 10000+ bills which exceeds the item limit 400k. I use an ImmutableSet for holding bills(connected nodes) which helps do deduplication.
What's the best way to work around this limit? If I want to keep latest ~100 bills in an invoice, is it possible to implement this at very minimal effort?
Currently due to the item limit, db stopped writing any bill data for the invoice. I tried setting ttl for bills however in the invoice entry it doesn't go off.
We use Cosmos DB to track all our devices and also data that is related to the device (and not stored in the device document itself) is stored in the same container with the same partition ID.
Both the device document and the related documents have /deviceId as the partition key. When a device is removed, then I remove the device document. I actually want to remove the entire partition, but this doesn't seem to be possible. So I revert to a query that queries for all items with this partition key and remove them from the database.
This works fine, but may consume a lot of RUs if there is a lot of related data (which may be true in some cases). I would rather just remove the device and schedule all related data for removal later (it doesn't hurt to have them in the database for a while). When RU utilization is low, then I start removing these items. Is there a standard solution to do this?
The best solution would be to schedule this and that Cosmos DB would process these commands when it has spare RUs, just like with the TTL deletion. Is this even possible?
A feature is now in preview to delete all items by partition key using fire and forget background processing model with a limited amount of available throughput. There's a signup link in the feature request page to get access to preview.
Currently, the API looks like a new DeleteAllItemsByPartitionKey method in the SDK.
It definitely is possible to set a TTL and then let Cosmos handle expiring data out of the container when it is idle. However, the cost to update the document in the first place is about what it costs to delete it anyway so you're not gaining much.
An approach as you suggest, may be to have a separate container (or even a queue) where you insert a new item with the deviceId to retire. Then in the evenings or during a time when you know the system is idle. Run a job that reads the next deviceId in the queue, queries for all the items with that partition key, then deletes the data or sets the TTL to expire the data.
There is a feature to delete an entire partition in the works that would be perfect for this scenario (in fact, it's designed for it) but no ETA on availability.
My application need to support lookups for invoices by invoice id and by the customer. For that reason I created two collections in which I store the (exact) same invoice documents:
InvoicesById, with partition key /InvoiceId
InvoicesByCustomerId, with partition key /CustomerId
Apparently you should use partition keys when doing queries and since there are two queries I need two collections. I guess there may be more in the future.
Updates are primarily done to the InvoicesById collection, but then I need to replicate the change to InvoicesByCustomer (and others) as well.
Are there any best practice or sane approaches how to keep collections in sync?
I'm thinking change feeds and what not. I want avoid writing this sync code and risk inconsistencies due to missing transactions between collections (etc). Or maybe I'm missing something crucial here.
Change feed will do the trick though I would suggest to take a step back before brute-forcing the problem.
Please find detailed article describing split issue here: Azure Cosmos DB. Partitioning.
Based on the Microsoft recommendation for maintainable data growth you should select partition key with highest cardinality (in your case I assume it will be InvoiceId). For the main reason:
Spread request unit (RU) consumption and data storage evenly across all logical partitions. This ensures even RU consumption and storage distribution across your physical partitions.
You don't need creating separate container with CustomerId partition key as it won't give you desired, and most importantly, maintainable performance in future and might result in physical partition data skew when too many Invoices linked to the same customer.
To get optimal and scalable query performance you most probably need InvoiceId as partition key and indexing policy by CustomerId (and others in future).
There will be a slight RU overhead (definitely not multiplication of RUs but rather couple additional RUs per request) in consumption when data you're querying is distributed between number of physical partitions (PPs) but it will be neglectable comparing to issues occurring when data starts growing beyond 50-, 100-, 150GB.
Why CustomerId might not be the best partition key for the data sets which are expected to grow beyond 50GB?
Main reason is that Cosmos DB is designed to scale horizontally and provisioned throughput per PP is limited to the [total provisioned per container (or DB)] / [number of PP].
Once PP split occurs due to exceeding 50GB size your max throughput for existing PPs as well as two newly created PPs will be lower then it was before split.
So imagine following scenario (consider days as a measure of time between actions):
You've created container with provisioned 10k RUs and CustomerId partition key (which will generate one underlying PP1). Maximum throughput per PP is 10k/1 = 10k RUs
Gradually adding data to container you end-up with 3 big customers with C1[10GB], C2[20GB] and C3[10GB] of invoices
When another customer was onboarded to the system with C4[15GB] of data Cosmos DB will have to split PP1 data into two newly created PP2 (30GB) and PP3 (25GB). Maximum throughput per PP is 10k/2 = 5k RUs
Two more customers C5[10GB] C6[15GB] were added to the system and both ended-up in PP2 which lead to another split -> PP4 (20GB) and PP5 (35GB). Maximum throughput per PP is now 10k/3 = 3.333k RUs
IMPORTANT: As a result on [Day 2] C1 data was queried with up to 10k RUs
but on [Day 4] with only max to 3.333k RUs which directly impacts execution time of your query
This is a main thing to remember when designing partition keys in current version of Cosmos DB (12.03.21).
What you are doing is a good solution. Different queries requires different Partition Keys on different Cosmos DB Containers with same data.
How to sync the two Containers: use Triggers from the firs Container.
Cassandra has a Feature called Materialized Views for this exact problem, abstracting the sync problem. Maybe some day same Feature will be included on Cosmos DB.
In DynamoDB I have a table like below example data
pk sk name price
product cat#phone#name#iPhone11 iPhone 11 500
product cat#phone#name#Nokia1100 Nokia 1100 100
product cat#phone#name#iPhone11 iPhone 11 500
In a case I have to search by name. So, first I have created a global index for name where in index pk = pk, sk=name . Then I made a search which working fine.
Now I have changed my decision and created a local index for name, where name is sk. It's also working fine. My question is if I use local index here, has there any benefit ? and when I should not use local index ? If global index not required here but I have used , has there any performance issues ?
This AWS doc very well explains LSI and GSI in detail.
Now to answer your questions
- LSI comes at no extra cost. You don't need to pay for GSI's RCUs, WCUs however need to pay for storage as depicted here in another AWS doc.
- One should not use LSI if you are very certain that single partition (ie - pk) of your main table (pk remains the same in LSI) can be over 10GB. This is also discussed in link shared above.
- There is no performance issue with LSI and GSI in terms of query latencies. However, reads in GSI are eventual consistent whereas LSI supports strong consistent reads.
Edit, putting excerpt from the AWS doc to understand strong and eventual consistent reads.
Strongly Consistent Reads - When you request a strongly consistent read, DynamoDB returns a response with the most up-to-date data, reflecting the updates from all prior write operations that were successful.
Eventually Consistent Reads - When you read data from a DynamoDB table, the response might not reflect the results of a recently completed write operation. The response might include some stale data. If you repeat your read request after a short time, the response should return the latest data.
Refer this AWS doc for tips to minimise propagation delay of data from main table to GSIs
I've been thinking a lot about the possible strategies of querying unbound amount of items.
For example, think of a forum - you could have any number of forum posts categorized by topic. You need to support at least 2 access patterns: post details view and list of posts by topic.
// legend
PK = partition key, SK = sort key
While it's easy to get a single post, you can't effectively query a list of posts without a scan.
PK = postId
Great for querying all the posts for given topic but all are in same partition ("hot partition").
PK = topic and SK = postId#addedDateTime
Store items in buckets, e.g new bucket for each day. This would push a lot of logic to application layer and add latency. E.g if you need to get 10 posts, you'd have to query today's bucket and if bucket contains less than 10 items, query yesterday's bucket, etc. Don't even get me started on pagionation. That would probably be a nightmare if it crosses buckets.
PK = topic#date and SK = postId#addedDateTime
So my question is that how to store and query unbound list of items in "DynamoDB way"?
I think you've got a good understanding about your options.
I can't profess to know the One True Way™ to solve this particular problem in DynamoDB, but I'll throw out a few thoughts for the sake of discussion.
While it's easy to get a single post, you can't effectively query a list of posts without a scan.
This would definitely be the case if your Primary Key consists solely of the postId (I'll use POST#<postId> to make it easier to read). That table would look something like this:
This would be super efficient for the 'fetch post details view (aka fetch post by ID)" access pattern. However, we haven't built-in any way to access a group of Posts by topic. Let's give that a shot next.
There are a few ways to model the one-to-many relationship between Posts and topics. The first thing that comes to mind is creating a secondary index on the topic field. Logically, that would look like this:
Now we can get an item collection of Posts by topic using the efficient query operation. Pagination will help you if your number of Posts per topic grows larger. This may be enough for your application. For the sake of this discussion, let's assume it creates a hot partition and consider what strategies we can introduce to reduce the problem.
One Option
You said
Store items in buckets, e.g new bucket for each day.
This is a great idea! Let's update our secondary index partition key to be <topic>#<truncated_timestamp> so we can group posts by topic for a given time frame (day/week/month/etc).
I've done a few things here:
Introduced two new attributes to represent the secondary index PK and SK (GSIPK and GSISK respectively).
Introduced a truncated timestamp into the partition key to represent a given month. For example, POST#1 and POST#2 both have a posted_at timestamp in September. I truncated both of those timestamps to 2020-09-01 to represent the entire month of September (or whatever time boundary that makes sense for your application).
This will help distribute your data across partitions, reducing the hot key issue. As you correctly note, this will increase the complexity of your application logic and increase latency since you may need to make multiple requests to retrieve enough results for your applications needs. However, this might be a reasonable trade off in this situation. If the increased latency is a problem, you could pre-populate a partition to contain the results of the prior N months worth of a topic discussion (e.g. PK = TOPIC_CACHE#<topic> with a list attribute that contains a list of postIds from the prior N months).
If the TOPIC_CACHE ends up being a hot partition, you could always shard the partition using calculated suffix:
Your application could randomly select a TOPIC_CACHE between 1..N when retrieving the topic cache.
There are numerous ways to approach this access pattern, and these options represent only a few possibilities. If it were my application, I would start by creating a secondary index using the Post topic as the partition key. It's the easiest to implement and would give me an opportunity to see how my application access patterns performed in a production environment. If the hot key issue started to become a problem, I'd dive deeper into some sort of caching solution.
I have a table with close to 2 billion rows already created in DynamoDB.
Due to a query requirement, I had to create a Global Secondary Index(GSI) in it. The process of GSI creation started 36 hours ago but still isn't completed. Portal shows Item Count to be around 100 million. So long way to go.
Why does it take such a long time when sufficient WCU and RCU are
alotted( 30k in fact ).
GSI partition key I've used is something whose values are repetitive, could that be the reason why GSI creation is taking more time (ideal scenario is that we select a partition key which doesn't repeat for items to span across multiple partitions).
Is there a way to abort the creation of GSI while the process is on? it doesn't allow through AWS console.
A GSI has its own WCUs and RCUs, distinct and separate to the primary index. Could this be because you dont have enough WCUs on your GSI?
If your global secondary index is taking too long to create (common
when adding indexes on an existing large table), you can provision
additional write capacity by following these steps:
Open the DynamoDB console.
From the navigation pane, choose Tables,
and then select your table from the list.
Choose the Indexes tab.
Increase the write capacity of the index, and then choose Save.
about a minute, check the OnlineIndexPercentageProgress metric from
the Metrics tab to see if the creation of your global secondary index
is progressing satisfactorily.
EDIT: Above from the AWS Knowledge Center
'OnlineIndexPercentageProgress' instructions:
Creation of your global secondary index will begin. You can monitor
the progress on the Metrics tab:
Choose the Metrics tab.
Choose View all CloudWatch metrics.
In the CloudWatch console, choose DynamoDB. In the Search Metrics box, enter OnlineIndexPercentageProgress. Note: If the search returns an empty
list, wait about a minute for metrics to populate.
Choose the name of
the index to see the progress.