DocumentDb Cross Partition Querying Strategy - azure-cosmosdb

Based on this article, I have a question of strategy:
https://learn.microsoft.com/en-us/azure/cosmos-db/partition-data
A) Should I be structuring my partition keys so that my queries (ideally) end up at one partition? E.g. PartitionKey = CustomerId
OR
B) Does document still handle queries that cross multiple (many) partitions efficiently? Eg. PartitionKey = "CustomerId+ContextName+TypeName"
We currently have "A" implemented, but have discussed "B" because of the article has this quote in it:
It is a best practice to have a partition key with many distinct
values (100s-1000s at a minimum).
Emphasis on "at minimum". Our CustomerIds will not be of a volume to produce more than 2-300 partition keys. Should we add more information to it ("B"), knowing that one query may hit 30-50 partitions (i.e. the "TypeId" addition specifically)
SELECT * FROM c
WHERE(MyPartition = "1+ContextA+TypeA"
OR MyPartition = "1+ContextA+TypeB"
OR MyPartition = "1+ContextA+TypeC"
...)
AND <some other conditions>
The scenarios laid out in the article seem to presume that customer or user will generate plenty of keys. This isn't going to be true for us.

Docdb Sdk makes parallel calls when you run a cross partition query.
If you check the network traffic, you would notice that, it first queries the physical partition key ranges and then makes individual calls to each partition key range.
It does it in parallel, and it allows to control the maxdegreeofparallelism etc.
Having said that, there are two aspects to consider:
Volume of the data
If your volume is say 1 TB, that would mean it would required at-least 100 Physical partitions (each partition being 10 GB), hence it would make atleast 100 calls.
If your data volumes grow higher, making more calls might start to hurt the performance.
Querying aggregations
If you are using aggregations, currently supported by doc db SUM/AVG/COUNT/MIN/MAX. These cannot be performed across partitions.

Related

How the partition limit of DynamoDB works for small databases?

I have read that a single partition of DynamoDB has a size limit of 10GB. This means if all my data are smaller as 10GB then I have only one partition?
There is also a limit of 3000 RCUs or 1000 WCUs on a single partition. This means this is also the limit for a small database which has only one partition?
I use the billing mode PAY_PER_REQUEST. On the database there are short usage peaks of approximate 50MB data. And then there is nothing for hours. How can I design the database to get the best peak performance? Or is DynamoDB a bad option for this use case?
How to design a database to get best performance and picking the right database... these are deep questions.
DynamoDB works well for a wide variety of use cases. On the back end it uses partitions. You rarely have to think about partitions until you're at the high-end of scale. Are you?
Partition keys are used as a way to map data to partitions but it's not 1 to 1. If you don't follow best practice guidance and use one PK value, the database may still split the items across back-end partitions to spread the load. Just don't use a Local Secondary Index (LSI) or it prohibits this ability. The details of the mapping depend on your usage pattern.
One physical partition will be 10 GB or less, and has the 3,000 Read units and 1,000 Write units limit, which is why the database will spread load across partitions. If you use a lot of PK values you make it more straightforward for the database to do this.
If you're at a high enough scale to hit the performance limits, you'll have an AWS account manager you can ask to hook you up with a DynamoDB specialist.
A given partition key can't receive more than 3k RCUs/1k WCUs worth of requests at any given time and store more than 10GB in total if you're using an LSI (if not using an LSI, you can store more than 10GB assuming you're using a Sort Key). If your data definitely fits within those limits, there's no reason you can't use DDB with a single partition key value (and thus a single partition). It'd still be better to plan on a design that could scale.
The right design for you will depend on what your data model and access patterns look like. Given what you've described of some kind of periodic job, a timestamp could be used (although it has issues with hotspots you should be careful of). If you've got some kind of other unique id, like user_id or device_id, etc. that would be a better choice. There is some great documentation on that here.

How do you synchronize related collections in Cosmos Db?

My application need to support lookups for invoices by invoice id and by the customer. For that reason I created two collections in which I store the (exact) same invoice documents:
InvoicesById, with partition key /InvoiceId
InvoicesByCustomerId, with partition key /CustomerId
Apparently you should use partition keys when doing queries and since there are two queries I need two collections. I guess there may be more in the future.
Updates are primarily done to the InvoicesById collection, but then I need to replicate the change to InvoicesByCustomer (and others) as well.
Are there any best practice or sane approaches how to keep collections in sync?
I'm thinking change feeds and what not. I want avoid writing this sync code and risk inconsistencies due to missing transactions between collections (etc). Or maybe I'm missing something crucial here.
Change feed will do the trick though I would suggest to take a step back before brute-forcing the problem.
Please find detailed article describing split issue here: Azure Cosmos DB. Partitioning.
Based on the Microsoft recommendation for maintainable data growth you should select partition key with highest cardinality (in your case I assume it will be InvoiceId). For the main reason:
Spread request unit (RU) consumption and data storage evenly across all logical partitions. This ensures even RU consumption and storage distribution across your physical partitions.
You don't need creating separate container with CustomerId partition key as it won't give you desired, and most importantly, maintainable performance in future and might result in physical partition data skew when too many Invoices linked to the same customer.
To get optimal and scalable query performance you most probably need InvoiceId as partition key and indexing policy by CustomerId (and others in future).
There will be a slight RU overhead (definitely not multiplication of RUs but rather couple additional RUs per request) in consumption when data you're querying is distributed between number of physical partitions (PPs) but it will be neglectable comparing to issues occurring when data starts growing beyond 50-, 100-, 150GB.
Why CustomerId might not be the best partition key for the data sets which are expected to grow beyond 50GB?
Main reason is that Cosmos DB is designed to scale horizontally and provisioned throughput per PP is limited to the [total provisioned per container (or DB)] / [number of PP].
Once PP split occurs due to exceeding 50GB size your max throughput for existing PPs as well as two newly created PPs will be lower then it was before split.
So imagine following scenario (consider days as a measure of time between actions):
You've created container with provisioned 10k RUs and CustomerId partition key (which will generate one underlying PP1). Maximum throughput per PP is 10k/1 = 10k RUs
Gradually adding data to container you end-up with 3 big customers with C1[10GB], C2[20GB] and C3[10GB] of invoices
When another customer was onboarded to the system with C4[15GB] of data Cosmos DB will have to split PP1 data into two newly created PP2 (30GB) and PP3 (25GB). Maximum throughput per PP is 10k/2 = 5k RUs
Two more customers C5[10GB] C6[15GB] were added to the system and both ended-up in PP2 which lead to another split -> PP4 (20GB) and PP5 (35GB). Maximum throughput per PP is now 10k/3 = 3.333k RUs
IMPORTANT: As a result on [Day 2] C1 data was queried with up to 10k RUs
but on [Day 4] with only max to 3.333k RUs which directly impacts execution time of your query
This is a main thing to remember when designing partition keys in current version of Cosmos DB (12.03.21).
What you are doing is a good solution. Different queries requires different Partition Keys on different Cosmos DB Containers with same data.
How to sync the two Containers: use Triggers from the firs Container.
https://devblogs.microsoft.com/premier-developer/synchronizing-azure-cosmos-db-collections-for-blazing-fast-queries/
Cassandra has a Feature called Materialized Views for this exact problem, abstracting the sync problem. Maybe some day same Feature will be included on Cosmos DB.

DocumentDB partitions sizes

According to docs, documents with different partitionKey may end up in same partition but documents with same partitionKey are guaranteed to end up in same partition.
Now, lets consider a case where you have partitionKey with cardinality=100 (for example 100 tenants).
Initially, all data is roughly equally distributed across partitions.
Lety say you end up with partitions of about 50GB size. I would assume in that case you might have a few partition keys contained within same partition. Then, all of the sudden your 2 tenants grow exponentially and they go to 200GB size.
Since partition have 250GB limit, now you're in problem.
Questions:
How is this being solved?
Is DocumentDB partitioning handling this moving to separate partitions?
Should we (and are we even able to) view data/storage consumption per partitionKey (not partition)?
If someone could shed a bit of light to these dilemas as i couldnt find answers to these specific questions in docs.
Currently, the logical partition for Single partition key cannot exceed 10GB. It means you have to ensure that at any given point of the time your logical partition does not exceed 10GB.
Source MSDN
A logical partition is a partition within a physical partition that stores all the data associated with a single partition key value. A logical partition has a 10 GB max.
On your question.
How is this being solved?
Choosing the appropriate partition key and ensure it is well balanced. If you anticipate that a tenant data might grow beyond 10GB, then having tenant id as partition key is not an option. You have to have something else as a partition key which can be scalable.
Is DocumentDB partitioning handling this moving to separate partitions?
Yes, CosmosDB will take care of Physical Partition handling.
Should we (and are we even able to) view data/storage consumption per partitionKey (not partition)?
Yes, In the Azure portal, go to Azure Cosmos DB account and click on Metrics in Monitoring section and then on right pane click on storage tab to see how your data is partitioned in different physical partition

Partition key for DocumentDB

I have a question about DocumentDB partition key choise.
I have data with UserId, DeviceId and WhateverId. UserId parameter will be in queries always, so I have chosen UserId as a partition key. But I have a lot of data for one user (millions of entities) and when I made a quety like "SELECT * FROM c WHERE c.DeviceId = #DeviceId" with partition key specified it takes a lot of time(about 6 minutes for about 220 000 returned entities).
Maybe it would be more efficient to choose for example DeviceId as a partition key and make queries against a few partitions in parallel
(specifying EnableCrossPartitionQuery = true and MaxDegreeOfParallelism = partition count)?
Or maybe it is a good idea to use separate collection for every user?
It might help a little but I don't think a partition for each user will solve your problem because you essentially have that under the covers.
You could experiment with the partition key to improve the parrallism but, at best that would give you 2x to 5x improvement in my experience. Is that enough?
For more dramatic improvements you usually have to resort to selective denormalization and/or caching.
I know this is a bit old, but for the benefit of others coming to this topic...
From your description I assume that the devices are mostly unique to the user. It is often advised to partition on something like userid which is good if you have, say a call centre application, with many queries for a given userid and want to look up no more than a few hundred entries. In such cases the data can be quickly extracted from a single partition without the overhead of having to collate data across partitions. However, if you have millions of records for the user then partitioning on User Id is perhaps the worst option as extracting large volumes of data from a single partition will soon exceed the overhead of collation. In such cases you want to distribute user data as evenly as possible over all partitions. Unless each user has 25+ devices with similar usage then Device Id is probably not a good choice either.
In cases such as yours, I generally find a system generated incrementing key (e.g. Event Id or Transaction Id) to be the best choice.

How to strike a performance balance with documentDB collection for multiple tenants?

Say I have:
My data stored in documetDB's collection for all of my tenants. (i.e. multiple tenants).
I configured the collection in such a way that all of my data is distributed uniformly across all partitions.
But partitions are NOT by each tenant. I use some other scheme.
Because of this data for a particular tenant is distributed across multiple partitions.
Here are my questions:
Is this the right thing to do to maximum performance for both reading and writing data?
What if I want to query for a particular tenant? What are the caveats in writing this query?
Any other things that I need to consider?
I would avoid queries across partitions, they come with quite a cost (basically multiply index and parsing costs with number of partitions - defaults to 25). It's fairly easy to try out.
I would prefer a solution where one can query on a specific partition, typically partitioning by tenant ID.
Remember that with partitioned collections, there's stil limits on each partition (10K RU and 10GB) - I have written about it here http://blog.ulriksen.net/notes-on-documentdb-partitioning/
It depends upon your usage patterns as well as the variation in tenant size.
In general for multi-tenant systems, 99% of all operations are within a single tenant. If you make the tenantID your partition key, then those operations will only touch a single partition. This won't make a single operation any faster (latency) but could provide huge throughput gains when under load by multiple tenants. However, if you only have 5 tenants and 1 of them is 10x bigger than all the others, then using the tenantID as your key will lead to a very unbalanced system.
We use the tenantID as the partition key for our system and it seems to work well. We've talked about what we would do if it became very unbalanced and one idea is to make the partition key be the tenantID + to split the large tenants up. We haven't had to do that yet though so we haven't worked out all of those details to know if that would actually be possible and performant, but we think it would work.
What you have described is a sensible solution, where you avoid data skews and load-balance across partitions well. Since the query for a particular tenant needs to touch all partitions, please remember to set FeedOptions.EnableCrossPartitionQuery to true (x-ms-documentdb-query-enablecrosspartition in the REST API).
DocumentDB site also has an excellent article on partitioned collections and tips for choosing a partition key in general. https://azure.microsoft.com/en-us/documentation/articles/documentdb-partition-data/

Resources