Is it a problem if I choose my hash key and range key so that the number of unique hash keys is very low (maximum: 1000), while there are many more unique range keys?
Does the ratio between the number of unique hash and range keys affect the performance of retrieval of information?
It should not be a problem to have few hash keys with many range keys for each if:
The number of hash keys is not too low
Your access is randomly spread across the hash keys
You don't need to scale to extreme levels
According to the AWS Developer Guidelines for Working with Tables:
Provisioned throughput is dependent on the primary key selection, and
the workload patterns on individual items. When storing data, DynamoDB
divides a table's items into multiple partitions, and distributes the
data primarily based on the hash key element. The provisioned
throughput associated with a table is also divided evenly among the
partitions, with no sharing of provisioned throughput across
partitions.
Essentially, each hash key resides on a single node (i.e. server). Actually, it is redundantly stored to prevent data loss, but that can be ignored for this discussion. When you provision throughput you are indirectly determining the number of nodes to spread the hash keys across. However, no matter how much throughput you provision, it is limited for a single hash key by what a single node can handle.
To explain my three caveats:
1. The number of hash keys is not too low
You mention a max of 1000 hash keys, but the concern is what the minimum is. If for example there were only 10 hash keys then you would quickly reach the throughput limit for each key and would not actually realize the provisioned throughput.
2. Your access is randomly spread across the hash keys
It doesn't matter how many hash keys you have if there are a small number of keys that are "hot". That is if you are frequently reading or writing to only a small subset of the hash keys then you will reach the throughput limit of the nodes those keys are stored on.
3. You don't need to scale to extreme levels
Even assuming you have 1000 distinct hash keys and your access is randomly spread across them, if you need to scale to extreme levels you will eventually reach a point where each hash key is on a separate node. That is, if you provision enough throughput that each hash key is allocated to a separate node (i.e. you have 1000+ nodes), then any throughput provisioned beyond that level will not be realized because you will reach the limit of each node for each key.
The ratio of range keys to hash keys should have little to no affect on get, scan and query performance.
It is my understanding that the range keys for each hash key are efficiently stored in some kind of index that will scale well. However, remember that all the rows for a given hash key are stored together on the same node, so you can reach a point where there is too much data for a given hash key. The AWS Limits in DynamoDB states:
For a table with local secondary indexes, there is a limit on item
collection sizes: For every distinct hash key value, the total sizes
of all table and index items cannot exceed 10 GB. Depending on your
item sizes, this may constrain the number of range keys per hash
value.
As far as I know, this doesn't matter. The load distribution depends on the "frequency" of access and not on the "possible combinations". If your access is uniformly distributed across the 1000 keys you are taking about, then it is OK - This means the probability of fetching by key1 should me similar to probability of fetching key10 or key100. Internally I guess they would be bucketing your 1000 keys into say 3 groups and each of these groups "might" be served by 3 machines. You need to ensure that your access is nearly uniform so that all 3 machines get uniform load share.
Related
The documentation to the partitionKeyPath for the Cosmos DB only point to large data and scaling. But what is with small data which frequently changed. For example with a container with a TTL of some seconds. Is the frequently creating and removing of logical partitions an overhead?
Should I use a static partition key value in this case for best performance?
Or should I use the /id because this irrelevant if all is in one physical partition?
TLDR: Use as granular LP key as possible. document id will do the job.
There are couple factors which affect performance and results you get from logical partition (LP) selection. When assessing your partitioning strategy you should bear in mind some limitations on the Logical and Physical Partition (PP) sizing.
LP limitation:
Max 20GB documents
PP limitations:
Max 10k RU per one physical partition
Max 50GB documents
Going beyond the PP limits will cause partition split - skewed PP will be replaced and data split equally between two newly provisioned PPs. It has an effect on max RU per PP as max throughput is calculated based on [provisioned throughput]/[number of PPs]
I definitely wouldn't suggest using static LP key. Smaller logical partitions - more maintainable and predictable performance of your container.
Very specific and unique data consumption patterns may benefit from larger LPs but only if you're trying to micro-optimize queries for better performance and majority of queries you will be running will filter data by LP key. Moreover even for this scenario there is a high risk of a major drawback - hot partitions and partition data skew for containers/DBs with more than 50GB in size.
I'm scanning a huge table (> 1B docs) so I'm using a parallel scan (using one segment per worker).
The table has a hash key and a sort key.
Intuitively a segment should contain a set of hash keys (including all their sort keys), so one hash key shouldn't appear in more than one segment, but I haven't found any documentation indicating this.
Does anyone know how does DynamoDB behave in this scenario?
Thanks
This is an interesting question. I thought it would be easy to find a document stating that each segment contains a disjoint range of hash keys, and the same hash key cannot appear in more than one segment - but I too failed to find any such document. I am curious if anyone else can find such a document. In the meantime, I can try to offer additional intuitions on why your conjecture is likely correct - but also might be wrong:
My first intuition would be that you are right:
DynamoDB uses the hash key, also known as a partition key to decide on which of the many storage nodes to store copy of this data. All of the items sharing the same partition key (with different sort key values) are stored together, in sort-key order, so they can be Queryed together in order. DynamoDB uses a hash function on the partition key to decide the placement of each item (hence the name "hash key").
Now, if DynamoDB needs to divide the task of scanning all the data into "segments", the most sensible thing for it to do is to divide the space of hash values (i.e., hash function of the hash keys) to different equal-sized pieces. This division is easy to do (just a numeric division by TotalSegments), it ensures roughly the same amount of items in each segment (assuming there are many different partitions), and it ensures that the scanning of each segment involves a different storage node, so the parallel scan can proceed faster than what a single storage node is capable of.
However, there is one indication that this might not be the entire story.
The DynamoDB documentation claims that
In general, there is no practical limit on the number of distinct sort key values per partition key value.
This means that in theory at least, your entire database, perhaps one petabyte of it, may be in a single partition with billions of different sort keys. Since Amazon's single storage node do have a size limit, it means DynamoDB must (unless the above statement is false) support splitting of a single huge partition into multiple storage nodes. This means that when GetItem is looking for a particular item, DynamoDB needs to know which sort key is on which storage node. It also means that a parallel scan might - possibly - divide this huge partition into pieces, all scanning the same partition but different sort-key ranges in it. I am not sure we can completely rule out this possibility. I am guessing it will never happen when you only have smallish partitions.
Every DynamoDB table has a "hashspace" and data is partitioned as per the hash value of the partition key. When a ParallelScan is intended and the TotalSegments and Segment values are provided, the table's complete hashspace is logically divided into these "Segments" such that TotalSegments cover the complete hash space, without overlapping. It is quite possible some segments here do not actually have any data corresponding to them, since there may not be any data in the hashspace allocated to the segment. This can be observed if the TotalSegments value chosen is very high for instance.
And for each Segment value passed in the Scan request (with TotalSegments value being constant), each Segment would return distinct items without any overlap.
FAQs
Q. Ideal Number for TotalSegments ?
-> You might need to experiment with values, find the sweet spot for your table, and the number of workers you use, until your application achieves its best performance.
Q. One or more segments do not return any records. Why?
-> This is possible if the hash range that is allocated as per the TotalSegments value does not have any items. In this case, the TotalSegments value can be decreased, for better performance.
Q. Scan for a segment failed midway. Can a Scan for that segment alone be retried now ?
-> As long as the TotalSegments value remains the same, a Scan for one of the segments can be re-run, since it would have the same hash range allocated at any given time.
Q. Can I perform a Scan for a single segment, without performing the Scan for other segments as per TotalSegments value?
-> Yes. Multiple Scan operations for different Segments are not linked/do not depend on previous/other Segment Scans.
I have a bunch of documents. Right now only about 100,000. But I could potentially have millions. These documents are each about 15KB each.
Right now the way I'm calculating the partition key is to take the Id field from Sql, which is set to autoincrement by 1, and dividing that number by 1000. I think this is not a good idea.
Sometimes I have to hit the CosmosDB very hard with parallel writes. When I do this, the documents usually have very closely grouped SQL Ids. For example, like this:
12000
12004
12009
12045
12080
12090
12102
As you can see, all of these documents would be written at the same time to the same partition because they would all have a partition key of 12. And from the documentation I've read, this is not good. I should be spreading my writes across partitions.
I'm considering changing this so that the PartitionKey is the Sql Id divided by 10,000 plus the last digit. Assuming that the group of Ids being written at the same time are randomly distributed (which they pretty much are).
So like this:
(12045 / 10000).ToString() + (12045 % 10).ToString()
This means, given my list above, the partition keys would be:
12000: 10
12004: 14
12009: 19
12045: 15
12080: 10
12090: 10
12102: 12
Instead of writing all 7 to a single partition, this will write all 7 to partitions 10, 12, 14, 15, and 19 (5 total). Will this result in faster write times? What are the effects on read time? Am I doing this right?
Also, is it better to have the first part of the key be the Id / 1000 or Id / 1000000? In other words, is it better to have lots of small partitions or should I aim to fill up the 10 GB limit of single partitions?
you should aim at evenly distributing load between your partitions. 10gb is the limit,you shouldn't aim to hit that limit (because that would mean you wont be able to add documents to the partition anymore).
Creating a synthetic partition key is a valid way to distribute your documents evenly between partitions. Its up to you to find\invent a key that would fit your load pattern.
You could simply take the last digit of your Id, thus nicely spreading the documents over exactly 10 partitions.
In regards to your comment on max partitions: the value of the partitionKey is hashed and THAT hash determines the physical partitions. So when your partitionKey has 1.000 possible values, it does not mean you have 1.000 partitions.
I have a DynamoDB table with:
Timestamp (HASH)
Text (String)
I want to be able to get the latest item via a query, but doing so requires that I sort by Timestamp rather than partition by it. I was considering doing this instead:
Partition (HASH, hard-coded as whatever)
Timestamp (RANGE)
Text (String)
That way I can query and pass a hard-coded partition in.
But is this bad practice?
It depends.
The main thing to consider is that partitions have a finite throughput for both reads and writes. This is independent from the provisioned throughput for the table. Partition throughput is constrained by the hard disk's read and write speeds. Remember that all items with the same hash value will live on the same partition and therefore will be written to the same disk (discounting replication).
So, it depends on your scale. It will work for a small scale, low throughput use case but it won't be able to scale beyond a single disk.
It is usually bad practice to use a single, hard coded value for your hash key. Rather than a hard coded hash key value, you should consider using year_month_day (or some variation) as your hash key for this use case. It's still not great, but it's much better than a single value.
If you do want to use hard coded hash key values, consider using multiple hard coded values to shard your data across partitions.
My application requires a key value store. Following are some of the details regarding key values:
1) Number of keys (data type: string) can either be 256, 1024 or 4096.
2) Data type of values against each key is a list of integers.
3) The list of integers (value) against each key can vary in size
4) The largest size of the value can be around 10,000,000 integers
5) Some keys might contain very small list of integers
The application needs fast access to the list of integers against a specified key . However, this step is not frequent in the working of the application.
I need suggestions for best Key value stores for my case. I need fast retrieval of values against key and value size can be around 512 MB or more.
I checked Redis but it requires the store to be stored in memory. However, in the given scenario I think I should look for disk based key value stores.
LevelDB can fit your use case very well, as you have limited number of keys (given you have enough disk space for your requirements), and might not need a distributed solution.
One thing you need to specify is if (and how) you wish to modify the lists once in the db, as levelDB and many other general key-val stores do not have such atomic transactions.
If you are looking for a distributed db, cassandra is good, as it will also let you insert/remove individual list elements.