Azure cosmos DB physical partition unbalanced with no hot partition key indicated - azure-cosmosdb

I have this problem:
new physical partitions are being created for me and requests often occur on the same partition. I wonder how is this possible since I have no hot partition keys indicated?
My only partition key is an id composed as follows:
user id _ year _ travel number
I memorize the geographical positions of vehicles.
This obviously generates several 429.
I hope I was clear.
EDIT ----------------------------------------
I set up the log and investigate:
Is it possible that a json of 400kb produce a request write od 2900 RU/s ?
The document is this:
I don't understand why the result of console say a count of 4 when i have load only one document??

The graph shows you have a hot partition. You need to select a partition key with higher cardinality to more evenly spread writes across partitions and migrate your data into this new container. This can be best done using Change Feed to read from your existing container and write the data into your new container. When Change Feed has caught up you can change the connection string in your app and restart it to then write to the new container with very little downtime.

Related

Checking millions of IDs in Cosmos DB

Given a potentially large (up to 10^7) set of IDs (together with associated partition keys), I need to verify that there is no document in a Cosmos DB collection with an ID that is in the given set.
There are two obvious ways to achieve this:
Check the existence for each ID/partition key pair individually using parallel point reads, with AllowBulkExecution = true, and abort as soon as a read comes back successfully.
Group the IDs by partition key, and for each group, issue parallel queries of the following form (such that each query is smaller than the maximum query size 256 kB), and abort as soon as any query returns with a non-empty result:
SELECT c.id FROM c
WHERE c.partitionkey = 'partition123' AND ARRAY_CONTAINS(['id1', 'id2', ...], c.id)
LIMIT 1
Is it possible to say, without trying it out, which one is faster?
Here is a bit more context:
The client is an Azure App Service located in the same region as the Cosmos DB instance.
The Cosmos DB collection contains about 10^7 documents and has a throughput of 4000 RU/s.
The IDs are actually GUID strings of length 36, so the number of IDs per query in Solution 2 would be limited to about 6500 in order to not exceed the maximum query size. In other words, the number of required queries in Solution 2 is about n/6500 where n is the number of IDs in the set.
The number of different partition keys is small (< 10).
The average document size is about 500 B.
Default indexing policy.
A bit more background: The check is part of an import/initial load operation. More precisely, it is part of the validation of an import set so an error can be returned before the write operations begin. So the expected (non-error) case is that none of the IDs in the set already exists. The import operation is not expected to be executed frequently (though certainly more than once), so managing auxiliary processes/data just to optimize for this check would not be a good tradeoff.
Not quite sure I understand the need for this but... queries will cost more than a point-read, in terms of RU cost (and given your doc size, those point reads are going to cost 1 RU).
I don't see how you will be able to abandon parallel point-reads if you succeed in finding a particular ID within a given partition. Also remember that an ID is only unique within a partition, so it's possible to have that ID in multiple partitions.
It is likely more efficient to just attempt to write a given ID to a given partition, and see if it succeeds (it'll fail if there's an ID collision).
Lastly: For all practical purposes, you won't have a duplicate ID if you're generating a new GUID for every document you're saving.

DynamoDB partition key design with On-Demand

How much do I need to care about partition key design with DynamoDB On-Demand and Adaptive Capacity? What would happen if I tried to write to single partition key 40,000 times in one second? Does the per-partition write request unit cap of 1,000 still exist such that it would throttle those 40,000 requests, or is there some magic that boosts that single partition temporarily up to the table limit?
It's not an arbitrary question, as I'd like to use incrementing integers for all our entities in DynamoDB via the method suggested within this SO post, but that would require maintaining the latest id for an entity on a single partition key. Every new item created would get their ID by writing to that partition key and inspecting the new value returned in the response. If I were writing something like a chat app and using this method to get the new ID for each message, would my app only be able to create 1,000 new messages a second?

DynamoDB Streams with Lambda, how to process the records in order (by logical groups)?

I want to use DynamoDB Streams + AWS Lambda to process chat messages. Messages regarding the same conversation user_idX:user_idY (a room) must be processed in order. Global ordering is not important.
Assuming that I feed DynamoDB in the correct order (room:msg1, room:msg2, etc), how to guarantee that the Stream will feed AWS Lambda sequentially, with guaranteed ordering of the processing of related messages (room) across a single stream?
Example, considering I have 2 shards, how to make sure the logical group goes to the same shard?
I must accomplish this:
Shard 1: 12:12:msg3 12:12:msg2 12:12:msg1 ==> consumer
Shard 2: 13:24:msg2 51:91:msg3 13:24:msg1 51:92:msg2 51:92:msg1 ==> consumer
And not this (messages are respecting the order that I saved in the database, but they are being placed in different shards, thus incorrectly processing different sequences for the same room in parallel):
Shard 1: 13:24:msg2 51:92:msg2 12:12:msg2 51:92:msg2 12:12:msg1 ==> consumer
Shard 2: 51:91:msg3 12:12:msg3 13:24:msg1 51:92:msg1 ==> consumer
This official post mentions this, but I couldn't find anywhere in the docs how to implement it:
The relative ordering of a sequence of changes made to a single
primary key will be preserved within a shard. Further, a given key
will be present in at most one of a set of sibling shards that are
active at a given point in time. As a result, your code can simply
process the stream records within a shard in order to accurately track
changes to an item.
Questions
1) How to set a partition key in DynamoDB Streams?
2) How to create Stream shards that guarantee partition key consistent delivery?
3) Is this really possible after all? Since the official article mentions: a given key will be present in at most one of a set of sibling shards that are active at a given point in time so it seems that msg1 may go to shard 1 and then msg2 to shard 2, as my example above?
EDITED: In this question, I found this:
The amount of shards that your stream has, is based on the amount of
partitions the table has. So if you have a DDB table with 4
partitions, then your stream will have 4 shards. Each shard
corresponds to a specific partition, so given that all items with the
same partition key should be present in the same partition, it also
means that those items will be present in the same shard.
Does this mean that I can achieve what I need automatically? "All items with the same partition will be present in the same shard". Does Lambda respect this?
EDIT 2: From the FAQ:
The ordering of records across different shards is not guaranteed, and
processing of each shard happens in parallel.
I don't care about global ordering, just logical one as per example. Still, not clear if the shards group logically with this answer from the FAQ.
In-order processing for updates on the same key will happen automatically. As described in this presentation, one Lambda function per active shard is run. Because all the updates for a particular partition/sort key appear in exactly one shard lineage, they are processed in order.

DynamoDB table structure

We are looking to use AWS DynamoDB for storing application logs. Logs from multiple components in our system would be stored here. We are expecting a lot of writes and only minimal number of reads.
The client that we use for writing into DynamoDB generates a UUID for the partition key, but using this makes it difficult to actually search.
Most prominent search cases are,
Search based on Component / Date / Date time
Search based on JobId / File name
Search based on Log Level
From what I have read so far, using a UUID for the partition key is not suitable for our case. I am currently thinking about using either / for our partition key and ISO 8601 timestamp as our sort key. Does this sound reasonable / widely used setting for such an use case ?
If not kindly suggest alternatives that can be used.
Using UUID as partition key will efficiently distribute the data amongst internal partitions so you will have ability to utilize all of the provisioned capacity.
Using sortable (ISO format) timestamp as range/sort key will store the data in order so it will be possible to retrieve it in order.
However for retrieving logs by anything other than timestamp, you may have to create indexes (GSI) which are charged separately.
Hope your logs are precious enough to store in DynamoDB instead of CloudWatch ;)
In general DynamoDB seems like a bad solution for storing logs:
It is more expensive than CloudWatch
It has poor querying capabilities, unless you start utilising global secondary indexes which will double or triple your expenses
Unless you use random UUID for hash key, you are risking creating hot partitions/keys in your db (For example, using component ID as a primary or global secondary key, might result in throttling if some component writes much more often than others)
But assuming you already know these drawbacks and you still want to use DynamoDB, here is what I would recommend:
Use JobId or Component name as hash key (one as primary, one as GSI)
Use timestamp as a sort key
If you need to search by log level often, then you can create another local sort key, or you can combine level and timestamp into single sort key. If you only care about searching for ERROR level logs most of the time, then it might be better to create a sparse GSI for that.
Create a new table each day(let's call it "hot table"), and only store that day's logs in that table. This table will have high write throughput. Once the day finishes, significantly reduce its write throughput (maybe to 0) and only leave some read capacity. This way you will reduce risk of running into 10 GB limit per hash key that Dynamo DB has.
This approach also has an advantage in terms of log retention. It is very easy and cheap to remove log older than X days this way. By keeping old table capacity very low you will also avoid very high costs. For more complicated ad-hoc analysis, use EMR

Model daily game ranking in DynamoDB

I have a question. I m pretty new to DynamoDB but have been working on large scale aggregation on SQL databases for a long time.
Suppose you have a table called GamePoints (PlayerId, GameId, Points) and would like to create a ranking table Rankings (PlayerId, Points) sorted by points.
This table needs to be updated on an hourly basis but keeping the previous version of its contents is not required. Just the current Rankings.
The query will always be give me the ranking table (with paging).
The GamePoints table will get very very large over time.
Questions:
Is this the best practice schema for DynamoDB ?
How would you do this kind of aggregation?
Thanks
You can enable a DynamoDB Stream on the GamePoints table. You can read stream records from the stream to maintain materialized views, including aggregations, like the Rankings table. Set StreamViewType=NEW_IMAGE on your GamePoints table, and set up a Lambda function to consume stream records from your stream and update the points per player using atomic counters (UpdateItem, HK=player_id, UpdateExpression="ADD Points #stream_record_points", ExpressionAttributeValues={"#stream_record_points":[put the value from stream record here.]}). As the hash key of the Rankings table would still be the player ID, you could do full table scans of the Rankings table every hour to get the n highest players, or all the players and sort.
However, considering the size of fields (player_id and number of points probably do not take more than 100 bytes), an in memory cache updated by a Lambda function could equally well be used to track the descending order list of players and their total number of points in real time. Finally, if your application requires stateful processing of Stream records, you could use the Kinesis Client Library combined with the DynamoDB Streams Kinesis Adapter on your application server to achieve the same effect as subscribing a Lambda function to the Stream of the GamePoints table.
An easy way to do this is by using DynamoDb's HashKey and Sort key. For example, the HashKey is the GameId and Sort key is the Score. You then query the table with a descending sort and a limit to get the real-time top players in O(1).
To get the rank of a given player, you can use the same technique as above: you get the top 1000 scores in O(1) and you then use BinarySearch to find the player's rank amongst the top 1000 scores in O(log n) on your application server.
If the user has a rank of 1000, you can specify that this user has a rank of 1000+. You can also obviously change 1000 to a greater number (100,000 for example).
Hope this helps.
Henri
The PutItem can be helpful to implement the persistence logic according to your Use Case:
PutItem Creates a new item, or replaces an old item with a new item.
If an item that has the same primary key as the new item already
exists in the specified table, the new item completely replaces the
existing item. You can perform a conditional put operation (add a new
item if one with the specified primary key doesn't exist), or replace
an existing item if it has certain attribute values. Source:
http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_PutItem.html
In terms of querying the data, if you know for sure that you are going to be reading the entire Ranking table, I would suggest doing it through several read operations with minimum acceptable page size so you can make the best use of your provisioned throughput. See the guidelines below for more details:
Instead of using a large Scan operation, you can use the following
techniques to minimize the impact of a scan on a table's provisioned
throughput.
Reduce Page Size
Because a Scan operation reads an entire page (by default, 1 MB), you
can reduce the impact of the scan operation by setting a smaller page
size. The Scan operation provides a Limit parameter that you can use
to set the page size for your request. Each Scan or Query request that
has a smaller page size uses fewer read operations and creates a
"pause" between each request. For example, if each item is 4 KB and
you set the page size to 40 items, then a Query request would consume
only 40 strongly consistent read operations or 20 eventually
consistent read operations. A larger number of smaller Scan or Query
operations would allow your other critical requests to succeed without
throttling.
Isolate Scan Operations
DynamoDB is designed for easy scalability. As a result, an application
can create tables for distinct purposes, possibly even duplicating
content across several tables. You want to perform scans on a table
that is not taking "mission-critical" traffic. Some applications
handle this load by rotating traffic hourly between two tables – one
for critical traffic, and one for bookkeeping. Other applications can
do this by performing every write on two tables: a "mission-critical"
table, and a "shadow" table.
SOURCE: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScanGuidelines.html#QueryAndScanGuidelines.BurstsOfActivity
You can also segment your tables by GameId (e.g. Ranking_GameId) to distribute the data more evenly and give you more granularity in terms of provisioned throughput.

Resources