I have a DynamoDB transaction which appends > 1 records at any time in a single DynamoDB table using transactWrite. For example, in a single transaction, I can append A, B, and C records. Note that in my case, the operations are always append only (inserts only).
The records are then passed over to DynamoDB stream and to a lambda for processing. However, sometimes, lambda receives the events out of order. I understand that behavior I think because from DynamoDB's point of view, all 3 events were written at the same timestamp. So, there is no ordering. But if these events are part of same batch, I can always reorder them in the lambda before processing.
However, that is where the problem is. Even though these records are written in single transaction, they don't always appear together in the same batch in the lambda. Sometimes, I receive C as the only event and then A, B arrive in a batch later on. I think that the behavior is somewhat reasonable. Is there a way to guarantee that I receive all the records written in a transaction in one single batch.
Your items may be written in a single transaction, but each item could be in a separate stream shard. Streams have shards, therefore it is possible that each item each arrives at the same time, but each of items in the streams land on different stream shards. Ordering is by time and item, not overall keyspace and time: "For each item that is modified in a DynamoDB table, the stream records appear in the same sequence as the actual modifications to the item.” It is possible to ensure ordered updates to each item, but if you need to have consistency across all updates in the keyspace then this would need to be designed on the reader side.
All that said, I wonder if there is an opportunity to denormalize these three items into one item on the base table and skip using TransactWriteItem altogether.
Related
Let us say, We have a situation where instead of getting the total count in a table, get the count of records with a particular status.
We know DynamoDb is schemaless and still has to count each record one by one to get the total count.
And yet, How can we leverage the above need using dynamoDb queries?
While normally "Query" or "Scan" requests return all the matching items, you can pass the Select=COUNT parameter and ask to retrieve only the number of matching items, instead of the actual items. But before you go doing that, there are a few things you should know:
DynamoDB will still be reading - and you will still be paying for - all the data, even if just for being counted. Doing a "Scan" with a filter is in almost all cases out of the question, because it will read the entire data set every time. With a "Query" you can ask to read just one partition, or a contiguous range of sort-keys in one partition, which in some cases may be reasonable enough (but please think if it is, in your use case).
Even if you're not actually reading the data, and just counting, DynamoDB still does Scan and Query with "paging", i.e., your reads request will read just 1MB of data from disk, return you the partial count, and ask you to submit another request to resume the scan. Your DynamoDB library probably has a way to automate this resumption, so for example it can run thousands or whatever number of queries needed until finally finishing the scan and calculating the total sum.
In some cases, it may make sense for to maintain a counter in addition to the data. Writes will be more expensive (e.g., each write adds data and increments the counter), but reads that need this counter will be hugely cheaper - so it all depends on how much of each your workload needs.
Assuming we're using AWS Triggers on DynamoDB Table, and that trigger is to run a lambda function, whose job is to update entry into CloudSearch (to keep DynamoDB and CS in sync).
I'm not so clear on how Lambda would always keep the data in sync with the data in dynamoDB. Consider the following flow:
Application updates a DynamoDB table's Record A (say to A1)
Very closely after that Application updates same table's same record A (to A2)
Trigger for 1 causes Lambda of 1 to start execute
Trigger for 2 causes Lambda of 2 to start execute
Step 4 completes first, so CloudSearch sees A2
Now Step 3 completes, so CloudSearch sees A1
Lambda triggers are not guaranteed to start ONLY after previous invocation is complete (Correct if wrong, and provide me link)
As we can see, the thing goes out of sync.
The closest I can think which will work is to use AWS Kinesis Streams, but those too with a single Shard (1MB ps limit ingestion). If that restriction works, then your consumer application can be written such that the record is first processed sequentially, i.e., only after previous record is put into CS, then the next record should be processed. Assuming the aforementioned statement is true, how to ensure the sync happens correctly, if there is so much of data ingestion into DynamoDB that more than one shards are needed n Kinesis?
You may achieve that using DynamoDB Streams:
DynamoDB Streams
"A DynamoDB stream is an ordered flow of information about changes to items in an Amazon DynamoDB table."
DynamoDB Streams guarantees the following:
Each stream record appears exactly once in the stream.
For each item that is modified in a DynamoDB table, the stream records appear in the same sequence as the actual modifications to the item.
Another cool thing about DynamoDB Streams, if your Lambda fails to handle the stream (any error when indexing in Cloud Search for example) the event will keep retrying and the other record streams will wait until your context succeed.
We use Streams to keep our Elastic Search indexes in sync with our DynamoDB tables.
AWS Lambda F&Q Link
Q: How does AWS Lambda process data from Amazon Kinesis streams and Amazon DynamoDB Streams?
The Amazon Kinesis and DynamoDB Streams records sent to your AWS Lambda function are strictly serialized, per shard. This means that if you put two records in the same shard, Lambda guarantees that your Lambda function will be successfully invoked with the first record before it is invoked with the second record. If the invocation for one record times out, is throttled, or encounters any other error, Lambda will retry until it succeeds (or the record reaches its 24-hour expiration) before moving on to the next record. The ordering of records across different shards is not guaranteed, and processing of each shard happens in parallel.
So that means Lambda would pick the Records in one shard one by one, in order they appear in the Shard, and not execute a new record until previous record is processed!
However, the other problem that remains is what if the entries of the same record are present across different shards? Thankfully, AWS DynamoDB Streams ensure that primary key only resides in a particular Shard always. (Essentially, I think, the Primary Key is what is used to find the hash to point to a shard) AWS Slide Link. See more from AWS Blog below:
The relative ordering of a sequence of changes made to a single primary key will be preserved within a shard. Further, a given key will be present in at most one of a set of sibling shards that are active at a given point in time. As a result, your code can simply process the stream records within a shard in order to accurately track changes to an item.
I want to use DynamoDB Streams + AWS Lambda to process chat messages. Messages regarding the same conversation user_idX:user_idY (a room) must be processed in order. Global ordering is not important.
Assuming that I feed DynamoDB in the correct order (room:msg1, room:msg2, etc), how to guarantee that the Stream will feed AWS Lambda sequentially, with guaranteed ordering of the processing of related messages (room) across a single stream?
Example, considering I have 2 shards, how to make sure the logical group goes to the same shard?
I must accomplish this:
Shard 1: 12:12:msg3 12:12:msg2 12:12:msg1 ==> consumer
Shard 2: 13:24:msg2 51:91:msg3 13:24:msg1 51:92:msg2 51:92:msg1 ==> consumer
And not this (messages are respecting the order that I saved in the database, but they are being placed in different shards, thus incorrectly processing different sequences for the same room in parallel):
Shard 1: 13:24:msg2 51:92:msg2 12:12:msg2 51:92:msg2 12:12:msg1 ==> consumer
Shard 2: 51:91:msg3 12:12:msg3 13:24:msg1 51:92:msg1 ==> consumer
This official post mentions this, but I couldn't find anywhere in the docs how to implement it:
The relative ordering of a sequence of changes made to a single
primary key will be preserved within a shard. Further, a given key
will be present in at most one of a set of sibling shards that are
active at a given point in time. As a result, your code can simply
process the stream records within a shard in order to accurately track
changes to an item.
Questions
1) How to set a partition key in DynamoDB Streams?
2) How to create Stream shards that guarantee partition key consistent delivery?
3) Is this really possible after all? Since the official article mentions: a given key will be present in at most one of a set of sibling shards that are active at a given point in time so it seems that msg1 may go to shard 1 and then msg2 to shard 2, as my example above?
EDITED: In this question, I found this:
The amount of shards that your stream has, is based on the amount of
partitions the table has. So if you have a DDB table with 4
partitions, then your stream will have 4 shards. Each shard
corresponds to a specific partition, so given that all items with the
same partition key should be present in the same partition, it also
means that those items will be present in the same shard.
Does this mean that I can achieve what I need automatically? "All items with the same partition will be present in the same shard". Does Lambda respect this?
EDIT 2: From the FAQ:
The ordering of records across different shards is not guaranteed, and
processing of each shard happens in parallel.
I don't care about global ordering, just logical one as per example. Still, not clear if the shards group logically with this answer from the FAQ.
In-order processing for updates on the same key will happen automatically. As described in this presentation, one Lambda function per active shard is run. Because all the updates for a particular partition/sort key appear in exactly one shard lineage, they are processed in order.
I have a question. I m pretty new to DynamoDB but have been working on large scale aggregation on SQL databases for a long time.
Suppose you have a table called GamePoints (PlayerId, GameId, Points) and would like to create a ranking table Rankings (PlayerId, Points) sorted by points.
This table needs to be updated on an hourly basis but keeping the previous version of its contents is not required. Just the current Rankings.
The query will always be give me the ranking table (with paging).
The GamePoints table will get very very large over time.
Questions:
Is this the best practice schema for DynamoDB ?
How would you do this kind of aggregation?
Thanks
You can enable a DynamoDB Stream on the GamePoints table. You can read stream records from the stream to maintain materialized views, including aggregations, like the Rankings table. Set StreamViewType=NEW_IMAGE on your GamePoints table, and set up a Lambda function to consume stream records from your stream and update the points per player using atomic counters (UpdateItem, HK=player_id, UpdateExpression="ADD Points #stream_record_points", ExpressionAttributeValues={"#stream_record_points":[put the value from stream record here.]}). As the hash key of the Rankings table would still be the player ID, you could do full table scans of the Rankings table every hour to get the n highest players, or all the players and sort.
However, considering the size of fields (player_id and number of points probably do not take more than 100 bytes), an in memory cache updated by a Lambda function could equally well be used to track the descending order list of players and their total number of points in real time. Finally, if your application requires stateful processing of Stream records, you could use the Kinesis Client Library combined with the DynamoDB Streams Kinesis Adapter on your application server to achieve the same effect as subscribing a Lambda function to the Stream of the GamePoints table.
An easy way to do this is by using DynamoDb's HashKey and Sort key. For example, the HashKey is the GameId and Sort key is the Score. You then query the table with a descending sort and a limit to get the real-time top players in O(1).
To get the rank of a given player, you can use the same technique as above: you get the top 1000 scores in O(1) and you then use BinarySearch to find the player's rank amongst the top 1000 scores in O(log n) on your application server.
If the user has a rank of 1000, you can specify that this user has a rank of 1000+. You can also obviously change 1000 to a greater number (100,000 for example).
Hope this helps.
Henri
The PutItem can be helpful to implement the persistence logic according to your Use Case:
PutItem Creates a new item, or replaces an old item with a new item.
If an item that has the same primary key as the new item already
exists in the specified table, the new item completely replaces the
existing item. You can perform a conditional put operation (add a new
item if one with the specified primary key doesn't exist), or replace
an existing item if it has certain attribute values. Source:
http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_PutItem.html
In terms of querying the data, if you know for sure that you are going to be reading the entire Ranking table, I would suggest doing it through several read operations with minimum acceptable page size so you can make the best use of your provisioned throughput. See the guidelines below for more details:
Instead of using a large Scan operation, you can use the following
techniques to minimize the impact of a scan on a table's provisioned
throughput.
Reduce Page Size
Because a Scan operation reads an entire page (by default, 1 MB), you
can reduce the impact of the scan operation by setting a smaller page
size. The Scan operation provides a Limit parameter that you can use
to set the page size for your request. Each Scan or Query request that
has a smaller page size uses fewer read operations and creates a
"pause" between each request. For example, if each item is 4 KB and
you set the page size to 40 items, then a Query request would consume
only 40 strongly consistent read operations or 20 eventually
consistent read operations. A larger number of smaller Scan or Query
operations would allow your other critical requests to succeed without
throttling.
Isolate Scan Operations
DynamoDB is designed for easy scalability. As a result, an application
can create tables for distinct purposes, possibly even duplicating
content across several tables. You want to perform scans on a table
that is not taking "mission-critical" traffic. Some applications
handle this load by rotating traffic hourly between two tables – one
for critical traffic, and one for bookkeeping. Other applications can
do this by performing every write on two tables: a "mission-critical"
table, and a "shadow" table.
SOURCE: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScanGuidelines.html#QueryAndScanGuidelines.BurstsOfActivity
You can also segment your tables by GameId (e.g. Ranking_GameId) to distribute the data more evenly and give you more granularity in terms of provisioned throughput.
We are building a conversation system that will support messages between 2 users (and eventually between 3+ users). Each conversation will have a collection of users who can participate/view the conversation as well as a collection of messages. The UI will display the most recent 10 messages in a specific conversation with the ability to "page" (progressive scrolling?) the messages to view messages further back in time.
The plan is to store conversations and the participants in MSSQL and then only store the messages (which represents the data that has the potential to grow very large) in DynamoDB. The message table would use the conversation ID as the hash key and the message CreateDate as the range key. The conversation ID could be anything at this point (integer, GUID, etc) to ensure an even message distribution across the partitions.
In order to avoid hot partitions one suggestion is to create separate tables for time series data because typically only the most recent data will be accessed. Would this lead to issues when we need to pull back previous messages for a user as they scroll/page because we have to query across multiple tables to piece together a batch of messages?
Is there a different/better approach for storing time series data that may be infrequently accessed, but available quickly?
I guess we can assume that there are many "active" conversations in parallel, right? Meaning - we're not dealing with the case where all the traffic is regarding a single conversation (or a few).
If that's the case, and you're using a random number/GUID as your HASH key, your objects will be evenly spread throughout the nodes and as far as I know, you shouldn't be afraid of skewness. Since the CreateDate is only the RANGE key, all messages for the same conversation will be stored on the same node (based on their ConversationID), so it actually doesn't matter if you query for the latest 5 records or the earliest 5. In both cases it's query using the index on CreateDate.
I wouldn't break the data into multiple tables. I don't see what benefit it gives you (considering the previous section) and it will make your administrative life a nightmare (just imagine changing throughput for all tables, or backing them up, or creating a CloudFormation template to create your whole environment).
I would be concerned with the number of messages that will be returned when you pull the history. I guess you'll implement that by a query command with the ConversationID as the HASH key and order results by CreationDate descending. In that case, I'd return only the first page of results (I think it returns up to 1MB of data, so depends on an average message length, it might be enough or not) and only if the user keeps scrolling, fetch the next page. Otherwise, you might use a lot of your throughput on really long conversations and anyway, the client doesn't really want to get stuck for a long time waiting for megabytes of data to appear on screen..
Hope this helps