Is it ok to build architecture around regular creation/deletion of tables in DynamoDB? - amazon-dynamodb

I have a messaging app, where all messages are arranged into seasons by creation time. There could be billions of messages each season. I have a task to delete messages of old seasons. I thought of a solution, which involves DynamoDB table creation/deletion like this:
Each table contains messages of only one season
When season becomes 'old' and messages no longer needed, table is deleted
Is it a good pattern and does it encouraged by Amazon?
ps: I'm asking, because I'm afraid of two things, met in different Amazon services -
In Amazon S3 you have to delete each item before you can fully delete bucket. When you have billions of items, it becomes a real pain.
In Amazon SQS there is a notion of 'unwanted behaviour'. When using SQS api you can act badly regarding SQS infrastructure (for example not polling messages) and thus could be penalized for it.

Yes, this is an acceptable design pattern, it actually follows a best practice put forward by the AWS team, but there are things to consider for your specific use case.
AWS has a limit of 256 tables per region, but this can be raised. If you are expecting to need multiple orders of magnitude more than this you should probably re-evaluate.
You can delete a table a DynamoDB table that still contains records, if you have a large number of records you have to regularly delete this is actually a best practice by using a rolling set of tables
Creating and deleting tables is an asynchronous operation so you do not want to have your application depend on the time it takes for these operations to complete. Make sure you create tables well in advance of you needing them. Under normal circumstances tables create in just a few seconds to a few minutes, but under very, very rare outage circumstances I've seen it take hours.
The DynamoDB best practices documentation on Understand Access Patterns for Time Series Data states...
You can save on resources by storing "hot" items in one table with
higher throughput settings, and "cold" items in another table with
lower throughput settings. You can remove old items by simply deleting
the tables. You can optionally backup these tables to other storage
options such as Amazon Simple Storage Service (Amazon S3). Deleting an
entire table is significantly more efficient than removing items
one-by-one, which essentially doubles the write throughput as you do
as many delete operations as put operations.

It's perfectly acceptable to split your data the way you describe. You can delete a DynamoDB table regardless of its size of how many items it contains.
As far as I know there are no explicit SLAs for the time it takes to delete or create tables (meaning there is no way to know if it's going to take 2 seconds or 2 minutes or 20 minutes) but as long your solution does not depend on this sort of timing you're fine.
In fact the idea of sharding your data based on age has the potential of significantly improving the performance of your application and will definitely help you control your costs.

Related

Storing and querying for announcements between two datetimes

Background
I have to design a table to store announcements in DynamoDB. Each announcement has the following structure:
{
"announcementId": "(For the frontend to identify an announcement to the backend)",
"author": "(id of author)",
"displayStartDatetime": "",
"displayEndDatetime": "",
"title": "",
"description": "",
"image": "(A url to an image)",
"link": "(A single url to another page)"
}
As we are still designing the table, alterations to the structure are permitted. In particular, announcementId, displayStartDatetime and displayEndDatetime can be changed.
The main access pattern is to find the current announcements. Users have a webpage which they can see all current announcements and their details.
Every announcement has a date for when to start showing it (displayStartDatetime) and when to stop showing it (displayEndDatetime). The announcement is should still be kept in the table after the current datetime is past displayEndDatetime for reference for admins.
The start and end datetime are precise to the minute.
Problem
Ideally, I would like a way to query the table for all the current announcements in one query.
However, I have come to the conclusion that it is impossible to fuse two datetimes in one sort key because it is impossible to order two pieces of data of equal importance (e.g. storing the timestamps as a string will mean one will be more important/greater than the other).
Hence, as a compromise, I would like to sort the table values by displayEndDatetime so that I can filter out past announcements. This is because, as time goes on, there will be more past announcements than future announcements, so it will be more beneficial to optimise that.
Compromised Solution
Currently, my (not very good) solutions are:
Use one "hot" partition key and use the displayEndDatetime as the sort key.
This allows me to filter out past announcements, but it also means that all the data is in a single partition. I could run a scheduled job every now and then to move the past announcements to a different spaced out partitions.
Scan through the table
I believe Scan will look at every item in the table before it performs any filtering. This solution doesn't seem as good as 1. but it would be the simplest to implement and it would allow me to keep announcementId as the partition key.
Scan a GSI of the table
Since Scan will look through every item, it may be more efficient to create a GSI (announcementId (PK), displayEndDatetime (SK)) and scan through that to retrieve all the announcementIds which have not passed. After that, another request could be made to get all the announcements.
Question
What is the most optimised solution for storing all announcements and then finding current announcements when using DynamoDB?
Although I have listed a few possible solutions for sorting the displayEndDatetime, the main point is still finding announcements between the start and end datetime.
Edit
Here are the answers to #tugberk's questions on the background:
What is the rate of writes you anticipate receiving (i.e. peak writes per second you need to handle)?
I am uncertain of how the admins will use this system, announcements can be very regular (about 3/day) or very infrequent (about 3/month).
How much new data do you anticipate storing daily, and how do you think this will grow?
As mentioned above, this could be about 3 announcements a day or 3 a month. This is likely to remain the same for as long as I should be concerned about.
What is the rate of reads (e.g. peak reads per second)?
I would expect the peak reads per second to be around 500-1000 reads/s. This number is expected to grow as there are more users.
How many announcements a user can see at a time (i.e. what's avg/max number of announcements will be visible at any point in time)? Practically thinking, this shouldn't be more than a few (e.g. 10-20 at most).
I would expect the maxmimum number of viewable announcements to be up to 30-40. This is because there could be multiple long-running announcements along with short-term announcements. On average, I would expect about 5-10 announcements.
What is the data inconsistency gap you are happy to have here (i.e. do you need seconds level precision, or would you be happy to have ~1min delay on displaying and hiding announcements)?
I think the speed which the announcement starts showing is important, especially if the admins decide that this is a good platform for urgent announcements (likely urgent to the minute). However, when it stops showing is less important, but to avoid confusing the users the announcement should stop display at most 4 hours after it is past its display end datetime.
This type of questions are always hard to answer here as there is so many assumptions on the answer as it's really hard to have all the facts. But I will try to give you so ideas, which may help you think about your data storage choice as well as giving you further options.
I know what I am doing, and really need to use DynamoDB
Edited this answer based on the OP's answers to my original questions.
As you really need to us DynamoDB for this for internal reasons, I think it's more suitable to store the data in two DynamoDB tables for both serving reads and writes as nearly all access patterns I can think of will hit multiple partitions if you have one table. You can get away with a GSI, but it's not too straight forward how to do it, and I am not sure whether there is any advantage to doing it that way.
The core thing you need to optimize for is the reads as you mentioned it can go up to 2K/rps which is big enough to make this the part where you optimize your architecture against. Based on your assumptions of having 3 announcements a day, it's nothing to worry about as far as the writes are concerned.
General idea is this:
I would consider using one DynamoDB table to handle writes where you can configure author identifier as the partition key, and announcement identifier as the sort key (and make your primary key as the combination of both). This will allow you to query all the announcements for a given author easily.
I would also have a second DynamoDB table to handle reads, where you will only store active announcements which your application can query and retrieve all of it with a Scan query (i.e. O(N)), which is not a concern as you mentioned there will only be 30-40 active announcments at any point in time. Let's imagine this to be even 500, you are still OK with this structure. In terms of partition and sort key, I would just have an active boolean field as the partition key, which you will always have it as true, you can have the announcement id as the sort key, and make the combination of both as the primary key. If you care about the sort of these announcements, you can adjust the sort key accordingly but make sure it's unique (i.e. consider concatenating the announcement identifier, e.g. {displayBeginDatetime-in-yyyyMMddHHmmss-format}-{announcementId}. With this way you will guarantee that you will only hit one partition. However, you can actually simplify this and have the announcement identifier as the partition key and primary key as I am nearly sure that DynamoDB will store all your data in one partition as it's going to be so small. Better to confirm this though as I am not 100% sure. The point here is that you are much better of ensuring hitting one partition with this query.
Here is how this may work, where there are some edge cases I am overlooking:
record the write inside the first DynamoDB for an announcement. When an announcement is written, configure displayEndDatetime as the TTL of that row, with the assumption that you don't need this record in this table when an announcement expires.
have a job running for N minute (one or more, depending on the data inconsistency gap you can handle), which will Scan the entire DynamoDB table across partitions (do it in a paginated way), and makes decisions on which announcements are currently visible. Then, write your data into the second DynamoDB table, which will handle the reads, in the structure we have established above so that your consumer can read from this w/o worrying about any filtering as the data is already filtered (e.g. all the announcements here are visible ones). Note that Scan is fine here as you are running this once every N minutes, with the assumption that you are ok with at least 1 minute + processing time data inconsistency gap. I would suggest running this every 10 minutes or so, if you don't have strong data consistency requirements.
On the read storage system, also configure displayEndDatetime as the TTL for the row so that it gets automatically deleted.
Configure DynamoDB streams on the first DynamoDB table, which has 24 hours retention and exactly once delivery guarantee, and have a lambda consumer of this stream, which to handle when an item is deleted (will happen when TTL kicks in for a particular row) to keep a record of this announcements somewhere else, for longer retention reasons, and will need to expose it through different access pattern (e.g. show all the announcements per author so that they can reenable old announcements), as you mentioned in you question. You can configure a lambda event sourcing with DynamoDb streams, which will allow you to handle failures with retries, etc. Make sure that your logic in these lambdas are idempotent so that you can retry safely.
The below is the parts from my original question, which are still relevant to anyone who might be trying to achieve the same. So, I will leave them here but they are less relevant as the OP needs to use DynamoDB.
Why DynamoDB?
First of all, I would question why you need DynamoDB for this, as it seems like your requirements are more read heavy than it's being write heavy, where I think DynamoDB shines the most due to its partitioned out of the box nature.
Below questions would help you understand whether you really need DynamoDB for this, or can you get away with a more flexible data storage system:
what is the rate of writes you anticipate receiving (i.e. peak writes per second you need to handle)?
how much new data do you anticipate storing daily, and how do you think this will grow?
what is the rate of reads (e.g. peak reads per second)?
How many announcements a user can see at a time (i.e. what's avg/max number of announcements will be visible at any point in time)? Practically thinking, this shouldn't be more than a few (e.g. 10-20 at most). This will help you understand whether you need will be OK pulling all the visible announcements in one go, or need a pagination system.
What is the data inconsistency gap you are happy to have here (i.e. do you need seconds level precision, or would you be happy to have ~1min delay on displaying and hiding announcements)?
Actually, I don't need DynamoDB
Based on my assumptions on your consumption and admin needs for this use case, I believe you don't need DynamoDB for this with the assumption of not having high number of writes for this (which might be wrong), and if these assumptions are correct, the above is a super over engineered solution for you. Let's say it's correct, I think you are better of using PostgreSQL for this, which can give you easy ability to change your access pattern as you see fit with further indexing, and for the current access pattern you have, you can have a range query over the start and end times.

how to create dynamoDB efficiently with my table?

If each of my database's an overview has only two types (state: pending, appended), is it efficient to designate these two types as partition keys? Or is it effective to index this state value?
It would be more effective to use a sparse index. In your case, you might add an attribute called isPending. You can add this attribute to items that are pending, and remove it once they are appended. If you create a GSI with tid as the hash key and isPending as the sort key, then only items that are pending will be in the GSI.
It will depend on how would you search for these records!
For example, if you will always search by record ID, it never minds. But if you will search every time by the set of records pending, or appended, you should think in use partitions.
You also could research in this Best practice guide from AWS: https://docs.aws.amazon.com/en_us/amazondynamodb/latest/developerguide/best-practices.html
Updating:
In this section of best practice guide, it recommends the following:
Keep related data together. Research on routing-table optimization
20 years ago found that "locality of reference" was the single most
important factor in speeding up response time: keeping related data
together in one place. This is equally true in NoSQL systems today,
where keeping related data in close proximity has a major impact on
cost and performance. Instead of distributing related data items
across multiple tables, you should keep related items in your NoSQL
system as close together as possible.
As a general rule, you should maintain as few tables as possible in a
DynamoDB application. As emphasized earlier, most well designed
applications require only one table, unless there is a specific reason
for using multiple tables.
Exceptions are cases where high-volume time series data are involved,
or datasets that have very different access patterns—but these are
exceptions. A single table with inverted indexes can usually enable
simple queries to create and retrieve the complex hierarchical data
structures required by your application.
Use sort order. Related items can be grouped together and queried
efficiently if their key design causes them to sort together. This is
an important NoSQL design strategy.
Distribute queries. It is also important that a high volume of
queries not be focused on one part of the database, where they can
exceed I/O capacity. Instead, you should design data keys to
distribute traffic evenly across partitions as much as possible,
avoiding "hot spots."
Use global secondary indexes. By creating specific global secondary
indexes, you can enable different queries than your main table can
support, and that are still fast and relatively inexpensive.
I hope I could help you!

DynamoDB tables per customer considering DynamoDB's advanced recovery abilities

I am deciding whether or not I have tables per a customer, or a customer shares a table with everybody else. Creating a table for every customer seems problematic, as it is just another thing to manage.
But then I thought about backing up the database. There could be a situation where a customer does not have strong IT security, or even a disgruntled employee, and that this person goes and deletes a whole bunch of crucial data of the customer.
In this scenario if all the customers are on the same table, one couldn't just restore from a DynamoDB snapshot 2 days ago for instance, as then all other customers would lose the past 2 days of data. Before cloud this really wasn't such a prevalent consideration IMO because backups were not as straight forward offering such functionality to your customers who are not tier 1 businesses wasn't really on the table.
But this functionality could be a huge selling point for my SAAS application so now I am thinking it will be worth the hassle for me to have table per customer. Is this the right line of thinking?
Sounds like a good line of thinking to me. A couple of other things you might want to consider:
Having all customer data in one table will probably be cheaper as you can distribute RCUs and WCUs more efficiently. From your customer point of view this might be good or bad because one customer can spend any customers RCUs/WCUs (if you want to think about like that). If you split customer data into separate tables your can provision them independently.
Fine grained security isn't great in DynamoDB. You can only really implement row (item) level security if the partition key of the table is an Amazon uid. If this isn't possible you are relying on application code to protect customer data. Splitting customer data into separate tables will improve security (if you cant use item level security).
On to your question. DynamoDB backups don't actually have to be restored into the same table. So potentially you could have all your customer data in one table which is backed up. If one customer requests a restore you could load the data into a new table, sync their data into the live table and then remove the restore table. This wouldn't necessarily be easy, but you could give it a try. Also you could be paying for all the RCUs/WCUs as you perform your sync - a cost you don't incur on a restore.
Hope some of that is useful.
Separate tables:
Max number of tables. It's probably a soft limit but you'd have to contact support rather often - extra overhead for you because they prefer to raise limits in small (reasonable) bits.
A lot more things to manage, secure, monitor etc.
There's probably a lot more RCU and WCU waste.
Just throwing another idea up in the air, haven't tried it or considered every pro and con.
Pick up all the write ops with Lambda and write them to backup table(s). Use TTL (for how long can users restore their stuff) to delete old entries for free. You could even modify TTL per costumer basis if you e.g provide longer backups for different price classes of your service.
You need a good schema to avoid hot keys.
customer-id (partition ID) | time-of-operation#uuid (sort key) | data, source table etc
-------------------------------------------------------------------------------------------
E.g this example might be problematic if some of your costumers are a lot more active than others.
Possible solution: use known range of int-s to suffix IDs, e.g customer-id#1, customer-id#2 ... customer-id#100 etc. This will spread the writes and your app knows the range - able to query.
Anyway, this is just a quick and dirty example off the top of my head.
Few pros and cons that come to my mind:
Probably more expensive unless separate tables have big RCU/WCU headroom.
Restoring from regular backup might be a huge headache, e.g which items to sync?
This is very cranual, users can pick any moment in your TTL range to restore.
Restore specific items, revert specific ops w/ very low cost if your schema allows it.
Could use that backup data to e.g show history of items in front-end.

Riak and time-sorted records

I'd like to sort some records, stored in riak, by a function of the each record's score and "age" (current time - creation date). What is the best way do do a "time-sensitive" query in riak? Thus far, the options I'm aware of are:
Realtime mapreduce - Do the entire calculation in a mapreduce job, at query-time
ETL job - Periodically do the query in a background job, and store the result back into riak
Punt it to the app layer - Don't sort at all using riak, and instead use an application-level layer to sort and cache the records.
Mapreduce seems the best on paper, however, I've read mixed-reports about the real-world latency of riak mapreduce.
MapReduce is a quite expensive operation and not recommended as a real-time querying tool. It works best when run over a limited set of data in batch mode where the number of concurrent mapreduce jobs can be controlled, and I would therefore not recommend the first option.
Having a process periodically process/aggregate data for a specific time slice as described in the second option could work and allow efficient access to the prepared data through direct key access. The aggregation process could, if you are using leveldb, be based around a secondary index holding a timestamp. One downside could however be that newly inserted records may not show up in the results immediately, which may or may not be a problem in your scenario.
If you need the computed records to be accurate and will perform a significant number of these queries, you may be better off updating the computed summary records as part of the writing and updating process.
In general it is a good idea to make sure that you can get the data you need as efficiently as possibly, preferably through direct key access, and then perform filtering of data that is not required as well as sorting and aggregation on the application side.

Storing messages and threads in Windows Azure Table Storage

I am designing a simple messaging service using ASP.NET MVC / Windows Azure Table Storage. I have two kinds of entities - messages and message threads. Relation between them is simple - each thread can have multiple messages but the message can only be assigned to one thread.
Table storage is not a relational DB, so representing relations is always a bit tricky. I need to decide between 2 approaches:
Having one big table for threads and one for messages. And having threadId as a partition key of message entity so that messages are partitioned by threads.
Dynamically creating a special table for each message thread and having threadId as a name of the table.
I tend to prefer the second because it fits better into architecture of the rest of the service. But there will obviously be large number of tables created in a storage account.
Do you think this may be a problem?
You could also consider having just one table, that stores both Thread and Message entities. This would give you transaction support, and you could use Lucifure's hybrid approach on this table.
Creating a large number of tables may be an issue, depending on how you want to manage them. The underlying REST API for listing tables works like a query for table entities. It only returns the first 1000 tables, after that you have to use a continuation token. All of the storage explorers I've seen don't allow you to query tables based on name, they simply like the first 1000 tables. If you end up with 20000 threads, it could take you a while to get to the table you want.
One way you could mitigate this is to put your message table in its own storage account. This way your storage account with all of your other tables won't get crowded out by all of these dynamic tables that you will be creating and possibly deleting.
Deleting is actually one of the ways in which using a separate table for each thread would be easier. To delete all of the related messages you simply have to delete one table rather than iterating over each message and deleting it.
Everything else however will be more complicated than keeping all of the messages in one table. If this is core functionality to your app and you can dedicate enough time to develop it this way, one table per thread is probably a good idea. Otherwise the easy way to do things is with one big table.
You may consider a hybrid approach to keep the number of tables to a manageable level, depending on your scalability needs.
My experience has been that date based partitioning at the table level is a very effective approach and can be leverage across the board.
For example you could partition tables based on date and with a granularity of day or month. So a table name like “Thread201202” could be used for all threads started in February 2012.
Your thread id would implicitly include the “201202” and be something like “201202-myid01” although you would not need to explicitly store it in the partition key since it would be implied in the table name.
Aged threads could then be easily disposed by deleting tables say more than a year old.

Resources