Integration testing a DynamoDB client which uses inconsistent reads? - amazon-dynamodb

Situation:
A web service with an API to read records from DynamoDB. It uses eventually consistent reads (GetItem default mode)
An integration test consisting of two steps:
create test data in DynamoDB
call the service to verify that it is returning the expected result
I worry that this test is bound to be fragile due to eventual consistency of the data.
If I attempt to verify the data immediately after writing using GetItem withConsistenRead=true it only guarantees that the data has been written to the majority of DB copies, but not all, so the service under test still has a chance to read from the non-updated copy on the next step.
Is there a way to ensure that the data has been written to all DynamoDB copies before proceeding?

The data usually reach all geographically distributed replicas in a second.
My suggestion is to wait (i.e. in Java terms sleep for few seconds) for couple of seconds before calling the web service should produce the desired result.
After inserting the data into DynamoDB table, wait for few seconds before calling the web service.
Eventually Consistent Reads (Default) – the eventual consistency
option maximizes your read throughput. However, an eventually
consistent read might not reflect the results of a recently completed
write. Consistency across all copies of data is usually reached within
a second. Repeating a read after a short time should return the
updated data.

Related

DDD: persisting domain objects into two databases. How many repositories should I use?

I need to persist my domain objects into two different databases. This use case is purely write-only. I don't need to read back from the databases.
Following Domain Driven Design, I typically create a repository for each aggregate root.
I see two alternatives. I can create one single repository for my AG, and implement it so that it persists the domain object into the two databases.
The second alternative is to create two repositories, one each for each database.
From a domain driven design perspective, which alternative is correct?
My requirement is that it must persist the AR in both databases - all or nothing. So if the first one goes through and the second fails, I would need to remove the AG from the first one.
If you had a transaction manager that were to span across those two databases, you would use that manager to automatically roll back all of the transactions if one of them fails. A transaction manager like that would necessarily add overhead to your writes, as it would have to ensure that all transactions succeeded, and while doing so, maintain a lock on the tables being written to.
If you consider what the transaction manager is doing, it is effectively writing to one database and ensuring that write is successful, writing to the next, and then committing those transactions. You could implement the same type of process using a two-phase commit process. Unfortunately, this can be complicated because the process of keeping two databases in sync is inherently complex.
You would use a process manager or saga to manage the process of ensuring that the databases are consistent:
Write to the first database and leave the record in a PENDING status (not visible to user reads).
Make a request to second database to write the record in a PENDING status.
Make a request to the first database to leave the record in a VALID status (visible to user reads).
Make a request to the second database to leave the record in a VALID status.
The issue with this approach is that the process can fail at any point. In this case, you would need to account for those failures. For example,
You can have a process that comes through and finds records in PENDING status that are older than X minutes and continues pushing them through the workflow.
You can can have a process that cleans up any PENDING records after X minutes and purges them from the database.
Ideally, you are using something like a queue based workflow that allows you to fire and forget these commands and a saga or process manager to identify and react to failures.
The second alternative is to create two repositories, one each for each database.
Based on the above, hopefully you can understand why this is the correct option.
If you don't need to write why don't build some sort of commands log?
The log acts as a queue, you write the operation in it, and two processes pulls new command from it and each one update a database, if you can accept that in worst case scenario the two dbs can have different version of the data, with the guarantees that eventually they will be consistent it seems to me much easier than does transactions spanning two different dbs.
I'm not sure how much DDD is your use case, as if you don't need to read back you don't have any state to manage, so no need for entities/aggregates

DynamoDB: Time frame to avoid stale read

I am writing to dynamoDB using AWS lambda. I am using AWS Console to read data from dynamoDB. But I have seen instances of stale read with latest not records getting updated when trying to pull records under few mins. What is a safe time interval for data pull which would ensure that latest data is available on read? Would 30 mins be a safe interval?
The below is from AWS site: Just want to understand how recent is recent here. "When you read data from a DynamoDB table, the response might not reflect the results of a recently completed write operation. The response might include some stale data"
Regards,
Dbeings
If you must have a strongly consistent read, you can specify that in your read statement. That way the client will always read from the leader storage node for that partition.
In order for DynamoDB to acknowledge a write, the write must be durable on the leader storage node for that partition and one other storage node for the partition.
If you do an eventually consistent read (which is the default), you might get that read you have a 1:3 chance of that read coming from a node that was not part of the write and an even lesser chance that the item has not been updated on that third storage node.
So, if you need a strongly consistent read, ask for one and you'll get the newest of that item. There is no real performance degradation for doing a strongly consistent read.

DynamoDB tables uses more Read/Write capacity than expected

Background: I have a DynamoDB table which I interact exclusively with a DAO class. This DAO class logs metrics on the number of calls to insert/update/delete operations to the boto library.
I noticed that the # of operations I logged in my code do correlate with the consumed read/write capacity on AWS monitoring but the AWS measurements on consumption are 2 - 15 times the # of operations I logged in my code.
I know for a fact that the only other process interacting with the table is my manual queries on the AWS UI (which is insignificant in capacity consumption). I also know that the size of each item is < 1 KB, which would mean each call should only consume 1 read.
I use strong consistent reads so I do not enjoy the 2x benefit of eventual consistent reads.
I am aware that boto auto-retries at most 10 times when throttled but my throttling threshold is seldomly reached to trigger such a problem.
With that said, I wonder if anyone knows of any factor that may cause such a discrepency in # of calls to boto w.r.t. the actual consume capacities.
While I'm not sure of the support with the boto AWS SDK, in other languages it is possible to ask DynamoDB to return the capacity that was consumed as part of each request. It sounds like you are logging actual requests and not this metric from the API itself. The values returned by the API should accurately reflect what is consumed.
One possible source for this discrepancy is if you are doing query/scan requests where you are performing server side filtering. DynamoDB will consume the capacity for all of the records scanned and not just those returned.
Another possible cause of a discrepancy are the actual metrics you are viewing in the AWS console. If you are viewing the CloudWatch metrics directly make sure you are looking at the appropriate SUM or AVERAGE value depending on what metric you are interested in. If you are viewing the metrics in the DynamoDB console the interval you are looking at can dramatically affect the graph (ex: short spikes that appear in a 5 minute interval would be smoothed out in a 1 hour interval).

Can we wait for PeopleSoft application engine to wait for the asynchronous messages to complete?

I am trying to process a data upload where I am trying to publish the messages through PeopleSoft over Integration Broker asynchronously in an Application Engine. The whole point is to be able to send several messages and consume them in the same node. Before I send the messages, I am storing the data on a table (say T1) to store all the field values in the upload file.
While consuming I am trying to expose each message to the Component Interface and the exceptions are logged onto the same table T1. Let's say for each transaction we are flagging the table field (say Processed_flag ='Y').
I need a mechanism where I could just wait for all the asynchronous messages to complete. I am thinking of checking the T1 table, if there are any rows on the T1 table where Processed_flag is 'N', just make the thread sleep for more time. While all the messages are not processed keep it sleeping and don't let the application engine complete.
The only benefit I can get is I don't have to wait for multiple instances at once or does not have to make the synchronous call. The whole idea is to use the component by different transactions ( as if it was used by say 100 people -> 100 transactions ).
Unless those 100 transactions are complete, we will be make sure out T1 table keeps a record of what goes on and off. If something is wrong, it can log the exceptions catched by the CI.
Any comments on this approach would be appreciated. Thanks in advance!
We are taking a different approach. Even if we are able to validate the data on those tables before app engine completes, the whole idea to send the messages asynchronously is of no use. In that case, using synchronous messages would be better and run the processes in parallel.
So, we decided to let the application engine complete and publish all the chunks of data through messages and make sure the messages are completely consumed in the same node.
We will be updating the table T1, for all the processed / successful / failed rows as we keep consuming the messages and use them as needed.
We will keep an audit or counter for all the rows published and consumed. Since exposing the same component to multiple transactions would be a huge performance impact. We want to make sure it performs better, as if say 50 users are updating the same tables behind component using the same CI ( of course different instances ). I will be completing my proof of concept and hopefully it will be much better than running the processes in parallel.
I hope this can help any one who is dealing with these upload issues. Thanks!

How do I prevent SQLite database locks?

From sqlite FAQ I've known that:
Multiple processes can have the same database open at the same time.
Multiple processes can be doing a SELECT at the same time. But only
one process can be making changes to the database at any moment in
time, however.
So, as far as I understand I can:
1) Read db from multiple threads (SELECT)
2) Read db from multiple threads (SELECT) and write from single thread (CREATE, INSERT, DELETE)
But, I read about Write-Ahead Logging that provides more concurrency as readers do not block writers and a writer does not block readers. Reading and writing can proceed concurrently.
Finally, I've got completely muddled when I found it, when specified:
Here are other reasons for getting an SQLITE_LOCKED error:
Trying to CREATE or DROP a table or index while a SELECT statement is
still pending.
Trying to write to a table while a SELECT is active on that same table.
Trying to do two SELECT on the same table at the same time in a
multithread application, if sqlite is not set to do so.
fcntl(3,F_SETLK call on DB file fails. This could be caused by an NFS locking
issue, for example. One solution for this issue, is to mv the DB away,
and copy it back so that it has a new Inode value
So, I would like to clarify for myself, when I should to avoid the locks? Can I read and write at the same time from two different threads? Thanks.
For those who are working with Android API:
Locking in SQLite is done on the file level which guarantees locking
of changes from different threads and connections. Thus multiple
threads can read the database however one can only write to it.
More on locking in SQLite can be read at SQLite documentation but we are most interested in the API provided by OS Android.
Writing with two concurrent threads can be made both from a single and from multiple database connections. Since only one thread can write to the database then there are two variants:
If you write from two threads of one connection then one thread will
await on the other to finish writing.
If you write from two threads of different connections then an error
will be – all of your data will not be written to the database and
the application will be interrupted with
SQLiteDatabaseLockedException. It becomes evident that the
application should always have only one copy of
SQLiteOpenHelper(just an open connection) otherwise
SQLiteDatabaseLockedException can occur at any moment.
Different Connections At a Single SQLiteOpenHelper
Everyone is aware that SQLiteOpenHelper has 2 methods providing access to the database getReadableDatabase() and getWritableDatabase(), to read and write data respectively. However in most cases there is one real connection. Moreover it is one and the same object:
SQLiteOpenHelper.getReadableDatabase()==SQLiteOpenHelper.getWritableDatabase()
It means that there is no difference in use of the methods the data is read from. However there is another undocumented issue which is more important – inside of the class SQLiteDatabase there are own locks – the variable mLock. Locks for writing at the level of the object SQLiteDatabase and since there is only one copy of SQLiteDatabase for read and write then data read is also blocked. It is more prominently visible when writing a large volume of data in a transaction.
Let’s consider an example of such an application that should download a large volume of data (approx. 7000 lines containing BLOB) in the background on first launch and save it to the database. If the data is saved inside the transaction then saving takes approx. 45 seconds but the user can not use the application since any of the reading queries are blocked. If the data is saved in small portions then the update process is dragging out for a rather lengthy period of time (10-15 minutes) but the user can use the application without any restrictions and inconvenience. “The double edge sword” – either fast or convenient.
Google has already fixed a part of issues related to SQLiteDatabase functionality as the following methods have been added:
beginTransactionNonExclusive() – creates a transaction in the “IMMEDIATE mode”.
yieldIfContendedSafely() – temporary seizes the transaction in order to allow completion of tasks by other threads.
isDatabaseIntegrityOk() – checks for database integrity
Please read in more details in the documentation.
However for the older versions of Android this functionality is required as well.
The Solution
First locking should be turned off and allow reading the data in any situation.
SQLiteDatabase.setLockingEnabled(false);
cancels using internal query locking – on the logic level of the java class (not related to locking in terms of SQLite)
SQLiteDatabase.execSQL(“PRAGMA read_uncommitted = true;”);
Allows reading data from cache. In fact, changes the level of isolation. This parameter should be set for each connection anew. If there are a number of connections then it influences only the connection that calls for this command.
SQLiteDatabase.execSQL(“PRAGMA synchronous=OFF”);
Change the writing method to the database – without “synchronization”. When activating this option the database can be damaged if the system unexpectedly fails or power supply is off. However according to the SQLite documentation some operations are executed 50 times faster if the option is not activated.
Unfortunately not all of PRAGMA is supported in Android e.g. “PRAGMA locking_mode = NORMAL” and “PRAGMA journal_mode = OFF” and some others are not supported. At the attempt to call PRAGMA data the application fails.
In the documentation for the method setLockingEnabled it is said that this method is recommended for using only in the case if you are sure that all the work with the database is done from a single thread. We should guarantee than at a time only one transaction is held. Also instead of the default transactions (exclusive transaction) the immediate transaction should be used. In the older versions of Android (below API 11) there is no option to create the immediate transaction thru the java wrapper however SQLite supports this functionality. To initialize a transaction in the immediate mode the following SQLite query should be executed directly to the database, – for example thru the method execSQL:
SQLiteDatabase.execSQL(“begin immediate transaction”);
Since the transaction is initialized by the direct query then it should be finished the same way:
SQLiteDatabase.execSQL(“commit transaction”);
Then TransactionManager is the only thing left to be implemented which will initiate and finish transactions of the required type. The purpose of TransactionManager – is to guarantee that all of the queries for changes (insert, update, delete, DDL queries) originate from the same thread.
Hope this helps the future visitors!!!
Not specific to SQLite:
1) Write your code to gracefully handle the situation where you get a locking conflict at the application level; even if you wrote your code so that this is 'impossible'. Use transactional re-tries (ie: SQLITE_LOCKED could be one of many codes that you interpret as "try again" or "wait and try again"), and coordinate this with application-level code. If you think about it, getting a SQLITE_LOCKED is better than simply having the attempt hang because it's locked - because you can go do something else.
2) Acquire locks. But you have to be careful if you need to acquire more than one. For each transaction at the application level, acquire all of the resources (locks) you will need in a consistent (ie: alphabetical?) order to prevent deadlocks when locks get acquired in the database. Sometimes you can ignore this if the database will reliably and quickly detect the deadlocks and throw exceptions; in other systems it may just hang without detecting the deadlock - making it absolutely necessary to take the effort to acquire the locks correctly.
Besides the facts of life with locking, you should try to design the data and in-memory structures with concurrent merging and rolling back planned in from the beginning. If you can design data such that the outcome of a data race gives a good result for all orders, then you don't have to deal with locks in that case. A good example is to increment a counter without knowing its current value, rather than reading the value and submitting a new value to update. It's similar for appending to a set (ie: adding a row, such that it doesn't matter which order the row inserts happened).
A good system is supposed to transactionally move from one valid state to the next, and you can think of exceptions (even in in-memory code) as aborting an attempt to move to the next state; with the option to ignore or retry.
You're fine with multithreading. The page you link lists what you cannot do while you're looping on the results of your SELECT (i.e. your select is active/pending) in the same thread.

Resources