Can AWS DAX ensure handling provision throughput errors of dynamoDB? - amazon-dynamodb-dax

I am using dynamoDB and I am getting read and write ProvisionedThroughputExceededException
How can I solve this ?
Can using DAX ensure - that I do not get this error ?

DAX is a write-through cache, not write-back. Which means if a request is a cache miss, DAX makes the call to DynamoDB on your behalf to fetch the data. In this model, you are responsible for managing DynamoDB table capacity
You may want to consider using AutoScale with DAX, but it depends on your access patterns.

Related

Cloudflare worker kv is a Database like dynamoDB in aws?

I can't find the exact understanding between cloudflare worker kv and dynamoDB in aws, Can anyone make it clear in simpler ?
Although there are some similarities (ie. both DynamoDB and Worker KV are offered as managed services) I would say they are more different than they are alike.
Worker KV are always eventually consistent whereas DynamoDB can be strongly consistent for read after write operations.
DynamoDB has additional capabilities such as local and global secondary indexes allowing you to have different access patterns for the same underlying data.
Workers KV is heavily optimized for reads with infrequent writes, whereas DynamoDB doesn't have the same limitation (though DynamoDB also does better at reading data than writing in terms of throughput).
DynamoDB also has other features such as stream processing which allows you to do out of band processing in reaction to changes to the data stored in the database.
I'm not sure about the security model for Workers KV but DynamoDB allows you to configure strict access policies to the tables.

Verify dynamodb is healthy

I would like to verify in my service /health check that I have a connection with my dynamodb.
I am searching for something like select 1 in MySQL (its only ping the db and return 1) but for dynamodb.
I saw this post but searching for a nonexisting item is an expensive action.
Any ideas on how to only ping my db?
I believe the select 1 equivalent in DDB is Scan with a Limit of 1 item. You can read more here.
Dynamodb is a managed service from AWS. It is highly available anyways. Instead of using query for verifying health of dynamodb, why not setup cloudwatch metrics on your table and check for recent alarm in cloud watch concerning dynamodb. This will also prevent you from spending your read units.
The question is perhaps too broad to answer as stated. There are many ways you could set this up, depending on your concerns and constraints.
My recommendation would be to not over-think, or over-do it in terms of verifying connectivity from your service host to DynamoDB: for example just performing a periodic GetItem should be sufficient to establish basic network connectivity..
Instead of going about the problem from this angle, perhaps you might want to consider a different approach:
a) setup canary tests that exercise all your service features periodically -- these should be "fail-fast" light tests that run constantly and in the event of consistent failure you can take action
b) setup error metrics from your service and monitor on those metrics: for example, CloudWatch allows you to take action on metrics -- you will likely get more milage out of this approach than narrowly focusing on a single failure mode (ie. DynamoDB, which, as other have stated, is a Managed service with very good availability SLA)

Throttling requests in Cosmos DB SQL API

I'm running a simple adf pipeline for storing data from data lake to cosmos db (sql api).
After setting database throughput to Autopilot 4000 RU/s, the run took ~11 min and I see 207 throttling requests. On setting database throughput to Autopilot 20,000 RU/s, the run took ~7 min and I see 744 throttling requests. Why is that? Thank you!
Change the Indexing Policy to None from Consistent for the ADF copy activity and then change back to Consistent when done.
Azure Cosmos DB supports two indexing modes:
Consistent: The index is updated synchronously as you create, update or delete items. This means that the consistency of your read queries will be the consistency configured for the account.
None: Indexing is disabled on the container. This is commonly used when a container is used as a pure key-value store without the need for secondary indexes. It can also be used to improve the performance of bulk operations. After the bulk operations are complete, the index mode can be set to Consistent and then monitored using the IndexTransformationProgress until complete.
How to modify the indexing policy:
Modifying the indexing policy

Azure Mongo Cosmos Ping Test Equivalent

I want to do a Ping Like Operation on a Cosmos/Mongo/DocumentDB in Azure.
The Collection Has Zero Documents in it.
I am Using the Microsoft.Azure.Documents.Client (Microsoft.Azure.Documents.Client.dll)
I want to do something that would exercise a full round trip and auth cycle to the Cosmos DB to prove the general integrity of the config before any documents are in the collection.
I was looking for an operation on DocumentClient that would prove or disprove all the configuration is correct at runtime, like a Ping.
You could call _client.OpenAsync(cancellationToken), which will validate your configuration and throw if there are any exceptions connecting to the database.
In fact, it is recommended that you call this on service/app startup to avoid latency on your first request.
Reference:
https://learn.microsoft.com/en-us/dotnet/api/microsoft.azure.documents.client.documentclient.openasync?view=azure-dotnet
https://learn.microsoft.com/en-us/azure/cosmos-db/performance-tips

Serverless Titan graph stack with AWS DynamoDB and Lambda

As announced here, it is possible to use Titan with DynamoDB as its backend.
Is it possible to build a serverless Titan Graph DB stack that is accessed via AWS Lambda functions?
Theoretically there should be nothing stoping this implementation but I couldn't find any example. There had been a discussion on the issue under the code repository but did not yield anything concrete yet.
It is possible but I have not estimated the latency considerations involved in starting Titan in a Lambda function. For high request rates, write loads may not be appropriate as each lambda container will try to secure one range of ids from the titan_ids table and you may run out of ids quickly. If your requests are read-only, one way to reduce Titan launch time is to open the graph in read-only mode. In read-only mode, Titan does not need to get an id range lease from titan-ids either.

Resources