I am trying to implement dynamodb autoscaling using terraform but I am having a bit of difficulty in understanding the difference between aws_appautoscaling_target and aws_appautoscaling_policy.
Do we need both specified for the autoscaling group? Can some one kidly explain what each is meant for?
Thanks a ton!!
The aws_appautoscaling_target ties your policy to the DynamoDB table. You can define a policy once and use it over and over (i.e. build standard set of scaling policies for your organization to use), the target allows you to bind a policy to a resource.
An auto scaling group doesn't have to have either a target or a resource. An ASG can scale EC2 instances in/out based other triggers such as instance health (defined by EC2 health checks or LB health checks) or desired capacity. This allows a load balanced application to replace bad instances when they are unable to respond to instance traffic and also recover from failures to keep your cluster at the right size. You could add additional scaling policies to better react to demand. For example, your cluster has 2 instances but they're at max capacity, a scaling policy can watch those instances and add more when needed and then remove them when demand falls.
Related
Cosmos DB documentation seems to suggest that if we configure our Strong consistent Cosmos DB account with >= 3 regions, we get similar availability as eventual consistency (SLAs).
But, according to the CAP theorem how can this be the case. Suppose we have 3 regions, and there is a network partition isolating 1 read region from the remaining two (1 write and 1 read region). If a write requests comes to the write region, there are two options:
Fail the request
Commit the write to the write region and the reachable read region. The region outside the partition cannot be reached.
If Cosmos DB goes with option 2, then if a read requests were to come to the region that could not be reached, then because Cosmos DB uses local quorum, it will return stale data, which violates the consistency is guarantees.
Therefore, Cosmos DB must fail the write request in the face of network partitions.
This is accomplished by the use of a dynamic quorum over the regions when using 3+ regions. When one of the secondary read regions is impacted by a network partition, the service will remove it from the quorum, allowing writes to commit and replicate to the other online region for an RTO of 0.
The primary region periodically gets health signals from all regions that have been created for the account. It also keeps track of the commits that all regions have caught up to. Until the read region that was previously unavailable has caught up to commits it missed out on, it is not marked online. Once fully caught up, it starts accepting new writes from the primary region at steady state and is simultaneously marked available for serving read traffic.
The system I'm working on has multiple environments, each running in separate Azure regions. Our CosmosDB is replicated to these regions and multi-region writes are enabled. We're using the default consistency model (Session).
We have azure functions that use the CosmosDb trigger deployed in all three regions. Currently these use the same lease prefix which means that only one function processes changes at any given time. I know that we can set each
region to have different lease prefixes to enable concurrent processing but I'd like to solidify my understanding before taking this step.
My question is around the behaviour of the change feed with regards to replication in this scenario? According to this link https://github.com/MicrosoftDocs/azure-docs/issues/42248#issuecomment-552207409 data is first converged on the primary region and then the change feed is updated.
Other resources I've read seem to suggest that each region has it's own change feed which will update upon replication. Also, the previous link recommends only running a change feed processor in the primary region in multi-master.
In an ideal world, I'd like change feed processors in each region to handle local writes quickly. These functions will make updates to CosmosDB and I also want to avoid issues with replication. My question is - what is the actual behavior in a multi master configuration (and by extension the correct architecture)?. Is it "safe" to use per-region change feed processors, or should we use a single processor in the primary region?
You cannot have per-region Change Feed Processor's that only process the local changes, because the Change Feed in each region contains the local writes plus the replicated writes from each other region.
Technically you can use a single Change Feed Processor deployment connecting to one of the regions to process events on all the regions.
Is it an anti pattern if multiple micro-services read/write to/from the same DynamoDB table?
Please note that:
We have a fixed schema which will not change any soon.
All read/write will go though REST services though API gateway + Lambda
micro-services are deployed on Kubernetes or Lambda
When you writing microservices it is advised to not share databases. That is there because it provides for easy scaling and provides each to services to have their own say when it comes to their set of data. It also enables one of these services to take a call on how data is to be kept and can change it at their will. This gives services flexibility.
Even if your schema is not changing you can be sure of the fact that one service when throttled will not impact the others.
If all your crud calls are channeled through a rest service you do have a service layer in front of a database and yes you can do that under microservices guidelines
I will still not call the approach an anti pattern. Its just that the collective experience says that its better not to talk to one databases for some of the reasons I mentioned above.
I have a different opinion here and I'll call this approach as anti-pattern. Few of the key principles which are getting violated here:
Each MS should have it's own bounded context, here if all are sharing the same data set then their business boundaries are blurred.
DB is single point of failure if DB goes down, all your services goes down.
If you try to scale single service and spawn multiple instances, it will impact DB performance and eventually, will effect the performance of other microservices.
Solution,..IMO,
First analyze if you have a case for MSA, if your data is very tightly coupled and dependent on each other then you don't need this architecture.
Explore CQRS pattern, you might like to have different DB for read and write and synchronize them via event pattern.
Hope this helps!!
I currently have 2 separate microservices monitoring the same unpartitioned CosmosDB collection (let's call it MasterCollection).
I only ever want one microservice to process MasterCollection's change feed at any given time - the rationale for having 2 separate hosts monitor the same feed basically boils down to redundancy.
This is what I'm doing in code (note that only the first hostName param differs - every other param is identical):
microservice #1:
ChangeFeedEventHost host = new ChangeFeedEventHost("east", monitoredCollInfo, leaseCollInfo, feedOptions, feedHostOptions);
microservice #2:
ChangeFeedEventHost host = new ChangeFeedEventHost("west", monitoredCollInfo, leaseCollInfo, feedOptions, feedHostOptions);
My testing seems to indicate that this works (only one of them processes the changes), but I was wondering if this is a good practice?
The Change Feed Processor library has a load balancing mechanism that will share the load between them as explained here.
Sharing the load means that they will distribute the Leases among themselves. Each Lease represents a Partition Key Range in the collection.
In your scenario, where you are creating 2 hosts for the same monitored collection using the same lease collection, the effect is that they will share the load and each will hold half of the leases and process changes only for those Partition Key Ranges. So half of the partitions will be processed by the host named westand half by the one named east. If the collection is Single Partition, one of them will process all the changes while the other sits doing nothing.
If what you want is for both to process all the changes independently, you have multiple options:
Use a different lease collection for each Host.
Use the LeasePrefix option in the ChangeFeedHostOptions that can be set on the Host creation. That will let you share the Lease collection for both hosts but they will track independently. Just keep in mind that the RU usage in the Lease collection will raise depending on the amount of activity your main collection has.
In riak documentation, there are often examples that you could model your e-commerce datastore in certain way. But here is written:
In a production Riak cluster being hit by lots and lots of concurrent writes,
value conflicts are inevitable, and Riak Data Types
are not perfect, particularly in that they do not guarantee strong
consistency and in that you cannot specify the rules yourself.
From http://docs.basho.com/riak/latest/theory/concepts/crdts/#Riak-Data-Types-Under-the-Hood, last paragraph.
So, is it safe enough to user Riak as primary datastore in e-commerce app, or its better to use another database with stronger consistency?
Riak out of the box
In my opinion out of the box Riak is not safe enough to use as the primary datastore in an e-commerce app. This is because of the eventual consistency nature of Riak (and a lot of the NoSQL solutions).
In the CAP Theorem distributed datastores (Riak being one of them) can only guarentee at most 2 of:
Consistency (all nodes see the same data at the same time)
Availability (a guarantee that every request receives a response
about whether it succeeded or failed)
Partition tolerance (the system
continues to operate despite arbitrary partitioning due to network
failures)
Riak specifically errs on the side of Availability and Partition tolerance by having eventual consistency of data held in its datastore
What Riak can do for an e-commerce app
Using Riak out of the box, it would be a good source for the content about the items being sold in your e-commerce app (content that is generally written once and read lots is a great use case for Riak), however maintaining:
count of how many items left
money in a users' account
Need to be carefully handled in a distributed datastore.
Implementing consistency in an eventually consistent datastore
There are several methods you can use, they include:
Implement a serialization method when writing updates to values that you need to be consistent (ie: go through a single/controlled service that guarantees that it will only update a single item sequentially), this would need to be done outside of Riak in your API layer
Change the replication properties of your consistent buckets so that you can 'guarantee' you never retrieve out of date data
At the bucket level, you can choose how many copies of data you want
to store in your cluster (N, or n_val), how many copies you wish to
read from at one time (R, or r), and how many copies must be written
to be considered a success (W, or w).
The above method is similar to using the strong consistency model available in the latest versions of Riak.
Important note: In all of these data store systems (distributed or not) you in general will do:
Read the current data
Make a decision based on the current value
Change the data (decrement the Item count)
If all three of the above actions cannot be done in atomic way (either by locking or failing the 3rd if the value was changed by something else) an e-commerce app is open to abuse. This issue exists in traditional SQL storage solutions (which is why you have SQL Transactions).