I want to know the performance related data of Google Cloud Datastore like QPS and
maximum throughput related data.
Any information regarding these and factors which these details depends on is also preferred.
No, there is not a performance SLA for Cloud Datastore, only an availability SLA.
It is unlikely that QPS would be an issue for you as I'd be almost certain there are larger QPS, higher throughput projects using it than you are likely to have. A correctly designed system can run millions of QPS against Datastore. (If you're seriously looking at a 1M+ QPS load on Datastore, we should talk)
As a historical reference, the team announced handling 4.5 trillion transactions per month back in mid 2013 (that'd be a sustained an avg of 1.7M QPS for every second of the month). You can assume that it has grown significantly since then.
Related
This document on cosmosdb consistency levels and latency says CosmosDb writes have a latency of 10ms at the 99th percentile. Does this include the time it takes for the write to reach CosmosDB. I suspect not, since if I issue a request far away from my configured azure regions, I don't see how it can take < 10 ms.
The SLA is for the latency involved in performing the operations and returning results. As you mention, it does not include time taken to reach the Cosmos endpoint, which depends on the client's distance.
As indicated in performance guidance:
You can get the lowest possible latency by ensuring that the calling
application is located within the same Azure region as the provisioned
Azure Cosmos DB endpoint.
In my experience latency <10ms is typical for an app located in the same region as the Cosmos endpoint it works against.
Started a GCP free trial, migrated two WordPress sites with almost zero traffic to test the service. Here's what I'm running for each of the two sites:
VM: g1-small (1 vCPU, 1.7 GB memory) 10gb SSD
Package: bitnami-wordpress-5-2-4-1-linux-debian-9-x86-64
After about 1-2 months it seems to show that $46 has been deducted from the $300 free trial credit. Is this accurate / typical? Am I looking at paying $20+ per month to process perhaps 100 hits to the site from myself, plus any normal bot crawling that happens? This is roughly 10 times more expensive than a shared hosting multi domain account available from other web hosts.
Overall, how can I tell how much it will actually cost, when it looks to me that GCP reports about $2 of resource consumption per month, a $2 credit, and somehow a $254 balance from $300? Also GCP says average monthly cost is 17 cents on one of the billing pages, which is different from the $2 and the $46 figures. I can't find any entry that would explain all the other resources that were paid/credited.
Does anyone else have experience how much it should cost to run the Bitnami WordPress package provided on GCP marketplace?
Current Usage:
Running 2x g1-small (1 vCPU, 1.7 GB memory) 10gb SSD Package 24x7 should have deducted around ~$26* USD from your free-trial.
I presume you need MySQL would cost you minimum of $7.67* per instance:
Assuming you used 2x MySQL instances it would have costed you ~$15
So $26 Compute + $15 DB + $5 (other network, dns cost etc) would come upto about $46. Please note that price would go up if you used compute for less than a month.
*
1. As you can see from the image, you could get sustained use discount if you run it for a full month
if you are planning to use it for even longer you can get bigger discount for commited use.
Optimise for Cost
Have a look at the cost calculator link to plan your usage.
https://cloud.google.com/products/calculator/
Since compute and relational storage are the most cost prohibitive factor for you. If you are tech-savvy and open to experimentation you can try and use cloud run which should reduce your cost significantly but might add extra latency in serving your request. The link below shows how to set this up:
https://medium.com/acadevmy/how-to-install-a-wordpress-site-on-google-cloud-run-828bdc0d0e96
Currently there is no way around using database. Serverless databases could help bring down your cost but gcp does not over this at this point. AWS has this offering so gcp might come up with this in future.
Scalability
When your user base grows you might want to use
CDN which would help with your network cost.
Saving images to cloud storage would also help bring down your cost as disks are more expensive and less scalable and has increased maintenance.
Hope this helps.
We had a period of latency in our application that was directly correlated with latency in DynamoDB and we are trying to figure out what caused that latency.
During that time, the consumed reads and consumed writes for the table were normal (much below the provisioned capacity) and the number of throttled requests was also 0 or 1. The only thing that increased was the SuccessfulRequestLatency.
The high latency occurred during a period where we were doing a lot of automatic writes. In our use case, writing to dynamo also includes some reading (to get any existing records). However, we often write the same quantity of data in the same period of time without causing any increased latency.
Is there any way to understand what contributes to an increase in SuccessfulRequest latency where it seems that we have provisioned enough read capacity? Is there any way to diagnose the latency caused by this set of writes to dynamodb?
You can dig deeper by checking the Get Latency and Put Latency in CloudWatch.
As you have already mentioned, there was no throttling, and your writes involve some reading as well, and your writes at other period of time don't cause any latency, you should check for what exactly in read operation is causing this.
Check SuccessfulRequestLatency metric while including the Operation dimension as well. Start with GetItem and BatchGetItem. If that doesn't
help include Scan and Query as well.
High request latency can sometimes happen when DynamoDB is doing an internal failover of one of its storage nodes.
Internally within Dynamo each storage partition has to be replicated across multiple nodes to provide a high level of fault tolerance. Occasionally one of those nodes will fail and a replacement node has to be introduced, and this can result in elevated latency for a subset of affected requests.
The advice I've had from AWS is to use a short timeout and a fast retry (e.g. 100ms) if your use-case is latency-sensitive. It's my understanding that only requests that hit the affected node experience increased latency, so within one or two retries you'll hit a different node and get a successful response, with minimal impact on your overall latency. Obviously it's hard to verify this, because it's not a scenario you can reproduce!
If you've got a support contract with AWS, it's well worth submitting a support ticket from the AWS console when events like this happen. They are usually able to provide an insight into what actually happened.
Note: If you're doing retries, remember to use exponential backoff to reduce the risk of throttling.
I am seeing some throttles on my updates on DynamoDB table. I know that throttle work on per second basis, that peaks above provisioned capacity can be sometimes absorbed, but not guaranteed. I know that one is supposed to evenly distribute the load, which I have not done.
BUT please look at the 1 minute average graphs from metrics; attached. The utilized capacity is way below the provisioned capacity. Where are these throttles coming from? Because all writes went to a particular shard?
There are no batch writes. The workload distribution is something that cannot, easily, control.
DynamoDB is built on the assumption that to get the full potential out of your provisioned throughput your reads and writes must be uniformly distributed over space (hash/range keys) and time (not all coming in at the exact same second).
Based on the allocated throughput on your graphs you are still most likely at one shard, but it is possible that there are two or more shards if you have previously raised the throughput above the current level and lowered it down to what it is at now. While this is something to be mindful of, it likely is not what is causing this throttling behavior directly. If you have a lot of data in your table, over 10 GB then you definitely will have multiple shards. This would mean you likely have a lot of cold data in your table and that may be causing this issue, but that seems less likely.
The most likely issue is that you have some hot keys. Specifically, you have one or just a few records that are receiving a very high number of read or write requests and this is resulting in throttling. Essentially DynamoDB can support massive IOPS for both writes and reads, but you can't apply all of those IOPS to just a few records, they need to be distributed among all of the records uniformly in an ideal situation.
Since the number of throttles you were showing is in the order of magnitude of 10s to 100s it may not be something to worry about. As long as you are using the official AWS SDK it will automatically take care of retries with exponential backoff to retry requests several times before completely giving up.
While it is difficult in many circumstances to control the distribution of reads and writes to a table, it may be worth taking another look at your hash/range key design to make sure it is really optimal for your pattern of reads and writes to the table. Also, for reads you may employ caching through Memcached or Redis, even if the cache expired in just a few minutes or a few seconds to help reduce the impact of hot keys. For writes you would need to look at the logic in the application to make sure there are not any unnecessary writes being performed that could be causing this issue.
One last point related to batch writes: A batch operation in DynamoDB does not reduce the consumed amount of read or writes the different child requests consume, it simply reduces the overhead of making multiple HTTP requests. While batch requests generally help with throughput, they are not useful at reducing the likelihood of throttling in DynamoDB.
I have a collection of SOA components that can handle a series of business processes. For example one SOA component imports user data, another runs analytics on it.
I'm familiar with business process modeling for manufacturing, i.e. calculating WIP, throughput, cycle times, utilization etc. for each process. Little's Law, theory of constraints, etc.
Can I apply this approach to capacity planning for my SOA architecture, or is there a more rigorous / more widely accepted approach?
A bit of a broad question. Some guidelines for you but there is no real perfect answer here.
What you are looking for is Business Activity Monitoring used together with performance metrics reported from your servers.
BAM/Business Activity Monitoring will allow you to measure how many orders per seconds you are processing. How many sales you have made today etc. You all then monitor and collect information such as CPU usage, network bandwidth, disk io performance, memory usage and other technical performance metrics. In windows you can use performance counters for this. In the Linux world there is various tools and techniques that you can use.
Using the number of orders placed you can then look at the performance statistics of the systems used by the order placing software to give you some indication of what is happening.
For example we process 10 orders a second on average using roughly 8GB of ram on the ESB server where the orders service is hosted. We are seeing a average increase of 25% per month in the order coming through. We have noticed several alerts about swapping to disk when orders are at their peak. To ensure that we can cater with the demand we will need to double the memory on the server every 4 months. Thus in a year we will need 3*8GB of memory extra or another 32GB of memory. Now you can decide on the implementation do you create a cluster with 4 machines with 8GB of ram in or do I load balance.
Using this information you can start to get a good idea of where your limits are and what you need to budget for in the future.
Go look at some BAM tools and some monitoring tools and see what suits you.