For each time ECS Fargate service scales in or out, Cloudwatch show the CPU utilization with very low percent (about 2 -> 3%) (same with Memory) on graph and then it will be increased gradually although before that time, it's quite high (polices: 80% for scaling out, 40% for scaling in).
I just worry if there is any unavailable period (or break-time) when it's scaling?
I just worry if there is any unavailable period (or break-time) when it's scaling?
Technically, at the Fargate level, the answer is no as long as you have your service set with a minimum task count >= 1.
The wiggle room in saying "no" though, is if your app spikes from 70% CPUUtilization to 100%, the application itself may become unresponsive before Cloudwatch is able to trigger an alarm which in turn triggers the service to scale out.
although before that time, it's quite high
Keep in mind, scaling actions aren't instantaneous. If you're using a Cloudwatch metric for CPUUtilization with a period of 60 seconds, and a threshold of 2, that means your task would have to be > 80% utilization for over 2 minutes before autoscaling was triggered.
In addition to this, Fargate startup time is slower than ECS launch type startup time, because AWS has to do some magic behind the scenes - specifically download the image and also attach an ENI - to make it "serverless".
So, if your application goes over 80% utilization, you wouldn't see it autoscale immediately. This might explain what you're seeing in Cloudwatch with the utilization being high enough to trigger scaling, dropping off to 2% but after the scaling has already been triggered.
Related
I have been going over allot of IIS tuning for my web application, and have set everything that I have found online. But I still had some lag after some time idle, even after setting idle time out to 0, and the action to Suspend in the advanced app pool settings. If I also set the CPU Limit to 0, and Limit Interval to 0 - this seems to take care of problem. What would be the drawbacks of doing so, besides no more CPU monitoring. Could this cause other problems? Also it seems strange why if my Limit was 0, and I am experiencing lags on pages left idle, then I set Limit Interval to 0 - problem solved?
Set idle time-out=0 means w3wp.exe keeps running. Disable CPU limit works indicate that recovering from idle still consume CPU usage.
But disable CPU limit is obviously not a good idea, CPU limit protect the operating system from crash. You would rather crash a website but not the entire system. I recommend give it a higher limit.
You can also try to set Start Mode: "always running" in Application pool advance settings to see if it works for you. Another thing to consider is about the session state, which also has an idle time-out configuration. Set a higher value in case this application does a CPU consumption move when lost session state.
At last, you may consider find the reason by using performance profiler in Visual studio.
Say my ADX cluster is set as Optimized Autoscale in the range [10,50] and the cluster is currently running at 30 nodes. Lets say cache utilization for my ADX cluster is 80% as of now. Considering these factors, if I now change cache policy for one of the tables and suddenly my cache utilization goes to 120% , how much time the Autoscale feature will take to start scale up operation on the cluster? And what is the cache utilization threshold for autoscale to kick in?
It will wait for an hour before adding more nodes (to ensure that it is not a transient issue) see more in the docs. Note that it takes additional few minutes for nodes to be added to the cluster.
The target cache utilization is 80%.
I have a cluster of Elastic Beanstalk instances running Wordpress. RAM utilization is high, around 93% because of some plugins that we're running (a different issue altogether). However, every time one of the instances hits 90+% RAM utilization it kicks the instance into a "degraded" state and kicks it out of the cluster.
I can't for the life of me find a way to modify this RAM utilization check. Ideally I'd be able to bump the RAM utilization threshold up to 95% or so....how do I change this metric, disable it, or remove it altogether?
EDIT
Here is a screenshot of the elastic beanstalk Health screen. I may have jumped to conclusions assuming it's a Cloudwatch metric but the concept is still the same...how do I up the threshold?
Of particular note, the instance status and the message beneath it.
Should I take in consideration CPU utilization, network traffic or http response time checks? I've run some tests with Apache AB (from the same server - eq: ab -k -n 500000 -c 100 http://192.XXX.XXX.XXX/) - and I monitored the load average. Even if the load was between 1.0 - 1.50(one core server), "time per request"(mean) was pretty solid, 140ms for a simple dynamic page with one set/get Redis operation. Anyway, I'm confused as the general advice is to launch a new instance when you pass the 70% CPU utilization threshold.
70% CPU utilization is a good rule of thumb for CPU-bound applications like nginx. CPU time is kind of like body temperature: it actually hides a lot of different things, but is a good general indicator of health. Load average is a separate measure of how many processes are waiting to be scheduled. The reason the rule is 70% (or 80%) utilization is that, past this point, CPU-bound appliations tend to suffer contention-induced latency and non-linear performance.
You can test this yourself by plotting throughput and latency (median and 90th percentile) against CPU utilization on your setup. Finding the inflection point for your particular system is important for capacity planning.
A very good writeup of this phenomenon is given in Facebook's original paper on Dyno, their system for measuring throughput of PHP under load.
https://www.facebook.com/note.php?note_id=203367363919
we have requirement to handle 10000 concurrent user.
Let me explain the system. Machine has two processors. ProcessModel in machine.config is set as autoconfig = true. so that makes maxWorkerThreads = 20.
When I Load run my case with 30 users and watch CPU usage it is maximing to 100. and number of threads on w3wp.exe is more then 70. As my default is 20 * 2 (CPU's) = 40.
Once cpu touches 100% most of the transaction fail or talking maximum time to respond
Now questions
1. how do i get 30 more threads assigned to the same workerprocess?
2. How can reduce CPU usage here?
You have a bit of an issue here. Increasing the # of threads will further increase CPU usage. (Your 2 goals are incompatible.) Not only are you asking each CPU to do more, but you'll have additional overhead with context switching.
You could investigate using a non-blocking IO library, which would essentially mean only 1-2 threads per CPU. However, this could be a significant architecture change to your project (probably not feasible) - and what you might actually find is that most of the CPU was actually spent due to the work your code is performing, and not because of anything threading-related.
It sounds like you need to do some performance tuning and optimization of your application.
You should take a look at making async calls so that your threads are not remaining active while the caller is waiting for a response.
http://msdn.microsoft.com/en-us/magazine/cc163463.aspx