I am trying to stress test a IIS running a AspNet core App.
to do this i setup a Thread Group with 100 workers
In the thread group I use a Loop Controller
in the loop controller I use a Access Log sampler in order to replay real Get requests obtained from NCSA formatted logfile.
I am amazed to see that i obtain as total throughput only 100 request per sec.
how can i check if this is a limitation of jmeter or if this is a limitation of my web App ?
I would expect jmeter to blast my server and see target CPU shoot at 100%. or shall I increas again already high value of 100 threads ?
Total throughput is 86 requests per second
100 users might be not enough to "blast" your IIS instance, I would rather recommend going for stress test, i.e. start with 1 user and gradually increase the load until throughput starts decreasing or errors start occurring, whatever comes the first. Moreover it might be the case that the bottleneck is not CPU usage, it may be somewhere else, in case of incorrect configuration or inefficient algorithms the web application may not fully utilize underlying OS and hardware resources
Don't use GUI mode for testing, it's only for test development and debugging, when it comes to execution you should be running JMeter tests in command-line non-GUI mode
There are 464 errors, check jmeter.log file for any suspicious entries
I don't think you can really replay your access log, it can be used only for something simple like static HTML pages, if there is authentication or dynamic parameters it might be the case all your requests are hitting the same login page, you can try running your test using View Results Tree listener and inspect the responses to ensure that your test is doing what it's supposed to be doing
Related
I want to create a load test for a feature of my app. It’s using a Google App Engine and a VM. The user sends HTTP requests to the App Engine. It’s realistic that this Engine gets thousands of requests in a few seconds. So I want to create a load test, where I send 20.000 - 50.000 in a timeframe of 1-10 seconds.
How would you solve this problem?
I started to try using Google Cloud Task, because it seems perfect for this. You schedule HTTP requests for a specific timepoint. The docs say that there is a limit of 500 tasks per second per queue. If you need more tasks per second, you can split this tasks into multiple queues. I did this, but Google Cloud Tasks does not execute all the scheduled task at the given timepoint. One queue needs 2-5 minutes to execute 500 requests, which are all scheduled for the same second :thinking_face:
I also tried a TypeScript script running asynchronous node-fetch requests, but I need for 5.000 requests 77 seconds on my macbook.
I don't think you can get 50.000 HTTP requests "in a few seconds" from "your macbook", it's better to consider going for a special load testing tool (which can be deployed onto GCP virtual machine in order to minimize network latency and traffic costs)
The tool choice is up to you, either you need to have powerful enough machine type so it would be able to conduct 50k requests "in a few seconds" from a single virtual machine or the tool needs to have the feature of running in clustered mode so you could kick off several machines and they would send the requests together at the same moment of time.
Given you mention TypeScript you might want to try out k6 tool (it doesn't scale though) or check out Open Source Load Testing Tools: Which One Should You Use? to see what are other options, none of them provides JavaScript API however several don't require programming languages knowledge at all
A tool you could consider using is siege.
This is Linux based and to prevent any additional cost by testing from an outside system out of GCP.
You could deploy siege on a relatively large machine or a few machines inside GCP.
It is fairly simple to set up, but since you mention that you need 20-50k in a span of a few seconds, siege by default only allows 255 requests per second. You can make this larger, though, so it can fit your needs.
You would need to play around on how many connections a machine can establish, since each machine will have a certain limit based on CPU, Memory and number of network sockets. You could just increase the -c number, until the machine gives an "Error: system resources exhausted" error or something similar. Experiment with what your virtual machine on GCP can handle.
I am running a load test on an API using JMeter. When I host the API on the same pc as the test (the database is remote though) I get ok results.
However, when I tried running the load test through the same API but hosted on a different pc on the same network, I got this wavy pattern in my test results.
Each of the four grouped lines are response times for a particular API endpoint and the blue line is active thread count.
The question is: does this wavy pattern mean anything? This pattern isn't visible when the API is hosted on the same machine as the test.
The results are very different and I am thinking this pattern might be correlated to the problem.
I used 200 active threads and no specific configuration which would produce the requests in this pattern.
You need pay attention to the following points:
Connect Time and Latency metrics, Elapsed Time is a sum of Connect Time, Latency and the actual server response time so these "waves" might be caused by networking issues.
It might be indicating the application under tests is doing i.e. garbage collection or using swap file which is much slower than memory due to lack of resources Make sure that it has enough headroom to operate in terms of CPU, RAM, Network and Disk IO. These metrics can be checked using i.e. JMeter PerfMon Plugin. The same is applicable for JMeter, if JMeter will not be able to send requests fast enough - you will see throughput dropdowns.
The most efficient way to get to the bottom of the issue is running your application under profiling tool telemetry, this will allow you to
identify the heaviest functions, largest objects in heap, etc.
Consider checking your database as well and detect slow queries as the issue might be caused by database issues (including networking layer)
I'm executing a load test against an application hosted in Azure. It's a cloud service with 3 instances behind an internal load balancer (Hash based load balancing mode).
When I execute the load test, it queues request even though the req/sec and total current request to IIS is quite low. I'm not sure what could be the problem.
Any suggestions?
Adding few screenshot of performance counters which might help you take decision.
Click on image to view original image.
Edit-1: Per request from Rohit Rajan,
Cloud Service is having 2 instances (meaning 2 VMs), each of them having 14 GBs of RAM and 8 cores.
I'm executing a Step load pattern start with 100 and add 100,150 user every 5 minutes, till 4-5 hours until the load reaches to 10,000 VUs.
Any call to external system are written async. Database calls are synchronous.
There is no straight forward answer to your question. One possible way would be to explore additional investigation options.
Based on your explanation, there seems to be a bottleneck within the application which is causing the requests to queue-up.
In order to investigate this, collect a memory dump when you see the requests queuing up and then use DebugDiag to run a hang analysis on it.
There are several ways to gather the memory dump.
Task Manager
Procdump.exe
Debug Diagnostics
Process Explorer
Once you have the memory dump you can install debug diag and then run analysis on it. It will generate a report which can help you get started.
Debug Diagnostics download: https://www.microsoft.com/en-us/download/details.aspx?id=49924
We have a fairly popular site that has around 4 mil users a month. It is hosted on a Dedicated Box with 16 gb of Ram, 2 procc with 24 cores.
At any given time the CPU is always under 40% and the memory is under 12 GB but at the highest traffic we see a very poor performance. The site is very very slow. We have 2 app pools one for our main site and one for our forum. Only the site is being slow. We don't have any restrictions on cpu or memory per app pool.
I have looked at he Performance counters and I saw something very interesting. At our peek time for some reason Request are being queued. Overall context switching numbers are very high around 30 - 110 000 k.
As i understand high context switching is caused by locks. Can anyone give me an example code that would cause a high number of context switches.
I am not too concerned with the context switching, and i don't think the numbers are huge. You have a lot of threads running in IIS (since its a 24 core machine), and higher context switching numbers re expected. However, I am definitely concerned with the request queuing.
I would do several things and see how it affects your performance counters:
Your server CPU is evidently under-utilized, since you run below 40% all the time. You can try to set a higher value of "Threads per processor limit" in IIS until you get to a 50-60% utilization. An optimal value of threads per core by the books is 20, but it depends on the scenario, and you can experiment with higher or lower values. I would recommend trying setting a value >=30. Low CPU utilization can also be a sign of blocking IO operations.
Adjust the "Queue Length" settings in IIS properties. If you have configured the "Threads per processor limit" to be 20, then you should configure the Queue Length to be 20 x 24 cores = 480. Again, if the requests are getting Queued, that can be a sign that all your threads are blocked serving other requests or blocked waiting for an IO response.
Don't serve your static files from IIS. Move them to a CDN, amazon S3 or whatever else. This will significantly improve your server performance, because 1,000s of Server requests will go somewhere else! If you MUST serve the files from IIS, than configure IIS file compression. In addition use expire headers for your static content, so they get cached on the client, which will save a lot of bandwidth.
Use Async IO wherever possible (reading/writing from disk, db, network etc.) in your ASP.NET controllers, handlers etc. to make sure you are using your threads optimally. Blocking the available threads using blocking IO (which is done in 95% of the ASP.NET apps i have seen in my life) could easily cause the thread pool to be fully utilized under heavy load, and Queuing would occur.
Do a general optimization to prevent the number of requests that hit your server, and the processing time of single requests. This can include Minification and Bundling of your CSS/JS files, refactoring your Javascript to do less roundtrips to the server, refactoring your controller/handler methods to be faster etc. I have added links below to Google and Yahoo recommendations.
Disable ASP.NET debugging in IIS.
Google and Yahoo recommendations:
https://developers.google.com/speed/docs/insights/rules
https://developer.yahoo.com/performance/rules.html
If you follow all these advices, i am sure you will get some improvements!
I've got an asp.net web page that is making 7 async requests to a WCF service on another server. Both boxes are clean without anything else installed.
I've also increased maxconnections in web.config to 20.
I run a single call through the system and the page returns in 800ms. The long and short of it is I think that the threadpool is being being overwhelmed as, once placed underload I cannot get more that 8 requests per second, even though both quad core boxes are running at 20% CPU load and the sql server it's connected to is returning the querys in under 10ms per call.
I've changed the service behaviour to concurrency.multiple but that's not seeming to help.
Any ideas anyone.
There are many different factors that could be in play here. Taking a stab at the remark that changing your instancing model on the service had zero effect (big IF here) then its possible the 'bottleneck' is upstream from the service. Either at the web server, or the client load generator.
You've got several areas to review for tuning: client, web server, wcf service server - that's assuming there are no network devices in the middle. Pick an end and work towards the other end. Since I'm already making an assumption that its not the service, then I'd start at the client and work my way towards the wcf service.
Client
What machine is driving the load against the web server? A laptop? A desktop? A dedicated test agent, or a shared one? The client acting as the load generator for purposes of this test is also susceptible to maxConnections limitation as this is a client setting.
What is the CPU utilization of the client generating load? Could it be that the test driver is just unable to generate enough load to push these boxes? Can you add additional test clients to your test?
Web Server
What does the system.net/processModel element look like in machine.config on the ASP.NET web server? Try setting autoConfig = true. This will allow the configuration to auto size based on the 'size' of the machine its running on.
WCF Service
Review WCF service for any throttling defaults that might be in play and tweak appropriately. See ServiceThrottlingBehavior on MSDN.
Let us know any changes in behavior you might observe (if any) if you make any changes!
The real answer here that everyone missed is that you're using an ASP.NET web page. That means your client is some form of web browser. Modern web browsers have a limit of 2 concurrent async requests at any time. This means that 5 of your requests were queued up and waiting for the first two to finish. Once those first two, it served the next two, then the next two, then the last one.
All of these round-trips and handshakes simply take time. I'm guessing that your roundtrip time is around 200ms, unfortunately you have to do it 4 times.
I also really dislike the "max 2" browser limitation on making webservice calls.
Is this service hosted in IIS, WAS or a Windows Service?
You should try to set Windows to run services on a higher priority. Your WCF Service is probably creating the threads it needs but they should be running at a low priority.
Hope that helps.