I want to create a load test for a feature of my app. It’s using a Google App Engine and a VM. The user sends HTTP requests to the App Engine. It’s realistic that this Engine gets thousands of requests in a few seconds. So I want to create a load test, where I send 20.000 - 50.000 in a timeframe of 1-10 seconds.
How would you solve this problem?
I started to try using Google Cloud Task, because it seems perfect for this. You schedule HTTP requests for a specific timepoint. The docs say that there is a limit of 500 tasks per second per queue. If you need more tasks per second, you can split this tasks into multiple queues. I did this, but Google Cloud Tasks does not execute all the scheduled task at the given timepoint. One queue needs 2-5 minutes to execute 500 requests, which are all scheduled for the same second :thinking_face:
I also tried a TypeScript script running asynchronous node-fetch requests, but I need for 5.000 requests 77 seconds on my macbook.
I don't think you can get 50.000 HTTP requests "in a few seconds" from "your macbook", it's better to consider going for a special load testing tool (which can be deployed onto GCP virtual machine in order to minimize network latency and traffic costs)
The tool choice is up to you, either you need to have powerful enough machine type so it would be able to conduct 50k requests "in a few seconds" from a single virtual machine or the tool needs to have the feature of running in clustered mode so you could kick off several machines and they would send the requests together at the same moment of time.
Given you mention TypeScript you might want to try out k6 tool (it doesn't scale though) or check out Open Source Load Testing Tools: Which One Should You Use? to see what are other options, none of them provides JavaScript API however several don't require programming languages knowledge at all
A tool you could consider using is siege.
This is Linux based and to prevent any additional cost by testing from an outside system out of GCP.
You could deploy siege on a relatively large machine or a few machines inside GCP.
It is fairly simple to set up, but since you mention that you need 20-50k in a span of a few seconds, siege by default only allows 255 requests per second. You can make this larger, though, so it can fit your needs.
You would need to play around on how many connections a machine can establish, since each machine will have a certain limit based on CPU, Memory and number of network sockets. You could just increase the -c number, until the machine gives an "Error: system resources exhausted" error or something similar. Experiment with what your virtual machine on GCP can handle.
Related
I am trying to stress test a IIS running a AspNet core App.
to do this i setup a Thread Group with 100 workers
In the thread group I use a Loop Controller
in the loop controller I use a Access Log sampler in order to replay real Get requests obtained from NCSA formatted logfile.
I am amazed to see that i obtain as total throughput only 100 request per sec.
how can i check if this is a limitation of jmeter or if this is a limitation of my web App ?
I would expect jmeter to blast my server and see target CPU shoot at 100%. or shall I increas again already high value of 100 threads ?
Total throughput is 86 requests per second
100 users might be not enough to "blast" your IIS instance, I would rather recommend going for stress test, i.e. start with 1 user and gradually increase the load until throughput starts decreasing or errors start occurring, whatever comes the first. Moreover it might be the case that the bottleneck is not CPU usage, it may be somewhere else, in case of incorrect configuration or inefficient algorithms the web application may not fully utilize underlying OS and hardware resources
Don't use GUI mode for testing, it's only for test development and debugging, when it comes to execution you should be running JMeter tests in command-line non-GUI mode
There are 464 errors, check jmeter.log file for any suspicious entries
I don't think you can really replay your access log, it can be used only for something simple like static HTML pages, if there is authentication or dynamic parameters it might be the case all your requests are hitting the same login page, you can try running your test using View Results Tree listener and inspect the responses to ensure that your test is doing what it's supposed to be doing
I am running a load test on an API using JMeter. When I host the API on the same pc as the test (the database is remote though) I get ok results.
However, when I tried running the load test through the same API but hosted on a different pc on the same network, I got this wavy pattern in my test results.
Each of the four grouped lines are response times for a particular API endpoint and the blue line is active thread count.
The question is: does this wavy pattern mean anything? This pattern isn't visible when the API is hosted on the same machine as the test.
The results are very different and I am thinking this pattern might be correlated to the problem.
I used 200 active threads and no specific configuration which would produce the requests in this pattern.
You need pay attention to the following points:
Connect Time and Latency metrics, Elapsed Time is a sum of Connect Time, Latency and the actual server response time so these "waves" might be caused by networking issues.
It might be indicating the application under tests is doing i.e. garbage collection or using swap file which is much slower than memory due to lack of resources Make sure that it has enough headroom to operate in terms of CPU, RAM, Network and Disk IO. These metrics can be checked using i.e. JMeter PerfMon Plugin. The same is applicable for JMeter, if JMeter will not be able to send requests fast enough - you will see throughput dropdowns.
The most efficient way to get to the bottom of the issue is running your application under profiling tool telemetry, this will allow you to
identify the heaviest functions, largest objects in heap, etc.
Consider checking your database as well and detect slow queries as the issue might be caused by database issues (including networking layer)
This is a very simple question for those with the knowledge, but I'm a newbie.
In essence, I just need to know if it would be considered okay to run a small, approx. 700 visitors/day bitnami wordpress blog on just one t2.medium EC2 instance (without any auto-scaling, beanstalk).
Am at risk of it crashing? What stats should I monitor or be aware of to be aware of potential dangers? Sorry for the basic nature of these questions, but this is new.
tl;dr: It might be "okay", but it's not ideal.
If your question is because of:
Initial setup time - Load-balancing and auto-scaling will be less expensive (more time-efficient) over time.
Cost - Auto-scaling spins down instances that aren't being used to reduce cost.
Minimal setup for a great user experience - The goal of a great AWS setup is to ensure that capacity matches demand
Am at risk of it crashing?
Possibly, yes. If you average 700 visitors, then the risk is traffic spikes if all visitors hit at the same. It also depends on what your maximum visitors are, which could vary widely from the average (or not)
What stats should I monitor or be aware of to be aware of potential dangers?
Monitor the usage on high traffic days (ie. public holiday sales)
Setup billing alerts
Setup the right metrics:
See John Rotenstein's SO answer:
CPU Utilization is not always the right measure to use -- your
application might only be able to handle a limited number of
connections, it might be squeezed on RAM and the types of requests
might vary too.
You can use normal monitoring tools, or you can write something that
pushes metrics to Amazon CloudWatch, so that you go beyond the basic
CPU and Network metrics that CloudWatch normally provides. You could
even use the Load Balancer's Latency metric to trigger scaling when
the application slows down (custom code required).
I'd start with:
Two or more instances - to deal with instance redundancy (an instance going down)
Several t2.small rather than one t2.medium can work out to be more cost-efficient, and more cost efficient than EC in some use cases.
Add auto-scaling - automatically spin up or down instances based on minimum and maximum counts
Load balancing - to re-route users from unhealthy to healthy instances. And also to keep all of the spun up instances all working as evenly as possible (rather than a single instance handling 80% of the workload while the others bludge).
You can always reduce your instances after time with monitoring.
In my opinion, with 700 visitors a day, the safer option would be to run a load balanced/auto-scaling environment on Elastic Beanstalk with at least 2 instances. The problem with running just one instance is that yes you are at a great risk of crashing in case you get an increase in traffic or when the instance goes down and with just one running you will not have a fallback. You can easily set up CloudWatch monitoring on NetworkIn, NetworkOut to get a sense of the number of requests your site is receiving and serving, and setup CPU Usage monitoring as well. The trade-off with running a load balanced environment over a single instance environment is that the cost might significantly increase as you introduce other things into your environment such as a load balancer. Also if you introduce a load balancer consider reducing the instance size to maybe a t2.small, could aid in reducing the cost.
It actually depends. This question range is wide. You have multiple options here.
You can use only ec2 instance for that much amount of visitors or even more if your application allows. You can also consider caching if your app need it.
You may add instance in an autoscaling group. So that if by any chance you need more resources you can increase them horizontally.
You can add load balancers lateron also. You just need to add user data in your launch configuration attached to autoscaling group. So when your instance get up it should automatically register itself in your load balancer.
For monitoring, you can check for the request metrics in cloudwarch for ELB. You have to keep an eye on your CPU and trigger the scale out policy once it reaches a particular threshold.
I'm executing a load test against an application hosted in Azure. It's a cloud service with 3 instances behind an internal load balancer (Hash based load balancing mode).
When I execute the load test, it queues request even though the req/sec and total current request to IIS is quite low. I'm not sure what could be the problem.
Any suggestions?
Adding few screenshot of performance counters which might help you take decision.
Click on image to view original image.
Edit-1: Per request from Rohit Rajan,
Cloud Service is having 2 instances (meaning 2 VMs), each of them having 14 GBs of RAM and 8 cores.
I'm executing a Step load pattern start with 100 and add 100,150 user every 5 minutes, till 4-5 hours until the load reaches to 10,000 VUs.
Any call to external system are written async. Database calls are synchronous.
There is no straight forward answer to your question. One possible way would be to explore additional investigation options.
Based on your explanation, there seems to be a bottleneck within the application which is causing the requests to queue-up.
In order to investigate this, collect a memory dump when you see the requests queuing up and then use DebugDiag to run a hang analysis on it.
There are several ways to gather the memory dump.
Task Manager
Procdump.exe
Debug Diagnostics
Process Explorer
Once you have the memory dump you can install debug diag and then run analysis on it. It will generate a report which can help you get started.
Debug Diagnostics download: https://www.microsoft.com/en-us/download/details.aspx?id=49924
Is there a way to stimulate 10000 concurrent HTTP request?
I try siege tool
but only have 2000 request limit for my laptop
How can I make 10000 request?
The most simple approach to generate a huge amount of concurrent requests, it probably Apache's ab tool.
For example, ab -n 100 -c 10 http://www.example.com/ would request the given websites a 100 times, with a concurrency of 10 requests.
It is true that the number of simultaneous requests is limited by nature. Keep in mind that TCP only has 65536 available ports, some of which are already occupied and the first 1024 are usually reserved, this leaves you with a theoretical maximum of around 64500 ports per machine for outgoing request.
Then there are the operating system limits. For example, in Linux there are the kernel parameters in the net.ipv4.* group.
Finally, you should of course configure your HTTP server to handle that amount of simultaneous requests. In Apache, those are StartServers and its friends, in nginx it's worker_processes and worker_connections. Also, if you have some stand-alone dynamic processor attached to your webserver (such as php-fpm), you must raise the number of idle processes in the connection pool, too.
After all, the purpose of massive parallel requests should be to find your bottle necks, and the above steps will give you a fair idea.
Btw. if you use ab, read its final report thoroughly. It may seem brief, but it carries a lot of useful information (e.g. "non-2xx responses" may indicate server-side errors due to overload.)
Jmeter allows distributed testing, which means that you can setup up a set of computers (one acting as a master and the rest as slaves) to run as many threads as you need. Jmeter has a very good doc explaining this here . . .
http://jmeter.apache.org/usermanual/jmeter_distributed_testing_step_by_step.pdf
and some more info here . . .
http://digitalab.org/2013/06/distributed-testing-in-jmeter/
You can set this all up on the cloud as well if you do not have access to sufficient slave machines, there are a couple of services out there for this.
Have you tried using Apache JMeter? You can create a web test plan and there are several options which you can play with. You can wrap the requests in a ThreadGroup as outlined here. You can generate extensive reports and graphs as well. If the simple thread group is not enough you could potentially try using the UltimateThreadGroup plugin for JMeter.
When creating so many threads with JMeter on a single machine you run out of memory to allocate a new stack for a thread. For that you can potentially consider reducing the stack space for the thread. How to do that is explained in the SO answer here. The post has some other alternative approaches as well.
If there isn't an OS limit of the number of simultaneous TCP connections allowed, there is a registry setting that removes or increases that limit. After you made sure that isn't the case, you could write some JavaScript that includes AJAX requests and put it in a loop.
You would probably need node.js to execute the JavaScript.