Difference between load and spike testing - automated-tests

How is a load test different from a Spike test, Considering the below scenarios.
Load test: Using an automation tool(JMeter in my case) I create a load of 1000 virtual users loaded in 1 sec(ramp up period).
Spike test: Using an automation tool(JMeter in my case) I create a continuous load of 400 virtual users loaded every 1 sec and a spike load of 600 virtual users loaded in 1 sec at a certain point in time.
When there is a spike load induced is it not the same as a load test described?
So my point is what is the need of a spike test if load tests can be carried out continuously at varied load conditions?
Test scenario:
Application tested : Website.
Automation tool : Jmeter.
Speed of internet used while testing: 3 MBPS.
I`m thanking you all in advance.

According to "Performance Testing Guidance for Web Applications", "spike testis a type of performance test focused on determining or validating performance characteristics of the product under test when subjected to workload models and load volumes that repeatedly increase beyond anticipated production operations for short periods of time.". So I think about analogy with Geometric or Algebraic progression, because volumes are repeatedly (and rapidly) increased. Also this and other definitions are paying attention to short period of time.
Load testing is more general term, without specified time (short or long) of testing or pattern to increase load volumes.

Load Testing: It helps us to know how much load a application/System can bear at a point of time.
Ex: Let a normal man can drink Maximum 3lt water at a time.
spike testing: It helps us to know the behaviour of a system by giving suddenly high amount of load.
Ex: For spike testing we try to know whether a normal man can drink 4lt or more at a time?

A spike test is a kind of load test, used to simulate bursty traffic patterns.
For example, you might want to support 1 million client requests an hour. That's an average of 277 requests/sec. However, that doesn't account for varying usage patterns, like a sudden burst of traffic followed by a lull period. A spike test would simulate these bursts, where the short-term request rate can be much higher or lower than the expected average.

Related

What are synthetic tests?

While reading the Spotify blog, I found a reference to something called "synthetic testing":
Having synthetic tests reduces time to recover
After this work involving timelines, we got some signals on time to recover. One such signal was that TTR was all over the place and genuinely hard to correlate with any single aspect of our systems.
However, we got a hit. One of the more exciting things we learned through our incident study was that synthetic testing works. We spent a fair amount of time grading whether or not a synthetic test would have plausibly detected outages, and then looked at the TTR for those that were in fact detected by synthetic tests, versus those that were not because they were not covered by a synthetic test.
The results were even more striking than we thought. We found that incidents involving coverable features that did have a synthetic test saw a recovery time that was generally 10 times faster. No really, read it again!
This may seem obvious, but we never want to discount the power of data to drive decisions. This isn’t just a curiosity. We’ve adjusted our priorities to put a greater emphasis on synthetic testing, as we think it’s pretty important to get things back up and running as quickly as possible.
What is a synthetic test, and how it differs from the normal software testing (unit, integration, ...) that is running in a CI?

Why wouldn't a small Firebase Functions app just use a single Function to handle logic?

...aside from the benefit in separate performance monitoring and logging.
For logging, I am confident I can get granularity through manually adding the name of the "routine" to each call. This is how it is now with several discrete Functions for different parts of the system:
There are multiple automatic logs: start and finish of the routine, for example. It would be more challenging to find out how expensive certain routines are, but it would not be impossible.
The reason I want the entire logic of the application handled by a single handle function is because of reducing cold starts: one function means only one container that can be persistently kept alive when there are very few users of the app.
If a month is ~2.6m seconds and we assume the system uses 1 GB RAM and 1 GHz CPU frequency at all times, that's:
2600000 * 0.0000025 + 2600000 * 0.000001042 = USD$9.21 a month
...for one minimum instance.
I should also state that all of my functions have the bare minimum amount of global scope code; it just sets up Firebase assets (RTDB and Firestore).
From a billing, performance (based on user wait time), and user/developer experience perspective, is there any reason why it would be smart to keep all my functions discrete?
I'd also accept an answer saying "one single function for all logic is reasonable" as long as there's a reason for it.
Thanks!
If you have very small app with ~5 end points and very low traffic. Sure you could do something like this. But why not do it:
billing and performance
The important thing to realize is that with every request a new instance of your function is created. Which means there could be 10s of them running at the same time.
If you would like to have just 1 instance handling all the traffic you should explore GCP Cloud run, where you have 1 container handling multiple requests and scaling only when it's not sufficient.
Imagine you have several end-points and every one of them have different performance requirements.
1 can need only 128MB or RAM
1 can need 1GB RAM
(FYI: You can control the CPU MHz of the function via the RAM settings too - which can speed up execution in some cases)
If you had only 1 function with 1GB of ram. Every request would allocate such function and in some cases most of the memory could go to waste.
But if you split it into multiple, some requests will require much less resources and can save you $ when we talk about bigger amount of executions / month. (tens of thousands+).
Let's imagine function, 3 second execution, 10k executions/month:
128MB would cost you $0.0693
1024MB would cost you $0.495
As you can see, with small app the difference could be nothing. But if you scale it matters. (*The cost can vary based on datacenter)
As for the logging, I don't think it matters. Usually in bigger systems there could be messages traveling trough several functions so you have to deal with that anyway.
As for the cold start. You just need good UI to facilitate that. At first I was worry about it in our apps but later on, you just get used to it that some action can take ~2s to execute (cold start). And you should have the UI "loading" regardless, because you don't know if the function will take ~100ms or 3s due to bad connection.

Predicting/calculating congestion in telecom network

I have an application installed at my phone which is providing below details every minute: - Bandwidth , -Packet loss ,-signal strength,- RTT for google.com every minute.
I am trying to predict congestion based on these 4 attribute , but some how it doesn't look accurate to me , previously i have only used bandwidth .
I want predict congestion at any point more appropriately , appreciate any recommendations .
I think you are saying you are trying to measure network 'responsiveness', and from these measurements get a sense of how congested the network is. You also mention you want to predict which I guess means you want to make an estimate of the future 'responsiveness' based on your measurements and observations.
The items you are measuring look sensible, although you may want to include jitter if you are interested in VoIP or other real time streamed media.
The issue you have is that there are many variables which can effect your measurements, for example:
congestion in the radio cell you are in at the time
congestion in the backhaul network
delays in the server you are using to measure the RTT
congestion or faults with the particular APN your mobile is using to access data services
network faults
As some of these can be irregularly occurring but can have a large impact, it is quite hard to build up an accurate view of the overall network 'responsiveness' with a single handset. For example your local cell may be busy or have a problem but others users of Google.com in other cells will have perfectly good response, or Google.com may be busy or delayed and other users in your cell accessing a different server may again have perfectly good response.
It would likely be useful for you to look at some of the generally available web speedtest applications to see the type of information they provide - they have the advantage of being able to gather results from many thousands of users, and also generally have access to the servers to understand any issues on that side.
Depending on what you are trying to achieve it might be that a combination of measurements from one of the general speedtest services, combined with your own measurements will give you enough data to draw some sort of meaningful conclusions.

Calculating theoretical network bandwidth in topology

I'm in the process of building a discrete-event simulator, and need to be able to calculate the theoretical bandwidth available between two systems in a given network topology, so that I can "time" how long a transfer will take to occur and create an event at its expected completion time.
At the moment, for simplicity, I do not consider the switch's backplanes or likelyhood for collisions / congestion to occur within the network. I am simply interested in the maximum transfer rate between all systems communicating.
For instance, consider the following sample network topology:
We assume the following connections:
Source 1, Source 2 -> (sending to) Dest 1
Source 3, Source 4 -> (sending to) Dest 2
Given these connections, what is the maximum effective transfer rate of all sources?
If we visualize this as a graph, I can calculate this manually by starting from the sources and evaluating at each switch level the maximum amount of incoming network traffic vs the switch's uplink.
For instance, Source #1 in this scenario has 50 Mbps of effective bandwidth to Dest 1
1 Gbps * S1(1/2) * S2(1) * S3(1/10) = 50 Mbps
However, I'm curious as to what other methods can be utilized to calculate this, or if there is a more effective approach which I can utilize to "predict" network traffic.
Any feedback is appreciated -- thanks.
This is essentially a max-min fairness problem.
https://en.wikipedia.org/wiki/Max-min_fairness
The progressive filling algorithm (described in the Wiki article) is a simple solution to this problem:
If resources are allocated in advance in the network nodes, max-min
fairness can be obtained by using an algorithm of progressive filling.
You start with all rates equal to 0 and grow all rates together at the
same pace, until one or several link capacity limits are hit. The
rates for the sources that use these links are not increased any more,
and you continue increasing the rates for other sources. All the
sources that are stopped have a bottleneck link. This is because they
use a saturated link, and all other sources using the saturated link
are stopped at the same time, or were stopped before, thus have a
smaller or equal rate. The algorithm continues until it is not
possible to increase. Lastly, when the algorithm terminates, all
sources have been stopped at some time and thus have a bottleneck
link. This allocation is max-min fair.

How can I find the average number of concurrent users for IIS to simulate during a load/performance test?

I'm using JMeter for load testing. I'm going through and exercise of finding the max number of concurrent threads (users) that our webserver can handle by simply increasing the # of threads in my distributed JMeter test case, and firing off the test.
Then -- it struck me, that while the MAX number may be useful, the REAL number of users that my website actually handles on average is the number I need to make the test fruitful.
Here are a few pieces of information about our setup:
This is a mixed .NET/Classic ASP site. Upon login, a browser session (with timeout) is created in both for the users.
Each session times out after 60 minutes.
Is there a way using this information, IIS logs, performance counters, and/or some calculation that will help me determine the average # of concurrent users we handle on our production site?
You might use logparser with the QUANTIZE function to determine the peak number of requests over a suitable interval.
For a 10 second window, it would be something like:
logparser "select quantize(to_localtime(to_timestamp(date,time)), 10) as Qnt,
count(*) as Hits from yourLogFile.log group by Qnt order by Hits desc"
The reported counts won't be exactly the same as threads or users, but they should help get you pointed in the right direction.
The best way to do exact counts is probably with performance counters, but I'm not sure any of the standard ones works like you would want -- you'd probably need to create a custom counter.
I can see a couple options here.
Use Performance Monitor to get the current numbers or have it log all day and get an average. ASP.NET has a Requests Current counter. According to this page Classic ASP also has a Requests current, but I've never used it myself.
Run the IIS logs through Log Parser to get the total number of requests and how long each took. I'm thinking that if you know how many requests come in each hour and how long each took, you can get an average of how many were running concurrently.
Also, keep in mind that concurrent users isn't quite the same as concurrent threads on the server. For one, multiple threads will be active per user while content like images is being downloaded. And after that the user will be on the page for a few minutes while the server is idle.
My suggestion is that you define the stop conditions first, such as
Maximum CPU utilization
Maximum memory usage
Maximum response time for requests
Other key parameters you like
It is really subjective to choose the parameters and I personally cannot provide much experience on that.
Secondly you can see whether performance counters or IIS logs can map to the parameters. Then you set up proper mappings.
Thirdly you can start testing by simulating N users (threads) and see whether the stop conditions hit. If not hit, you can go to a higher number. If hit, you can use a smaller number. Recursively you will find a rough number.
However, that never means your web site in real world can take so many users. No simulation so far can cover all the edge cases.

Resources