I’ve been doing some really basic test with k6 on different http servers,
and I’ve noticed that when I increase the load (in request per second), the latency (http_duration_req metric) decrease and it seems quite odd.
Two scenarios for example:
"s0": {
executor: 'constant-arrival-rate',
rate: 100,
duration: '10s',
preAllocatedVUs: 100
}
"s1": {
executor: 'constant-arrival-rate',
rate: 500,
duration: '10s',
preAllocatedVUs: 500
}
And when i look at the summary at the end of each run, the median and the minimum for the “http_req_duration” metric is ALWAYS lower for the s1 than for the s0.
And I have the intuition that it should rather be the opposite.
Does anyone have an explanation?
I notice the same thing for different webservers, flask with Python, PHP server…, on the same machine or not.
Summary of scenario 0, at 100req/s
Summary of scenario 1, at 500req/s
Related
I was reading Designing Data Intensive Applications book by Martin Kelppmann, where I read a quote about SLA.
For example, percentiles are often used in service level objectives
(SLOs) and service level agreements (SLAs), contracts that define the
expected performance and availability of a service. An SLA may state
that the service is considered to be up if it has a median response
time of less than 200 ms and a 99th percentile under 1 s (if the
response time is longer, it might as well be down), and the service
may be required to be up at least 99.9% of the time. These metrics set
expectations for clients of the service and allow customers to demand
a refund if the SLA is not met.
My question is how can a service have 99th Percentile response time to be less 1s but have a 50th percentile (median) to be 200ms?
If I am understanding that sentence, it tells that at least 50% of the users will experience latency of 200ms or less but 99% of the users will experience at least 1 sec. Then shouldn't the median latency also be less than 1 sec?
Sorry id this sounds a like dumb question but could someone explain what that sentence means?
This comes from the definition of median:
In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value.
So if distribution of response time is skewed to the "faster" ones the median can be be 200 ms while 99th percentile can be less than 1 second, for example lets consider the following 100 requests sorted by the response time:
1
2
..
49
50
51
..
99
100
100ms
120ms
199ms
200ms
200ms
999ms
2sec
Here we have median of 200ms ((50th + 51th)/2 -> (200+200)/2) and 99th percentile under 1 second.
but 99% of the users will experience at least 1 sec
This is not "at least" it is "at most", or more precisely "less then"
I have a multi-threading application and when I run vtune-profiler on it, under the caller/callee tab, I see that the callee function's CPU Time: Total - Effective Time is larger than caller function's CPU Time: Total - Effective Time.
eg.
caller function - A
callee function - B (no one calls B but A)
Function
CPU time: Total
-
Effective Time
A
54%
B
57%
My understanding is that Cpu Time: Total is the sum of CPU time: self + time of all the callee's of that function. By that definition should not Cpu Time: Total of A be greater than B?
What am I missing here?
It might have happened that the function B is being called by some other function along with A so there must be this issue.
Intel VTune profiler works by sampling and numbers are less accurate for short run time. If your application runs for a very short duration you could consider using allow multiple runs in VTune or increasing the run time.
Also Intel VTune Profiler sometimes rounds off the numbers so it might not give ideal result but the difference is very small like 0.1% but in your question its 3% difference so this won't be the reason for it.
I am trying wrk, and got these results:
wrk -t8 -c200 -d60s --latency http://www.baidu.com
Running 1m test # http://www.baidu.com
8 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 15.19ms 36.58ms 1.49s 97.76%
Req/Sec 1.46k 422.00 2.31k 81.41%
Latency Distribution
50% 9.05ms
75% 12.23ms
90% 17.17ms
99% 227.16ms
22621 requests in 1.00m, 331.43MB read
Socket errors: connect 0, read 1632838, write 0, timeout 0
Requests/sec: 376.75
Transfer/sec: 5.52MB
I'm confusing by the Req/Sec and Requests/sec values. What's the different between them?
According to the owner of the wrk repo:
In addition to Latency and Req/Sec being per-thread stats, they're also statistics periodically captured during a benchmarking run. So avg is the average req/sec over the testing interval whereas Requests/sec is simply total requests / total time.
source : https://github.com/wg/wrk/issues/259
I figured it out.
Req/Sec means How many request currently processing at that second;
Requests/sec means (Number of processed requests)/(Seconds used to process these requests);
I get the following result when I run a load test. Can any one help me to read the report?
the number of thread = '500 '
ramp up period = '1'
Sample = '500'
Avg = '20917'
min = '820'
max = '48158'
Std Deviation = '10563.178194669255'
Error % = '0.046'
throughput = '10.375381295262601'
KB/Sec = `247.05023046315702`
Avg. Bytes = '24382.664'
Short explanation looks like:
Sample - number of requests sent
Avg - an Arithmetic mean for all responses (sum of all times / count)
Minimal response time (ms)
Maximum response time (ms)
Deviation - see Standard Deviation article
Error rate - percentage of failed tests
Throughput - how many requests per second does your server handle. Larger is better.
KB/Sec - self expalanatory
Avg. Bytes - average response size
If you having troubles with interpreting results you could try BM.Sense results analysis service
The JMeter docs say the following:
The summary report creates a table row for each differently named request in your test. This is similar to the Aggregate Report , except that it uses less memory.
The thoughput is calculated from the point of view of the sampler target (e.g. the remote server in the case of HTTP samples). JMeter takes into account the total time over which the requests have been generated. If other samplers and timers are in the same thread, these will increase the total time, and therefore reduce the throughput value. So two identical samplers with different names will have half the throughput of two samplers with the same name. It is important to choose the sampler labels correctly to get the best results from the Report.
Label - The label of the sample. If "Include group name in label?" is
selected, then the name of the thread group is added as a prefix.
This allows identical labels from different thread groups to be
collated separately if required.
# Samples - The number of samples with the same label
Average - The average elapsed time of a set of results
Min - The lowest elapsed time for the samples with the same label
Max - The longest elapsed time for the samples with the same label
Std. Dev. - the Standard Deviation of the sample elapsed time
Error % - Percent of requests with errors
Throughput - the Throughput is measured in requests per
second/minute/hour. The time unit is chosen so that the displayed
rate is at least 1.0. When the throughput is saved to a CSV file, it
is expressed in requests/second, i.e. 30.0 requests/minute is saved
as 0.5.
Kb/sec - The throughput measured in Kilobytes per second
Avg. Bytes - average size of the sample response in bytes. (in JMeter
2.2 it wrongly showed the value in kB)
Times are in milliseconds.
A Jmeter Test Plan must have listener to showcase the result of performance test execution.
Listeners capture the response coming back from Server while Jmeter runs and showcase in the form of – tree, tables, graphs and log files.
It also allows you to save the result in a file for future reference. There are many types of listeners Jmeter provides. Some of them are: Summary Report, Aggregate Report, Aggregate Graph, View Results Tree, View Results in Table etc.
Here is the detailed understanding of each parameter in Summary
report.
By referring to the figure:
image
Label: It is the name/URL for the specific HTTP(s) Request. If you have selected “Include group name in label?” option then the name of the Thread Group is applied as the prefix to each label.
Samples: This indicates the number of virtual users per request.
Average: It is the average time taken by all the samples to execute specific label. In our case, the average time for Label 1 is 942 milliseconds & total average time is 584 milliseconds.
Min: The shortest time taken by a sample for specific label. If we look at Min value for Label 1 then, out of 20 samples shortest response time one of the sample had was 584 milliseconds.
Max: The longest time taken by a sample for specific label. If we look at Max value for Label 1 then, out of 20 samples longest response time one of the sample had was 2867 milliseconds.
Std. Dev.: This shows the set of exceptional cases which were deviating from the average value of sample response time. The lesser this value more consistent the data. Standard deviation should be less than or equal to half of the average time for a label.
Error%: Percentage of Failed requests per Label.
Throughput: Throughput is the number of request that are processed per time unit(seconds, minutes, hours) by the server. This time is calculated from the start of first sample to the end of the last sample. Larger throughput is better.
KB/Sec: This indicates the amount of data downloaded from server during the performance test execution. In short, it is the Throughput measured in Kilobytes per second.
For more information:
http://www.testingjournals.com/understand-summary-report-jmeter/
Sample: Number of requests sent.
The Throughput: is the number of requests per unit of time (seconds, minutes, hours) that are sent to your server during the test.
The Response time: is the elapsed time from the moment when a given request is sent to the server until the moment when the last bit of information has returned to the client.
The throughput is the real load processed by your server during a run but it does not tell you anything about the performance of your server during this same run. This is the reason why you need both measures in order to get a real idea about your server’s performance during a run. The response time tells you how fast your server is handling a given load.
Average: This is the Average (Arithmetic mean μ = 1/n * Σi=1…n xi) Response time of your total samples.
Min and Max are the minimum and maximum response time.
An important thing to understand is that the mean value can be very misleading as it does not show you how close (or far) your values are from the average.For this purpose, we need the Deviation value since Average value can be the Same for different response time of the samples!!
Deviation: The standard deviation (σ) measures the mean distance of the values to their average (μ).It gives you a good idea of the dispersion or variability of the measures to their mean value.
The following equation show how the standard deviation (σ) is calculated:
σ = 1/n * √ Σi=1…n (xi-μ)2
For Details, see here!!
So, if the deviation value is low compared to the mean value, it will indicate you that your measures are not dispersed (or mostly close to the mean value) and that the mean value is significant.
Kb/sec: The throughput measured in Kilobytes per second.
Error % : Percent of requests with errors.
An example is always better to understand!!! I think, this article will help you.
There are lots of explanation of Jmeter Summary, I have been using this tool from quite some time for generating performance testing report with relevant data. The explanation available on below link is right from the field experience:
Jmeter:Understanding Summary Report
This is one of the most useful report generated by Jmeter to undertstand the load test result.
# Label: Name of HTTP sample request send to server
# Samples : This Captures the total number of samples pushed to server. Suppose you put a Loop Controller to run it 5 times this particular request and then 2 iteration(Called Loop Count in Thread Group)is set and load test is run for 100 users, then the count that will be displayed here .... 1*5*2 * 100 =1000. Total = total number of samples send to server during entire run.
# Average : It's an average response time for a particular http request. This response time is in millisecond, and an average for 5 loops in two iteration for 100 users. Total = Average of total average of samples, means add all averages for all samples and divide by number of samples
# Min : Minmum time spend by sample requests send for this label.
The total equals to the minimum time across all samples.
# Max : Maximum tie spend by sample requests send for this label
The total equals to the maxmimum time across all samples.
# Std. Dev. : Knowing the standard deviation of your data set tells you how densely the data points are clustered around the mean. The smaller the standard deviation, the more consistent the data. Standard deviation should be less than or equal to half of the average time for a label. If it is more than that, then it means that something is wrong. you need to figure out the problem and fix it.
https://en.wikipedia.org/wiki/Standard_deviation
Total is euqals to highest deviation across all samples.
# Error: Total percentage of erros found for a particular sample request. 0.0% shows that all requests completed successfully.
Total equals to percentage of errors samples in all samples (Total Samples)
# Throughput: Hits/sec, or total number of request per unit of time(sec, mins, hr) send to server during test.
endTime = lastSampleStartTime + lastSampleLoadTime
startTime = firstSampleStartTime
converstion = unit time conversion value
Throughput = Numrequests / ((endTime - startTime)*conversion)
# KB/sec : Its mesuring throughput rate in Kilobytes per second.
# Avg. Bytes: Avegare of total bytes of data downloaded from server.
Totals is average bytes across all samples.
Well, I know. 'fast from SSMS, slow from app' - that sounds very familiar for someone.
One can start thinking about parameter sniffing or connection settings. But I guess that's not the case for me.
So that's the query:
SELECT [ST].*, [STL].*
FROM [WP_CashCenter_StockTransaction] AS [ST]
LEFT JOIN [WP_CashCenter_StockTransactionLine] AS [STL] ON ([STL] [StockTransaction_id] = [ST].[id])
WHERE
([ST].[Type] IN (0, 1, 10, 9)
AND ([STL].[Direction] IN (0, 1) OR [STL].[id] IS NULL)
AND [ST].[Status] IN (0,1)
AND ( ([STL].[StockContainer_id] = 300000742600 OR [STL].[id] IS NULL) AND [ST].[StockContainerID] = 300000742600))
I'll post links to images of execution plans, if you don't mind (pls tell in comments), cause there's gonna be many of them.
Execution plan that I get from SSMS: http://i.imgur.com/DjTypV2.png (runs fraction of a sec)
Execution plan, that's used for query when it's executed from app: http://i.imgur.com/Ra45CAo.png (runs ~3sec)
So, for some reason, sql-server makes wrong estimations and prefers a table scan in second case.
The query is built dynamically and new plan is generated for each new value of StockContainerID (no parameters).
Well, okay, so gave up trying to figure out the problem and just used FORCESEEK hint:
SELECT [ST].*, [STL].*
FROM [WP_CashCenter_StockTransaction] AS [ST] WITH(FORCESEEK)
LEFT JOIN [WP_CashCenter_StockTransactionLine] AS [STL] WITH(FORCESEEK) ON ([STL].[StockTransaction_id] = [ST].[id])
Now, the execution plans seem to be identical:
http://i.imgur.com/orq7Pmx.png (executed from app). But it still takes ~3secs.
Take a look at this:
http://i.imgur.com/ZFhyWYc.png (SSMS, 1 number of executions, 1 estimated rows)
http://i.imgur.com/bIeTE13.png (app, 1 numer of executions, 655k estimated rows)
http://i.imgur.com/FERBNQR.png (SSMS, 1 numebr of executions, 1 estimated rows)
http://i.imgur.com/pm2k8CS.png (app, 655k number of executions, 1 estimated rows)
You should have noticed that the second plan uses parallelism. I don't know if that can be a reason for the problem (I think no).