Siege test stop hitting api after some time - nginx

I am load testing my api(nginx) with siege.Api is forwarding POST request from nginx to kafka rest server running at 8082.
I am running siege from four ec2 machines. Everytime I can see siege stops hitting for some time and then resume. So I forcefuly break the siege with Ctrl+C and I can see following:
HTTP/1.1 200 0.01 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.01 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.01 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.01 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.01 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.01 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.01 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.02 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.03 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.03 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.02 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
^C
Lifting the server siege.. done.
Transactions: 10699 hits
Availability: 100.00 %
Elapsed time: 26.71 secs
Data transferred: 1.23 MB
Response time: 0.05 secs
Transaction rate: 400.56 trans/sec
Throughput: 0.05 MB/sec
Concurrency: 20.11
Successful transactions: 10699
Failed transactions: 0
Longest transaction: 1.67
Shortest transaction: 0.00
If I dont stop it then it will resume hitting after sometime and then again stop hitting. Again when I forcefully stop it:
HTTP/1.1 200 0.06 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.07 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.04 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.03 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
HTTP/1.1 200 0.04 secs: 121 bytes ==> POST http://my-ip/topics/jsontest
^C
Lifting the server siege.. done.
Transactions: 21399 hits
Availability: 100.00 %
Elapsed time: 133.88 secs
Data transferred: 2.47 MB
Response time: 2.97 secs
Transaction rate: 159.84 trans/sec
Throughput: 0.02 MB/sec
Concurrency: 474.16
Successful transactions: 21399
Failed transactions: 0
Longest transaction: 63.23
Shortest transaction: 0.00
Now again it was able to hit 21399-10699=10700 roughly same hits. SO I want to understand why it stops hitting for some time after 10699 hits? limitation of ec2-machine? It just reduce my transaction rate because of waiting time it take after 10699 hits. Which I dont want. THis is happening on all four machines. Now my api is on a ec2 instance itself. But I am able to hit 10699 times from each of four machines. But transaction rate is very low.
Any help appreciated!!

Related

Performing load test with nginx degrades performance when increasing concurrency

Performance test results with Apache Bench.
Performance degrades with increasing concurrency.
Project is here
https://github.com/ohs30359-nobuhara/nginx-php7-alpine
$ ab -n 50 -c 1 "127.0.0.1/sample.html"
Concurrency Level: 1
Time taken for tests: 0.111 seconds
Complete requests: 50
Failed requests: 0
Total transferred: 11700 bytes
HTML transferred: 550 bytes
Requests per second: 448.50 [#/sec] (mean)
Time per request: 2.230 [ms] (mean)
Time per request: 2.230 [ms] (mean, across all concurrent requests)
Transfer rate: 102.49 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 1
Processing: 1 2 0.9 2 6
Waiting: 1 2 0.8 2 5
Total: 1 2 1.0 2 6
Percentage of the requests served within a certain time (ms)
50% 2
66% 2
75% 2
80% 2
90% 3
95% 5
98% 6
99% 6
100% 6 (longest request)
$ ab -n 50 -c 50 "127.0.0.1/sample.html"
Concurrency Level: 50
Time taken for tests: 0.034 seconds
Complete requests: 50
Failed requests: 0
Total transferred: 11700 bytes
HTML transferred: 550 bytes
Requests per second: 1480.56 [#/sec] (mean)
Time per request: 33.771 [ms] (mean)
Time per request: 0.675 [ms] (mean, across all concurrent requests)
Transfer rate: 338.33 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 4 2.1 4 8
Processing: 9 18 5.2 20 24
Waiting: 2 18 5.5 20 24
Total: 9 23 5.6 25 30
Percentage of the requests served within a certain time (ms)
50% 25
66% 26
75% 26
80% 27
90% 29
95% 29
98% 30
99% 30
100% 30 (longest request)
The HTML returned here only displays characters that do not contain js or css.
I don't think the load will drop much with this load,
so is there a problem with nginx settings?

apache bench keep alive got 0 document length

I used bellow command:
ab -k -n 1 -c 1 -v 5 $URL
and got:
LOG: header received:
Blockquote
HTTP/1.1 200 OK
content-length: 228
content-type: application/octet-stream
date: Fri, 26 Feb 2016 03:09:27 GMT
expires: Fri, 26 Feb 2016 03:10:27 GMT
cache-control: private, max-age=60
last-modified: Thu, 18 Feb 2016 07:02:46 GMT
connection: keep-alive`
LOG: Response code = 200
..done
...
Document Length: 0 bytes
Concurrency Level: 1
Time taken for tests: 0.019 seconds
Complete requests: 1
Failed requests: 0
Write errors: 0
Keep-Alive requests: 1
Total transferred: 263 bytes
HTML transferred: 0 bytes
Requests per second: 52.44 [#/sec] (mean)
Time per request: 19.068 [ms] (mean)
Time per request: 19.068 [ms] (mean, across all concurrent requests)
Transfer rate: 13.47 [Kbytes/sec] received
ab received the header: content-length: 228, but the Document Length is 0 bytes.
curl the $URL works just fine and obtained 228 bytes.
So what's wrong with it? Thanks!
It turned out to be a ApacheBench bug that it doesn't accept lower cased content-length header.

Why doesn't Apache Bench work on google.com?

$ curl -s -D - https://www.google.com/ -o /dev/null
HTTP/1.1 200 OK
Date: Thu, 29 Oct 2015 05:33:13 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Set-Cookie: PREF=ID=1111111111111111:FF=0:TM=1446096793:LM=1446096793:V=1:S=LVeGIvKogvfq6VHi; expires=Thu, 31-Dec-2015 16:02:17 GMT; path=/; domain=.google.com
Set-Cookie: NID=72=sAIx-8ox3_AVxn6ymUjBsKzSmAXLwjNRTcV4Cj9ob1YmLkFc-lSJKvRK1kNdn1lIGruh-wH1_vctiRzKSFTG7IkJHSrVY_At_QbacsYgiI_8EOpMLe2cRIxXINj27DVpgnijGx7tKT1TCDirrunO3Bu0D4DVXz3lB0f42ZyJqOCtOJX2hprvbOOc8P8; expires=Fri, 29-Apr-2016 05:33:13 GMT; path=/; domain=.google.com; HttpOnly
Alternate-Protocol: 443:quic,p=1
Alt-Svc: quic="www.google.com:443"; p="1"; ma=600,quic=":443"; p="1"; ma=600
Accept-Ranges: none
Vary: Accept-Encoding
Transfer-Encoding: chunked
but Apache Bench has errors for all but one request:
$ ab -n 5 https://www.google.com/
This is ApacheBench, Version 2.3 <$Revision: 1528965 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking www.google.com (be patient).....done
Server Software: gws
Server Hostname: www.google.com
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256,2048,128
Document Path: /
Document Length: 18922 bytes
Concurrency Level: 1
Time taken for tests: 1.773 seconds
Complete requests: 5
Failed requests: 4
(Connect: 0, Receive: 0, Length: 4, Exceptions: 0)
Total transferred: 99378 bytes
HTML transferred: 94606 bytes
Requests per second: 2.82 [#/sec] (mean)
Time per request: 354.578 [ms] (mean)
Time per request: 354.578 [ms] (mean, across all concurrent requests)
Transfer rate: 54.74 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 158 179 40.8 162 252
Processing: 132 176 79.0 148 316
Waiting: 81 118 80.5 83 262
Total: 292 354 119.5 310 567
Percentage of the requests served within a certain time (ms)
50% 299
66% 321
75% 321
80% 567
90% 567
95% 567
98% 567
99% 567
100% 567 (longest request)
Why does ab have errors?
Add a -l to the command
It tells Apache Bench to not expect constant length for every response.
This should work:
ab -l -n 5 https://www.google.com/
Output:
This is ApacheBench, Version 2.3 <$Revision: 1807734 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking www.google.com (be patient).....done
Server Software: gws
Server Hostname: www.google.com
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-ECDSA-CHACHA20-POLY1305,256,256
TLS Server Name: www.google.com
Document Path: /
Document Length: Variable
Concurrency Level: 1
Time taken for tests: 0.433 seconds
Complete requests: 5
Failed requests: 0
Total transferred: 67064 bytes
HTML transferred: 62879 bytes
Requests per second: 11.55 [#/sec] (mean)
Time per request: 86.588 [ms] (mean)
Time per request: 86.588 [ms] (mean, across all concurrent requests)
Transfer rate: 151.27 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 20 20 1.0 20 22
Processing: 63 66 2.7 67 69
Waiting: 62 65 2.8 66 68
Total: 83 86 3.3 87 91
Percentage of the requests served within a certain time (ms)
50% 85
66% 89
75% 89
80% 91
90% 91
95% 91
98% 91
99% 91
100% 91 (longest request)
AB has categorized these as length errors because it expects the responses to be of the same length. Google probably has some kind of dynamic content being returned on their homepage.
Load Testing with AB ... fake failed requests (length)

why round-trip time different between two test host?

I have written one http put client(use libcurl libarary) to put file into apache webdav server, and use tcpdump catch the packet at the server side, then use tcptrace (www.tcptrace.org) to analysis the dump file, below is the result:
Host a is the client side, Host b is the server side:
a->b: b->a:
total packets: 152120 total packets: 151974
ack pkts sent: 152120 ack pkts sent: 151974
pure acks sent: 120 pure acks sent: 151854
sack pkts sent: 0 sack pkts sent: 0
dsack pkts sent: 0 dsack pkts sent: 0
max sack blks/ack: 0 max sack blks/ack: 0
unique bytes sent: 3532149672 unique bytes sent: 30420
actual data pkts: 152000 actual data pkts: 120
actual data bytes: 3532149672 actual data bytes: 30420
rexmt data pkts: 0 rexmt data pkts: 0
rexmt data bytes: 0 rexmt data bytes: 0
zwnd probe pkts: 0 zwnd probe pkts: 0
zwnd probe bytes: 0 zwnd probe bytes: 0
outoforder pkts: 0 outoforder pkts: 0
pushed data pkts: 3341 pushed data pkts: 120
SYN/FIN pkts sent: 0/0 SYN/FIN pkts sent: 0/0
req 1323 ws/ts: N/Y req 1323 ws/ts: N/Y
urgent data pkts: 0 pkts urgent data pkts: 0 pkts
urgent data bytes: 0 bytes urgent data bytes: 0 bytes
mss requested: 0 bytes mss requested: 0 bytes
max segm size: 31856 bytes max segm size: 482 bytes
min segm size: 216 bytes min segm size: 25 bytes
avg segm size: 23237 bytes avg segm size: 253 bytes
max win adv: 125 bytes max win adv: 5402 bytes
min win adv: 125 bytes min win adv: 5402 bytes
zero win adv: 0 times zero win adv: 0 times
avg win adv: 125 bytes avg win adv: 5402 bytes
initial window: 15928 bytes initial window: 0 bytes
initial window: 1 pkts initial window: 0 pkts
ttl stream length: NA ttl stream length: NA
missed data: NA missed data: NA
truncated data: 0 bytes truncated data: 0 bytes
truncated packets: 0 pkts truncated packets: 0 pkts
data xmit time: 151.297 secs data xmit time: 150.696 secs
idletime max: 44571.3 ms idletime max: 44571.3 ms
throughput: 23345867 Bps throughput: 201 Bps
RTT samples: 151915 RTT samples: 120
RTT min: 0.0 ms RTT min: 0.1 ms
RTT max: 0.3 ms RTT max: 40.1 ms
RTT avg: 0.0 ms RTT avg: 19.9 ms
RTT stdev: 0.0 ms RTT stdev: 19.8 ms
RTT from 3WHS: 0.0 ms RTT from 3WHS: 0.0 ms
RTT full_sz smpls: 74427 RTT full_sz smpls: 60
RTT full_sz min: 0.0 ms RTT full_sz min: 39.1 ms
RTT full_sz max: 0.3 ms RTT full_sz max: 40.1 ms
RTT full_sz avg: 0.0 ms RTT full_sz avg: 39.6 ms
RTT full_sz stdev: 0.0 ms RTT full_sz stdev: 0.3 ms
post-loss acks: 0 post-loss acks: 0
segs cum acked: 89 segs cum acked: 0
duplicate acks: 0 duplicate acks: 0
triple dupacks: 0 triple dupacks: 0
max # retrans: 0 max # retrans: 0
min retr time: 0.0 ms min retr time: 0.0 ms
max retr time: 0.0 ms max retr time: 0.0 ms
avg retr time: 0.0 ms avg retr time: 0.0 ms
sdv retr time: 0.0 ms sdv retr time: 0.0 ms
According the result above, the RTT of client to server is small, but the server side to client side is large. Can anyone help explain this from me?
Because this
unique bytes sent: 3532149672 unique bytes sent: 30420
actual data pkts: 152000 actual data pkts: 120
actual data bytes: 3532149672 actual data bytes: 30420
a->b is sending a steady flow of data, which ensures buffers get filled and things get pushed.
b->a is only sending a few acks etc, doing next to nothing at all, so as a result things get left in buffers for a while (a few ms).
In addition to that, RTT is round trip time. It's the time from when the application queues a packet for sending and when the corresponding response is received. Since the host on a is busy pushing data, and probably filling its own buffers, there's going to be a small amount of additional overhead for something from b to get acknowledged.
Firstly host b sent very little data (a very small sample size). Secondly, I suspect that host a has an asymmetrical Internet connection (e.g. 10MB/1MB).

CouchDB / MochiWeb : negative effect of persistent connections

I have pretty straightforward setup of CouchDB on my Mint/Debian box. My Java webapp was sufferring rather long delays on querying CouchDB, so I started to seek for the causes.
EDIT: The query pattern is lots of small queries and small JSON objects (like 300 bytes up / 1Kbyte down).
Wireshark dumps are pretty nice, showing mostly 3-5 millis request-response turnaround. JVM frame sampling showed me that socket code (client side queries to the Couch) is somewhat busy, but nothing remarkable. Then I tried to profile the same with ApacheBench and oops: I currently see that keep-alive introduces steady extra 39ms delay over non-persistent setups.
Does anyone know how to explain this? Maybe persistent connections increase the congestion window on the TCP layer and then are idling out due to TCP_WAIT and small request/response sizes, or something like that?
Should this option (TCP_WAIT) be ever switched ON for loopback tcp connections?
w#mint ~ $ uname -a
Linux mint 2.6.39-2-486 #1 Tue Jul 5 02:52:23 UTC 2011 i686 GNU/Linux
w#mint ~ $ curl http://127.0.0.1:5984/
{"couchdb":"Welcome","version":"1.1.1"}
running with keep alive, average 40 millis per request
w#mint ~ $ ab -n 1024 -c 1 -k http://127.0.0.1:5984/
>>>snip
Server Software: CouchDB/1.1.1
Server Hostname: 127.0.0.1
Server Port: 5984
Document Path: /
Document Length: 40 bytes
Concurrency Level: 1
Time taken for tests: 41.001 seconds
Complete requests: 1024
Failed requests: 0
Write errors: 0
Keep-Alive requests: 1024
Total transferred: 261120 bytes
HTML transferred: 40960 bytes
Requests per second: 24.98 [#/sec] (mean)
Time per request: 40.040 [ms] (mean)
Time per request: 40.040 [ms] (mean, across all concurrent requests)
Transfer rate: 6.22 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 1 40 1.4 40 48
Waiting: 0 1 0.7 1 8
Total: 1 40 1.3 40 48
Percentage of the requests served within a certain time (ms)
50% 40
>>>snip
95% 40
98% 41
99% 44
100% 48 (longest request)
No keepalive, and voila - 1 ms per request, mostly.
w#mint ~ $ ab -n 1024 -c 1 http://127.0.0.1:5984/
>>>snip
Time taken for tests: 1.080 seconds
Complete requests: 1024
Failed requests: 0
Write errors: 0
Total transferred: 236544 bytes
HTML transferred: 40960 bytes
Requests per second: 948.15 [#/sec] (mean)
Time per request: 1.055 [ms] (mean)
Time per request: 1.055 [ms] (mean, across all concurrent requests)
Transfer rate: 213.89 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 1 1 1.0 1 11
Waiting: 1 1 0.9 1 11
Total: 1 1 1.0 1 11
Percentage of the requests served within a certain time (ms)
50% 1
>>>snip
80% 1
90% 2
95% 3
98% 5
99% 6
100% 11 (longest request)
Okay, now with keep-alive on but also asking to close the connection via http header. Also 1 ms per request or so.
w#mint ~ $ ab -n 1024 -c 1 -k -H 'Connection: close' http://127.0.0.1:5984/
>>>snip
Time taken for tests: 1.131 seconds
Complete requests: 1024
Failed requests: 0
Write errors: 0
Keep-Alive requests: 0
Total transferred: 236544 bytes
HTML transferred: 40960 bytes
Requests per second: 905.03 [#/sec] (mean)
Time per request: 1.105 [ms] (mean)
Time per request: 1.105 [ms] (mean, across all concurrent requests)
Transfer rate: 204.16 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 1 1 1.2 1 14
Waiting: 0 1 1.1 1 13
Total: 1 1 1.2 1 14
Percentage of the requests served within a certain time (ms)
50% 1
>>>snip
80% 1
90% 2
95% 3
98% 6
99% 7
100% 14 (longest request)
Yeah, this is related to tcp socket setup options. This configuration now leveled off all three cases at 1ms per request.
[httpd]
socket_options = [{nodelay, true}]
See this for details:
http://wiki.apache.org/couchdb/Performance#Network

Resources