Stress-testing ASP.NET/IIS with WCAT - asp.net

I'm trying to setup a stress/load test using the WCAT toolkit included in the IIS Resources.
Using LogParser, I've processed a UBR file with configuration. It looks something like this:
[Configuration]
NumClientMachines: 1 # number of distinct client machines to use
NumClientThreads: 100 # number of threads per machine
AsynchronousWait: TRUE # asynchronous wait for think and delay
Duration: 5m # length of experiment (m = minutes, s = seconds)
MaxRecvBuffer: 8192K # suggested maximum received buffer
ThinkTime: 0s # maximum think-time before next request
WarmupTime: 5s # time to warm up before taking statistics
CooldownTime: 6s # time to cool down at the end of the experiment
[Performance]
[Script]
SET RequestHeader = "Accept: */*\r\n"
APP RequestHeader = "Accept-Language: en-us\r\n"
APP RequestHeader = "User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705)\r\n"
APP RequestHeader = "Host: %HOST%\r\n"
NEW TRANSACTION
classId = 1
NEW REQUEST HTTP
ResponseStatusCode = 200
Weight = 45117
verb = "GET"
URL = "http://Url1.com"
NEW TRANSACTION
classId = 3
NEW REQUEST HTTP
ResponseStatusCode = 200
Weight = 13662
verb = "GET"
URL = "http://Url1.com/test.aspx"
Does it look OK?
I execute the controller with this command: wcctl -z StressTest.ubr -a localhost
The Client(s) is executed like this: wcclient localhost
When the client is executed, I get this error: main client thread Connect Attempt 0 Failed. Error = 10061
Has anyone in this world ever used WCAT?

I'd look at updating to WCat 6.3 - available here for x86 and here for x64
They've changed the settings/scenario file strucutures, which is a little painful, but should suit your needs.

I've just started evaluating wcat 6.3 and I'm afraid my experience has been a bit disapointing in terms of online support/community.
There is also a major bug in the wcat.wsf script - see:
http://forums.iis.net/t/1153312.aspx
I'm now struggling with getting performance counter measurement working.

I've had good success with WCAT, though I'm struggling with simulating NTLM connections.
I'm using 6.3, so my config files look very different from yours. Some gotchas I noted along the way:
+ Make sure you've got your firewall turned off, or holes punched through for WMI.
+ Each thing you set in the request header has a tremendous impact on throughput. Apples to apples must have the same request headers.
+ Remote calls with multiple clients work only after correcting the bug identified by sthorogood.
Once I crossed those hurdles, I got great results from WCAT. It tests quickly, repeatably, and aggressively.
Best of luck,
Kevin

I don't have an answer for you, but have you considered using other tools for your testing? The WCAT tools seems pretty limited and complicated to use.
OpenSTA and JMeter are good open source tools for load/stress/performance testing.

OpenSTA and JMeter looks very Apache like. I'm running IIS on Windows Server 2003.

Have you looked at the Microsoft Web Application Stress Tool?

for performance counter you can define -p .prf in the same command run for controller as:
wcctl -c config.txt -d distribution.txt -s script.txt -a localhost - p performance.prf

Related

airflow webserver suddenly stopped after long time of no issues, "No response from gunicorn"

Have had airflow webserver -D deamon process (v1.10.7) running on machine (CentOS 7) for long time. Suddenly saw that the webserver could no longer be accessed and checking the airflow-webserver.log saw...
[airflow#airflowetl airflow]$ cat airflow-webserver.log
2020-10-23 00:57:15,648 ERROR - No response from gunicorn master within 120 seconds
2020-10-23 00:57:15,649 ERROR - Shutting down webserver
(nothing of note in airflow-webserver.err)
[airflow#airflowetl airflow]$ cat airflow-webserver.err
/home/airflow/.local/lib/python3.6/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
""")
The airflow.cfg values for the webserver section looks like...
[webserver]
# The base url of your website as airflow cannot guess what domain or
# cname you are using. This is used in automated emails that
# airflow sends to point links to the right web server
#base_url = http://localhost:8080
base_url = http://airflowetl.co.local:8080
# The ip specified when starting the web server
web_server_host = 0.0.0.0
# The port on which to run the web server
web_server_port = 8080
# Paths to the SSL certificate and key for the web server. When both are
# provided SSL will be enabled. This does not change the web server port.
web_server_ssl_cert =
web_server_ssl_key =
# Number of seconds the webserver waits before killing gunicorn master that doesn't respond
web_server_master_timeout = 120
# Number of seconds the gunicorn webserver waits before timing out on a worker
#web_server_worker_timeout = 120
web_server_worker_timeout = 300
# Number of workers to refresh at a time. When set to 0, worker refresh is
# disabled. When nonzero, airflow periodically refreshes webserver workers by
# bringing up new ones and killing old ones.
worker_refresh_batch_size = 1
# Number of seconds to wait before refreshing a batch of workers.
worker_refresh_interval = 30
# Secret key used to run your flask app
secret_key = my_key
# Number of workers to run the Gunicorn web server
workers = 4
# The worker class gunicorn should use. Choices include
# sync (default), eventlet, gevent
worker_class = sync
Ultimately, just restarted the process as a daemon again (airflow webserver -D (should I have deleted the old airflow-webserer.log and .err files first?)), but not sure what would make this happen, since it had had no problems running for months before this.
Could anyone with more experience explain what could have happened after all this time and how I could prevent it in the future? Any issues with running dags or anything else that I should check for that this temporary unexpected shutdown of the websever may have caused?
I am experiencing the same issue, and it only started (very unfrequently) when I changed the following two config parameters in the webserver.
worker_refresh_interval = 120
workers = 2
However, my parameters are also set quite differently than yours, will share them here.
rbac = True
web_server_host = 0.0.0.0
web_server_port = 8080
web_server_master_timeout = 600
web_server_worker_timeout = 600
default_ui_timezone = Europe/Amsterdam
reload_on_plugin_change = True
After comparing the two, as your settings of the two I changed were set to the default (same as me before changing them), it seems that it is a combination of more parameters.

Fail2ban for nginx post flood ignores time intervals

I'm trying to create a fail2ban filter that is going to ban the host when it sends over 100 POST requests over 30 seconds interval.
jail.local:
[nginx-postflood]
enabled = false
filter = nginx-postflood
action = myaction
logpath = /var/log/nginx/access.log
findtime = 30
bantime = 100
maxretry = 100
nginx-postflood.conf
[Definition]
failregex = ^<HOST>.*"POST.*
ignoreregex =
Using GREP i was able to test the regular expressions and indeed it matches Host and POST requests.
Problem is that it bans any Host that performs at least one POST request. This means likely that it's not taking findttime or maxretry options into consideration. In my opinion it's timestamp issue.
Sample line of nginx log:
5.5.5.5 - user [05/Aug/2014:00:00:09 +0200] "POST /auth HTTP/1.1" 200 6714 "http://referer.com" "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0"
Any help?
I guess it maybe to late for the answer but anyway...
The excerpt you have posed has the filter disabled.
enabled = false
As there is not mentioning of Fail2Ban version and syslog/fail2ban logs are missing for this jail.
I tested your Filter on fail2ban 0.9.3-1 and it works fine although I had to enable it and had to drop the line with action = myaction as you have not provided what you are expecting fail2ban to do.
Therefore this filter should work fine, based that it's enabled and the action is correct as well.
What is happening in the provided example is that Your Filter is disabled and fail2ban is using another Filter which checks the same log file and matches your regex but has more restrictive rules i.e ban after 1 request.

How can I serve HTTP slowly?

I'm working on an http client and I would like to test it on requests that take some time to finish. I could certainly come up with a python script to suit my needs, something about like:
def slow_server(environ, start_response):
with getSomeFile(environ) as file_to_serve:
block = file_to_serve.read(1024);
while block:
yield block
time.sleep(1.0)
block = file_to_serve.read(1024);
but this feels like a problem others have already encountered. Is there an easy way to serve static files with an absurdly low bandwidth cap, short of a full scale server like apache or nginx.
I'm working on linux, and the way I've been testing so far is with python -m SimpleHTTPServer 8000 in a directory full of files to serve. I'm equally interested in another simple command line server or a way to do bandwidth limiting with one or a few iptables commands on tcp port 8000 (or whatever would work).
The solution I'm going with for now uses a "real" webserver, but a much easier to configure one, lighttpd. I've added the following file to my path (its in ~/bin)
#! /usr/sbin/lighttpd -Df
server.document-root = "/dev/null"
server.modules = ("mod_proxy")
server.kbytes-per-second = env.LIGHTTPD_THROTTLE
server.port = env.LIGHTTPD_PORT
proxy.server = ( "" => (( "host" => "127.0.0.1", "port" => env.LIGHTTPD_PROXY )))
Which is a lighttpd config file that acts as a reverse proxy to localhost; source and destination ports, as well as a server total maximum bandwidth are given as environment variables, and so it can be invoked like:
$ cd /path/to/some/files
$ python -m SimpleHTTPServer 8000 &
$ LIGHTTPD_THROTTLE=60 LIGHTTPD_PORT=8001 LIGHTTPD_PROXY=8000 throttle.lighttpd
to proxy the python file server on port 8000 with a low 60KB per second on port 8001. Obviously, lighttpd could be used to serve the files itself, but this little script can be used to make any http server slow
On Windows you can use Fiddler which is a HTTP proxy debugging tool to simulate very slow speeds. Maybe a similar tool exists on what ever OS you are using.
I remember I once had the same question and my search turned up an Apache2 module that goes by the name of mod_bw (mod_bandwith that is). It served me well for my testings.

Configuring plone.recipe.varnish in Plone 4

I am using plone.recipe.varnish 1.2.2 in my Plone application.
Below is a section of my buildout:
parts =
...
instance
paster
varnish-build
varnish
plonesite
...
[varnish-build]
recipe = zc.recipe.cmmi
url = http://downloads.sourceforge.net/project/varnish/varnish/2.1.3/varnish-2.1.3.tar.gz
[varnish]
recipe = plone.recipe.varnish
daemon = ${buildout:parts-directory}/varnish-build/sbin/varnishd
bind = 127.0.0.1:8000
backends = 127.0.0.1:9000
cache-size = 1G
I cannot conclusively determine if it works. My Plone application serves on port 9000. So I want to test if varnish really works by going to http://localhost:8000 but I get nothing. The browser says "Firefox can't establish a connection to the server at 127.0.0.1:8000."
Am I doing this wrong? I have followed the instructions provided here but no headway.
How does one really configure plone.recipe.varnish in Plone, and how do you actually test that it works in local development machine?
The recipe does not start your varnish server. It only configures it for you.
Use something like supervisord to manage the process, or start it by hand with bin/varnish.

Drupal site - Memcache Connection errors

We are tying to perf tune our drupal site.
We are using Siege to measure performance (as drupal visitor).
Env:
Nginx + FastCGI+ Memcache
Siege runs fine for a few seconds, and then we run into connection errors:
Example:
HTTP/1.1 200 29.18 secs: 5877 bytes ==> /
HTTP/1.1 200 29.39 secs: 5877 bytes ==> /
warning: socket: -1656235120 select timed out: Connection timed out
warning: socket: -1673020528 select timed out: Connection timed out
Using the same Siege test confiuration, Nginx + FastCGI+ Drupal Cache seems to work fine.
Example:
HTTP/1.1 200 1.41 secs: 5868 bytes ==> /
HTTP/1.1 200 1.40 secs: 5868 bytes ==> /
As you can see, Response time is much higher with MemCache, in addition to the connection errors.
Any idea what could be wrong here... and why Drupal is throwing errors with memcache under load?
Memcache runs on a separate instance. Allocated 2GB memory for MemCache.
I guess that You run out of memcached connections. Please run a check of Your memcached installation with a simple script every second. Then start Siege. I guess Your memcached stops responding after a while.
Test memcache php script:
<?php
$memcache = new Memcache;
$memcache->connect('localhost', 11211) or die ('Unable to connect');
$version = $memcache->getVersion();
echo 'Server version: '.$version;
?>
What I guess is happening is that You have not disable the persistent connections in memcache and they hang around in the php threads. Memcached can serve ~1023 of them at a time and that might not be enough while Sieging.
You might also try ab, apache benchmarking tool with the close look to the -c switch. Play around with it and see how the results change on different values.
Finally, You should run a tcpdump on Your memcached port (usually 11211) on the php machine to find out what is happening to the connections. Does drupal start them? Does the other host respond with a RST or does it time out?
There was a bug in the memcached php documentation api that said that the connections are non-persistent by default. They are persistent by default (well, they were at the time I had the problem with it).
Feel free to comment this answer, I'll read the comments and assist further if necessary.

Resources