Load Stressing Web applications deployed in openstack instances under an autoscaling group - servlets

I am working testing the auto-scaling feature of OpenStack. In my test set-up, java servlet applications are deployed in tomcat web servers behind a HAproxy load balancer. I aim at stressing testing the application, to see how it scales and the response time using JMeter as the stress tester. However the I observe that HAProxy (or something else) terminates the connection immediately the onComplete signal is sent by one of the member instances. Consequently, the subsequent responses from the remaining servers are reported as failures (timeouts). I have configured the HAProxy server to use a round-robin algorithm with sticky sessions. See attached JMeter results tree , I am not sure of the next step to undertake. The web applications are asyncronous hence my expectation was that the client (HAProxy in this case) should wait until the last thread is submitted before sending the response.
Is there be some issues with my approach or some set up flaws ?

Related

R-Shiny script timeout behind a load-balancer

I am testing one Shiny script on an instance in GCP. The instance resides behind a load-balancer that serves as a front end with a static IP address and SSL certificate to secure connections. I configured the GCP instance as part of a backend service to which the load-balancer forwards the requests. The connection between the load-balancer and the instance is not secured!
The issue:
accessing the Shiny script via the load-balancer works, but the web browser's screen gets grayed (time-out) on the client-side after a short time of initiating the connection!! When the browser screen grayed out, I have to start over!!
If I try to access the Shiny script on the GCP instance directly (not through the load-balancer), the script works fine. I suppose that the problem is in the load-balancer, not the script.
I appreciate any help with this issue.
Context: Shiny uses a websocket (RFC 6455) for its constant client-server communication. If, for whatever reason, this websocket connection gets dicsonnected, the user experience is the described "greying out". Fortunately GCP supports websockets.
However, it seems that your load balancer has an unexpected http time out value set.
Depending on what type of load balancer you are using (TCP, HTTPS) this can be configured differently. For their HTTPS offering:
The default value for the backend service timeout is 30 seconds. The full range of timeout values allowed is 1-2,147,483,647 seconds.
Consider increasing this timeout under any of these circumstances:
[...]
The connection is upgraded to a WebSocket.
Answer:
You should be able to increase the timeout for your backend service with the help of this support document.
Mind you, depending on your configuration there could be more proxies involved which might complicate things.
Alternatively you can try to prevent any timeout by adding a heartbeat mechanism to the Shiny application. Some ways of doing this have been discussed in this issue on GitHub.

understanding load balancing in asp.net

I'm writing a website that is going to start using a load balancer and I'm trying to wrap my head around it.
Does IIS just do all the balancing for you?
Do you have a separate web layer that sits on the distributed server that does some work before sending to the sub server, like auth or other work?
It seems like a lot of the articles I keep reading don't really give me a straight answer, or I'm just not understanding them correctly, I'd like to get my head around how true load balancing works from a techincal side, and if anyone has any code to share that would also be nice.
I understand caching is gonna be a problem but that's a different topic, session as well.
IIS do not have a load balancer by default but you can use at least two Microsoft technologies:
Application Request Routing that integrates with IIS, there you should ideally have a separate web layer to do routing work,
Network Load Balancing that is integrated with Microsoft Windows Server, there you can join existing servers into NLB cluster.
Both of those technologies do not require any code per se, it a matter of the infrastructure. But you must of course remember about load balanced environment during development. For example, to make a web sites truly balanced, they should be stateless. Otherwise you will have to provide so called stickiness between client and the server, so the same client will be connecting always to the same server.
To make service stateless, do not persist any state (Session, for example, in case of ASP.NET website) on the server but on external server shared between all servers in the farm. So it is common for example to use external ASP.NET Session server (StateServer or SQLServer modes) for all sites in the cluster.
EDIT:
Just to clarify a few things, a few words about both mentioned technologies:
NLB works on network level (as a networking driver in fact), so without any knowledge about applications used. You create so called clusters consisting of a few machines/servers and expose them as a single IP address. Then another machine can use this IP as any other IP, but connections will be routed to the one of the cluster's machines automatically. A cluster is configured on each server, there is no external, additional routing machine. Depending on the clusters settings, as we have already mentioned, a stickiness can be enabled or disabled (called here a Single or None Affinity). There is also a Load weight parameter, so you can set weighed load distribution, sending more connections to the fastest machine for example. But this parameter is static, it can't be dynamically based on network, CPU or any other usage. In fact NLB does not care if target application is even running, it just route network traffic to the selected machine. But it notices servers went offline, so there will be no routing there. The advantages of NLB is that it is quite lightweight and requires no additional machines.
ARR is much more sophisticated, it is built as a module on top of IIS and is designed to make the routing decisions at application level. Network load balancing is only one of its features as it is a more complete, routing solution. It has "rule-based routing, client and host name affinity, load balancing of HTTP server requests, and distributed disk caching" as Microsoft states. You create there Server Farms with many options like load balance algorithm, load distribution and client stickiness. You can define health tests and routing rules to forward request to other servers. Disadvantage of all of it is that there should be a dedicated machine where ARR is installed, so it takes more resources (and costs).
NLB & ARR - as using a single ARR machine can be the single point of failure, Microsoft states that it is worth consideration to create a NLB cluster of ARR machines.
Does IIS just do all the balancing for you?
Yes,if you configure Application Request Routing:
Do you have a separate web layer that sits on the distributed server
Yes.
that does some work before sending to the sub server, like auth or other work?
No, ARR is pretty 'dumb':
IIS ARR doesn't provide any pre-authentication. If pre-auth is a requirement then you can look at Web Application Proxy (WAP) which is available in Windows Server 2012 R2.
It just acts as a transparent proxy that accepts and forwards requests, while adding some caching when configured.
For authentication you can look at Windows Server 2012's Web Application Proxy.
Some tips, and perhaps items to get yourself fully acquainted with:
ARR as all the above answers above state is a "proxy" that handles the traffic from your users to your servers.
You can handle State as Konrad points out, or you can have ARR do "sticky" sessions (ensure that a client always goes to "this server" - presumably the server that maintains state for that specific client). See the discussion/comments on that answer - it's great.
I haven't worn an IT/server hat for so long and frankly haven't touched clustering hands on (always "handled for me automagically" by some provider), so I did ask this question from our host, "what/how is replication among our cluster/farm" done?" - The question covers things like
I'm only working/setting things on 1 server, does that get replicated across X VMs in our cluster/farm? How long?
What about dynamically generated,code and/or user generated files (file system)? If it's on VM1's file system, and I have 10 load balanced VMs, and the client can hit any one of them at any time, then...?
What about encryption? e.g. if you use DPAPI to encrypt web.config stuff (e.g.db conn strings/sections), what is the impact of that (because it's based on machine key, and well, the obvious thing is now you have machine(s) or VM(s). RSA re-write....?
SSL: ARR can handle this for you as well, and that's great! But as with all power, comes a "con" - if you check/validate in your code, e.g. HttpRequest.IsSecureConnection, well, it'll always be false. Your servers/VMs don't have the cert, ARR does. The encrypted conn is between client and ARR. ARR to your servers/VMs isn't. As the link explains, if you prefer it the other way around (no offloading), you can...but that means all your servers/VMs should then have a cert (and how that pertains to "replication" above starts popping in your head).
Not meant to be comprehensive, just listing things out from memory...Hth

Looking for a message bus for HTTP load sharing

I have an HTTP application with standalone workers that perform well. The issue is that some times they need to purge and rebuild their caches, so they stop responding for up to 30 seconds.
I have looked into a number of load balancers, but none of them seem to address this issue. I have tried Perlbal and some Apache modules (like fcgid) and they happily send requests to workers that are busy rebuilding their cache.
So my take is this: isn't there some kind of message bus solution where all http requests are queued up, leaving it up to the workers to process messages when they are able to?
Or - alternatively - a load balancer that can take into account that the workers are some times unable to respond.
Added later: I am aware that a strategy could be that the workers could use a management protocol to inform the load balancer when they are busy, but that solution seems kludgy and I worry that there will be some edge cases that results in spurious errors.
If you use Amazon Web Services Load Balancer you can achieve your desired result. You can mark an EC2 Instance behind an Elastic Load Balancer (ELB) as unhealthy while it does this cache purge and rebuild.
What I would do is create an additional endpoint for each instance, that is called rebuild_cache for example. So if you have 5 instances behind your ELB, you can make a script to hit each individual instance (not through the load balancer) on that rebuild_cache endpoint. This endpoint would do 3 things:
Mark the instance as unhealthy. The load balancer will realize it's unhealthy after a failed health check (the timing and threshold of health checks are configurable from AWS Web Console).
Run your cache purge and rebuild
Mark the instance as healthy. The load balancer will run a health check on the instance and only start sending it traffic once it has been healthy for the required amount of healthy health checks (again, this threshold is defined through ELB Health configuration)
I see two strategies here: put a worker offline for the period, so a balancer will abandon it; inverse control - workers pull for tasks from a balancer, instead of the balancer pushes tasks to workers. Second strategy easy to do with a Message Queue.

Round robin load balancing options for a single client

We have a biztalk server that makes frequent calls to a web service that we also host.
The web service is hosted on 4 servers with a DNS load balancer sitting between them. The theory is that each subsequent call to the service will round robin the servers and balance the load.
However this does not work presumably because the result of the DNS lookup is cached for a small amount of time on the client. The result is that we get a flood of requests to each server before it moves on to the next.
Is that presumption correct and what are the alternative options here?
a bit more googling has suggested that I can disable client side caching for DNS: http://support.microsoft.com/kb/318803
...however this states the default cache time is 1 day which is not consistent with my experiences
You need to load balance at a lower level with NLB Clustering on Windows or LVS on Linux (or other equivalent piece of software). If you are letting clients to the web service keep an HTTP connection open for longer than a single request/response you still might not get the granularity of load balancing you are looking for so you might have to reconfigure your application servers if that is the case.
The solution we finally decided to go with was Application Request Routing which is an IIS extension. In tests this has shown to do what we want and is far easier for us (as developers) to get up and running as compared to a hardware load balancer.
http://www.iis.net/download/ApplicationRequestRouting

Can a load balancer recognize when an ASP.NET worker process is restarting and divert traffic?

Say I have a web farm of six IIS 7 web servers, each running an identical ASP.NET application.
They are behind a hardware load balancer (say F5).
Can the load balancer detect when the ASP.NET application worker process is restarting and divert traffic away from that server until after the worker process has restarted?
What happens during an IIS restart is actually a roll-over process. A new IIS worker process starts that accepts new connections, while the old worker process continues to process existing connections.
This means that if you configure your load balancer to use a balancing algorithm other than simple round-robin, that it will tend to "naturally" divert some, but not all, connections away from the machine that's recycling. For example, with a "least connections" algorithm, connections will tend to hang around longer while a server is recycling. Or, with a performance or latency algorithm, the recycling server will suddenly appear slower to the load balancer.
However, unless you write some code or scripts that explicitly detect or recognize a recycle, that's about the best you can do--and in most environments, it's also really all you need.
We use a Cisco CSS 11501 series load balancer. The keepalive "content" we are checking on each server is actually a PHP script.
service ac11-192.168.1.11
ip address 192.168.1.11
keepalive type http
keepalive method get
keepalive uri "/LoadBalancer.php"
keepalive hash "b026324c6904b2a9cb4b88d6d61c81d1"
active
Because it is a dynamic script, we are able to tell it to check various aspects of the server, and return "1" if all is well, or "0" if not all is well.
In your case, you may be able to implement a similar check script that will not work if the ASP.NET application worker process is restarting (or down).
It depends a lot on the polling interval of the load balancer, a request from the balancer has to fail in order before it can decide to divert traffic
IIS 6 and 7 restart application pools every 1740 minutes by default. It does this in an overlapped manner so that services are not impacted.
http://technet.microsoft.com/en-us/library/cc770764%28v=ws.10%29.aspx
http://forums.iis.net/t/1153736.aspx/1
On the other hand, in case of a fault, a good load balancer (I'm sure F5 is) can detect a fault with one of the web servers and send requests to the remaining, healthy web servers. That's a critical part of a high-availability web infrastructure.

Resources