Round robin load balancing options for a single client - http

We have a biztalk server that makes frequent calls to a web service that we also host.
The web service is hosted on 4 servers with a DNS load balancer sitting between them. The theory is that each subsequent call to the service will round robin the servers and balance the load.
However this does not work presumably because the result of the DNS lookup is cached for a small amount of time on the client. The result is that we get a flood of requests to each server before it moves on to the next.
Is that presumption correct and what are the alternative options here?
a bit more googling has suggested that I can disable client side caching for DNS: http://support.microsoft.com/kb/318803
...however this states the default cache time is 1 day which is not consistent with my experiences

You need to load balance at a lower level with NLB Clustering on Windows or LVS on Linux (or other equivalent piece of software). If you are letting clients to the web service keep an HTTP connection open for longer than a single request/response you still might not get the granularity of load balancing you are looking for so you might have to reconfigure your application servers if that is the case.

The solution we finally decided to go with was Application Request Routing which is an IIS extension. In tests this has shown to do what we want and is far easier for us (as developers) to get up and running as compared to a hardware load balancer.
http://www.iis.net/download/ApplicationRequestRouting

Related

Load Stressing Web applications deployed in openstack instances under an autoscaling group

I am working testing the auto-scaling feature of OpenStack. In my test set-up, java servlet applications are deployed in tomcat web servers behind a HAproxy load balancer. I aim at stressing testing the application, to see how it scales and the response time using JMeter as the stress tester. However the I observe that HAProxy (or something else) terminates the connection immediately the onComplete signal is sent by one of the member instances. Consequently, the subsequent responses from the remaining servers are reported as failures (timeouts). I have configured the HAProxy server to use a round-robin algorithm with sticky sessions. See attached JMeter results tree , I am not sure of the next step to undertake. The web applications are asyncronous hence my expectation was that the client (HAProxy in this case) should wait until the last thread is submitted before sending the response.
Is there be some issues with my approach or some set up flaws ?

understanding load balancing in asp.net

I'm writing a website that is going to start using a load balancer and I'm trying to wrap my head around it.
Does IIS just do all the balancing for you?
Do you have a separate web layer that sits on the distributed server that does some work before sending to the sub server, like auth or other work?
It seems like a lot of the articles I keep reading don't really give me a straight answer, or I'm just not understanding them correctly, I'd like to get my head around how true load balancing works from a techincal side, and if anyone has any code to share that would also be nice.
I understand caching is gonna be a problem but that's a different topic, session as well.
IIS do not have a load balancer by default but you can use at least two Microsoft technologies:
Application Request Routing that integrates with IIS, there you should ideally have a separate web layer to do routing work,
Network Load Balancing that is integrated with Microsoft Windows Server, there you can join existing servers into NLB cluster.
Both of those technologies do not require any code per se, it a matter of the infrastructure. But you must of course remember about load balanced environment during development. For example, to make a web sites truly balanced, they should be stateless. Otherwise you will have to provide so called stickiness between client and the server, so the same client will be connecting always to the same server.
To make service stateless, do not persist any state (Session, for example, in case of ASP.NET website) on the server but on external server shared between all servers in the farm. So it is common for example to use external ASP.NET Session server (StateServer or SQLServer modes) for all sites in the cluster.
EDIT:
Just to clarify a few things, a few words about both mentioned technologies:
NLB works on network level (as a networking driver in fact), so without any knowledge about applications used. You create so called clusters consisting of a few machines/servers and expose them as a single IP address. Then another machine can use this IP as any other IP, but connections will be routed to the one of the cluster's machines automatically. A cluster is configured on each server, there is no external, additional routing machine. Depending on the clusters settings, as we have already mentioned, a stickiness can be enabled or disabled (called here a Single or None Affinity). There is also a Load weight parameter, so you can set weighed load distribution, sending more connections to the fastest machine for example. But this parameter is static, it can't be dynamically based on network, CPU or any other usage. In fact NLB does not care if target application is even running, it just route network traffic to the selected machine. But it notices servers went offline, so there will be no routing there. The advantages of NLB is that it is quite lightweight and requires no additional machines.
ARR is much more sophisticated, it is built as a module on top of IIS and is designed to make the routing decisions at application level. Network load balancing is only one of its features as it is a more complete, routing solution. It has "rule-based routing, client and host name affinity, load balancing of HTTP server requests, and distributed disk caching" as Microsoft states. You create there Server Farms with many options like load balance algorithm, load distribution and client stickiness. You can define health tests and routing rules to forward request to other servers. Disadvantage of all of it is that there should be a dedicated machine where ARR is installed, so it takes more resources (and costs).
NLB & ARR - as using a single ARR machine can be the single point of failure, Microsoft states that it is worth consideration to create a NLB cluster of ARR machines.
Does IIS just do all the balancing for you?
Yes,if you configure Application Request Routing:
Do you have a separate web layer that sits on the distributed server
Yes.
that does some work before sending to the sub server, like auth or other work?
No, ARR is pretty 'dumb':
IIS ARR doesn't provide any pre-authentication. If pre-auth is a requirement then you can look at Web Application Proxy (WAP) which is available in Windows Server 2012 R2.
It just acts as a transparent proxy that accepts and forwards requests, while adding some caching when configured.
For authentication you can look at Windows Server 2012's Web Application Proxy.
Some tips, and perhaps items to get yourself fully acquainted with:
ARR as all the above answers above state is a "proxy" that handles the traffic from your users to your servers.
You can handle State as Konrad points out, or you can have ARR do "sticky" sessions (ensure that a client always goes to "this server" - presumably the server that maintains state for that specific client). See the discussion/comments on that answer - it's great.
I haven't worn an IT/server hat for so long and frankly haven't touched clustering hands on (always "handled for me automagically" by some provider), so I did ask this question from our host, "what/how is replication among our cluster/farm" done?" - The question covers things like
I'm only working/setting things on 1 server, does that get replicated across X VMs in our cluster/farm? How long?
What about dynamically generated,code and/or user generated files (file system)? If it's on VM1's file system, and I have 10 load balanced VMs, and the client can hit any one of them at any time, then...?
What about encryption? e.g. if you use DPAPI to encrypt web.config stuff (e.g.db conn strings/sections), what is the impact of that (because it's based on machine key, and well, the obvious thing is now you have machine(s) or VM(s). RSA re-write....?
SSL: ARR can handle this for you as well, and that's great! But as with all power, comes a "con" - if you check/validate in your code, e.g. HttpRequest.IsSecureConnection, well, it'll always be false. Your servers/VMs don't have the cert, ARR does. The encrypted conn is between client and ARR. ARR to your servers/VMs isn't. As the link explains, if you prefer it the other way around (no offloading), you can...but that means all your servers/VMs should then have a cert (and how that pertains to "replication" above starts popping in your head).
Not meant to be comprehensive, just listing things out from memory...Hth

nginx behind load balancers

I've found at that Instagram share their technology implementation with other developers trough their blog. They've some great solutions for the problems they run into. One of those solutions they've is an Elastic Load Balancer on Amazon with 3 nginx instances behind it. What is the task of those nginx servers? And what is the task of the Elastic Load balancers, and what is the relation between them?
Disclaimer: I am no expert on this in any way and am in the process of learning about AWS ecosystem myself.
The ELB (Elastic load balancer) has no functionality on its own except receiving the requests and routing it to the right server. The servers can run Nginx, IIS, Apache, lighthttpd, you name it.
I will give you a real use case.
I had one Nginx server running one WordPress blog. This server was, like I said, powered by Nginx serving static content and "upstreaming" .php requests to phpfpm running on the same server. Everything was going fine until one day. This blog was featured on a tv show. I had a ton of users and the server could not keep up with that much traffic.
My first reaction would be to just use the AMI (Amazon machine image) to spin up a copy of my server on a more powerful instance like m1.heavy. The problem was I knew I would have traffic increasing over time over the next couple of days. Soon I would have to spin an even more powerful machine, which would mean more downtime and trouble.
Instead, I launched an ELB (elastic load balancer) and updated my DNS to point website traffic to the ELB instead of directly to the server. The user doesn’t know server IP or anything, he only sees the ELB, everything else goes on inside amazon’s cloud.
The ELB decides to which server the traffic goes. You can have ELB and only one server on at the time (if your traffic is low at the moment), or hundreds. Servers can be created and added to the server array (server group) at any time, or you can configure auto scaling to spawn new servers and add them to the ELB Server group using amazon command line, all automatically.
Amazon cloud watch (another product and important part of the AWS ecosystem) is always watching your server’s health and decides to which server it will route that user. It also knows when all the servers are becoming too loaded and is the agent that gives the order to spawn another server (using your AMI). When the servers are not under heavy load anymore they are automatically destroyed (or stopped, I don’t recall).
This way I was able to serve all users at all times, and when the load was light, I would have ELB and only one Nginx server. When the load was high I would let it decide how many servers I need (according to server load). Minimal downtime. Of course, you can set limits to how many servers you can afford at the same time and stuff like that so you don’t get billed over what you can pay.
You see, Instagram guys said the following - "we used to run 2 Nginx machines and DNS Round-Robin between them". This is inefficient IMO compared to ELB. DNS Round Robin is DNS routing each request to a different server. So first goes to server one, second goes to server two and on and on.
ELB actually watches the servers' HEALTH (CPU usage, network usage) and decides which server traffic goes based on that. Do you see the difference?
And they say: "The downside of this approach is the time it takes for DNS to update in case one of the machines needs to get decommissioned."
DNS Round robin is a form of a load balancer. But if one server goes kaput and you need to update DNS to remove this server from the server group, you will have downtime (DNS takes time to update to the whole world). Some users will get routed to this bad server. With ELB this is automatic - if the server is in bad health it does not receive any more traffic - unless of course the whole group of servers is in bad health and you do not have any kind of auto-scaling setup.
And now the guys at Instagram: "Recently, we moved to using Amazon’s Elastic Load Balancer, with 3 NGINX instances behind it that can be swapped in and out (and are automatically taken out of rotation if they fail a health check).".
The scenario I illustrated is fictional. It is actually more complex than that but nothing that cannot be solved. For instance, if users upload pictures to your application, how can you keep consistency between all the machines on the server group? You would need to store the images on an external service like Amazon s3. On another post on Instagram engineering – “The photos themselves go straight to Amazon S3, which currently stores several terabytes of photo data for us.”. If they have 3 Nginx servers on the load balancer and all servers serve HTML pages on which the links for images point to S3, you will have no problem. If the image is stored locally on the instance – no way to do it.
All servers on the ELB would also need an external database. For that amazon has RDS – All machines can point to the same database and data consistency would be guaranteed.
On the image above, you can see an RDS "Read replica" - that is RDS way of load balancing. I don't know much about that at this time, sorry.
Try and read this: http://awsadvent.tumblr.com/post/38043683444/using-elb-and-auto-scaling
Can you please point the blog entry out?
Load balancers balance load. They monitor the Web servers health (response time etc) and distribute the load between the Web servers. On more complex implementations it is possible to have new servers spawn automatically if there is a traffic spike. Of course you need to make sure there is a consistency between the servers. THEY CAN share the same databases for instance.
So I believe the load balancer gets hit and decides to which server it will route the traffic according to server health.
.
Nginx is a Web server that is extremely good at serving a lot of static content for simultaneous users.
Requests for dynamic pages can be offloaded to a different server using cgi. Or the same servers that run nginx can also run phpfpm.
.
A lot of possibilities. I am on my cell phone right now. tomorrow I can write a little more.
Best regards.
I am aware that I am late to the party, but I think the use of NGINX instances behind ELB in Istagram blogpost is to provide high available load balancer as described here.
NGINX instances do not seem to be used as web servers in the blogpost.
For that role they mention:
Next up comes the application servers that handle our requests. We run Djangoon Amazon High-CPU Extra-Large machines
So ELB is used just as a replacement for their older solution with DNS Round-Robin between NGINX instances that was not providing high availability.

Redirecting http traffic to another server temporarily

Assume you have one box (dedicated server) that's on 24 7 and several other boxes that are user machines that have unused bandwidth. Assume you want to host several web pages. How can the dedicated server redirect http traffic to the user machines. It is desirable that the address field in the web browser still displays the right address, and not an ip. Ie. I don't want to redirect to another web page, I want to tell the web browser that it should request the same web page from a different server. I have been browsing through the 3xx codes, and I don't think they are made for anything like this.
It should work some what along these lines:
1. Dedicated server is online all the time.
2. User machine starts and tells the dedicated server that it's online.
(several other user machines can do similarly)
3. Web browser looks up domain name and finds out that it points to dedicated server.
4. Web browser requests page.
5. Dedicated server tells web browser to repeat request to user machine
Is it possible to use some kind of redirect, and preferably tell the browser to keep sending further requests to user machine. The user machine can close down at almost any point of time, but it is assumed that the user machine will wait for ongoing transactions to finish, no closing the server program in the middle of a get or something.
What you want is called a Proxy server or load balancer that would sit in front of your web server.
The web browser would always talk to the load balancer, and the load balancer would forward the request to one of several back-end servers. No redirect is needed on the client side, as the client always thinks it is just talking to the load balancer.
ETA:
Looking at your various comments and re-reading the question, I think I misunderstood what you wanted to do. I was thinking that all the machines serving content would be on the same network, but now I see that you are looking for something more like a p2p web server setup.
If that's the case, using DNS and HTTP 30x redirects would probably be what you need. It would probably look something like this:
Your "master" server would serve as an entry point for the app, and would have a well known host name, e.g. "www.myapp.com".
Whenever a new "user" machine came online, it would register itself with the master server and a the master server would create or update a DNS entry for that user machine, e.g. "user123.myapp.com".
If a request came to the master server for a given page, e.g. "www.myapp.com/index.htm", it would do a 302 redirect to one of the user machines based on whatever DNS entry it had created for that machine - e.g. redirect them to "user123.myapp.com/index.htm".
Some problems I see with this approach:
First, Once a user gets redirected to a user machine, if the user machine went offline it would seem like the app was dead. You could avoid this by having all the links on every page specifically point to "www.myapp.com" instead of using relative links, but then every single request has to be routed through the "master server" which would be relatively inefficient.
You could potentially solve this by changing the DNS entry for a user machine when it goes offline to point back to the master server, but that wouldn't work without an extremely short TTL.
Another issue you'll have is tracking sessions. You probably wouldn't be able to use sessions very effectively with this setup without a shared session state server of some sort accessible by all the user machines. Although cookies should still work.
In networking, load balancing is a technique to distribute workload evenly across two or more computers, network links, CPUs, hard drives, or other resources, in order to get optimal resource utilization, maximize throughput, minimize response time, and avoid overload. Using multiple components with load balancing, instead of a single component, may increase reliability through redundancy. The load balancing service is usually provided by a dedicated program or hardware device (such as a multilayer switch or a DNS server).
and more interesting stuff in here
apart from load balancing you will need to set up more or less similar environment on the "users machines"
This sounds like 1 part proxy, 1 part load balancer, and about 100 parts disaster.
If I had to guess, I'd say you're trying to build some type of relatively anonymous torrent... But I may be wrong. If I'm right, HTTP is entirely the wrong protocol for something like this.
You could use dns, off the top of my head, you could setup a hostname for each machine that is going to serve users:
www in A xxx.xxx.xxx.xxx # ip address of machine 1
www in A xxx.xxx.xxx.xxx # ip address of machine 2
www in A xxx.xxx.xxx.xxx # ip address of machine 3
Then as others come online, you could add then to the dns entries:
www in A xxx.xxx.xxx.xxx # ip address of machine 4
Only problem is you'll have to lower the time to live (TTL) entry for each record down to make it smaller (I think the default is 86400 - 1 day)
If a machine does down, you'll have to remove the dns entry, though I do think this is the least intensive way of adding capacity to any website. Jeff Attwood has more info here: is round robin dns good enough?

Can a load balancer recognize when an ASP.NET worker process is restarting and divert traffic?

Say I have a web farm of six IIS 7 web servers, each running an identical ASP.NET application.
They are behind a hardware load balancer (say F5).
Can the load balancer detect when the ASP.NET application worker process is restarting and divert traffic away from that server until after the worker process has restarted?
What happens during an IIS restart is actually a roll-over process. A new IIS worker process starts that accepts new connections, while the old worker process continues to process existing connections.
This means that if you configure your load balancer to use a balancing algorithm other than simple round-robin, that it will tend to "naturally" divert some, but not all, connections away from the machine that's recycling. For example, with a "least connections" algorithm, connections will tend to hang around longer while a server is recycling. Or, with a performance or latency algorithm, the recycling server will suddenly appear slower to the load balancer.
However, unless you write some code or scripts that explicitly detect or recognize a recycle, that's about the best you can do--and in most environments, it's also really all you need.
We use a Cisco CSS 11501 series load balancer. The keepalive "content" we are checking on each server is actually a PHP script.
service ac11-192.168.1.11
ip address 192.168.1.11
keepalive type http
keepalive method get
keepalive uri "/LoadBalancer.php"
keepalive hash "b026324c6904b2a9cb4b88d6d61c81d1"
active
Because it is a dynamic script, we are able to tell it to check various aspects of the server, and return "1" if all is well, or "0" if not all is well.
In your case, you may be able to implement a similar check script that will not work if the ASP.NET application worker process is restarting (or down).
It depends a lot on the polling interval of the load balancer, a request from the balancer has to fail in order before it can decide to divert traffic
IIS 6 and 7 restart application pools every 1740 minutes by default. It does this in an overlapped manner so that services are not impacted.
http://technet.microsoft.com/en-us/library/cc770764%28v=ws.10%29.aspx
http://forums.iis.net/t/1153736.aspx/1
On the other hand, in case of a fault, a good load balancer (I'm sure F5 is) can detect a fault with one of the web servers and send requests to the remaining, healthy web servers. That's a critical part of a high-availability web infrastructure.

Resources