We are working on an application that serves request 24 * 7 and none of the request should be missed. This application is hosted on IIS in load balanced environment. Primary server group P1 and C1, while the back up servers to primary servers are P2 and C2. So, requests are sent in load balanced manner to P1 and C1. If both P1 and C1 go down, the P2 and C2 are up.
We are trying to work on a strategy so none of our requests are missed. Plan is to bring primary servers P1 and C1 down so requests are forwarded to back up. Then bring the primary servers up and work on back up servers.
My question around it is how to ensure that worker process serves the last request it receives before we bring the application down for primary servers. As mentioned earlier, the intention is to ensure none of the requests are missed
If you want to ensure that none of the requests are missed, you may be waiting a good long while--any requests in progress at the time the recycle occurs are serviced by the original AppPool. Depending on your site, some of those requests could be large files with very slow connections, which could take literally hours to finish. At some point you may just want to whack the connections and move on, knowing that the caller will refresh their page, reconnect, or whatever to get back to where they were.
In any case, you don't need multiple servers for this. IIS automatically (by default at least), drains requests from the old application instance while starting up the new one and sending new requests to that (during application updates). (Note however, that any requests to the new pool will likely stall for several seconds because the application is loading and initializing everything for the first time--this is normal and very difficult to avoid).
If you really do have multiple servers and want to switch request handling from one server to the other, you must have some kind of load balancing system in place, either a software load balancer such as Microsoft Network Load Balancing, or a hardware load balancer such as F5 Big-IP. In both of these situations, you have to somehow tell the load balancer to drain connections on the old application (or server) and route them to the new server. This may be done manually through their GUIs, or for an automated deployment system, will require integration with those systems.
Related
My stack is uWSGI, flask and nginx currently. I have a need to store data between requests (basically I receive push notifications from another service about events to the server and I want to store those events in the server memory, so client can just query server every n milliseconds, to receive latest update).
Normally this would not work, because of many reasons. One is a good deployment requires you to have several processes in uwsgi in production (and even maybe several machines to scale this out). But my case is very specific: I'm building a web app for a piece of hardware (You can think of your home router configuration page as a good example). This means no need to scale. I also do not have a database (at least not a traditional one) and probably normally 1-2 clients simultaneously.
if I specify --processes 1 --threads 4 in uwsgi, is this enough to ensure the data is kept in the memory as a single instance? Or do I also need to use --threads 1?
I'm also aware that some web servers clear memory randomly from time to time and restart the hosted app. Does nginx/uwsgi do that and where can I read about the rules?
I'd also welcome advises on how to design all of this, if there are better ways to handle this. Please note that I do not consider using any persistant storage for this - this does not worth the effort and may be even impossible due to hardware limitations.
Just to clarify: When I'm talking about one instance of data, I'm thinking of my app.py executing exactly one time and keeping the instances defined there for as long as the server lives.
If you don't need data to persist past a server restart, why not just build a cache object into you application that can do push and pop operations?
A simple array of objects should suffice, one flask route pushes new data to the array and another can pop the data off the array.
I've been having this issue for quite a while now, where somehow the state of one of my SignalR clients (or many of them?) enters this behavior of switching to longPolling and then completely flooding my application pool. There will be a lot of reconnects and start/stop polling requests. It looks something like one of these two screenshots:
This eventually causes my entire web server to lock up since the application pool becomes full.
What might cause these scenarios? Why does it fall back on to longPolling, and why does it end up flooding the server?
I am currently in the design phase where we have several servers that make up our system and I need to work out what is the best way to store session information so that each of the servers can get access to it. Session information is basically a guid for the session id and a list of user group names.
The main client applications are web and mobile apps.
I have the following configuration:
A master server where all users log in and the session object is filled with the user group information that corresponds to the user. There could be up to 10,000 users at peak log in.
Slave servers which contain archiving content and most users would then via the UI be talking directly to the slave servers. The slave servers need the session information which is initially determined on the master server.
One option was to move the session data for each log in to the slave servers and cache it on the slave, then each slave could work independently and it will not need to reference the master server.
The other option is to have central database which contains the session information but since our databases are on each server (we don't have a separate machine to act as a database server), each slave will have a remote connection string back to the master server database. No doubt this will slow things down if I have to query the database remotely from the slave server.
I then have the situation where I need to clean up sessions but in general I don't expect more that 25 MB of data at the peak log in.
We could have up to 10 slave servers.
What is the best solution?
For IIS 7.0 here is a technet article that outlines two approaches (a session server or using SQL server):
http://technet.microsoft.com/en-us/library/cc754032(v=ws.10).aspx
I question the need to have 10 web servers running with 10 seperate databases running on them. Granted, I don't know anything about the application you're writing and there may very well be a good reason for it.
Here's how I'm seeing it (with my admitted limited knowledge of your application).
10,000 possible concurrent users hitting 1 authentication server that will then redirect them to one (or more?) of 10 servers could potentially cause bottlenecks. What if a majority go to one of the servers? What if a large number all try to log in at the same time?
Here's a stab at a different architecture:
[LoadBalancer]
-------------------------------------------------------------------------
[WebServer] [WebServer] [WebServer] -------------> [SessionServer]
[LoadBalancer]
-------------------------------------------------------------------------
[AppServer] [AppServer] [AppServer] [AppServer] -------^
-------------------------------------------------------------------------
[DBServer]
[DBServer](backup)
I write that not knowing what class of machines these are; they may not be suitable to be a DB server.
Well, it's early here and I'm only on my second cup of coffee. That may or may not be helpful, I hope it is.
I'm having no luck trying to find out how channging the instance count for an ASP.Net web role affects requests currently being processed.
Heres the scenario:
An ASP.Net site is deployed with 6 instances
Via the console I reduce the instancecount to 4
Is azure smart enough to not remove instances from the pool if it is currently progressing requests or does it just kill them mid request?
I've been through the azure doco, goolge and a number of emails to MS tech support none of which were able to answer this seemingly simple question. I know about the events that get triggered by a shutdown etc but that doesnt really help in web site scenario with a live person waiting for a request to their response.
You cannot choose which instances to kill off. Primarily this is due to Windows Azure's instance allocation scheme, where your instances are split into different fault domains (meaning different areas of the data center - different rack, etc.). If you were to choose the instances to kill, this could leave you in a state where your remaining instances are in the same fault domain, which would void the SLA.
Having said that: You get an event when your role instance is shutting down (the OnStop() event). If you capture this event, you can do instance cleanup in preparation for VM shutdown. I can't recall if you're taken out of the load balancer at this point, but you could always force yourself out with a simple PowerShell command (Set-RoleInstanceStatus -Busy). This way your asp.net instance stops taking requests, and you can more easily shut down in a graceful manner.
EDIT: Sorry - didn't quite address all of your question. Since you get to capture OnStop(), you might have to implement a mechanism to make sure nothing's being processed in that instance. Since you're out of the load balancer, and assuming your requests are processed fairly quickly (2-5 seconds), you shouldn't have to wait long to clear out remaining requests. There's probably a performance counter to check, to see how many active requests are being handled.
Just to add to David's answer: the OnStop event happens when you are off the load balancer. For web apps, it is usually sufficient time to bleed out all requests after you are disconnected from the LB until the instance is shutdown. However, for long running or stateful connections (perhaps to a worker role), there would be an abrupt disconnect in some cases. While the OnStop method removes you from the LB, it does not terminate open connections. It simply prevents you from getting new connections. For web apps, this is usually enough time to complete the request (and you can delay the shutdown if necessary in the OnStop as well if you really want to).
How do most people handle updating ASP.NET applications running in a webfarm? I am having the problem that because the app is in use and the request affitnity is not sticky, when we push the update users run into errors as the process requests the request might be handled by the wrong version of the application. How do you do this? Take the entire application offline and let the push complete or do you update live and let the chips fall where they may? Ideally we'd like to minimize down time if at all possible.
any thoughts/suggestions/pointers would be appreciated
This is what we do:
Drain the active sessions off a particular server in the farm no new traffic will be routed to that server during the time.
Apply the patch to the drained server
Drain the sessons off the remaining servers
Allow traffic back to the original server
As the other servers are drained, apply the patch and let them come back to life.
Make half the servers inaccessible, update those, flip all the inaccessible servers to accesible and vice versa, update the other half, and put the other half back up.