How do most people handle updating ASP.NET applications running in a webfarm? I am having the problem that because the app is in use and the request affitnity is not sticky, when we push the update users run into errors as the process requests the request might be handled by the wrong version of the application. How do you do this? Take the entire application offline and let the push complete or do you update live and let the chips fall where they may? Ideally we'd like to minimize down time if at all possible.
any thoughts/suggestions/pointers would be appreciated
This is what we do:
Drain the active sessions off a particular server in the farm no new traffic will be routed to that server during the time.
Apply the patch to the drained server
Drain the sessons off the remaining servers
Allow traffic back to the original server
As the other servers are drained, apply the patch and let them come back to life.
Make half the servers inaccessible, update those, flip all the inaccessible servers to accesible and vice versa, update the other half, and put the other half back up.
Related
My stack is uWSGI, flask and nginx currently. I have a need to store data between requests (basically I receive push notifications from another service about events to the server and I want to store those events in the server memory, so client can just query server every n milliseconds, to receive latest update).
Normally this would not work, because of many reasons. One is a good deployment requires you to have several processes in uwsgi in production (and even maybe several machines to scale this out). But my case is very specific: I'm building a web app for a piece of hardware (You can think of your home router configuration page as a good example). This means no need to scale. I also do not have a database (at least not a traditional one) and probably normally 1-2 clients simultaneously.
if I specify --processes 1 --threads 4 in uwsgi, is this enough to ensure the data is kept in the memory as a single instance? Or do I also need to use --threads 1?
I'm also aware that some web servers clear memory randomly from time to time and restart the hosted app. Does nginx/uwsgi do that and where can I read about the rules?
I'd also welcome advises on how to design all of this, if there are better ways to handle this. Please note that I do not consider using any persistant storage for this - this does not worth the effort and may be even impossible due to hardware limitations.
Just to clarify: When I'm talking about one instance of data, I'm thinking of my app.py executing exactly one time and keeping the instances defined there for as long as the server lives.
If you don't need data to persist past a server restart, why not just build a cache object into you application that can do push and pop operations?
A simple array of objects should suffice, one flask route pushes new data to the array and another can pop the data off the array.
Is it recommended to launch a docker instance per request?
I have either lighttpd or Nginx running on my web server as a reverse proxy. I support a number of subdomains with very low usage. When a request for the subdomain arrives I want to start the docker instance. Preferable I'd like to launch them dynamically so that if more than one user arrives that I would launch one per user... and/or a shared instance (determined by configuration)
Originally I said this should work well for low traffic sites, but upon further thought, no, this is a bad idea.
Each time you launch a Docker container, it adds a read-write layer to the image. Even if there is very little data written, the layer exists, and each request will generate one. When a single user visits a website, rendering the page will generate 10's to 1000's of requests, for CSS, for javascript, for each image, for fonts, for AJAX, and each of these would create those read-write layers.
Right now there is no automatic cleanup of the read-write layers -- they persist even after the Docker container has exited. By default, nothing is lost.
So, even for a single low traffic site, you would find your disk use growing steadily over time. You could add your own automated cleanup.
Then there is the second problem: anything uploaded to the website would not be available to any other requests unless it was written to some out-of-container shared storage. That's pretty easy to do with S3 or a separate and persistent database service, but it does start showing the weakness in the "one new Docker container per request" approach. If you're going to have some persistent services, why not make the Docker containers more persistent and run them longer?
I'm having no luck trying to find out how channging the instance count for an ASP.Net web role affects requests currently being processed.
Heres the scenario:
An ASP.Net site is deployed with 6 instances
Via the console I reduce the instancecount to 4
Is azure smart enough to not remove instances from the pool if it is currently progressing requests or does it just kill them mid request?
I've been through the azure doco, goolge and a number of emails to MS tech support none of which were able to answer this seemingly simple question. I know about the events that get triggered by a shutdown etc but that doesnt really help in web site scenario with a live person waiting for a request to their response.
You cannot choose which instances to kill off. Primarily this is due to Windows Azure's instance allocation scheme, where your instances are split into different fault domains (meaning different areas of the data center - different rack, etc.). If you were to choose the instances to kill, this could leave you in a state where your remaining instances are in the same fault domain, which would void the SLA.
Having said that: You get an event when your role instance is shutting down (the OnStop() event). If you capture this event, you can do instance cleanup in preparation for VM shutdown. I can't recall if you're taken out of the load balancer at this point, but you could always force yourself out with a simple PowerShell command (Set-RoleInstanceStatus -Busy). This way your asp.net instance stops taking requests, and you can more easily shut down in a graceful manner.
EDIT: Sorry - didn't quite address all of your question. Since you get to capture OnStop(), you might have to implement a mechanism to make sure nothing's being processed in that instance. Since you're out of the load balancer, and assuming your requests are processed fairly quickly (2-5 seconds), you shouldn't have to wait long to clear out remaining requests. There's probably a performance counter to check, to see how many active requests are being handled.
Just to add to David's answer: the OnStop event happens when you are off the load balancer. For web apps, it is usually sufficient time to bleed out all requests after you are disconnected from the LB until the instance is shutdown. However, for long running or stateful connections (perhaps to a worker role), there would be an abrupt disconnect in some cases. While the OnStop method removes you from the LB, it does not terminate open connections. It simply prevents you from getting new connections. For web apps, this is usually enough time to complete the request (and you can delay the shutdown if necessary in the OnStop as well if you really want to).
Say a website on my localhost takes about 3 seconds to do each request. This is fine, and as expected (as it is doing some fancy networking behind the scenes).
However, if i open the same url in tabs (in firefox), then reload them all at the same time, it appears to load each page sequentially rather than all at the same time. What is this all about?
Have tried it on windows server 2008 iis and windows 7 iis
It really depends on the web browser you are using and how tab support in it has been programmed.
It is probably using a single thread to load each tab in turn, which would explain your observation.
Edit:
As others have mentioned, it is also a very real possibility the the webserver running on your localhost is single threaded.
If I remember correctly HTTP standard limits the number of concurrent conections to the same host to 2. This is the reason highload websites use CDNs (content delivery networks).
network.http.max-connections 60
network.http.max-connections-per-server 30
The above two values determine how many connections Firefox makes to a server. If threshold is breached, it will pipeline the requests.
Each browser implements it in its own way. The requests are made in such a way to maximize the performance. Moreover, it also depends on the server (localhost which is slower).
Your local web server configuration might have only one thread, so every next request will wait for the previous to finish
I've got an asp.net web page that is making 7 async requests to a WCF service on another server. Both boxes are clean without anything else installed.
I've also increased maxconnections in web.config to 20.
I run a single call through the system and the page returns in 800ms. The long and short of it is I think that the threadpool is being being overwhelmed as, once placed underload I cannot get more that 8 requests per second, even though both quad core boxes are running at 20% CPU load and the sql server it's connected to is returning the querys in under 10ms per call.
I've changed the service behaviour to concurrency.multiple but that's not seeming to help.
Any ideas anyone.
There are many different factors that could be in play here. Taking a stab at the remark that changing your instancing model on the service had zero effect (big IF here) then its possible the 'bottleneck' is upstream from the service. Either at the web server, or the client load generator.
You've got several areas to review for tuning: client, web server, wcf service server - that's assuming there are no network devices in the middle. Pick an end and work towards the other end. Since I'm already making an assumption that its not the service, then I'd start at the client and work my way towards the wcf service.
Client
What machine is driving the load against the web server? A laptop? A desktop? A dedicated test agent, or a shared one? The client acting as the load generator for purposes of this test is also susceptible to maxConnections limitation as this is a client setting.
What is the CPU utilization of the client generating load? Could it be that the test driver is just unable to generate enough load to push these boxes? Can you add additional test clients to your test?
Web Server
What does the system.net/processModel element look like in machine.config on the ASP.NET web server? Try setting autoConfig = true. This will allow the configuration to auto size based on the 'size' of the machine its running on.
WCF Service
Review WCF service for any throttling defaults that might be in play and tweak appropriately. See ServiceThrottlingBehavior on MSDN.
Let us know any changes in behavior you might observe (if any) if you make any changes!
The real answer here that everyone missed is that you're using an ASP.NET web page. That means your client is some form of web browser. Modern web browsers have a limit of 2 concurrent async requests at any time. This means that 5 of your requests were queued up and waiting for the first two to finish. Once those first two, it served the next two, then the next two, then the last one.
All of these round-trips and handshakes simply take time. I'm guessing that your roundtrip time is around 200ms, unfortunately you have to do it 4 times.
I also really dislike the "max 2" browser limitation on making webservice calls.
Is this service hosted in IIS, WAS or a Windows Service?
You should try to set Windows to run services on a higher priority. Your WCF Service is probably creating the threads it needs but they should be running at a low priority.
Hope that helps.