Any known issues regarding empty *.aspx pages with IIS7? - asp.net

I have an IIS7 web application which primarily serves web service requests. As part of our solution we have two web servers and a load balancer, and the load balancer requests a page from each of its load balanced boxes periodically. The page the load balancer loads is named "Health.aspx", it does not have a code behind, and the entire contents of the file Health.aspx is:
OK
However we are observing occasional 400 errors from the load balancer when requesting this page throughout the day which causes the load balancer to eject the machine from its rotation for a period of time.
While there are potentially a number of things which could be causing a problem here I wanted to start at the web boxes themselves and determine whether a nearly empty *.aspx page with no code behind could cause periodic problems.

You should create Health.html and just use that instead. That will also give you some idea of whether it's ASP.NET giving you the 400 problems or IIS.

"400 errors" covers a wide range of potential issues, from a simple not found 404 to bad request 401, and forbidden 403.
Some of these could be caused by bad network cables, nics, lousy load balancer, overloaded server(s), etc.
Incidentally, no, a web page with no code behind is not going to throw that. Also, I'd probably do a little something more in the health.aspx to verify that the server is actually functioning. Like run a calulation or make a simple database request. After all, IIS could simply be caching the file.

Turns out that the problem was just the sheer number of requests and an untimely response time (or really request time - our client connections are very slow) for handling the requests. Requests for Health.aspx were getting caught up with slow client connections and the default MaxConcurrentRequestsPerCPU of 12 was artificially limiting our actual number of requests/second. Increasing this number to 100 - based on carefully testing our hardware against load for our particular application - solved the problem.

Related

What does Failure of Web server exactly mean?

What really happens when the web server fails ? I know the usual thing that the Website will be down and the page I have requested is inaccessible.
The cause might be external like too many incoming requests but what really happens in the inner workings of the server that leads to such a condition ?
I might be congestion due to lack of bandwidth or the VM/server that is containing the info/web page

Why is the initial connection time for a HTTP request so long?

My web app sits behind a Nginx. Occasionally, the loading of my web page takes more than 10 seconds, I used Chrome DevTools to track the timing, and it looks like this:
The weird thing is, when the page loads slowly, the initial connection time is always 11 seconds long. And after this slow request, subsequent loading of the same page becomes very fast.
What is the possible problem that cause this?
P.S. If this is caused by a resource limitation on my server, can I see some errors/warnings in some system log?
The initial connection refers to the time taken to perform the initial TCP handshake and negotiating an SSL (where applicable). The slowness could be caused by congestion, where the server has hit a limit and can't respond to new connections while existing ones are pending. You could look into some performance enhancements in your Nginx configuration.
Use dig command to check you domain name resolution process.
if return multi answer section, check these ips is valid.
You should eliminate the file incallings which are pointing to not-existing files. I had the same issue at a customer where a 404 image caused the problem, as it delayed the loading of other files.

What HTTP status codes to use to represent success/failure of IsAlive for web requests?

I have a web application, for which I have an isalive page that gets hit on a regular basis (monitoring) to make sure my application is working (or is alive as the name suggests).
In general, this isalive functionality/page is very light in terms of processing. It checks things that the application depends on like SQL connectivity, Fileshare connectivity etc (on top of obviously verifying that the app itself on the web server can server this current isalive request).
Currently we use HTTP status code 200 for success and HTTP status code 500 for failure (we set the HTTP status code on response). But some how I feel, this is not accurate (in my opinion), since HTTP status code 500 is internal server error, and may not be accurately representing failure condition, since failure could be thrown, not only due to web level failure, but could also be due to dependency failure like SQL, FS connectivity checks.
Since this seems like a fairly standard use case for many, what is the industry standard for this? Am I doing it right? Suggestions or how you have been handling this would be much appreciated.
200 and 500 are perfectly accurate for your use case.
If your isalive page is used by an automated monitoring service such as pingdom or wdt.io (as opposed to a human), you could change it to return a 204 (no content) because that would be more efficient.
From the client's perspective, the web server, database, and filesystem are all part of the server. So if one of them fails, a 500 response is correct. If you want to distinguish web server errors from other, upstream services, you could return 502 if one of these upstream services failed. Personally, I wouldn't bother and try to keep the implementation for isalive as simple as possible.

ASP.Net MVC Delayed requests arriving long after client browser closed

I think I know what is happening here, but would appreciate a confirmation and/or reading material that can turn that "think" into just "know", actual questions at the end of post in Tl,DR section:
Scenario:
I am in the middle of testing my MVC application for a case where one of the internal components is stalling (timeouts on connections to our database).
On one of my web pages there is a Jquery datatable which queries for an update via ajax every half a second - my current task is to display correct error if that data requests times out. So to test, I made a stored procedure that asks DB server to wait 3 seconds before responding, which is longer than the configured timeout settings - so this guarantees a time out exception for me to trap.
I am testing in Chrome browser, one client. Application is being debugged in VS2013 IIS Express
Problem:
Did not expect the following symptoms to show up when my purposeful slow down is activated:
1) After launching the page with the rigged datatable, application slowed down in handling of all requests from the client browser - there are 3 other components that send ajax update requests parallel to the one I purposefully broke, and this same slow down also applied to any actions I made in the web application that would generate a request (like navigating to other pages). The browser's debugger showed the requests were being sent on time, but the corresponding break points on the server side were getting hit much later (delays of over 10 seconds to even a several minutes)
2) My server kept processing requests even after I close the tab with the application. I closed the browser, I made sure that the chrome.exe process is terminated, but breakpoints on various Controller actions were still getting hit for 20 minutes afterward - mostly on the actions that were "triggered" by automatically looping ajax requests from several pages I was trying to visit during my tests. Also breakpoints were hit on main pages I was trying to navigate to. On second test I used RawCap monitor the loopback interface to make sure that there was nothing actually making requests still running in the background.
Theory I would like confirmed or denied with an alternate explanation:
So the above scenario was making looped requests at a frequency that the server couldn't handle - the client datatable loop was sending them every .5 seconds, and each one would take at least 3 seconds to generate the timeout. And obviously somewhere in IIS express there has to be a limit of how many concurrent requests it is able to handle...
What was a surprise for me was that I sort of assumed that if that limit (which I also assumed to exist) was reached, then requests would be denied - instead it appears they were queued for an absolutely useless amount of time to be processed later - I mean, under what scenario would it be useful to process a queued web request half an hour later?
So my questions so far are these:
Tl,DR questions:
Does IIS Express (that comes with Visual Studio 2013) have a concurrent connection limit?
If yes :
{
Is this limit configurable somewhere, and if yes, where?
How does IIS express handle situations where that limit is reached - is that handling also configurable somewhere? ( i mean like queueing vs. immediate error like server is busy)
}
If no:
{
How does the server handle scenarios when requests are coming faster than they can be processed and can that handling be configured anywhere?
}
Here - http://www.iis.net/learn/install/installing-iis-7/iis-features-and-vista-editions
I found that IIS7 at least allowed unlimited number of silmulatneous connections, but how does that actually work if the server is just not fast enough to process all requests? Can a limit be configured anywhere, as well as handling of that limit being reached?
Would appreciate any links to online reading material on the above.
First, here's a brief web server 101. Production-class web servers are multithreaded, and roughly one thread = one request. You'll typically see some sort of setting for your web server called its "max requests", and this, again, roughly corresponds to how many threads it can spawn. Each thread has overhead in terms of CPU and RAM, so there's a very real upward limit to how many a web server can spawn given the resources the machine it's running on has.
When a web server reaches this limit, it does not start denying requests, but rather queues requests to handled once threads free up. For example, if a web server has a max requests of 1000 (typical) and it suddenly gets bombarded with 1500 requests. The first 1000 will be handled immediately and the further 500 will be queued until some of the initial requests have been responded to, freeing up threads and allowing some of the queued requests to be processed.
A related topic area here is async, which in the context of a web application, allows threads to be returned to the "pool" when they're in a wait-state. For example, if you were talking to an API, there's a period of waiting, usually due to network latency, between sending the request and getting a response from the API. If you handled this asynchronously, then during that period, the thread could be returned to the pool to handle other requests (like those 500 queued up requests from the previous example). When the API finally responded, a thread would be returned to finish processing the request. Async allows the server to handle resources more efficiently by using threads that otherwise would be idle to handle new requests.
Then, there's the concept of client-server. In protocols like HTTP, the client makes a request and the server responds to that request. However, there's no persistent connection between the two. (This is somewhat untrue as of HTTP 1.1. Connections between the client and server are sometimes persisted, but this is only to allow faster future requests/responses, as the time it takes to initiate the connection is not a factor. However, there's no real persistent communication about the status of the client/server still in this scenario). The main point here is that if a client, like a web browser, sends a request to the server, and then the client is closed (such as closing the tab in the browser), that fact is not communicated to the server. All the server knows is that it received a request and must respond, and respond it will, even though there's technically nothing on the other end to receive it, any more. In other words, just because the browser tab has been closed, doesn't mean that the server will just stop processing the request and move on.
Then there's timeouts. Both clients and servers will have some timeout value they'll abide by. The distributed nature of the Internet (enabled by protocols like TCP/IP and HTTP), means that nodes in the network are assumed to be transient. There's no persistent connection (aside from the same note above) and network interruptions could occur between the client making a request and the server responding to the request. If the client/server did not plan for this, they could simply sit there forever waiting. However, these timeouts are can vary widely. A server will usually timeout in responding to a request within 30 seconds (though it could potentially be set indefinitely). Clients like web browsers tend to be a bit more forgiving, having timeouts of 2 minutes or longer in some cases. When the server hits its timeout, the request will be aborted. Depending on why the timeout occurred the client may receive various error responses. When the client times out, however, there's usually no notification to the server. That means that if the server's timeout is higher than the client's, the server will continue trying to respond, even though the client has already moved on. Closing a browser tab could be considered an immediate client timeout, but again, the server is none the wiser and keeps trying to do its job.
So, what all this boils down is this. First, when doing long-polling (which is what you're doing by submitting an AJAX request repeatedly per some interval of time), you need to build in a cancellation scheme. For example, if the last 5 requests have timed out, you should stop polling at least for some period of time. Even better would be to have the response of one AJAX request initiate the next. So, instead of using something like setInterval, you could use setTimeout and have the AJAX callback initiate it. That way, the requests only continue if the chain is unbroken. If one AJAX request fails, the polling stops immediately. However, in that scenario, you may need some fallback to re-initiate the request chain after some period of time. This prevents bombarding your already failing server endlessly with new requests. Also, there should always be some upward limit of the time polling should continue. If the user leaves the tab open for days, not using it, should you really keep polling the server for all that time?
On the server-side, you can use async with cancellation tokens. This does two things: 1) it gives your server a little more breathing room to handle more requests and 2) it provides a way to unwind the request if some portion of it should time out. More information about that can be found at: http://www.asp.net/mvc/overview/performance/using-asynchronous-methods-in-aspnet-mvc-4#CancelToken

Different server & browser HTTP status codes

I have a small python web application running on nginx with unicorn. The web application refresh it's page automatically every 1 minute.
Every day I see that around the same hour, the browser reports a 504 Gateway Time-out error and the application stops refreshing obviously.
I checked it with both chrome and firefox on two different client machines and two different server machines and found out it happens almost everyday on the same time (different time for each web server).
The weird thing is that looking at the web server access log I identify these calls and they are reported with 200 OK status code.
Could it be the the browser reports a different error code than the server due to connection issues? Any ideas how should I keep investigating it?
We found out the indeed our server had a maintenance procedure which blocked the access to it. Although it finished the request after a while the browser "gave up" and returned a timeout error. Once the maintenance procedure was canceled - the issue was resolved.
Yes - the server is able to serve the page ok so returns 200, but the client cannot finish the connection.
It could be a part of your infrastructure (firewall?) is choosing to update or something, although the odds of this happening at the exact same time of your request is slim unless it's a long running request or gateway outage.

Resources