I think I know what is happening here, but would appreciate a confirmation and/or reading material that can turn that "think" into just "know", actual questions at the end of post in Tl,DR section:
Scenario:
I am in the middle of testing my MVC application for a case where one of the internal components is stalling (timeouts on connections to our database).
On one of my web pages there is a Jquery datatable which queries for an update via ajax every half a second - my current task is to display correct error if that data requests times out. So to test, I made a stored procedure that asks DB server to wait 3 seconds before responding, which is longer than the configured timeout settings - so this guarantees a time out exception for me to trap.
I am testing in Chrome browser, one client. Application is being debugged in VS2013 IIS Express
Problem:
Did not expect the following symptoms to show up when my purposeful slow down is activated:
1) After launching the page with the rigged datatable, application slowed down in handling of all requests from the client browser - there are 3 other components that send ajax update requests parallel to the one I purposefully broke, and this same slow down also applied to any actions I made in the web application that would generate a request (like navigating to other pages). The browser's debugger showed the requests were being sent on time, but the corresponding break points on the server side were getting hit much later (delays of over 10 seconds to even a several minutes)
2) My server kept processing requests even after I close the tab with the application. I closed the browser, I made sure that the chrome.exe process is terminated, but breakpoints on various Controller actions were still getting hit for 20 minutes afterward - mostly on the actions that were "triggered" by automatically looping ajax requests from several pages I was trying to visit during my tests. Also breakpoints were hit on main pages I was trying to navigate to. On second test I used RawCap monitor the loopback interface to make sure that there was nothing actually making requests still running in the background.
Theory I would like confirmed or denied with an alternate explanation:
So the above scenario was making looped requests at a frequency that the server couldn't handle - the client datatable loop was sending them every .5 seconds, and each one would take at least 3 seconds to generate the timeout. And obviously somewhere in IIS express there has to be a limit of how many concurrent requests it is able to handle...
What was a surprise for me was that I sort of assumed that if that limit (which I also assumed to exist) was reached, then requests would be denied - instead it appears they were queued for an absolutely useless amount of time to be processed later - I mean, under what scenario would it be useful to process a queued web request half an hour later?
So my questions so far are these:
Tl,DR questions:
Does IIS Express (that comes with Visual Studio 2013) have a concurrent connection limit?
If yes :
{
Is this limit configurable somewhere, and if yes, where?
How does IIS express handle situations where that limit is reached - is that handling also configurable somewhere? ( i mean like queueing vs. immediate error like server is busy)
}
If no:
{
How does the server handle scenarios when requests are coming faster than they can be processed and can that handling be configured anywhere?
}
Here - http://www.iis.net/learn/install/installing-iis-7/iis-features-and-vista-editions
I found that IIS7 at least allowed unlimited number of silmulatneous connections, but how does that actually work if the server is just not fast enough to process all requests? Can a limit be configured anywhere, as well as handling of that limit being reached?
Would appreciate any links to online reading material on the above.
First, here's a brief web server 101. Production-class web servers are multithreaded, and roughly one thread = one request. You'll typically see some sort of setting for your web server called its "max requests", and this, again, roughly corresponds to how many threads it can spawn. Each thread has overhead in terms of CPU and RAM, so there's a very real upward limit to how many a web server can spawn given the resources the machine it's running on has.
When a web server reaches this limit, it does not start denying requests, but rather queues requests to handled once threads free up. For example, if a web server has a max requests of 1000 (typical) and it suddenly gets bombarded with 1500 requests. The first 1000 will be handled immediately and the further 500 will be queued until some of the initial requests have been responded to, freeing up threads and allowing some of the queued requests to be processed.
A related topic area here is async, which in the context of a web application, allows threads to be returned to the "pool" when they're in a wait-state. For example, if you were talking to an API, there's a period of waiting, usually due to network latency, between sending the request and getting a response from the API. If you handled this asynchronously, then during that period, the thread could be returned to the pool to handle other requests (like those 500 queued up requests from the previous example). When the API finally responded, a thread would be returned to finish processing the request. Async allows the server to handle resources more efficiently by using threads that otherwise would be idle to handle new requests.
Then, there's the concept of client-server. In protocols like HTTP, the client makes a request and the server responds to that request. However, there's no persistent connection between the two. (This is somewhat untrue as of HTTP 1.1. Connections between the client and server are sometimes persisted, but this is only to allow faster future requests/responses, as the time it takes to initiate the connection is not a factor. However, there's no real persistent communication about the status of the client/server still in this scenario). The main point here is that if a client, like a web browser, sends a request to the server, and then the client is closed (such as closing the tab in the browser), that fact is not communicated to the server. All the server knows is that it received a request and must respond, and respond it will, even though there's technically nothing on the other end to receive it, any more. In other words, just because the browser tab has been closed, doesn't mean that the server will just stop processing the request and move on.
Then there's timeouts. Both clients and servers will have some timeout value they'll abide by. The distributed nature of the Internet (enabled by protocols like TCP/IP and HTTP), means that nodes in the network are assumed to be transient. There's no persistent connection (aside from the same note above) and network interruptions could occur between the client making a request and the server responding to the request. If the client/server did not plan for this, they could simply sit there forever waiting. However, these timeouts are can vary widely. A server will usually timeout in responding to a request within 30 seconds (though it could potentially be set indefinitely). Clients like web browsers tend to be a bit more forgiving, having timeouts of 2 minutes or longer in some cases. When the server hits its timeout, the request will be aborted. Depending on why the timeout occurred the client may receive various error responses. When the client times out, however, there's usually no notification to the server. That means that if the server's timeout is higher than the client's, the server will continue trying to respond, even though the client has already moved on. Closing a browser tab could be considered an immediate client timeout, but again, the server is none the wiser and keeps trying to do its job.
So, what all this boils down is this. First, when doing long-polling (which is what you're doing by submitting an AJAX request repeatedly per some interval of time), you need to build in a cancellation scheme. For example, if the last 5 requests have timed out, you should stop polling at least for some period of time. Even better would be to have the response of one AJAX request initiate the next. So, instead of using something like setInterval, you could use setTimeout and have the AJAX callback initiate it. That way, the requests only continue if the chain is unbroken. If one AJAX request fails, the polling stops immediately. However, in that scenario, you may need some fallback to re-initiate the request chain after some period of time. This prevents bombarding your already failing server endlessly with new requests. Also, there should always be some upward limit of the time polling should continue. If the user leaves the tab open for days, not using it, should you really keep polling the server for all that time?
On the server-side, you can use async with cancellation tokens. This does two things: 1) it gives your server a little more breathing room to handle more requests and 2) it provides a way to unwind the request if some portion of it should time out. More information about that can be found at: http://www.asp.net/mvc/overview/performance/using-asynchronous-methods-in-aspnet-mvc-4#CancelToken
Related
In a typical view of the internet you have a client that makes a HTTP request. This hits a server which processes the request and returns back to the client a response. I'm confused to what happens though if multiple clients make the same HTTP request to the same server at the same time.
I think a typical server can handle multiple requests at the same time so if the amount is less than that then there is not a problem. If it is more though then what happens exactly? Searching online I can't find an answer but I would think it is one of the following but maybe I'm entirely wrong:
1) If the server is already at full capacity then making a request just does not get a response
2) There is a queue somewhere which keeps idle requests until the server is ready to process them. This queue would be on the server itself or somewhere else?
3) Is this something handled by the TCP protocol, if it does not make the connection straight away because the server is overworked it tries again after a time limit
I had the following issue on a system that I supported ~7 years ago. We never got to the bottom of it, and focus shifted onto other issues. I was recently reminded of it, and wondered if anyone would know what was going on. But alas I'll be a little short on details. Sorry.
The Setup
I had a farm of web servers sitting behind a load balancer. The servers were hosting a system that would receive HTTP requests (XML &/or SOAP) from clients, then for each one kick-off a bunch of further HTTP requests to 3rd-party-suppliers, wait for the suppliers' responses, process and combine the results and respond to the client's request.
Think insurance comparison, but as Business-To-Business XML service.
The whole processing would take 5s of seconds, from receiving the initial client request to them sending back a response to that original HTTP request, and the server would be processing 10s or 100s of requests in parallel (i.e. at any given point, a given webserver would have many client Requests that had come in, and been logged, but not yet been responded to.)
We had detailed logging which record the reciept of the requests, including origin IP and which server was processing the request, and record when a response was sent.
All client requests were sent to a single IP address (well, URL), which was the address of the loadbalancer, which would then forward requests to the webservers, which weren't individually accessible to the internet (they didn't have public IP addresses).
Our load balancer would allow us to take individual web-servers out of rotation, for maintenance.
When we did that we could watch the DB logs, and see new requests stop coming in, and the existing request gradually get completed, until there were now outstanding requests and the server was idle.
The problem
We found that sometimes, when we took a server out of rotation ... it wouldn't entirely stop receiving requests. You could see the large bulk of request suddenly stop coming in, but it would still receive a trickle of fresh requests (I don't know ... maybe 0.1% of normal load, maybe less?). I think the longest we left it going was maybe ... 10 minutes?
Notably we realised that all of those requests were coming from a single client/IP address (I don't remember which).
I forget whether other (still-in-rotation) webservers were still receiving requests from this client, but I think they were?
If we rebooted the webserver, no further requests would come in after restarting.
Web stack was Windows, IIS, ASP.NET; pretty old school even at the time. All servers individually owned and configured.
What was happening?
We vaguely waved our hands and asserted that the client's integration with us was "holding an HTTP tunnel open and sending multiple requests through it", rather than sending each request separately, and thus was maintaining that tunnel even after the LB stopped sending new requests to that server. But that was BS-waffle, and since we never needed to actually understand what was going on, we ignored it and moved on with our lives :)
But I'd still like to know what we were seeing, if anyone can diagnose it from that description.
We vaguely waved our hands and asserted that the client's integration with us was "holding an HTTP tunnel open and sending multiple requests through it", rather than sending each request separately, and thus was maintaining that tunnel even after the LB stopped sending new requests to that server.
That sounds like a good explanation.
Normally, a LB will refuse new connections to a removed server, but will allow open connections to live on until they naturally close. This is known as "connection draining" or "graceful shutdown".
If one of your clients had HTTP keepalive on, and was holding a TCP connection open and sending HTTP requests through it for a long time, it would give the symptoms you describe.
Most LBs will have a configuration knob for how long to wait for connections to close before force-closing them during this "connection draining" time. You can set a timeout here to avoid this scenario if it is a problem for you.
The HTTP connection handling behaviour of clients will vary at the client's discretion, to a large extent. Perhaps most of your clients were of one type (say, web browsers) and weren't holding open a single connection for 10 mins, but perhaps one client was different (say, a programmatic HTTP API client)?
Further reading about "connection draining" on AWS Load Balancers here (the exact details will vary by LB vendor): https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/config-conn-drain.html
Further reading about HTTP keep alive here: https://en.wikipedia.org/wiki/HTTP_persistent_connection
Our front-end MVC3 web application is using AsyncController, because each of our instances is servicing many hundreds of long-running, IO bound processes.
Since Azure will terminate "inactive" http sessions after some pre-determined interval (which seems to vary depending up on what website you read), how can we keep the connections alive?
Our clients MUST stay connected, and our processes will run from 30 seconds to 5 minutes or more. How can we keep the client connected/alive? I initially thought of having a timeout on the Async method, and just hitting the Response object with a few bytes of output, sort of like chunking the response, and then going back and waiting some more. However, I don't think this will work, since MVC3 is handling the hookup of an IIS thread back to the asynchronous response, which will have already rendered a view at that time.
How can we run a really long process on an AsyncController, but have the client not be disconnected by the Azure Load Balancer? Sending an immediate response to the caller, and asking that caller to poll or check another resource URL is not acceptable.
Azure load balancer idle time-out is 4 minutes. Can you try to configure TCP keep-alive on the client side for less than 4 minutes, that should keep the connection alive?
On the other hand, it's pretty expensive to keep a connection open per client for a long time. This will limit the number of clients you can handle per server. Also, I think IIS may still decide to close a connection regardless of keep-alives if it thinks it need the connection to serve other requests.
In order to overcome the (apparent) 4 minute idle connection timeout on the Azure load balancer, it seems necessary to send some data down the pipe to the client every now and again to keep the connection from being regarded as idle.
Our controller is set up as an AsyncController, and it fires several different asynchronous methods on other objects, all of which are set up to use IO Completion Ports. Thus, we return from our method immediately, and when the completion packet is processed, IIS hooks back up to the original request so that we can render our View.
Is there any way to periodically send a few bytes down the wire in this case? In a "classic" situation, we could have executed the method and then just spun while we waited, sending data every few seconds until the asynchronous method was complete. But, in this situation, the IIS thread is freed to go do other business, and we hook back up to it in our completion callback. What to do? Is this possible?
While your particular case concerns Windows Azure specific (the 4 minute timeout of LBs), the question is pure IIS / ASP.NET workwise. Anyway, I don't think it is possible to send "ping-backs" to the client while in AsyncController/AsyncPage. This is the whole idea of the AsyncPages/Controllers. The IIS leaves the socket aside having the thread serving other requests. And gets back only when you got the OutstandingOperations to zero with AsyncManager.OutstandingOperations.Decrement(); Only then the control is given back to send final response to the client. And once you are the point of sending response, there is no turning back.
I would rather argue for the architectural approach of why you thing someone would wait 4 minutes to get a response (even with a good animated "please wait")? A lot of things may happen during this time. From browser crash, through internet disruption to total power loss/disruption at client. If you are doing real Azure, why not just send tasks for a Worker Role via a Queue (Azure Storage Queues or Service Bus Queues). The other option that stays in front of you for so long running tasks is to use SingalR and fully AJAXed solution. Where you communicate via SignalR the status of the long running operation.
UPDATE 1 due to comments
In addition to the approach suggested by #knightpfhor this can be also achieved with a Queues. Requestor creates a task with some Unique ID and sends it to "Task submission queue". Then "listens" (or polls at regular/irregular intervals) a "Task completion" queue for a message with given Task ID.
In any way I don't see a reason for keeping client connected for the whole duration of the long running task. There are number of ways to decouple such communication.
I'm using SignalR with Redis as a message bus on a server that sits behind an Nginx proxy for load balancing. I used SignalR's PersistentConnection class to write a simple chat program that broadcasts messages to users belonging to the same certain group. Users are added to a group in OnConnectedAsync, removed in OnDisconnectAsync, and the user-to-group mapping is deterministic.
Currently, the client side falls back to long polling for whatever reason (I'm not entirely sure why), and whenever the client sets up a new connection after waiting for and receiving a response, seemingly at random, the server will sometimes respond to the new connection immediately with the previous response, despite there having only been one POST.
The message ID's tend to differ by exactly one, (the smaller ID coming first), with the rest of the response remaining the same. I logged some debug info and am quite positive that my override of OnReceivedAsync is sending one response per one request. I tried the same implementation without the Redis message bus, and got the same problem. Running locally (with long polling) however yielded good results so I suspect that the problem might be with the way the message bus might be buffering messages to refresh clients who might not be caught up, and some weird timing with the cutting/setting up of connections with the Nginx load balancer, but beyond that, I am very much at a loss.
Any help would be appreciated.
EDIT: Further investigation reveals that duplication occurs at somewhat regular intervals of approximately 20-30 seconds. I'm led to believe that the message expiration in the message bus might have something to do with the bug.
EDIT: Bug can be seen here: http://tinyurl.com/9q5t3va
The server is simply broadcasting a counter being sent by the client. You will notice some responses are duplicated every 20 or so.
Reducing the number of worker processes in the IIS (6.0) Server Manager from 2 to 1 solved the problem.