Asp.net guaranteed response time - asp.net

Does anybody have any hints as to how to approach writing an ASP.net app that needs to have a guaranteed response time?
When under high load that would normally cause us to exceed our desired response time, we want to throw out an appropriate number of requests, so that the rest of the requests can return before the max response time. Throwing out requests based on exceeding a fixed req/s is not viable, as there are other external factors that will control response time that cause the max rps we can safely support to fiarly drastically drift and fluctuate over time.
Its ok if a few requests take a little too long, but we'd like the great majority of them to meet the required response time window. We want to "throw out" the minimal or near minimal number of requests so that we can process the rest of the requests in the allotted response time.
It should account for ASP.Net queuing time, ideally the network request time but that is less important.
We'd also love to do adaptive work, like make a db call if we have plenty of time, but do some computations if we're shorter on time.
Thanks!

SLAs with a guaranteed response time require a bit of work.
First off you need to spend a lot of time profiling your application. You want to understand exactly how it behaves under various load scenarios: light, medium, heavy, crushing.. When doing this profiling step it is going to be critical that it's done on the exact same hardware / software configuration that production uses. Results from one set of hardware have no bearing on results from an even slightly different set of hardware. This isn't just about the servers either; I'm talking routers, switches, cable lengths, hard drives (make/model), everything. Even BIOS revisions on the machines, RAID controllers and any other device in the loop.
While profiling make sure the types of work loads represent an actual slice of what you are going to see. Obviously there are certain load mixes which will execute faster than others.
I'm not entirely sure what you mean by "throw out an appropriate number of requests". That sounds like you want to drop those requests... which sounds wrong on a number of levels. Doing this usually kills an SLA as being an "outage".
Next, you are going to have to actively monitor your servers for load. If load levels get within a certain percentage of your max then you need to add more hardware to increase capacity.
Another thing, monitoring result times internally is only part of it. You'll need to monitor them from various external locations as well depending on where your clients are.
And that's just about your application. There are other forces at work such as your connection to the Internet. You will need multiple providers with active failover in case one goes down... Or, if possible, go with a solid cloud provider.

Yes, in the last mvcConf one of the speakers compares the performance of various view engines for ASP.NET MVC. I think it was Steven Smith's presentation that did the comparison, but I'm not 100% sure.
You have to keep in mind, however, that ASP.NET will really only play a very minor role in the performance of your app; DB is likely to be your biggest bottle neck.
Hope the video helps.

Related

Why the time-out values are so small?

I'm not sure if this is a pure stackoverflow relevant question. It is related to general design practice. Since I cannot think of another relevant stack exchange site, posting it here.
In the general design practice of converting an async call to sync one, we use a time-out and wait for the results. While, this may not exactly a good practice from the point of view of responsiveness, it definitely makes the implementation easier.
I have seen many such implementations and often noticed that the developers tend to give a very small time-out value. I can understand that the people may have the need of a responsive system in mind when they did this. But many of these applications I have seen are very data critical ones where the loss of data is very bad. So, it is always better to wait more and try to get as much data instead of timing out early and giving an error message to the user. Now, the situations where the server failing to give data or the client unable to reach server etc are rare. In those situations, I expect the a large time-out for such waits. After all, these time-outs don't mean that the wait will definitely last until the given time-out value; the timeout value is only an upper limit. So, I have always arguing for higher values here. But I see the use of low values in more and more places and now I'm getting confused if really there is something else in this practice that I don't understand.
So, my question is : Are there any arguments, other than the need for responsiveness to implement a very small time-out for waiting?
As always, the right decision depends on the real-life data.
The timeout should be proportional to the time it usually takes to complete an operation successfully.
Sending a UDP message for example could take between 1 - 50 milliseconds so a timeout of 100 milliseconds is more than reasonable however copying a file over the wire could take minutes or more so a 100 millisecond timeout is laughable.
There are pros and cons to both short and long timeouts so it's a tradeoff. Longer timeouts use more resources (tasks, threads, memory, etc.) for the same amount of work while short timeouts, as you mentioned, may result in loss of data.
In conclusion, you need to set a configurable timeout that sounds reasonable and then figure out whether you timeout too many operations in production or the other way around and calibrate accordingly.

Optimization of parallel programming

I want to use MPI to make my program parallel, and I want to send something to other computers. I want to know which one is better: sending a huge buffer one time or sending two smaller messages 3 times atrent times during the execution instead of all at once?
It's almost always going to be faster to send the one big message than the smaller one. Each time you do a Send/Receive pair, the two processes have to go through the entire process of sending a message to each other, including at least 6 roundtrip messages. If you are just sending one larger message, there is a minimum of 2 roundtrip messages. Each of those messages can be very expensive (compared to doing things locally like packing all of your data into one buffer).
I'd encourage you to try it out both ways though to be sure that this applies to your application. It could be different if you're doing something unexpected.
Depending on your problem, sending all data may be more efficient, because the nodes have to be synced, every time. That may cause a delay.
I always try to send as much data as I can in a single MPI call. In my experience, sending many small bits of data greatly increases the overhead and network traffic, and I have even run into problems where I overwhelmed the computers' ability to keep up with the number of requests, because I was sending a large member of a complicated class, one integer at a time, to many workers. Therefore, when possible, send the entire data at once, unless you have some reason to believe it is too large.
Further, I strive to use 100% of all the CPU's my program claims. When you are working on shared resources, if you use a CPU, you need to actually use it. Otherwise, someone else who wants to use that core, or node, is blocked out while your program sits and does nothing. For example, on a Cray which I have used, even if you call for only two 'cores', the manager will reserve a full bank of 24 cores, essentially wasting 22. Or, perhaps one worker has nothing to do, while another chugs away -- again, wasting time. Hopefully, there is a way to balance the load, so to speak, to avoid unintentional waste of resources.
Back to the topic at hand. Demonstrate timing and efficiency of vector sending to yourself -- write a program which breaks up the vector into varying sizes of packets, and do the sends/receives. Test it with varying numbers of workers, and on several different configurations of computers, if you can. Before writing production code, do proof of concept, and what optimization you can. Test and time it!

How can I find the average number of concurrent users for IIS to simulate during a load/performance test?

I'm using JMeter for load testing. I'm going through and exercise of finding the max number of concurrent threads (users) that our webserver can handle by simply increasing the # of threads in my distributed JMeter test case, and firing off the test.
Then -- it struck me, that while the MAX number may be useful, the REAL number of users that my website actually handles on average is the number I need to make the test fruitful.
Here are a few pieces of information about our setup:
This is a mixed .NET/Classic ASP site. Upon login, a browser session (with timeout) is created in both for the users.
Each session times out after 60 minutes.
Is there a way using this information, IIS logs, performance counters, and/or some calculation that will help me determine the average # of concurrent users we handle on our production site?
You might use logparser with the QUANTIZE function to determine the peak number of requests over a suitable interval.
For a 10 second window, it would be something like:
logparser "select quantize(to_localtime(to_timestamp(date,time)), 10) as Qnt,
count(*) as Hits from yourLogFile.log group by Qnt order by Hits desc"
The reported counts won't be exactly the same as threads or users, but they should help get you pointed in the right direction.
The best way to do exact counts is probably with performance counters, but I'm not sure any of the standard ones works like you would want -- you'd probably need to create a custom counter.
I can see a couple options here.
Use Performance Monitor to get the current numbers or have it log all day and get an average. ASP.NET has a Requests Current counter. According to this page Classic ASP also has a Requests current, but I've never used it myself.
Run the IIS logs through Log Parser to get the total number of requests and how long each took. I'm thinking that if you know how many requests come in each hour and how long each took, you can get an average of how many were running concurrently.
Also, keep in mind that concurrent users isn't quite the same as concurrent threads on the server. For one, multiple threads will be active per user while content like images is being downloaded. And after that the user will be on the page for a few minutes while the server is idle.
My suggestion is that you define the stop conditions first, such as
Maximum CPU utilization
Maximum memory usage
Maximum response time for requests
Other key parameters you like
It is really subjective to choose the parameters and I personally cannot provide much experience on that.
Secondly you can see whether performance counters or IIS logs can map to the parameters. Then you set up proper mappings.
Thirdly you can start testing by simulating N users (threads) and see whether the stop conditions hit. If not hit, you can go to a higher number. If hit, you can use a smaller number. Recursively you will find a rough number.
However, that never means your web site in real world can take so many users. No simulation so far can cover all the edge cases.

Practical value for concurrent-request-timeout parameter or options for avoiding concurrent access to conversation exception

In the Seam Reference Guide, one can find this paragraph:
We can set a sensible default for the concurrent request timeout (in ms) in components.xml:
<core:manager concurrent-request-timeout="500" />
However, we found that 500 ms is not nearly enough time for most of the cases we had to deal with, especially with the severe restriction seam places on conversation access.
In our application we have a combination of page scoped ajax requests (triggered by various user actions), some global scoped polling notification logic (part of the header, so included in every page) and regular links that invoke actions and/or navigate to other pages.
Therefore, we get the dreaded concurrent access to conversation exception way too often, even without any significant load on the site.
After researching the options for quite a bit, we ended up bumping this value to several seconds (we're debating whether to bump it up to 10s), as none of the recommended solutions seemed able to solve our issue completely (even forcing a global queue for all the ajax requests would still leave us exposed to a user deciding to click a link right when one of our polling calls was in progress). And we'd much rather have the users wait for a second or two instead of getting an error page just because they clicked a link at the wrong moment.
And now to the question: is there something obvious we're missing (like a way to allow concurrent access to conversations and taking care of the needed locking ourselves, for instance :)? How do people solve this problem (ajax requests mixed with user driven interaction) in seam? Disabling all the links on the page while ajax requests are in progress (as suggested by one blog page) is really not a viable option.
Any other suggestions?
TIA,
Andrei
We use 60000 or 120000 (1-2 minutes). Concurrent-request-timeout is designed to avoid deadlocks. Historically we have far more problems with timeouts than deadlocks. A better approach is to use a client-side queue (<a4j:ajaxQueue> if using RichFaces) to serialize and remove duplicate requests as much as possible, then set the timeout high enough to avoid any remaining problems.
There are many serious issues resulting from Seam's concurrent request timeouts:
The issue is the last request gets the ConcurrentRequestTimeoutException. If the user double-clicks or reloads the page, only the last request matters -- why should he get an error?
Usually the ConcurrentRequestTimeoutException is suppressed, and only secondary NullPointerExceptions and #In injection failures are shown, making debugging difficult.
Seam 2.2.1 has a severe problem where transactions, ThreadLocals, and locks may leak after a timeout occurs, especially when used with <spring:spring-transaction/>. Look at SeamPhaseListener.afterRestoreView: there's no finally block to clean up after restoreConversation fails!
In my opinion there are many poor aspects to this design, so it's best to use a much higher timeout and try to avoid the issues.
This is what we have and it works fine for us:
<core:manager concurrent-request-timeout="5000"
conversation-timeout="120000" conversation-id-parameter="cid"
parent-conversation-id-parameter="pid" />
We also use a much higher value for the concurrent-request-timeout.
At least for duplicate events you can use settings in the a4j components to filter and delay them with eventsQueue, requestDelay and ignoreDupResponses=”true”.
(Last point http://docs.jboss.org/seam/2.0.1.GA/reference/en/html/conversations.html )
Can you analyse which types of request are taking a long time? Is there a particular type which you could reduce the request time by doing the "work" asynchronously and getting the update back in your poll?
In my opinion, ajax requests should always complete fairly quickly, then you can calculate a max concurrent request time by (request time * max number of requests likely to be initiated)

What is the most accurate method of estimating peak bandwidth requirement for a web application?

I am working on a client proposal and they will need to upgrade their network infrastructure to support hosting an ASP.NET application. Essentially, I need to estimate peak usage for a system with a known quantity of users (currently 250). A simple answer like "you'll need a dedicated T1 line" would probably suffice, but I'd like to have data to back it up.
Another question referenced NetLimiter, which looks pretty slick for getting a sense of what's being used.
My general thought is that I'll fire the web app up and use the system like I would anticipate it be used at the customer, really at a leisurely pace, over a certain time span, and then multiply the bandwidth usage by the number of users and divide by the time.
This doesn't seem very scientific. It may be good enough for a proposal, but I'd like to see if there's a better way.
I know there are load tools available for testing web application performance, but it seems like these would not accurately simulate peak user load for bandwidth testing purposes (too much at once).
The platform is Windows/ASP.NET and the application is hosted within SharePoint (MOSS 2007).
In lieu of a good reporting tool for bandwidth usage, you can always do a rough guesstimate.
N = Number of page views in busiest hour
P = Average Page size
(N * P) /3600) = Average traffic per second.
The server itself will have a lot more internal traffic for probably db server/NAS/etc. But outward facing that should give you a very rough idea on utilization. Obviously you will need to far surpass the above value as you never want to be 100% utilized, and to allow for other traffic.
I would also not suggest using an arbitrary number like 250 users. Use the heaviest production day/hour as a reference. Double and triple if you like, but that will give you the expected distribution of user behavior if you have good log files/user auditing. It will help make your guesstimate more accurate.
As another commenter pointed out, a data center is a good idea, when redundancy and bandwidth availability become are a concern. Your needs may vary, but do not dismiss the suggestion lightly.
There are several additional questions that need to be asked here.
Is it 250 total users, or 250 concurrent users? If concurrent, is that 250 peak, or 250 typically? If it's 250 total users, are they all expected to use it at the same time (eg, an intranet site, where people must use it as part of their job), or is it more of a community site where they may or may not use it? I assume the way you've worded this that it is 250 total users, but that still doesn't tell enough about the site to make an estimate.
If it's a community or "normal" internet site, it will also depend on the usage - eg, are people really going to be using this intensely, or is it something that some users will simply log into once, and then forget? This can be a tough question from your perspective, since you will want to assume the former, but if you spend a lot of money on network infrastructure and no one ends up using it, it can be a very bad thing.
What is the site doing? At the low end of the spectrum, there is a "typical" web application, where you have reasonable size (say, 1-2k) pages and a handful of images. A bit more intense is a site that has a lot of media - eg, flickr style image browsing. At the upper end is a site with a lot of downloads - streaming movies, or just large files or datasets being downloaded.
This is getting a bit outside the threshold of your question, but another thing to look at is the future of the site: is the usage going to possibly double in the next year, or month? Be wary of locking into a long term contract with something like a T1 or fiber connection, without having some way to upgrade.
Another question is reliability - do you need redundancy in connections? It can cost a lot up front, but there are ways to do multi-homed connections where you can balance access across a couple of links, and then just use one (albeit with reduced capacity) in the event of failure.
Another option to consider, which effectively lets you completely avoid this entire question, is to just host the application in a datacenter. You pay a relatively low monthly fee (low compared to the cost of a dedicated high-quality connection), and you get as much bandwidth as you need (eg, most hosting plans will give you something like 500GB transfer a month, to start with - and some will just give you unlimited). The datacenter is also going to be more reliable than anything you can build (short of your own 6+ figure datacenter) because they have redundant internet, power backup, redundant cooling, fire protection, physical security.. and they have people that manage all of this for you, so you never have to deal with it.

Resources