I'm load/stress testing using Visual Studios load testing project with current users set quite low at 10. The application makes multiple System.Net.Http.HttpClient request to a WebAPI service layer/application (which in some cases calls another WebApi layer/application) and it's only getting around 30 request/second when testing a single MVC4 controller (not parsing dependent requests).
These figures are based on a sample project where no real code is executed at any of the layers other then HttpClient calls and to contruct a new collection of Models to return (so some serialization happening too as part of WebApi). I've tried both with async controller and without and the results seemed pretty similar (slight improvement with async controllers)
Caching the calls obviously increases it dramatically (to around 400 RPS) although the nature of the data and the traffic means that a significant amount of the calls wouldn't be cached in real world scenarios.
Is HttpClient (or HTTP layer in general) simply too expensive or are there some ideas you have around getting higher throughput?
It might be worth noting that on my local development machine this load test also maxes out the CPU. Any ideas or suggestions would be very much appreciated?
HttpClient has a built-in default of 2 concurrent connections.
Have you changed the default?
See this answer
I believe if you chase the deps for HTTPClient, you will find that it relies on the settings for ServicePointManager.
See this link
You can change the default connection limit by calling System.Net.ServicePointManager.DefaultConnectionLimit
You can increase number of concurrent connection for http client. Default connection limit is 2. In order to find optimum connection try the formula 12* number of CPU on your machine.
In code : In code:
ServicePointManager.DefaultConnectionLimit = 48;
Or In Api.config or web.config file
<system.net>
<connectionManagement>
<add address="*" maxconnection="48"/>
</connectionManagement>
</system.net>
Check this link for more detail.
Related
I have an AngularJS app that calls WebAPI. If I log the time I initiatiate a request (in my angluar controller) and log the time OnActionExecuting runs (in an action filter in my WebAPI controller), I notice at times a ~2 second gap. I'm assuming nothing else is running before this filter and this is due to requests being blocked/queued. The reason I assume this is because if I remove all my other data calls, I do not see this gap.
What is the number of parallel requests that WebAPI can handle at once? I tried looking at the ASP.NET performance monitors but couldn't find where I could see this data. Can someone shed some insight into this?
There's no straight answer for this but the shortest one is ...
There is no limit to this for WebApi the limits come from what your server can handle and how efficient the code you have it run is.
...
But since you asked, lets consider some basic things that we can assume about our server and our application ...
concurrent connections
A typical server is known for issues like "c10k" ... https://en.wikipedia.org/wiki/C10k_problem ...so that puts a hard limit on the number of concurrent connections.
Assuming each WebApi call is made from say, some AJAX call on a web page, that gives us a limit of around 10k connections before things get evil.
2.Dependency related overheads
If we then consider the complexity of the code in question you may then have a bottleneck in doing things like SQL queries, I have often written WebApi controllers that have business logic that runs 10+ db queries, the overhead here may be your problem?
Feed in Overhead
What about network bandwidth to the server?
Lets assume we are streaming 1MB of data for each call, it wont take long to choke a 1Gb/s ethernet line with messages that size.
Processing Overhead
Assuming you wrote an Api that does complex calculations (e.g mesh generation for complex 3D data) you could easily choke your CPU for some time on each request.
Timeouts
Assuming the server could accept your request and the request was made asynchronously the biggest issue then is, how long are you prepared to wait for your response? Assuming this is quite short you would reduce the number of problems you have time to solve before each request then needed a response.
...
So as you can see, this is by no means an exhaustive list but it outlines the complexity of the question you asked. That said, I would argue that WebApi (the framework) has no limits, it's really down to the infrastructure around it that has limitations in order to determine what can be possible.
Background
I'm in the midst of comparing the performance of NancyFx and ServiceStack.NET running under IIS 7 (testing on a Windows 7 host). Both are insanely fast - testing locally each framework processes over 10,000+ req/sec, with ServiceStack being about 20% faster.
The problem I'm running into is that ASP.NET appears to be caching the responses for each unique URI request from the HttpHandler, quickly leading to massive memory pressure (3+ GBs) and overworking the garbage collector (~25% time consumed by the GC). So far I've been unable to disable the caching and buildup of objects, and am looking for suggestions on how to disable this behavior.
Details
The request loop is basically as follows:
for i = 1..100000:
string uri = http://localhost/users/{i}
Http.Get(uri)
The response is a simple JSON object, formatted as { UserID: n }.
I've cracked open WinDBG, and for each request there are:
One System.Web.FileChangeEventHandler
Two System.Web.Configuration.MapPathCacheInfos
Two System.Web.CachedPathDatas
Three System.Web.Caching.CacheDependencys
Five System.Web.Caching.CacheEntrys
Obviously, these cache items are what is leading me to believe it's a cache bloat issue (I'd love to get rid of 150,000 unusable objects!).
What I've tried so far
In IIS 'HTTP Resonse Headers', set 'Expire Web content' to 'immediately'.
In the web.config
<system.web>
<caching>
<outputCache enableOutputCache="false" enableFragmentCache="false"/>
</caching>
</system.web>
Also in the web.config (and many variations on the policies, including none).
<caching enabled="false" enableKernelCache="false">
<profiles>
<add policy="DontCache" kernelCachePolicy="DontCache" extension="*/>
</profiles>
</caching>
Looked through the source code of the frameworks to see if there might be any "features" built in that would use ASP.NET caching. While there are caching helpers, they are private to the framework itself and do not appear to leverage ASP.NET caching.
Update #1
Digging through reflector I've found that setting the value for UrlMetadataSlidingExpiration to zero eliminates a large portion of the excessive memory usage, at the expense of cutting throughput by 50% (the FileAuthorizationModule class caches the FileSecurityDescriptors, which must be somewhat expensive to generate, when UrlMetadataSlidingExpiration is non-zero).
This is done by updating the web.config and placing the following in :
<hostingEnvironment urlMetadataSlidingExpiration="00:00:00"/>
I'm going to try to fully disable the FileAuthorizationModule from running, if possible, to see if that helps. However, ASP.NET is still generating 2*N MapPathCacheInfo and CacheEntry objects, so memory is still getting consumed, just at much slower rate.
Update #2
The other half of the problem is the same issue as described here: Prevent many different MVC URLs from filling ASP.NET Cache. Setting
<cache percentagePhysicalMemoryUsedLimit="1" privateBytesPollTime="00:00:01"/>
helps, but even with these very aggressive settings memory usage quickly rises to 2.5GB (compared to 4GB). Ideally these objects would never be created in the first place. Failing that, I may resort to a hacky solution of using reflection to clear out the Caches (all these entries are "private" and are not enumerated when iterating over the public Cache).
A late response for others that suffers the same problem:
This is a known issue:
KB 2504047
This issue occurs because the unique requests that try to access the same resources are cached as MapPathCacheInfo objects for 10 minutes.
While the objects are cached for 10 minutes, the memory consumption of
the W3wp.exe process increases significantly.
You can download the hotfix here
I think this is less of an issue of caching, and more of an issue of "high memory utilization".
Two things,
If you use an IDisposable friendly object (try one that can use the "using" keyword). This will allow you to dispose of the object sooner than later, creating less pressure for the garbage collector in the long run.
for (int i = 0; i < 10000; i++) {
using (System.Net.WebClient c = new System.Net.WebClient()) {
System.IO.Stream stream = c.OpenRead(String.Format("http://url.com/{0}", i));
}
}
From your pseudo-code, I can only assume that you're using System.Net.WebHttpRequest which isn't disposable and will probably hang around longer than it should if you are making successive calls.
Secondly, if you are making successive calls to an external server, I'd put delays between each calls. This will give some breathing room as your processor will process the for-loop much faster than the network will have time to respond (and will just keep queue up the requests and slowing down the speed of ones actually being processed).
System.Threading.Thread.Sleep(250);
Obviously, the best solution would be to make a single call with a list of the users you want to retreive and just deal with one webrequest/response.
Ensure the IsReusable property is set to false so IIS doesn't reuse the same request process to handle the request.
http://support.microsoft.com/kb/308001
Under windows server 2008 64bit, IIS 7.0 and .NET 4.0 if an ASP.NET application (using ASP.NET thread pool, synchronous request processing) is long running (> 30 minutes). Web application has no page and main purpose is reading huge files ( > 1 GB) in chunks (~5 MB) and transfer them to the clients. Code:
while (reading)
{
Response.OutputStream.Write(buffer, 0, buffer.Length);
Response.Flush();
}
Single producer - single consumer pattern implemented so for each request there are two threads. I don't use task library here but please let me know if it has advantage over traditional thread creation in this scenario. HTTP Handler (.ashx) is used instead of a (.aspx) page. Under stress test CPU utilization is not a problem but with a single worker process, after 210 concurrent clients, new connections encounter time-out. This is solved by web gardening since I don't use session state. I'm not sure if there's any big issue I've missed but please let me know what other considerations should be taken in your opinion ?
for example maybe IIS closes long running TCP connections due to a "connection timeout" since normal ASP.NET pages are processed in less than 5 minutes, so I should increase the value.
I appreciate your Ideas.
Personally, I would be looking at a different mechanism for this type of processing. HTTP Requests/Web Applications are NOT designed for this type of thing, and stability is going to be VERY hard, you have a number of risks that could cause you major issues as you are working with this type of model.
I would move that processing off to a backend process, so that you are OUTSIDE of the asp.net runtime, that way you have more control over start/shutdown, etc.
First, Never. NEVER. NEVER! do any processing that takes more than a few seconds in a thread pool thread. There are a limited number of them, and they're used by the system for many things. This is asking for trouble.
Second, while the handler is a good idea, you're a little vague on what you mean by "generate on the fly" Do you mean you are encrypting a file on the fly and this encryption can take 30 minutes? Or do you mean you're pulling data from a database and assembling a file? Or that the download takes 30 minutes to download?
Edit:
As I said, don't use a thread pool for anything long running. Create your own thread, or if you're using .NET 4 use a Task and specify it as long running.
Long running processes should not be implemented this way. Pass this off to a service that you set up.
IF you do want to have a page hang for a client, consider interfacing from AJAX to something that does not block on IO threads - like node.js.
Push notifications to many clients is not something ASP.NET can handle due to thread usage, hence my node.js. If your load is low, you have other options.
Use Web-Gardening for more stability of your application.
Turn-off caching since you don't have aspx pages
It's hard to advise more without performance analysis. You the VS built-in and find the bottlenecks.
The Web 1.0 way of dealing with long running processes is to spawn them off on the server and return immediately. Have the spawned off service update a database with progress and pages on the site can query for progress.
The most common usage of this technique is getting a package delivery. You can't hold the HTTP connection open until my package shows up, so it just gives you a way to query for progress. The background process deals with orchestrating all of the steps it takes for getting the item, wrapping it up, getting it onto a UPS truck, etc. All along the way, each step is recorded in the database. Conceptually, it's the same.
Edit based on Question Edit: Just return a result page immediately, and generate the binary on the server in a spawned thread or process. Use Ajax to check to see if the file is ready and when it is, provide a link to it.
I am building an ASP.NET web application that will be deployed to a 4-node web farm.
My web application's farm is located in California.
Instead of a database for back-end data, I plan to use a set of web services served from a data center in New York.
I have a page /show-web-service-result.aspx that works like this:
1) User requests page /show-web-service-result.aspx?s=foo
2) Page's codebehind queries a web service that is hosted by the third party in New York.
3) When web service returns, the returned data is formatted and displayed to user in page response.
Does this architecture have potential scalability problems? Suppose I am getting hundreds of unique hits per second, e.g.
/show-web-service-result.aspx?s=foo1
/show-web-service-result.aspx?s=foo2
/show-web-service-result.aspx?s=foo3
etc...
Is it typical for web servers in a farm to be using web services for data instead of database? Any personal experience?
What change should I make to the architecture to improve scalability?
You have most definitely a scalability problem: the third-party web service. Unless you have a service-level agreement with that service (agreeing on the number of requests that you can submit per second), chances are real that you overload that service with your anticipated load. That you have four nodes yourself doesn't help you then.
So you should a) come up with an agreement with the third party, and b) test what the actual load is that they can take.
In addition, you need to make sure that your framework can use parallel connections for accessing the remote service. Suppose you have a round-trip time of 20ms from California to New York (which would be fairly good), you can not make more than 50 requests over a single TCP connection. Likewise, starting new TCP connections for every request will also kill performance, so you want pooling on these parallel connections.
I don't see a problem with this approach, we use it quite a bit where I work. However, here are some things to consider:
Is your page rendering going to be blocked while waiting for the web service to respond?
What if the response never comes, i.e. the service is down?
For the first problem I would look into using AJAX to update the page after you get a response back from the web service. You'll also want to consider how to handle the no response or timeout condition.
Finally, you should really think about how you could cache the web service data locally. For example if you are calling a stock quoting service then unless you have a real-time feed, there is no reason to call the web service with every request you get. Store the data locally for a period of time and return that until it becomes stale.
You may have scalability problems but most of these can be carefully engineered around.
I recommend you use ASP.NET's asynchronous tasks so that the web service is queued up, the thread is released while the request waits for the web service to respond, and then another thread picks up when the web service is done to finish off the request.
MSDN Magazine - Wicked Code - Asynchronous Pages in ASP.NET 2.0
Local caching is an absolute must. The fewer times you have to go from California to New York, the better. You might want to look into Microsoft's Velocity (although that's still in CTP) or NCache, or another distributed cache, so that each of your 4 web servers don't all have to make and cache the same data from the web service - once one server gets it, it should be available to all.
Microsoft Project Code Named "Velocity"
NCache
Other things that can go wrong that you should engineer around:
The web service is down (obviously) and data falls out of cache, and you can't get it back. Try to make it so that the data is not actually dropped from cache until you're sure you have an update available. Then the only risk is if the service is down and your application pool is reset, so don't reset it as a first-line troubleshooting maneuver!
There are two different timeouts on web requests, a connect and an overall timeout. Make sure both are set extremely low and you handle both of them timing out. If the service's DNS goes down, this can look like quite a different failure.
Watch perfmon for ASP.NET Queued Requests. This number will rise rapidly if the service goes down and you're not covering it properly.
Research and adjust ASP.NET performance registry settings so you have a highly optimized ASP.NET thread pool. I don't remember the specifics, but I seem to remember that there's a limit on IO Completion Ports and something else of that nature that are absurdly low for the powerful hardware I'm assuming you have on hand.
the trendy answer is REST. Any GET request can be HTTP Response cached (with lots of options on how that is configured) and it will be cached by the internet itself (your ISP, essentially).
Your project has an architecture that reflects they direction that Microsoft and many others in the SOA world want to take us. That said, many people try to avoid this type of real-time risk introduced by the web service.
Your system will have a huge dependency on the web service working in an efficient manner. If it doesn't work, or is slow, people will just see that your page isn't working properly.
At the very least, I would get a web stress tool and performance test your web service to at least the traffic levels you expect to get at peaks, and likely beyond this. When does it break (if ever?), when does it start to slow down? These are good metrics to know.
Other options to look at: perhaps you can get daily batches of data from the web service to a local database and hit the database for your web site. Then, if for some reason the web service is down or slow, you could use the most recently obtained data (if this is feasible for your data).
Overall, it should be doable, but you want to understand and measure the risks, and explore any potential options to minimize those risks.
It's fine. There are some scalability issues. Primarily, with the number of calls you are allowed to make to the external web service per second. Some web services (Yahoo shopping for example) limit how often you can call their service and will lock out your account if you call too often. If you have a large farm and lots of traffic, you might have to throttle your requests.
Also, it's typical in these situations to use an interstitial page that forks off a worker thread to go and do the web service call and redirects to the results page when the call returns. (Think a travel site when you do search, you get an interstitial page while they call out to an external source for the flight data and then you get redirected to a results page when the call completes). This may be unnecessary if your web service call returns quickly.
I recommend you be certain to use WCF, and not the legacy ASMX web services technology as the client. Use "Add Service Reference" instead of "Add Web Reference".
One other issue you need to consider, depending on the type of application and/or data you're pulling down: security.
Specifically, I'm referring to authentication and authorization, both of your end users, and the web application itself. Where are these things handled? All in the web app? by the WS? Or maybe the front-end app is authenticating the users, and flowing the user's identity to the back end WS, allowing that to verify that the user is allowed? How do you verify this? Since many other responders here mention a local data cache on the front end app (an EXCELLENT idea, BTW), this gets even MORE complicated: do you cache data that is allowed to userA, but not for userB? if so, how do you verify that userB cannot access data from the cache? What if the authorization is checked by the WS, how do you cache the permissions then?
On the other hand, how are you verifying that only your web app is allowed to access the WS (and an attacker doesn't directly access your WS data over the Internet, for instance)? For that matter, how do you ensure that your web app contacts the CORRECT WS server, and not a bogus one? And of course I assume that all the connection to the WS is only over TLS/SSL... (but of course also programmatically verify the cert applies to the accessed server...)
In short, its complicated, and many elements to consider here.... but it is NOT insurmountable.
(as far as input validation goes, that's actually NOT an issue, since this should be done by BOTH the front end app AND the back end WS...)
Another aspect here, as mentioned by #Martin, is the need for an SLA on whatever provider/hosting service you have for the NY WS, not just for performance, but also to cover availability. I.e. what happens if the server is inaccessible how quickly they commit to getting it back up, what happens if its down for extended periods of time, etc. That's the only way to legitimately transfer the risk of your availability being controlled by an externality.
I'm quickly falling in love with ASP.NET MVC beta, and one of the things I've decided I won't sacrifice in deploying to my IIS 6 hosting environment is the extensionless URL. Therefore, I'm weighing the consideration of adding a wildcard mapping, but everything I read suggests a potential performance hit when using this method. However, I can't find any actual benchmarks!
The first part of this question is, do you know where I might find such benchmarks, or is it just an untested assumption?
The second part of the question is in regards to the 2 load tests I ran using jMeter on our dev server over a 100Mbs connection.
Background Info
Our hosting provider has a 4Gbs burstable internet pipe with a 1Gbs backbone for our VLAN, so anything I can produce over the office lan should translate well to the hosting environment.
The test scenario was to load several images / css files, since the supposed performance hit comes when requesting files that are now being passed through the ASP.NET ISAPI filter that would not normally pass through it. Each test contained 50 threads (simulated users) running the request script for 1000 iterations each. The results for each test are posted below.
Test Results
Without wildcard mapping:
Samples: 50,000
Average response time: 428ms
Number of errors: 0
Requests per second: 110.1
Kilobytes per second: 11,543
With wildcard mapping:
Samples: 50,000
Average response time: 429ms
Number of errors: 0
Requests per second: 109.9
Kilobytes per second: 11,534
Both tests were run warm (everything was in memory, no initial load bias), and from my perspective, performance was about even. CPU usage was approximately 60% for the duration of both tests, memory was fine, and network utilization held steady around 90-95%.
Is this sufficient proof that wildcard mappings that pass through the ASP.NET filter for ALL content don't really affect performance, or am I missing something?
Edit: 11 hours and not a single comment? I was hoping for more.. lol
Chris, very handy post.
Many who suggest a performance disadvantage infer that the code processed in a web application is some how different/inferior to code processed in the standard workflow. The base code type maybe different, and sure you'll be needing the MSIL interpreter, but MS has shown in many cases you'll actually see a performance increase in a .NET runtime over a native one.
It's also wise to consider how IIS has to be a "jack of all trades" - allowing all sorts of configuration and overrides even on static files. Some of those are designed for performance increase (caching, compression) and - indeed - will be lost unless you reimplement them in your code, but many of them are for other purposes and may not ever be used. If you build for your needs (only) you can ignore those other pieces and should be realising some kind of performance advantage, even though there's a potential ASP.NET disadvantage.
In my (non-.NET) MVC testing I'm seeing considerable (10x or more) performance benefits over webforms. Even if there was a small hit on the static content - that wouldn't be a tough pill to swallow.
I'm not surprised the difference is almost negligible in your tests, but I'm happy to see it backed up.
NOTE: You can disable wildcard mapping from static directories (I keep all static files in /static/(pics|styles|...)) in IIS. Switch the folder to an application, remove the wildcard mapping, and switch it back from an application and - voilĂ - static files are handled by IIS without pestering your ASP.NET.
I think there are several additional things to check:
Since we're using the .Net ISAPI filter, we might be using threads used to run application for serving static assets. I would run the same load test while reviewing performance counters of threads - Review this link
I would run the same load test while running Microsoft Performance Analyzer and compare the reports.
I was looking for benchmark like this for a long time. Thanx!
In my company we did wildcard mapping on several web sites (standard web forms, .net1.1 and 2, iis6), and sys admins said to me that they didn't noticed any performance issues.
But, it seems you stressed network, not server. So maybe the scores are so similar because network bottleneck? Just thinking...
That's quite an impressive post there, thanks very much for that.
We're also assessing the security and performance concerns with removing a piece of software that's always been in place to filter out unwanted traffic.
Will there by any further benchmarking on your part?
Cheers,
Karl.
Seems the bottleneck in your test is network utilization. If the performance degradation is expected to be on the CPU usage (I'm not sure it is, but it's reasonable), then you wouldn't notice it with the test you did.
Since this is a complex system, with many variables - it does not mean that there is no performance degradation. It means that in your scenario - the performance degradation is probably negligible.