Background
I'm in the midst of comparing the performance of NancyFx and ServiceStack.NET running under IIS 7 (testing on a Windows 7 host). Both are insanely fast - testing locally each framework processes over 10,000+ req/sec, with ServiceStack being about 20% faster.
The problem I'm running into is that ASP.NET appears to be caching the responses for each unique URI request from the HttpHandler, quickly leading to massive memory pressure (3+ GBs) and overworking the garbage collector (~25% time consumed by the GC). So far I've been unable to disable the caching and buildup of objects, and am looking for suggestions on how to disable this behavior.
Details
The request loop is basically as follows:
for i = 1..100000:
string uri = http://localhost/users/{i}
Http.Get(uri)
The response is a simple JSON object, formatted as { UserID: n }.
I've cracked open WinDBG, and for each request there are:
One System.Web.FileChangeEventHandler
Two System.Web.Configuration.MapPathCacheInfos
Two System.Web.CachedPathDatas
Three System.Web.Caching.CacheDependencys
Five System.Web.Caching.CacheEntrys
Obviously, these cache items are what is leading me to believe it's a cache bloat issue (I'd love to get rid of 150,000 unusable objects!).
What I've tried so far
In IIS 'HTTP Resonse Headers', set 'Expire Web content' to 'immediately'.
In the web.config
<system.web>
<caching>
<outputCache enableOutputCache="false" enableFragmentCache="false"/>
</caching>
</system.web>
Also in the web.config (and many variations on the policies, including none).
<caching enabled="false" enableKernelCache="false">
<profiles>
<add policy="DontCache" kernelCachePolicy="DontCache" extension="*/>
</profiles>
</caching>
Looked through the source code of the frameworks to see if there might be any "features" built in that would use ASP.NET caching. While there are caching helpers, they are private to the framework itself and do not appear to leverage ASP.NET caching.
Update #1
Digging through reflector I've found that setting the value for UrlMetadataSlidingExpiration to zero eliminates a large portion of the excessive memory usage, at the expense of cutting throughput by 50% (the FileAuthorizationModule class caches the FileSecurityDescriptors, which must be somewhat expensive to generate, when UrlMetadataSlidingExpiration is non-zero).
This is done by updating the web.config and placing the following in :
<hostingEnvironment urlMetadataSlidingExpiration="00:00:00"/>
I'm going to try to fully disable the FileAuthorizationModule from running, if possible, to see if that helps. However, ASP.NET is still generating 2*N MapPathCacheInfo and CacheEntry objects, so memory is still getting consumed, just at much slower rate.
Update #2
The other half of the problem is the same issue as described here: Prevent many different MVC URLs from filling ASP.NET Cache. Setting
<cache percentagePhysicalMemoryUsedLimit="1" privateBytesPollTime="00:00:01"/>
helps, but even with these very aggressive settings memory usage quickly rises to 2.5GB (compared to 4GB). Ideally these objects would never be created in the first place. Failing that, I may resort to a hacky solution of using reflection to clear out the Caches (all these entries are "private" and are not enumerated when iterating over the public Cache).
A late response for others that suffers the same problem:
This is a known issue:
KB 2504047
This issue occurs because the unique requests that try to access the same resources are cached as MapPathCacheInfo objects for 10 minutes.
While the objects are cached for 10 minutes, the memory consumption of
the W3wp.exe process increases significantly.
You can download the hotfix here
I think this is less of an issue of caching, and more of an issue of "high memory utilization".
Two things,
If you use an IDisposable friendly object (try one that can use the "using" keyword). This will allow you to dispose of the object sooner than later, creating less pressure for the garbage collector in the long run.
for (int i = 0; i < 10000; i++) {
using (System.Net.WebClient c = new System.Net.WebClient()) {
System.IO.Stream stream = c.OpenRead(String.Format("http://url.com/{0}", i));
}
}
From your pseudo-code, I can only assume that you're using System.Net.WebHttpRequest which isn't disposable and will probably hang around longer than it should if you are making successive calls.
Secondly, if you are making successive calls to an external server, I'd put delays between each calls. This will give some breathing room as your processor will process the for-loop much faster than the network will have time to respond (and will just keep queue up the requests and slowing down the speed of ones actually being processed).
System.Threading.Thread.Sleep(250);
Obviously, the best solution would be to make a single call with a list of the users you want to retreive and just deal with one webrequest/response.
Ensure the IsReusable property is set to false so IIS doesn't reuse the same request process to handle the request.
http://support.microsoft.com/kb/308001
Related
I'm load/stress testing using Visual Studios load testing project with current users set quite low at 10. The application makes multiple System.Net.Http.HttpClient request to a WebAPI service layer/application (which in some cases calls another WebApi layer/application) and it's only getting around 30 request/second when testing a single MVC4 controller (not parsing dependent requests).
These figures are based on a sample project where no real code is executed at any of the layers other then HttpClient calls and to contruct a new collection of Models to return (so some serialization happening too as part of WebApi). I've tried both with async controller and without and the results seemed pretty similar (slight improvement with async controllers)
Caching the calls obviously increases it dramatically (to around 400 RPS) although the nature of the data and the traffic means that a significant amount of the calls wouldn't be cached in real world scenarios.
Is HttpClient (or HTTP layer in general) simply too expensive or are there some ideas you have around getting higher throughput?
It might be worth noting that on my local development machine this load test also maxes out the CPU. Any ideas or suggestions would be very much appreciated?
HttpClient has a built-in default of 2 concurrent connections.
Have you changed the default?
See this answer
I believe if you chase the deps for HTTPClient, you will find that it relies on the settings for ServicePointManager.
See this link
You can change the default connection limit by calling System.Net.ServicePointManager.DefaultConnectionLimit
You can increase number of concurrent connection for http client. Default connection limit is 2. In order to find optimum connection try the formula 12* number of CPU on your machine.
In code : In code:
ServicePointManager.DefaultConnectionLimit = 48;
Or In Api.config or web.config file
<system.net>
<connectionManagement>
<add address="*" maxconnection="48"/>
</connectionManagement>
</system.net>
Check this link for more detail.
I need to put a customized logging system of sorts in place for an ASP.NET application. Among other things, it has to log some data per request. I've thought of two approaches:
Approach #1: Commit each entry per request. For example: A log entry is created and committed to the database on every request (using a transient DbContext). I'm concerned that this commit puts an overhead on the serving of the request that would not scale well.
Approach #2: Buffer entries, commit periodically. For example: A log entry is created and added to a concurrent buffer on every request (using a shared lock). When a limit in that buffer is exceeded, an exclusive lock is acquired, the buffered entries are committed to the database in one go (using another, also transient DbContext, created and destroyed only for each commit) and the buffer is emptied. I'm aware that this would make the "committing" request slow, but it's acceptable. I'm also aware that closing/restarting the application could result in loss of uncommitted log entries because the AppDomain will change in that case, but this is also acceptable.
I've implemented both approaches within my requirements, I've tested them and I've strained them as much as I could in a local environment. I haven't deployed yet and thus I cannot test them in real conditions. Both seem to work equally well, but I can't draw any conclusions like this.
Which of these two approaches is the best? I'm concerned about performance during peaks of a couple thousand users. Are there any pitfalls I'm not aware of?
To solve your concern with option 1 about slowing down each request, why not use the TPL to offload the logging to a different thread? Something like this:
public class Logger
{
public static void Log(string message)
{
Task.Factory.StartNew(() => { SaveMessageToDB(message); });
}
private static void SaveMessageToDB(string message)
{
// etc.
}
}
The HTTP request thread wouldn't have to wait while the entry is written. You could also adapt option 2 to do the same sort of thing to write the accumulated set of messages in a different thread.
I implemented a solution that is similar to option 2, but in addition to a number limit, there was also a time limit. If no logs entries had been entered in a certain number of seconds, the queue would be dumped to the db.
Use log4net, and set its buffer size appropriately. Then you can go home and have a beer the rest of the day... I believe it's Apache licensed, which means you're free to modify/recompile it for your own needs (fitting whatever definition of "integrated in the application, not third party" you have in mind).
Seriously though - it seems way premature to optimize out a single DB insert per request at the cost of a lot of complexity. If you're doing 10+ log calls per request, it would probably make sense to buffer per-request - but that's vastly simpler and less error prone than writing high-performance multithreaded code.
Of course, as always, the real proof is in profiling - so fire up some tests, and get some numbers. At minimum, do a batch of straight inserts vs your buffered logger and determine what the difference is likely to be per-request so you can make a reasonable decision.
Intuitively, I don't think it'd be worth the complexity - but I have been wrong on performance before.
I'm using MemoryCache in ASP.NET and it is working well. I have an object that is cached for an hour to prevent fresh pulls of data from the repository.
I can see the caching working in debug, but also once deployed to the server, after the 1st call is made and the object is cached subsequent calls are about 1/5 of the time.
However I'm noticing that each new client call (still inside that 1 hour window - in fact just a minute or 2 later) seems to have the 1st call to my service (that is doing the caching) taking almost as long as the original call before the data was cached.
This made me start to wonder - is MemoryCache session specific, and each new client making the call is storing it's own cache, or is something else going on to cause the 1st call to take so long even after I know the data has been cached?
From MSDN:
The main differences between the Cache and MemoryCache classes are
that the MemoryCache class has been changed to make it usable by .NET
Framework applications that are not ASP.NET applications. For example,
the MemoryCache class has no dependencies on the System.Web assembly.
Another difference is that you can create multiple instances of the
MemoryCache class for use in the same application and in the same
AppDomain instance.
Reading that and doing some investigation in reflected code it is obvious that MemoryCache is just a simple class. You can use MemoryCache.Default property to (re)use same instance or you can construct as many instances as you want (though recommended is as few as possible).
So basically the answer lies in your code.
If you use MemoryCache.Default then your cache lives as long as your application pool lives. (Just to remind you that default application pool idle time-out is 20 minutes which is less than 1 hour.)
If you create it using new MemoryCache(string, NameValueCollection) then the above mentioned considerations apply plus the context you create your instance in, that is if you create your instance inside controller (which I hope is not the case) then your cache lives for one request
It's a pity I can't find any references, but ... MemoryCache does not guarantee to hold data according to a cache policy you specify. In particular if machine you're running your app on gets stressed on memory your cache might be discarded.
If you still have no luck figuring out what's the reason for early cache item invalidation you could take advantage of RemoveCallback and investigate what is the reason of item invalidation.
Reviewing this a year later I found out some more information on my original post about the cache 'dropping' randomly. The MSDN states the following for the configurable cache properties CacheMemoryLimitMegabytes and PhysicalMemoryLimitPercentage:
The default value is 0, which means that the MemoryCache class's
autosize heuristics are used by default.
Doing some decompiling and investigation, there are predetermined scenarios deep in the CacheMemoryMonitor.cs class that define the memory thresholds. Here is a sampling of the comments in that class on the AutoPrivateBytesLimit property:
// Auto-generate the private bytes limit:
// - On 64bit, the auto value is MIN(60% physical_ram, 1 TB)
// - On x86, for 2GB, the auto value is MIN(60% physical_ram, 800 MB)
// - On x86, for 3GB, the auto value is MIN(60% physical_ram, 1800 MB)
//
// - If it's not a hosted environment (e.g. console app), the 60% in the above
// formulas will become 100% because in un-hosted environment we don't launch
// other processes such as compiler, etc.
It's not necessarily that the specific values are important as much as realizing to why cache is often used: to store large objects that we don't want to fetch over and over. If these large objects are being stored in the cache and the hosting environments memory threshold based on these internal calculations are exceeded, you may have the item removed from cache automatically. This could certainly explain my OP because I was storing a very large collection in memory on a hosted server with probably 2GB of memory running multiple apps in IIS.
There is an explicit override to setting these values. You can via configuration (or when setting up the MemoryCache instance) set the CacheMemoryLimitMegabytes and PhysicalMemoryLimitPercentage values. Here is modified sample from the following MSDN link where I set the physicalMemoryPercentage to 95 (%):
<configuration>
<system.runtime.caching>
<memoryCache>
<namedCaches>
<add name="default"
physicalMemoryLimitPercentage="95" />
</namedCaches>
</memoryCache>
</system.runtime.caching>
</configuration>
I have a web application that uses ASP.NET with "InProc" session handling. Normally, everything works fine, but a few hundred requests each day take significantly longer to run than normal. In the IIS logs, I can see that these pages (which usually require 2-5 seconds to run) are running for 20+ seconds.
I enabled Failed Request Tracing in Verbose mode, and found that the delay is happening in the AspNetSessionData section. In the example shown below, there was a 39-second gap between AspNetSessionDataBegin and AspNetSessionDataEnd.
I'm not sure what to do next. I can't find any reason for this delay, and I can't find any more logging features that could be enabled to tell me what's happening here. Does anyone know why this is happening, or have any suggestions for additional steps I can take to find the problem?
My app usually stores 1-5MB in session for each user, mostly cached data for searches. The server has plenty of available memory, and only runs about 50 users.
It could be caused by lock contention for the session state. Take a look at the last paragraph of MSDN's ASP.NET Session State Overview. See also K. Scott Allen's helpful post on this subject.
If a page is annotated with EnableSessionState="True" (or inherits the web.config default), then all requests for that page will acquire a write lock on the session state. All other requests that use session state -- even if they do not acquire a write lock -- are blocked until that request finishes.
If a page is annotated with EnableSessionState="ReadOnly", then the page will not acquire a write lock and so will not block other requests. (Though it may be blocked by another request holding the write lock.)
To eliminate this lock contention, you may want to implement your own [finer grained] locking around the HttpContext.Cache object or static WeakReferences. The latter is probably more efficient. (See pp. 118-122 of Ultra-Fast ASP.NET by Richard Kiessig.)
There is chance your are running up against the maximum amount of memory that Application Pool is allowed to consume, which causes a restart of the Application Pool (which would account for the delay you are seeing in accessing the session). The amount of memory on the server doesn't impact the amount of memory ASP.NET can use, this is controlled in the machine.config in the memoryLimit property and in IIS 6.0 later in IIS itself using the "Maximum memory used" property. Beyond that, have you considered alternatives to each user using 5 MB of session memory? This will not scale well at all and can cause a lot of issues while under load. Might caching be a more effective solution? Do the searches take so long that you need to do this, could the SQL/Database Setup be optimized to speed up your queries?
I want to cache custom data in an ASP.NET application. I am putting lots of data into it, such as List<objects>, and other objects.
Is there a best practice for this? Since if I use a static data, if the w3p.exe dies or gets recycled, the cache will need to be filled again.
The database is also getting updated by other applications, so a thread would be needed to make sure it is on the latest data.
Update 1:
Just found this, which problably helps me
http://www.codeproject.com/KB/web-cache/cachemanagementinaspnet.aspx?fid=229034&df=90&mpp=25&noise=3&sort=Position&view=Quick&select=2818135#xx2818135xx
Update 2:
I am using DotNetNuke as the application, ( :( ). I have enabled persistent caching and now the whole application feels slugish.
Such as a Multiview takes about 3 seconds to swap view....
Update 3:
Strategies for Caching on the Web?
Linked to this, I am using the DotNetNuke caching method, which in turn uses the ASP.NET Cache object, it also has file based caching.
I have a helper:
CachingProvider.Instance().Add( _
(label & "|") + key, _
newObject, _
Nothing, _
Cache.NoAbsoluteExpiration, _
Cache.NoSlidingExpiration, _
CacheItemPriority.NotRemovable, _
Nothing)
Which runs that to add the objects to the cache, is this correct? As I want to keep it cached as long as possible. I have a thread which runs every x Minutes, which will update the cache. But I have noticied, the cache is getting emptied, I check for an object "CacheFilled" in the cache.
As a test I've told the worker process not to recycle, etc., but still it seems to clear out the cache. I have also changed the DotNetNuke settings from "heavy" to "light" but think that is for module caching.
You are looking for either out of process caching or a distributed caching system of some sort, based upon your requirements. I recommend distributed caching, because it is very scalable and is dedicated to caching. Someone else had recommended Velocity, which we have been evaluating and thoroughly enjoying. We have written several caching providers that we can interchange while we are evaluating different distributed caching systems without having to rebuild. This will come in handy when we are load testing the various systems as part of the final evaluation.
In the past, our legacy application has been a random assortment of cached items. There have been DataTables, DataViews, Hashtables, Arrays, etc. and there was no logic to what was used at any given time. We have started to move to just caching our domain object (which are POCOs) collections. Using generic collections is nice, because we know that everything is stored the same way. It is very simple to run LINQ operations on them and if we need a specialized "view" to be stored, the system is efficient enough to where we can store a specific collection of objects.
We also have put an abstraction layer in place that pretty much brokers calls between either the DAL or the caching model. Calls through this layer will check for a cache miss or cache hit. If there is a hit, it will return from the cache. If there is a miss, and the call should be cached, it will attempt to cache the data after retrieving it. The immediate benefit of this system is that in the event of a hardware or software failure on the machines dedicated to caching, we are still able to retrieve data from the database without having a true outage. Of course, the site will perform slower in this case.
Another thing to consider, in regards to distributed caching systems, is that since they are out of process, you can have multiple applications use the same cache. There are some interesting possibilities there, involving sharing database between applications, real-time manipulation of data, etc.
Also have a look at the MS Enterprise Caching Application block which allows your to write custom expiration policy, custom store etc.
http://msdn.microsoft.com/en-us/library/cc309502.aspx
You can also check "Velocity" which is available at
http://code.msdn.microsoft.com/velocity
This will be useful if you wish to scale your application across servers...
There are lots of articles about the Cache object in ASP.NET and how to make it use SqlDependencies and other types of cache expirations. No need to write your own. And using the Cache is recommended over session or any of the other collections people used to cram lots of data into.
Cache and Session can lead to sluggish behaviour, but sometimes they're the right solutions: the rule of right tool for right job applies.
Personally I've often created collections in pseudo-static singletons for the kind of role you describe (typically to avoid I/O overheads like storing a compiled xslttransform), but it's very important to keep in mind that that kind of cache is fragile, and design for it to A). filewatch or otherwise monitor what it's supposed to cache where appropriate and B). recreate/populate itself with use - it should expect to get flushed frequently.
Essentially I recommend it as a performance crutch, but don't rely on it for anything requiring real persistence.