I want to cache custom data in an ASP.NET application. I am putting lots of data into it, such as List<objects>, and other objects.
Is there a best practice for this? Since if I use a static data, if the w3p.exe dies or gets recycled, the cache will need to be filled again.
The database is also getting updated by other applications, so a thread would be needed to make sure it is on the latest data.
Update 1:
Just found this, which problably helps me
http://www.codeproject.com/KB/web-cache/cachemanagementinaspnet.aspx?fid=229034&df=90&mpp=25&noise=3&sort=Position&view=Quick&select=2818135#xx2818135xx
Update 2:
I am using DotNetNuke as the application, ( :( ). I have enabled persistent caching and now the whole application feels slugish.
Such as a Multiview takes about 3 seconds to swap view....
Update 3:
Strategies for Caching on the Web?
Linked to this, I am using the DotNetNuke caching method, which in turn uses the ASP.NET Cache object, it also has file based caching.
I have a helper:
CachingProvider.Instance().Add( _
(label & "|") + key, _
newObject, _
Nothing, _
Cache.NoAbsoluteExpiration, _
Cache.NoSlidingExpiration, _
CacheItemPriority.NotRemovable, _
Nothing)
Which runs that to add the objects to the cache, is this correct? As I want to keep it cached as long as possible. I have a thread which runs every x Minutes, which will update the cache. But I have noticied, the cache is getting emptied, I check for an object "CacheFilled" in the cache.
As a test I've told the worker process not to recycle, etc., but still it seems to clear out the cache. I have also changed the DotNetNuke settings from "heavy" to "light" but think that is for module caching.
You are looking for either out of process caching or a distributed caching system of some sort, based upon your requirements. I recommend distributed caching, because it is very scalable and is dedicated to caching. Someone else had recommended Velocity, which we have been evaluating and thoroughly enjoying. We have written several caching providers that we can interchange while we are evaluating different distributed caching systems without having to rebuild. This will come in handy when we are load testing the various systems as part of the final evaluation.
In the past, our legacy application has been a random assortment of cached items. There have been DataTables, DataViews, Hashtables, Arrays, etc. and there was no logic to what was used at any given time. We have started to move to just caching our domain object (which are POCOs) collections. Using generic collections is nice, because we know that everything is stored the same way. It is very simple to run LINQ operations on them and if we need a specialized "view" to be stored, the system is efficient enough to where we can store a specific collection of objects.
We also have put an abstraction layer in place that pretty much brokers calls between either the DAL or the caching model. Calls through this layer will check for a cache miss or cache hit. If there is a hit, it will return from the cache. If there is a miss, and the call should be cached, it will attempt to cache the data after retrieving it. The immediate benefit of this system is that in the event of a hardware or software failure on the machines dedicated to caching, we are still able to retrieve data from the database without having a true outage. Of course, the site will perform slower in this case.
Another thing to consider, in regards to distributed caching systems, is that since they are out of process, you can have multiple applications use the same cache. There are some interesting possibilities there, involving sharing database between applications, real-time manipulation of data, etc.
Also have a look at the MS Enterprise Caching Application block which allows your to write custom expiration policy, custom store etc.
http://msdn.microsoft.com/en-us/library/cc309502.aspx
You can also check "Velocity" which is available at
http://code.msdn.microsoft.com/velocity
This will be useful if you wish to scale your application across servers...
There are lots of articles about the Cache object in ASP.NET and how to make it use SqlDependencies and other types of cache expirations. No need to write your own. And using the Cache is recommended over session or any of the other collections people used to cram lots of data into.
Cache and Session can lead to sluggish behaviour, but sometimes they're the right solutions: the rule of right tool for right job applies.
Personally I've often created collections in pseudo-static singletons for the kind of role you describe (typically to avoid I/O overheads like storing a compiled xslttransform), but it's very important to keep in mind that that kind of cache is fragile, and design for it to A). filewatch or otherwise monitor what it's supposed to cache where appropriate and B). recreate/populate itself with use - it should expect to get flushed frequently.
Essentially I recommend it as a performance crutch, but don't rely on it for anything requiring real persistence.
Related
I have an ASP.net application that I'm moving to Azure. In the application, there's a query that joins 9 tables to produce a user record. Each record is then serialized in json and sent back and forth with the client. To increase query performance, the first time the 9 queries run and the record is serialized in json, the resulting string is saved to a table called JsonUserCache. The table only has 2 columns: JsonUserRecordID (that's unique) and JsonRecord. Each time a user record is requested from the client, the JsonUserCache table is queried first to avoid having to do the query with the 9 joins. When the user logs off, the records he created in the JsonUserCache are deleted.
The table JsonUserCache is SQL Server. I could simply leave everything as is but I'm wondering if there's a better way. I'm thinking about creating a simple dictionary that'll store the key/values and put that dictionary in AppFabric. I'm also considering using a NoSQL provider and if there's an option for Azure or if I should just stick to a dictionary in AppFabric. Or, is there another alternative?
Thanks for your suggestions.
"There are only two hard problems in Computer Science: cache invalidation and naming things."
Phil Karlton
You are clearly talking about a cache and as a general principle, you should not persist any cached data (in SQL or anywhere else) as you have the problem of expiring the cache and having to do the deletes (as you currently are). If you insist on storing your result somewhere and don't mind the clearing up afterwards, then look at putting it in an Azure blob - this is easily accessible from the browser and doesn't require that the request be handled by your own application.
To implement it as a traditional cache, look at these options.
Use out of the box ASP.NET caching, where you cache in memory on the web role. This means that your join will be re-run on every instance that the user goes to, but depending on the number of instances and the duration of the average session may be the simplest to implement.
Use AppFabric Cache. This is an extra API to learn and has additional costs which may get quite high if you have lots of unique visitors.
Use a specialised distributed cache such as Memcached. This has the added cost/hassle of having to run it all yourself, but gives you lots of flexibility in the long run.
Edit: All are RAM based. Using ASP.NET caching is simpler to implement and is faster to retrieve the data from cache because it is on the same machine - BUT requires the cache to be populated for each instance of the web role (i.e. it is not distributed). AppFabric caching is distributed but is also a bit slower (network latency) and, depending what you mean by scalable, AppFabric caching currently behaves a bit erratically at scale - so make sure you run tests. If you want scalable, feature rich distributed caching, and it is a big part of your application, go and put in Memcached.
In order to improve speed of chat application, I am remembering last message id in static variable (actually, Dictionary).
Howeever, it seems that every thread has own copy, because users do not get updated on production (single server environment).
private static Dictionary<long, MemoryChatRoom> _chatRooms = new Dictionary<long, MemoryChatRoom>();
No treadstaticattribute used...
What is fast way to share few ints across all application processes?
update
I know that web must be stateless. However, for every rule there is an exception. Currently all data stroed in ms sql, and in this particular case some piece of shared memory wil increase performance dramatically and allow to avoid sql requests for nothing.
I did not used static for years, so I even missed moment when it started to be multiple instances in same application.
So, question is what is simplest way to share memory objects between processes? For now, my workaround is remoting, but there is a lot of extra code and I am not 100% sure in stability of this approach.
I'm assuming you're new to web programming. One of the key differences in a web application to a regular console or Windows forms application is that it is stateless. This means that every page request is basically initialised from scratch. You're using the database to maintain state, but as you're discovering this is fairly slow. Fortunately you have other options.
If you want to remember something frequently accessed on a per-user basis (say, their username) then you could use session. I recommend reading up on session state here. Be careful, however, not to abuse the session object -- since each user has his or her own copy of session, it can easily use a lot of RAM and cause you more performance problems than your database ever was.
If you want to cache information that's relevant across all users of your apps, ASP.NET provides a framework for data caching. The simplest way to use this is like a dictionary, eg:
Cache["item"] = "Some cached data";
I recommend reading in detail about the various options for caching in ASP.NET here.
Overall, though, I recommend you do NOT bother with caching until you are more comfortable with web programming. As with any type of globally shared data, it can cause unpredictable issues which are difficult to diagnosed if misused.
So far, there is no easy way to comminucate between processes. (And maybe this is good based on isolation, scaling). For example, this is mentioned explicitely here: ASP.Net static objects
When you really need web application/service to remember some state in memory, and NOT IN DATABASE you have following options:
You can Max Processes count = 1. Require to move this piece of code to seperate web application. In case you make it separate subdomain you will have Cross Site Scripting issues when accesing this from JS.
Remoting/WCF - You can host critical data in remoting applcation, and access it from web application.
Store data in every process and syncronize changes via memcached. Memcached doesn't have actual data, because it took long tim eto transfer it. Only last changed date per each collection.
With #3 I am able to achieve more than 100 pages per second from single server.
I'm using ASP.NET's in-process session state store. It locks access to a session exclusively, and which means concurrent requests to the same session are served sequentially.
I want to remove this implicit exclusive lock, so multiple requests per session can be handled concurrently. Of course, I'll be synchronizing access to the session state myself where it's applicable.
I'm using the MSDN documentation of Session State Providers to write my own session state provider, and this SO question pointed me to this example code of implementing this as an HTTP module, but the code looks suspiciously complex just for the purpose of removing the lock.
I should probably eventually implement the session state using ASP.NET's cache, and stop using the built-in session, like Vivek describes in this post, but for now how would I just like to remove the locking.
Any ideas or sample implementations?
Not the answer you're looking for, but I think that even if it were possible, changing the way SessionState works in this way is a terrible idea.
Think of the poor guys who will have to maintain your code down the line. The fact that Session serializes requests in this way means ASP.NET developers often don't need to worry too much about thread-safety.
Also if someone adds a third-party component that happens to use Session, it will expect the usual guarantees regarding locks - and you'll suddenly start getting Heisenbugs.
Instead, measure performance and identify specific areas where you need requests to process concurrently - I bet there'll be few of them - and carefully implement your own locking mechanism only for the specific items involved - possibly the solution that you're planning eventually using the ASP.NET cache.
If you are only reading from the session on a given page you could use the Page directive:
<%# Page EnableSessionState="ReadOnly" %>
to indicate the readonly nature and remove the exclusive write lock which will allow concurrent requests to this page from the same session.
As internally a ReaderWriter lock is used a reader lock will block a writer lock but a reader lock won't block reader lock. Writer lock will block all reader and writer lock.
If you need to both read and write from the same page to the session I think that what you've already found as information is more than sufficient. Just a remark about replacing the Session with Cache: while session is reliable, cache is not meaning that if you put something into the cache you are not guaranteed to retrieve it back. The ASP.NET could decide to evict the cache under some circumstances like low memory pressure so you always need to check if an item exists in the cache before accessing it.
but the code looks suspiciously complex just for the purpose of removing the lock.
That's because you are thinking of it as a simple lock someone added by chance / or laziness, instead of a factor that was considered in the whole session provider concept. That code looks simple enough for the task at hand / replacing the whole existing provider with your own / and while doing so lock in a different way than it is intended.
I suggest you read a bit more about how the session concept works in asp.net. Consider that the way it was designed involves reading the Whole Session once in the request, and then writing the changed session once.
Also note that while its not your scenario, code might depend on reading separate session values to process something and can write separate values as well / locking individually can get you into the same considerations that can cause a dead lock in databases.
The designed way contrasts with your intent of reading/writing+locking as each separate item is loaded/stored.
Also note that you might be updating the values of an object you retrieved from the session and not setting it back to it, yet the session was updated. As far as I know, the providers write everything in the session back at a certain point in the life cycle, so its not straightforward if you want to separate read vs. write locks at an item level.
Finally, if you are finding a high level of resistance from the framework session providers model and the way you intent to modify it will prevent you from using some features of it (as switching to a diff provider, since those would have the same issue); then I think you should just stay away from it and roll your own for your very specific needs. That's without even using asp.net Cache, since you want to it to be your way, you control the lifetime & locks of what's stored in there.
It just appears to me that you need a different mechanism altogether, so I don't see the benefit of trying to bend the framework to do so. You might reuse some very specific pieces, like generating a session id / and cookie vs. cookie-less, but other than that its a different beast.
You will have to write your own Session State Provider, it is not that hard as it seems. I wouldn't recommend you to use Cache because as Darin said Cache is not reliable enough, for example items there expire and/or removed from cache when there is not enough memory.
In your Session Store Provider you can lock items of session instead of the whole Session state (which suits us in our project) when writing, the essential code for this should be in ISessionStateItemCollection's implementation (in indexer), something like this:
public object this[string name]
{
get
{
//no lock, just returning item (maybe immutable, it depends on how you use it)
}
set
{
//lock here
}
}
For our project we have created implementation which stores items in AppFabric Cache and rely on GetAndLock, PutAndUnlock methods of the cache. If you have in-proc Session then you might just need to lock(smthg) {}
Hope this helps.
I suggest that you look into moving your sessions into SQL Server Mode. SQL Server sessions is to allow for any number of requests from multiple web servers to share a single session. As far as I know, the SQL Server manages the locking mechanism internally which is tremendously efficient. You also have the added bonus of having a lower in-memory footprint. And you don't need to write an HttpModule to run it.
We are developing an ASP.NET HR Application that will make thousands of calls per user session to relatively static database tables (e.g. tax rates). The user cannot change this information, and changes made at the corporate office will happen ~once per day at most (and do not need to be immediately refreshed in the application).
About 2/3 of all database calls are to these static tables, so I am considering just moving them into a set of static objects that are loaded during application initialization and then refreshed every 24 hours (if the app has not restarted during that time). Total in-memory size would be about 5MB.
Am I making a mistake? What are the pitfalls to this approach?
From the info you present, it looks like you definitely should cache this data -- rarely changing and so often accessed. "Static" objects may be inappropriate, though: why not just access the DB whenever the cached data is, say, more than N hours old?
You can vary N at will, even if you don't need special freshness -- even hitting the DB 4 times or so per day will be much better than "thousands [of times] per user session"!
Best may be to keep with the DB info a timestamp or datetime remembering when it was last updated. This way, the check for "is my cache still fresh" is typically very light weight, just get that "latest update" info and check it with the latest update on which you rebuilt the local cache. Kind of like an HTTP "if modified since" caching strategy, except you'd be implementing most of it DB-client-side;-).
If you decide to cache the data (vs. make a database call each time), use the ASP.NET Cache instead of statics. The ASP.NET Cache provides functionality for expiry, handles multiple concurrent requests, it can even invalidate the cache automatically using the query notification features of SQL 2005+.
If you use statics, you'll probably end up implementing those things anyway.
There are no drawbacks to using the ASP.NET Cache for this. In fact, it's designed for caching data too (see the SqlCacheDependency class http://msdn.microsoft.com/en-us/library/system.web.caching.sqlcachedependency.aspx).
With caching, a dbms is plenty efficient with static data anyway, especially only 5M of it.
True, but the point here is to avoid the database roundtrip at all.
ASP.NET Cache is the right tool for this job.
You didnt state how you will be able to find the matching data for a user. If it is as simple as finding a foreign key in the cached set then you dont have to worry.
If you implement some kind of filtering/sorting/paging or worst searching then you might at some point miss the quereing capabilities of SQL.
ORM often have their own quereing and linq makes things easy to, but it is still not SQL.
(try to group by 2 columns)
Sometimes it is a good way to have the db return the keys of a resultset only and use the Cache to fill the complete set.
Think: Premature Optimization. You'll still need to deal with the data as tables eventually anyway, and you'd be leaving an "unusual design pattern".
With event default caching, a dbms is plenty efficient with static data anyway, especially only 5M of it. And the dbms partitioning you're describing is often described as an antipattern. One example: multiple identical databases for multiple clients. There are other questions here on SO about this pattern. I understand there are security issues, but doing it this way creates other security issues. I've recently seen this same concept in a medical billing database (even more highly sensitive) that ultimately had to be refactored into a single database.
If you do this, then I suggest you at least wait until you know it's solving a real problem, and then test to measure how much difference it makes. There are lots of opportunities here for Unintended Consequences.
I recently came across a ASP 1.1 web application that put a whole heap of stuff in the session variable - including all the DB data objects and even the DB connection object. It ends up being huge. When the web session times out (four hours after the user has finished using the application) sometimes their database transactions get rolled back. I'm assuming this is because the DB connection is not being closed properly when IIS kills the session.
Anyway, my question is what should be in the session variable? Clearly some things need to be in there. The user selects which plan they want to edit on the main screen, so the plan id goes into the session variable. Is it better to try and reduce the load on the DB by storing all the details about the user (and their manager etc.) and the plan they are editing in the session variable or should I try to minimise the stuff in the session variable and query the DB for everything I need in the Page_Load event?
This is pretty hard to answer because it's so application-specific, but here are a few guidelines I use:
Put as little as possible in the session.
User-specific selections that should only last during a given visit are a good choice
often, variables that need to be accessible to multiple pages throughout the user's visit to your site (to avoid passing them from page to page) are also good to put in the session.
From what little you've said about your application, I'd probably select your data from the db and try to find ways to minimize the impact of those queries instead of loading down the session.
Do not put database connection information in the session.
As far as caching, I'd avoid using the session for caching if possible -- you'll run into issues where someone else changes the data a user is using, plus you can't share the cached data between users. Use the ASP.NET Cache, or some other caching utility (like Memcached or Velocity).
As far as what should go in the session, anything that applies to all browser windows a user has open to your site (login, security settings, etc.) should be in the session. Things like what object is being viewed/edited should really be GET/POST variables passed around between the screens so a user can use multiple browser windows to work with your application (unless you'd like to prevent that).
DO NOT put UI objects in session.
beyond that, i'd say it varies. too much in session can slow you down if you aren't using the in process session because you are going to be serializing a lot + the speed of the provider. Cache and Session should be used sparingly and carefully. Don't just put in session because you can or is convenient. Sit down and analyze if it makes sense.
Ideally, the session in ASP should store the least amount of data that you can get away with. Storing a reference to any object that is holding system resources open (particularly a database connection) is a definite scalability killer. Also, storing uncommitted data in a session variable is just a bad idea in most cases. Overall it sounds like the current implementation is abusively using session objects to try and simulate a stateful application in a supposedly stateless environment.
Although it is much maligned, the ASP.NET model of managing state automatically through hidden fields should really eliminate the majority of the need to keep anything in session variables.
My rule of thumb is that the more scalable (in terms of users/hits) that the app needs to be, the less you can get away with using session state. There is, however, a trade-off. For web applications where the user is repeatedly accessing the same data and typically has a fairly long session per use of the site, some caching (if necessary in session objects) can actually help scalability by reducing the load on the DB server. The idea here is that it is much cheaper and less complex to farm the presentation layer than the back-end DB. Of course, with all things, this advice should be taken in moderation and doesn't apply in all situations, but for a fairly simple in-house CRUD app, it should serve you well.
A very similar question was asked regarding PHP sessions earlier. Basically, Sessions are a great place to store user-specific data that you need to access across several page loads. Sessions are NOT a great place to store database connection references; you'd be better to use some sort of connection pooling software or open/close your connection on each page load. As far as caching data in the session, this depends on how session data is being stored, how much security you need, and whether or not the data is specific to the user. A better bet would be to use something else for caching data.
storing navigation cues in sessions is tricky. The same user can have multiple windows open and then changes get propagated in a confusing manner. DB connections should definitely not be stored. ASP.NET maintains the connection pool for you, no need to resort to your own sorcery. If you need to cache stuff for short periods and the data set size is relatively small, look into ViewState as a possible option (at the cost of loading more bulk onto the page size)
A: Data that is only relative to one user. IE: a username, a user ID. At most an object representing a user. Sometimes URL-relative data (like where to take somebody) or an error message stack are useful to push into the session.
If you want to share stuff potentially between different users, use the Application store or the Cache. They're far superior.
Stephen,
Do you work for a company that starts with "I", that has a website that starts with "BC"? That sounds exactly like what I did when I first started developing in .net (and was young and stupid) -- I crammed everything I could think of in session and application. Needless to say, that was double-plus ungood.
In general, eschew session as much as possible. Certainly, non-serializable objects shouldn't be stored there (database connections and such), but even big, serializable objects shouldn't be either. You just don't want the overhead.
I would always keep very little information in session. Sessions use server memory resources which is expensive. Saving too many values in session increases the load on server and eventualy the performance of the site will go down. When you use load balance servers, usage of session can run into problems. So what I do is use minimal or no sessions, use cookies if the information is not very critical, use hidden fields more and database sessions.