Evict cache for multiple keys rather than all in Spring Redis - spring-data-redis

Is there any way we can evict cache for only a selected number of keys rather than evicting all entries:
#CacheEvict(value = BOOKS_CACHE, allEntries = true)
public int deleteByIds(final Collection<Long> booksIds) {
// Delete all DB entries for given bookIds
}
I don't want to evict all entries but only who are in Collections.
Please suggest.
Thanks

A similar question was asked in the past.
Essentially, if your keys follow a pattern, then with Redis in particular, you can use REGEX to match on keys for entries you wanted evicted from the Redis cache.
I never responded to this question specifically since the answer was adequate, and asked in the context of Redis. However, it did inspire me to generalize the solution a bit further in order to apply this technique beyond Redis.
If you are curious, you can have a look at the more generalized solution in this test class, which applies the given Redis solution along with a solution for Apache Geode and 1 last solution when a simple ConcurrentMap is used as the backing cache in Spring's Cache Abstraction.
Of course, the more generalized solution requires some extension(s) (beginning here) to the core Spring Framework Cache Abstraction.
Anyway, hopefully this will give you some ideas how to cater these solutions to your use case. I am sure there are other possibilities as well.
Obviously, mileage will vary based on the caching provider (e.g. Redis) used with Spring's Cache Abstraction along with the requirements for your application use case.

Related

Modifying Application Variables in ASP.Net (MVC)

I store a large structure holding my application's reference data in a variable I access through HttpContext.Application. Every once in a while this data needs to change. When I update it in place, is there a danger that incoming requests will see the data in an inconsistent state? Is there a need (and a way) to lock some or all of this structure? Finally, are there other approaches to this problem other than querying the database every time you need this (mostly static) data?
There are also other solutions availiable, there are many caching providers that you can use.
First of all, there's the HttpRuntime.Cache (which is the same as the HttpContext cache). There's also the System.Runtime.Caching.MemoryCache in .NET 4.
You can set data expiry and other rules for the data in the cache.
http://wiki.asp.net/page.aspx/655/caching-in-aspnet/
http://msdn.microsoft.com/en-us/library/6hbbsfk6.aspx
http://msdn.microsoft.com/en-us/library/system.runtime.caching.memorycache.aspx
More advanced caching includes distributed caches.
Usually, they reside on another server but may also reside on a different process on the same server.
Such providers are AppFabric (from Microsoft) and MemCached and others that I can't recall currently.
appfabric: http://msdn.microsoft.com/en-us/magazine/ff714581.aspx
memcached: http://memcached.org/
You will not see the application variable in inconsistent state.
The MSDN page for HttpApplicationState says (Under the Thread Safety section):
This type is thread safe.
You may be looking for HttpContext.Items instead to store data in the request scope instead of the application scope. Check out this article to get a great overview of the different context scopes in ASP.NET.
Your solution to avoid querying the database for "mostly static data" is to leverage ASP.NET's caching.

ASP.net application session cache best practices and patterns

In asp.net the major data stores are application, session and we also have the object cache.
I have used common sense hints/tips (e.g. never put users specific data in application, never put unmanaged resources in session etc. etc.) but to be honest I have never come across any recommendations and examples for when to use what in MSDN or from prominent figures like Haack and the Gu that cover all three together (e.g. Google's first hit to MSDN talks about using application as a global cache, if that's the case, what's the object cache for ?
Also something that I find seldom discussed is comparison in scenario, for example I know its easy to unnecessary load up memory usage with over use of session, but what happens if you used the object cache as an alternative to store the same data ?
Edit: This is the best information I have found so far: http://msdn.microsoft.com/en-us/library/ff647787.aspx
Use Session to store user-specific information, since the framework automatically associates each session store with a specific user.
Use the Object Cache for information that can be cached once and reused across the entire application or across a set of users. If you store user-specific data in the Object Cache then you'll have to invent some mechanism to associate cache entries. Not only would this require extra work on your behalf, but you might do it in such a way that increases the likelihood of a nefarious user somehow doing something akin to session spoofing.
I don't know when you'd ever need to use the Application object. If I'm not mistaken, the Application object is more of a relic from classic ASP than anything else.
Another form of caching that can be just as important is per-request caching via the HttpContext.Items collection. This allows you to cache data for the lifetime of a request and is useful if you keep requesting the same data during a single request (such as from different User Controls on the page). For more information on this approach, see HttpContext.Items - a Per-Request Cache Store.
I'd suggest creating a wrapper class, at least for the session, if those get used throughout your code. That way, you can inject an instance of the class to do the real work, and use a mocked version for unit tests. I did this for a large project where the session was widely used, and it worked out rather well.
You can combine this with the facade pattern - the wrapper will provide specific methods that you needs, instead of exposing the general interface. As an example, the session takes objects and returns objects, it is not strongly typed. The wrapper can have strongly typed add and get methods.

Using static data in ASP.NET vs. database calls?

We are developing an ASP.NET HR Application that will make thousands of calls per user session to relatively static database tables (e.g. tax rates). The user cannot change this information, and changes made at the corporate office will happen ~once per day at most (and do not need to be immediately refreshed in the application).
About 2/3 of all database calls are to these static tables, so I am considering just moving them into a set of static objects that are loaded during application initialization and then refreshed every 24 hours (if the app has not restarted during that time). Total in-memory size would be about 5MB.
Am I making a mistake? What are the pitfalls to this approach?
From the info you present, it looks like you definitely should cache this data -- rarely changing and so often accessed. "Static" objects may be inappropriate, though: why not just access the DB whenever the cached data is, say, more than N hours old?
You can vary N at will, even if you don't need special freshness -- even hitting the DB 4 times or so per day will be much better than "thousands [of times] per user session"!
Best may be to keep with the DB info a timestamp or datetime remembering when it was last updated. This way, the check for "is my cache still fresh" is typically very light weight, just get that "latest update" info and check it with the latest update on which you rebuilt the local cache. Kind of like an HTTP "if modified since" caching strategy, except you'd be implementing most of it DB-client-side;-).
If you decide to cache the data (vs. make a database call each time), use the ASP.NET Cache instead of statics. The ASP.NET Cache provides functionality for expiry, handles multiple concurrent requests, it can even invalidate the cache automatically using the query notification features of SQL 2005+.
If you use statics, you'll probably end up implementing those things anyway.
There are no drawbacks to using the ASP.NET Cache for this. In fact, it's designed for caching data too (see the SqlCacheDependency class http://msdn.microsoft.com/en-us/library/system.web.caching.sqlcachedependency.aspx).
With caching, a dbms is plenty efficient with static data anyway, especially only 5M of it.
True, but the point here is to avoid the database roundtrip at all.
ASP.NET Cache is the right tool for this job.
You didnt state how you will be able to find the matching data for a user. If it is as simple as finding a foreign key in the cached set then you dont have to worry.
If you implement some kind of filtering/sorting/paging or worst searching then you might at some point miss the quereing capabilities of SQL.
ORM often have their own quereing and linq makes things easy to, but it is still not SQL.
(try to group by 2 columns)
Sometimes it is a good way to have the db return the keys of a resultset only and use the Cache to fill the complete set.
Think: Premature Optimization. You'll still need to deal with the data as tables eventually anyway, and you'd be leaving an "unusual design pattern".
With event default caching, a dbms is plenty efficient with static data anyway, especially only 5M of it. And the dbms partitioning you're describing is often described as an antipattern. One example: multiple identical databases for multiple clients. There are other questions here on SO about this pattern. I understand there are security issues, but doing it this way creates other security issues. I've recently seen this same concept in a medical billing database (even more highly sensitive) that ultimately had to be refactored into a single database.
If you do this, then I suggest you at least wait until you know it's solving a real problem, and then test to measure how much difference it makes. There are lots of opportunities here for Unintended Consequences.

How to implement locking across a server farm?

Are there well-known best practices for synchronizing tasks across a server farm? For example if I have a forum based website running on a server farm, and there are two moderators trying to do some action which requires writing to multiple tables in the database, and the requests of those moderators are being handled by different servers in the server farm, how can one implement some locking functionality to ensure that they can't take that action on the same item at the same time?
So far, I'm thinking about using a table in the database to sync, e.g. check the id of the item in the table if doesn't exsit insert it and proceed, otherwise return. Also probably a shared cache could be used for this but I'm not using this at the moment.
Any other way?
By the way, I'm using MySQL as my database back-end.
Your question implies data level concurrency control -- in that case, use the RDBMS's concurrency control mechanisms.
That will not help you if later you wish to control application level actions which do not necessarily map one to one to a data entity (e.g. table record access). The general solution there is a reverse-proxy server that understands application level semantics and serializes accordingly if necessary. (That will negatively impact availability.)
It probably wouldn't hurt to read up on CAP theorem, as well!
You may want to investigate a distributed locking service such as Zookeeper. It's a reimplementation of a Google service that provides very high speed distributed resource locking coordination for applications. I don't know how easy it would be to incorporate into a web app, though.
If all the state is in the (central) database then the database transactions should take care of that for you.
See http://en.wikipedia.org/wiki/Transaction_(database)
It may be irrelevant for you because the question is old, but it still may be useful for others so i'll post it anyway.
You can use a "SELECT FOR UPDATE" db query on a locking object, so you actually use the db for achieving the lock mechanism.
if you use ORM, you can also do that. for example, in nhibernate you can do:
session.Lock(Member, LockMode.Upgrade);
Having a table of locks is a OK way to do it is simple and works.
You could also have the code as a Service on a Single Server, more of a SOA approach.
You could also use the the TimeStamp field with Transactions, if the timestamp has changed since you last got the data you can revert the transaction. So if someone gets in first they have priority.

Custom caching in ASP.NET

I want to cache custom data in an ASP.NET application. I am putting lots of data into it, such as List<objects>, and other objects.
Is there a best practice for this? Since if I use a static data, if the w3p.exe dies or gets recycled, the cache will need to be filled again.
The database is also getting updated by other applications, so a thread would be needed to make sure it is on the latest data.
Update 1:
Just found this, which problably helps me
http://www.codeproject.com/KB/web-cache/cachemanagementinaspnet.aspx?fid=229034&df=90&mpp=25&noise=3&sort=Position&view=Quick&select=2818135#xx2818135xx
Update 2:
I am using DotNetNuke as the application, ( :( ). I have enabled persistent caching and now the whole application feels slugish.
Such as a Multiview takes about 3 seconds to swap view....
Update 3:
Strategies for Caching on the Web?
Linked to this, I am using the DotNetNuke caching method, which in turn uses the ASP.NET Cache object, it also has file based caching.
I have a helper:
CachingProvider.Instance().Add( _
(label & "|") + key, _
newObject, _
Nothing, _
Cache.NoAbsoluteExpiration, _
Cache.NoSlidingExpiration, _
CacheItemPriority.NotRemovable, _
Nothing)
Which runs that to add the objects to the cache, is this correct? As I want to keep it cached as long as possible. I have a thread which runs every x Minutes, which will update the cache. But I have noticied, the cache is getting emptied, I check for an object "CacheFilled" in the cache.
As a test I've told the worker process not to recycle, etc., but still it seems to clear out the cache. I have also changed the DotNetNuke settings from "heavy" to "light" but think that is for module caching.
You are looking for either out of process caching or a distributed caching system of some sort, based upon your requirements. I recommend distributed caching, because it is very scalable and is dedicated to caching. Someone else had recommended Velocity, which we have been evaluating and thoroughly enjoying. We have written several caching providers that we can interchange while we are evaluating different distributed caching systems without having to rebuild. This will come in handy when we are load testing the various systems as part of the final evaluation.
In the past, our legacy application has been a random assortment of cached items. There have been DataTables, DataViews, Hashtables, Arrays, etc. and there was no logic to what was used at any given time. We have started to move to just caching our domain object (which are POCOs) collections. Using generic collections is nice, because we know that everything is stored the same way. It is very simple to run LINQ operations on them and if we need a specialized "view" to be stored, the system is efficient enough to where we can store a specific collection of objects.
We also have put an abstraction layer in place that pretty much brokers calls between either the DAL or the caching model. Calls through this layer will check for a cache miss or cache hit. If there is a hit, it will return from the cache. If there is a miss, and the call should be cached, it will attempt to cache the data after retrieving it. The immediate benefit of this system is that in the event of a hardware or software failure on the machines dedicated to caching, we are still able to retrieve data from the database without having a true outage. Of course, the site will perform slower in this case.
Another thing to consider, in regards to distributed caching systems, is that since they are out of process, you can have multiple applications use the same cache. There are some interesting possibilities there, involving sharing database between applications, real-time manipulation of data, etc.
Also have a look at the MS Enterprise Caching Application block which allows your to write custom expiration policy, custom store etc.
http://msdn.microsoft.com/en-us/library/cc309502.aspx
You can also check "Velocity" which is available at
http://code.msdn.microsoft.com/velocity
This will be useful if you wish to scale your application across servers...
There are lots of articles about the Cache object in ASP.NET and how to make it use SqlDependencies and other types of cache expirations. No need to write your own. And using the Cache is recommended over session or any of the other collections people used to cram lots of data into.
Cache and Session can lead to sluggish behaviour, but sometimes they're the right solutions: the rule of right tool for right job applies.
Personally I've often created collections in pseudo-static singletons for the kind of role you describe (typically to avoid I/O overheads like storing a compiled xslttransform), but it's very important to keep in mind that that kind of cache is fragile, and design for it to A). filewatch or otherwise monitor what it's supposed to cache where appropriate and B). recreate/populate itself with use - it should expect to get flushed frequently.
Essentially I recommend it as a performance crutch, but don't rely on it for anything requiring real persistence.

Resources