I am in the planning process of moving a C# ASP.Net web application over to Azure (currently hosted on a single dedicated server) and am looking at caching options. Currently, because we only have one instance of the application running at one time, we have an 'in process' memory cache to relieve the SQL DB of some identical requests.
The process at the moment is to clear certain parts of the cache when the managers/services make a change to those parts of the database, e.g. we have a users table and we'll have keys like "User.{0}" returning a single User record/object and "Users.ForeignKey.{0}" returning all users related to the foreign key. If we update a single user record then we remove the "User.1" key (if the userid = 1) and for ease all of the list collections as they could have changed. We do this by removing keys by pattern, this means that only the affected keys are removed and all others persist.
We've been planning this move to Azure for a while now and when we first started looking at everything the Azure Redis Cache service wasn't available, at least supported, so we looked at the Azure Cache service, based on AppFabric. Using this we decided that we would use DataCache regions to separate the different object types and then just flush the region that was affected, not quite as exact as our current method but OK. Now, since Redis has come on to the scene, we've been looking at that and would prefer to use it if possible. However, it seems that to achieve the same thing we would have to have separate Redis caches for each 'Region'/section, which from how I understand it would mean we would pay for lots of small instances of the Redis Cache service from Azure which would cost quite a lot given that we would need 10+ separately flushable sections to the cache.
Anyone know how to achieve something similar to Azure DataCache Regions with Redis or can you suggest something glaringly obvious that I'm probably missing.
Sorry for such a long question/explanation but I found it difficult to explain what I'm trying to achieve without background/context.
Thanks,
Gareth
Update:
I've found a few bash commands that can do the job of deleting keys by pattern, including using the 'KEYS' command here and the lua script EVAL command here.
I'm planning on using the StackExchange.Redis client to interact, does anyone know how to use these types of commands or alternatives to those (to delete keys by pattern) when using StackExchange.Redis?
Thanks for reading, Gareth
You can use this method which leverage the async/await features and redis pipelining to delete keys by pattern using stack exchange redis client
private static Task DeleteKeysByPatternAsync(string pattern)
{
IDatabase cache1 = Connection.GetDatabase();
var redisServer1 = Connection.GetServer(Connection.GetEndPoints().First());
var deleteTasks = new List<Task>();
var counter = 0;
foreach (var key in redisServer1.Keys(pattern: pattern, database: 0, pageSize: 5000))
{
deleteTasks.Add(cache1.KeyDeleteAsync(key));
counter++;
if (counter % 1000 == 0)
Console.WriteLine($"Delete key tasks created: {counter}");
}
return Task.WhenAll(deleteTasks);
}
Then you can use it like this:
DeleteKeysByPatternAsync("*user:*").Wait(); //If you are calling from main method for example where you cant use await.
or
await DeleteKeysByPatternAsync("*user:*"); //If you run from async method
You can tweak the pageSize or receive as method param.
For what I understand by your question you need to group your data according to some criteria (user in your case), so that whenever record related to that criteria is changed, all of data related to that record is also invalidated in cache using a single cache api call.
You can achieve this in Azure using NCache for Azure, a distributed caching solution for azure by Alachisoft which has a rich set of features along with multiple caching topologies.
NCache allows multiple ways to perform this type of operations. One suitable for your use case is data grouping feature that will allow you to group data in groups/subgroups on addition. Data can be later fetched/removed on the basis of groups/subgroups.
NCache also allows to add tags with items being added. These tags can then be used for removing/fetching all data containing one or more specified tags. Querying feature (Delete query) provided in NCache can also be used to remove data satisfying a particular criteria.
Related
I know I can setup multiple namespaces for DoctrineCacheBundle in config.yml file. But Can I use one driver but with multiple namespaces?
The case is that in my app I want to cache all queries for all of my entities. The problem is with flushing cache while making create/update actions. I want to flush only part of my cached queries. My app is used by multiple clients. So when a client updates sth in his data for instance in Article entity, I want to clear cache only for this client only for Article. I could add proper IDs for each query and remove them manually but the queries are dynamically used. In my API mobile app send version number for which DB should return data so I don't know what kind of IDs will be used in the end.
Unfortunately I don't think what you want to do can be solved with some configuration magic. What you want it some sort of indexed cache, and for that you have to find a more powerful tool.
You can take a look at doctrines second level cache. Don't know how good it is now (tried it once when it was in beta and did not make the cut for me).
Or you can build your own cache manager. If you do i recommend using redis. The data structures will help you keep you indexes (Can be simulated with memcached, but it requires more work). What I meen by indexes.
You will have a key like client_1_articles where 1 is the client id. In that key you will store all the ids of the articles of client 1. For every article id you will have a key like article_x where x is the id the of article. In this example client_1_articles is a rudimentary index that will help you, if you want at some point, to invalidated all the caches of articles coming from client 1.
The abstract implementation for the above example will end up being a graph like structure over your cache, with possibly
-composed indexes 'client_1:category_1' => {article_1, article_2}
-multiple indexes for one item eg: 'category_1'=>{article_1, article_2, article_3}, 'client_1' => {article_1, article_3}
-etc.
Hope this help you in some way. At least that was my solution for a similar problem.
Good luck with your project,
Alexandru Cosoi
I have a legacy website that needs a little optimization because of poor performance. It is an asp.net shopping website with linq to sql as data layer and MVP pattern as UI pattern.
The most costly entities in the db are product and category tables that have a one to many relationship. These two entities might not change regularly unless a user of admin group decides to add a product or category… etc. i was wondering how resource costly would it be to create and fetch everything from these two entities for each request! so if i could have had a way to keep my data alive…
first I thought well let’s use AJAX for data retrievals so I will create only those entities that I need to query or bind to, but wait, how can I do that without creating a new DataContext instance?!!
At the other side, using cache for whole DataContext is considered a bad decision because of memory cost. So what would be the best option here? How can I improve things?
UPDATE
1) doing what #HatSoft suggested.
Cons: those approaches will not help your code, only the database. beside this, there might be memory issues since we're putting data in memory instead of rendered html, however this might be the best option regarding de-coupling.
2) using output caching we have this code in an http handler with *.aspx wildcard:
string pagePath = Context.Request.Url.AbsolutePath;
object cacheKey = application[pagePath];
if(cacheKey == null)
return; //application restarted/first run so cache the stuff
else
Context.Response.RemoveOutputCacheItem(pagePath);
Cons: now we should link the pagePath to each database entity that the page uses, but if i do so then i'm coupling things instead of de-coupling them. this approach also will run into a little hard coding.
3) another solution would be output caching in post-cache mode instead of control cache mode. using Subsituation element and setting the OutPutCache Duration to 86400 so the page will be re-created every 24 hours.
Cons: hard coding user controls to produce the html output for Subsituation element dynamically.
so what do you suggest?
I would suggest you look in to SqlDependency class please read this article http://www.asp.net/web-forms/tutorials/data-access/caching-data/using-sql-cache-dependencies-cs
Also I would suggest you look in to loading data in the cache at application startup if it suits your application. Please see a good example here http://www.asp.net/web-forms/tutorials/data-access/caching-data/caching-data-at-application-startup-cs
With Linq2SQL you can use LinqToCache which offers a SqlDependency powered cache for your LINQ queries. It transforms the IQueryable<Products> into IEnumerable<Products> and enumerates form memmory after first access (first iteration of the underlying IQueryable). Based on SqlDependency data change notifications it invalidates the list and subsequent access will query again from DB, and cache the result.
My recommendation would be to cache the Products list and Categories in memory, since they change seldom and I expect them to be of a fairly constrained size.
I am developing a website for a client (ASP.NET, T-SQL). It is a data-entry website allowing many of their users to login and manipulate records in the same database.
There are instructions (basically a list of string) throughout the form, telling the users what to do for each section; these instructions are themselves present in the database.
On each login, I store these instructions in the Session[] object per authenticated user. The instructions are identical for everyone.
I've looked at a solution which suggested storing a common session identifier in the database and then querying it to re-use that particular session but this seems very hacky. What is a best-practices solution to accomplish this? Is there a 'common' object available to all users?
Firstly, does it matter at this point? Yes, it's bad practice and inefficent, but if you're storing 20Kb of strings in memory and have a maximum of 100 users, that's 2,000Kb of data. Hardly a lot of memory "wasted". Even at 200Kb of strings, that's 20,000Kb of data. Again, not a lot. Is it worth your time, and the client waiting for you to solve it, right now?
If you decide it is then you could:
Store the strings in the Application object or a static class so that they're retrieved once and used many times.
Retrieve the strings on every page view. This may not be as performance damaging as it seems.
Use something like the Cache class in System.Web.Caching.
Make use of Output Caching.
Make use of Windows Server AppFabric "Velocity" memory cache.
Sounds to me like you're looking for the Application Cache. Like the Session, it is an in-memory cache of data. Unlike the session, it is shared among all users; each user doesn't get their own individual copy of the data. Also, when you add data elements to the cache, you can specify criteria which will automatically invalidate that data, and cause it to be reloaded/refreshed (useful when your seldom-changing data actually does change :).
Here's some articles which should give you everything you need to know about using the Application cache (and some other caching options within ASP.NET as well):
ASP.NET Caching Overview
Using the ASP.NET Application Cache to Make Your Applications Scream
Caching Data at Application Startup
.NET Data Caching
I would suggest using the application-level Cache object. It is available everywhere as part of HttpContext. You can populate it on App_Start.
You can put any kind of object into Cache, though obviously, the smaller the better.
Here are some examples of how to populate it using C#:
1) Add items to the cache as you would add items to a dictionary by specifying the item's key & value.
Example: add the current Value property of a text box to the cache.
Cache["txt1"] = txtName.value;
or
Cache["result"] = dataset;
2) The Insert method is overloaded, allowing you to define values for the parameters of the version you're using.
Example: add only an item key & value:
Cache.Insert("MyData1", connectionString);
3) The Add method has the same signature as the Insert method, but it returns an object representing the item you added.
Cache.Add("MyData1", connectionString);
To retrieve the from cache:
stringName = Cache["MyData"];
If the cached data is not a string, you may need to cast it to the proper data type.
result = (DataSet)Cache["result"];
One of the benefits of using the Cache object as opposed to the Application object is that the CLR will dump contents of Cache if the system is in danger of running out of memory.
For the sake of argument assume that I have a webform that allows a user to edit order details. User can perform the following functions:
Change shipping/payment details (all simple text/dropdowns)
Add/Remove/Edit products in the order - this is done with a grid
Add/Remove attachments
Products and attachments are stored in separate DB tables with foreign key to the order.
Entity Framework (4.0) is used as ORM.
I want to allow the users to make whatever changes they want to the order and only when they hit 'Save' do I want to commit the changes to the database. This is not a problem with textboxes/checkboxes etc. as I can just rely on ViewState to get the required information. However the grid is presenting a much larger problem for me as I can't figure out a nice and easy way to persist the changes the user made without committing the changes to the database. Storing the Order object tree in Session/ViewState is not really an option I'd like to go with as the objects could get very large.
So the question is - how can I go about preserving the changes the user made until ready to 'Save'.
Quick note - I have searched SO to try to find a solution, however all I found were suggestions to use Session and/or ViewState - both of which I would rather not use due to potential size of my object trees
If you have control over the schema of the database and the other applications that utilize order data, you could add a flag or status column to the orders table that differentiates between temporary and finalized orders. Then, you can simply store your intermediate changes to the database. There are other benefits as well; for example, a user that had a browser crash could return to the application and be able to resume the order process.
I think sticking to the database for storing data is the only reliable way to persist data, even temporary data. Using session state, control state, cookies, temporary files, etc., can introduce a lot of things that can go wrong, especially if your application resides in a web farm.
If using the Session is not your preferred solution, which is probably wise, the best possible solution would be to create your own temporary database tables (or as others have mentioned, add a temporary flag to your existing database tables) and persist the data there, storing a single identifier in the Session (or in a cookie) for later retrieval.
First, you may want to segregate your specific state management implementation into it's own class so that you don't have to replicate it throughout your systems.
Second, you may want to consider a hybrid approach - use session state (or cache) for a short time to avoid unnecessary trips to a DB or other external store. After some amount of inactivity, write the cached state out to disk or DB. The simplest way to do this, is to serialize your objects to text (using either serialization or a library like proto-buffers). This helps allow you to avoid creating redundant or duplicate data structure to capture the in-progress data relationally. If you don't need to query the content of this data - it's a reasonable approach.
As an aside, in the database world, the problem you describe is called a long running transaction. You essentially want to avoid making changes to the data until you reach a user-defined commit point. There are techniques you can use in the database layer, like hypothetical views and instead-of triggers to encapsulate the behavior that you aren't actually committing the change. The data is in the DB (in the real tables), but is only visible to the user operating on it. This is probably a more complicated implementation than you may be willing to undertake, and requires intrusive changes to your persistence layer and data model - but allows the application to be ignorant of the issue.
Have you considered storing the information in a JavaScript object and then sending that information to your server once the user hits save?
Use domain events to capture the users actions and then replay those actions over the snapshot of the order model ( effectively the current state of the order before the user started changing it).
Store each change as a series of events e.g. UserChangedShippingAddress, UserAlteredLineItem, UserDeletedLineItem, UserAddedLineItem.
These events can be saved after each postback and only need a link to the related order. Rebuilding the current state of the order is then as simple as replaying the events over the currently stored order objects.
When the user clicks save, you can replay the events and persist the updated order model to the database.
You are using the database - no session or viewstate is required therefore you can significantly reduce page-weight and server memory load at the expense of some page performance ( if you choose to rebuild the model on each postback ).
Maintenance is incredibly simple as due to the ease with which you can implement domain object, automated testing is easily used to ensure the system behaves as you expect it to (while also documenting your intentions for other developers).
Because you are leveraging the database, the solution scales well across multiple web servers.
Using this approach does not require any alterations to your existing domain model, therefore the impact on existing code is minimal. Biggest downside is getting your head around the concept of domain events and how they are used and abused =)
This is effectively the same approach as described by Freddy Rios, with a little more detail about how and some nice keyword for you to search with =)
http://jasondentler.com/blog/2009/11/simple-domain-events/ and http://www.udidahan.com/2009/06/14/domain-events-salvation/ are some good background reading about domain events. You may also want to read up on event sourcing as this is essentially what you would be doing ( snapshot object, record events, replay events, snapshot object again).
how about serializing your Domain object (contents of your grid/shopping cart) to JSON and storing it in a hidden variable ? Scottgu has a nice article on how to serialize objects to JSON. Scalable across a server farm and guess it would not add much payload to your page. May be you can write your own JSON serializer to do a "compact serialization" (you would not need product name,product ID, SKU id, etc, may be you can just "serialize" productID and quantity)
Have you considered using a User Profile? .Net comes with SqlProfileProvider right out of the box. This would allow you to, for each user, grab their profile and save the temporary data as a variable off in the profile. Unfortunately, I think this does require your "Order" to be serializable, but I believe all of the options except Session thus far would require the same.
The advantage of this is it would persist through crashes, sessions, server down time, etc and it's fairly easy to set up. Here's a site that runs through an example. Once you set it up, you may also find it useful for storing other user information such as preferences, favorites, watched items, etc.
You should be able to create a temp file and serialize the object to that, then save only the temp file name to the viewstate. Once they successfully save the record back to the database then you could remove the temp file.
Single server: serialize to the filesystem. This also allows you to let the user resume later.
Multiple server: serialize it but store the serialized value in the db.
This is something that's for that specific user, so when you persist it to the db you don't really need all the relational stuff for it.
Alternatively, if the set of data is v. large and the amount of changes is usually small, you can store the history of changes done by the user instead. With this you can also show the change history + support undo.
2 approaches - create a complex AJAX application that stores everything on the client and only submits the entire package of changes to the server. I did this once a few years ago with moderate success. The applicaiton is not something I would want to maintain though. You have a hard time syncing your client code with your server code and passing fields that are added/deleted/changed is nightmarish.
2nd approach is to store changes in the data base in a temp table or "pending" mode. Advantage is your code is more maintainable. Disadvantage is you have to have a way to clean up abandonded changes due to session timeout, power failures, other crashes. I would take this approach for any new development. You can have separate tables for "pending" and "committed" changes that opens up a whole new level of features you can add. What if? What changed? etc.
I would go for viewstate, regardless of what you've said before. If you only store the stuff you need, like { id: XX, numberOfProducts: 3 }, and ditch every item that is not selected by the user at this point; the viewstate size will hardly be an issue as long as you aren't storing the whole object tree.
When storing attachments, put them in a temporary storing location, and reference the filename in your viewstate. You can have a scheduled task that cleans the temp folder for every file that was last saved over 1 day ago or something.
This is basically the approach we use for storing information when users are adding floorplan information and attachments in our backend.
Are the end-users internal or external clients? If your clients are internal users, it may be worthwhile to look at an alternate set of technologies. Instead of webforms, consider using a platform like Silverlight and implementing a rich GUI there.
You could then store complex business objects within the applet, provide persistant "in progress" edit tracking across multiple sessions via offline storage and easily integrate with back-end services that providing saving / processing of the finalised order. All whilst maintaining access via the web (albeit closing out most *nix clients).
Alternatives include Adobe Flex or AJAX, depending on resources and needs.
How large do you consider large? If you are talking sessions-state (so it doesn't go back/fore to the actual user, like view-state) then state is often a pretty good option. Everything except the in-process state provider uses serialization, but you can influence how it is serialized. For example, I would tend to create a local model that represents just the state I care about (plus any id/rowversion information) for that operation (rather than the full domain entities, which may have extra overhead).
To reduce the serialization overhead further, I would consider using something like protobuf-net; this can be used as the implementation for ISerializable, allowing very light-weight serialized objects (generally much smaller than BinaryFormatter, XmlSerializer, etc), that are cheap to reconstruct at page requests.
When the page is finally saved, I would update my domain entities from the local model and submit the changes.
For info, to use a protobuf-net attributed object with the state serializers (typically BinaryFormatter), you can use:
// a simple, sessions-state friendly light-weight UI model object
[ProtoContract]
public class MyType {
[ProtoMember(1)]
public int Id {get;set;}
[ProtoMember(2)]
public string Name {get;set;}
[ProtoMember(3)]
public double Value {get;set;}
// etc
void ISerializable.GetObjectData(
SerializationInfo info,StreamingContext context)
{
Serializer.Serialize(info, this);
}
public MyType() {} // default constructor
protected MyType(SerializationInfo info, StreamingContext context)
{
Serializer.Merge(info, this);
}
}
sorry i have many questions about the lock/cache.=_=..
->1. about cache, i know that the cache in asp.net is threadsafe, the simple code i usually use is
IList<User> user= HttpRuntime.Cache["myCacheItem"] as IList<User>;
if (user == null)
{
//should i have a lock here?
//lock(some_static_var){...}
HttpRuntime.Cache["myCacheItem"] = GetDateFromDateBase();
}
return user;
should use a lock in the code?
->->1.1 if i use, maybe i should declare many lockitem? i have seen some implementation in community server, it use a static dictionary to store the lockitem, is it a good idea? cause i am worried that it maybe too many lockitems in the dictionary and it maybe slow down the system.
->->1.2 if i don't use, what will happen? just maybe two or more threads access the GetDateFromDateBase()? if just this, i think maybe i can give up the lock.
->2.i have a generic dictionary stored in the cache, i have to modify(add/update/delete) it. and i just use it to get the value like dic.trygetvalue(key), don't loop it.
->->2.1 if i can guarantee that the modify is just happen in only one thread, the scene like
a.aspx -> read the dictionary from cache, and display on the page, public for user
b.ashx -> will modify the dictionary when call it.(loop in 5 minutes),private used
should i use lock in a/b? lock reader and writer?
->->->2.11 if i don't use any lock, what will happened? will it throw exception when the reader and writer access on the same time?
->->->2.12 if i just lock the writer in b.ashx, what will happen? will the reader in a.aspx blocked? and what's the best practice to deal with this situation?
->->2.2 if the reader and writer both occured in the multi threads access. they are both the in public page.
a.aspx -> just read from cache
b.aspx -> modify the dictionary
what to do? lock all?
->->2.3 if i implement a new dictionary add function:
it just copy the current dictionary to a new dictionary and then add the new item or modify
and at last return the new dic,,will it solve the concurrency problem?
will the hashtable solve these problem?
how to determine a item need be locked? i think i'm wrong in these things=_=.
->3.the last question..i have two web application
a -> show the web
b -> mamage some settings
they both have own cache, what can i concurrent the two cache?
will the operation in [2.1] right or other operation?(i test memcached but it too slow than in the web application, so i just use two)
thank you for reading all-_-...
wish you can know what i say:)
Update======================================================
Thanks for Aaron's answer. after read his answer, i try to answer myself:)
1. about cache, if the data will not modify, it can read into cache first in Application_Start(global.asax).
1.1 If lock, i should add lock start when reading the data, not only writing. the lock item should static but i also feel uncerrain about 1.1.
1.2 Yes, if you can premise that the code is just read date from database and then insert into cache and will not modify(right?-_-). the result of this maybe read several times from database. i think this is not a big problem.
2. the generic dictionary write is not threadsafe, so it should be lock when modify. To solve this, i can use a immutable dictionary.
but if i use a ConcurrentDictionary(threadsafe when read/write) in .net4.0 or i implement a new dictionary(use reader-writer lock) myself, will it solve? Do i need lock it again when modify? the code is like
ConcurrentDictionary user= HttpRuntime.Cache["myCacheItem"] as ConcurrentDictionary;
if (user == null)
{
//is it safe when the user is ConcurrentDictionary?
HttpRuntime.Cache["myCacheItem"] = GetDateFromDateBase();
}
else
{
//is it safe when the user is ConcurrentDictionary?
user["1"] = a_new_user;
}
3.the question is that, i have a small application like a stroe, it have two web application one is the store show site(A) and the other is the management site(B), so i need to concurrnt two web cache, like that if i modify a product price in B, how can i notify site A to change/delete cache?(i know that the cache in A can set shortor, but it not quickly, so i want to know wheather there have a inner support in asp.net or just like question 2.1? have a aspx/ashx page to call?)
Thread-safe means that you do not have to worry about multiple threads reading and writing the same individual cache item at the same time. You do not need any kind of lock for simple reads and writes.
However, if you are trying to perform what should be an atomic operation - for example, checking for the presence of an item and adding it if it doesn't exist - then you need to synchronize or serialize access. Because the lock needs to be global, it is usually best to declare a static readonly object in your global.asax and lock that before performing the atomic operation. Note that you should lock before the read and only release the lock after the write, so your hypothetical lock in your example above is actually happening too late.
This is why many web applications don't lazy-load the cache; instead, they perform the load in the Application_Start method. At the very least, putting an empty dictionary into the cache would save you the trouble of checking for null, and this way you wouldn't have to synchronize because you would only be accessing the cache item once (but read on).
The other aspect to the problem is, even though the cache itself is thread-safe, it does not make any items you add to it thread-safe. That means that if you store a dictionary in there, any threads that retrieve the cache item are guaranteed to get a dictionary that's fully-initialized, but you can still create race conditions and other thread-safety issues if you have multiple requests trying to access the dictionary concurrently with at least one of them modifying it.
Dictionaries in .NET can support multiple concurrent readers but not writers, so you definitely can run into issues if you have a single dictionary stored in the ASP.NET cache and have multiple threads/requests reading and writing. If you can guarantee that only one thread will ever write to the dictionary, then you can use the Dictionary as an immutable type - that is, copy the dictionary, modify the copy, and replace the original in the cache with the copy. If the dictionary is infrequently-modified, then this would indeed save you the trouble of synchronizing access, because no request will ever be trying to read from the same dictionary that is being modified. On the other hand, if the dictionary is very large and/or frequently modified, then you could run into performance issues making all those copies - the only way to be sure is profile, profile, profile!
If you find that performance constraints don't allow to use that approach, then the only other option is to synchronize access. If you know that only one thread will ever modify the dictionary at a time, then a reader-writer lock (i.e. ReaderWriterLockSlim) should work well for you. If you can't guarantee this, then you need a Monitor (or just a simple lock(...) clause around each sequence of operations).
That should answer questions 1-2; I'm sorry but I don't quite understand what you're asking in #3. ASP.NET applications are all instantiated in their own AppDomains, so there aren't really any concurrency issues to speak of because nothing is shared (unless you are actually using some method of IPC, in which case everything is fair game).
Have I understood your questions correctly? Does this help?