I have a search functionality on my site that is accessible from every page. Typical top of the masterpage textbox and button deal. I'm looking for a better way to accomplish my caching of the most common search strings and their result using System.Web.Caching.Cache.
I was thinking of concatenating the search string with some applicable user group permission data and using that as the cache key with the value being the List.
example cache key: Microsoft Visual Studio 2008 Service Pack 1--usergroup2,3,6,17,89
But that got me thinking about what's the max length of cache key. Is there a max length that the key can be? By trying to store things this way can end up with some pretty lengthy key name values and it really doesn't do anything about keeping the most common searches as well the most recently used.
Is there already a commonly used method to accomplish what I'm trying to do? Does my question even make sense? Thanks for any help.
But that got me thinking about what's the max length of cache key. Is there a max length that the key can be? By trying to store things this way can end up with some pretty lengthy key name values and it really doesn't do anything about keeping the most common searches as well the most recently used.
The length for the key is the maximum length of the "string" itself.
According to the documentation here : http://msdn.microsoft.com/en-us/library/system.web.caching.cache.add.aspx, the key can be defined in a string with the value in Object type.
I would suggest to tag a custom Object with a unique key, so that when you query from the Cache, you can object your custom Object with more complex information tagged along in the Custom Object.
EDIT 11072009_1154
After i carefully read your requirement again, i noticed that your objective is to cache the frequently search string.
In your given example, the frequently search string might be "Microsoft Visual Studio 2008 Service Pack 1". In my opinion this should be the key, while the value is a custom object which will have additional properties to hold your other necessary attributes.
In summary, this might be the example :
Key : "Microsoft Visual Studio 2008 Service Pack 1"
Value : CustomObjectInstance where : CustomObjectInstance.UserLanguage = "English" and CustomObjectInstance.UserLocalization : "USA" , CustomObjectInstance.UserKeyboardLayout = "UK" etc.
AFAIK, The Cache implement a dictionary type of data structure, so the key must be unique enough. So if your key is "Microsoft Visual Studio 2008 Service Pack 1--usergroup2,3,6,17,89" How can you uniquely identify this particular key from your ASp.NET web apps ? Because in my search textbox, i will not insert usergroup2,3,6,17,89
Think also like StackOverflow site search functionality: users will insert a common search string i.e. "learn jquery material", then in my opinion, your cache key should have an entry of "learn jquery material".
EDIT 11072009_1250
Thanks for the additional information. I can also give additional solution by enforcing multiple layers, what i mean is, rather than cramming all the information into one layer of cache, why not store additional layer.
Means that your cache will have a key (string) and a value which point to a dictionary again.
Another possible solution, is to push these feature by using SQL Server Full Text Index Search, i am not quite familiar to the SQL Server Full Text Index Search, but it can be good if we can leverage this functionality to existing infrastructure if possible.
Caching search results is a fairly common technique. ASP.NET Cache will store all the cached data in memory for faster access. It all depends on how much memory is available to you for caching. If you want to deviate from the ASP.NET Cache approach, there's another method for implementing this - that method for caching the data retrieved from search is to store it in a database table.
Searching a table with billions of records is really expensive; so, you can store the data for the most searched keywords in a table for faster access. You can also create a job to refresh the table at regular intervals, based on some fairly easy algorithms. Least Recently Used algorithm, for example. You can remove the search results which have not been used recently.
EDIT: And, as for your question for the length of the cache key; it is a string, and the length of a string is dependent on the memory available to store it.
Related
I have a website that allows users to query for specific recipes using various search criteria. For example, you can say "Show me all recipes that I can make in under 30 minutes that will use chicken, garlic and pasta but not olive oil."
This query is sent to the web server over JSON, and deserialized into a SearchQuery object (which has various properties, arrays, etc).
The actual database query itself is fairly expensive, and there's a lot of default search templates that would be used quite frequently. For this reason, I'd like to start caching common queries. I've done a little investigation into various caching technologies and read plenty of other SO posts on the subject, but I'm still looking for advice on which way to go. Right now, I'm considering the following options:
Built in System.Web.Caching: This would provide a lot of control over how many items are in the cache, when they expire, and their priority. However, cached objects are keyed by a string, rather than a hashable object. Not only would I need to be able to convert a SearchQuery object into a string, but the hash would have to be perfect and not produce any collisions.
Develop my own InMemory cache: What I'd really like is a Dictionary<SearchQuery, Results> object that persists in memory across all sessions. Since search results can start to get fairly large, I'd want to be able to cap how many queries would be cached and provide a way for older queries to expire. Something like a FIFO queue would work well here. I'm worried about things like thread safety, and am wondering if writing my own cache is worth the effort here.
I've also looked into some other third party cache providers such as NCache and Velocity. These are both distributed cache providers and are probably completely overkill for what I need at the moment. Plus, it seems every cache system I've seen still requires objects to be keyed by a string. Ideally, I want something that holds a cache in process, allows me to key by an object's hash value, and allows me to control expiration times and priorities.
I'd appreciate any advice or references to free and preferably open source solutions that could help me out here. Thanks!
Based on what you are saying, I recommend you use System.Web.Caching and build that into your DataAccess layer shielding it from the rest of you system. When called you can make your real time query or pull from a cached object based on your business/application needs. I do this today, but with Memcached.
An in-memory cache should be pretty easy to implement. I can't think of any reason why you should have particular concerns about validating the uniqueness of a SearchQuery object versus any other - that is, while the key must be a string, you can just store the original object along with the results in the cache, and validate equality directly after you've got a hit on the hash. I would use System.Web.Caching for the benefits you've noted (expiration, etc.). If there happened to be a collision, then the 2nd one would just not get cached. But this would be extremely rare.
Also, the amount of memory needed to store search results should be trivial. You don't need to keep the data of every single field, of every single row, in complete detail. You just need to keep a fast way to access each result, e.g. an int primary key.
Finally, if there are possibly thousands of results for a search that could be cached, you don't even need to keep an ID for each one - just keep the first 100 or something (as well as the total number of hits). I suspect if you analyzed how people use search results, it's a rare person that goes beyond a few pages. If someone did, then you can just run the query again.
So basically you're just storing a primary key for the first X records of each common search, and then if you get a hit on your cache, all you have to do is run a very inexpensive lookup of a handful of indexed keys.
Give a quick look to the Enterprise library Caching Application Block. Assuming you want a web application wide cache, this might be the solution your looking for.
I'm assuming that generating a database query from a SearchQuery object is not expensive, and you want to cache the result (i.e. rowset) obtained from executing the query.
You could generate the query text from your SearchQuery object and use that text as the key for a lookup using System.Web.Caching.
From a quick reading the documentation for the Cache class it appears that the keys have to be unique - which they would be if you used they query text - not the hash of the key.
EDIT
If you are concerned about long cache keys then check the following links:
Cache key length in asp.net
Maximum length of cache keys in HttpRuntime.Cache object?
It seems that the Cache class stores the cached items in an internal dictionary, which uses the key's hash. Keys (query text) with the same hash would end-up in the same bucket in the dictionary, where its just a quick linear search to find the required one when do a cache lookup. So I think you'd be okay with long key strings.
The asp.net caching is pretty well thought out, and I don't think this is a case where you need something else.
I am developing a website for a client (ASP.NET, T-SQL). It is a data-entry website allowing many of their users to login and manipulate records in the same database.
There are instructions (basically a list of string) throughout the form, telling the users what to do for each section; these instructions are themselves present in the database.
On each login, I store these instructions in the Session[] object per authenticated user. The instructions are identical for everyone.
I've looked at a solution which suggested storing a common session identifier in the database and then querying it to re-use that particular session but this seems very hacky. What is a best-practices solution to accomplish this? Is there a 'common' object available to all users?
Firstly, does it matter at this point? Yes, it's bad practice and inefficent, but if you're storing 20Kb of strings in memory and have a maximum of 100 users, that's 2,000Kb of data. Hardly a lot of memory "wasted". Even at 200Kb of strings, that's 20,000Kb of data. Again, not a lot. Is it worth your time, and the client waiting for you to solve it, right now?
If you decide it is then you could:
Store the strings in the Application object or a static class so that they're retrieved once and used many times.
Retrieve the strings on every page view. This may not be as performance damaging as it seems.
Use something like the Cache class in System.Web.Caching.
Make use of Output Caching.
Make use of Windows Server AppFabric "Velocity" memory cache.
Sounds to me like you're looking for the Application Cache. Like the Session, it is an in-memory cache of data. Unlike the session, it is shared among all users; each user doesn't get their own individual copy of the data. Also, when you add data elements to the cache, you can specify criteria which will automatically invalidate that data, and cause it to be reloaded/refreshed (useful when your seldom-changing data actually does change :).
Here's some articles which should give you everything you need to know about using the Application cache (and some other caching options within ASP.NET as well):
ASP.NET Caching Overview
Using the ASP.NET Application Cache to Make Your Applications Scream
Caching Data at Application Startup
.NET Data Caching
I would suggest using the application-level Cache object. It is available everywhere as part of HttpContext. You can populate it on App_Start.
You can put any kind of object into Cache, though obviously, the smaller the better.
Here are some examples of how to populate it using C#:
1) Add items to the cache as you would add items to a dictionary by specifying the item's key & value.
Example: add the current Value property of a text box to the cache.
Cache["txt1"] = txtName.value;
or
Cache["result"] = dataset;
2) The Insert method is overloaded, allowing you to define values for the parameters of the version you're using.
Example: add only an item key & value:
Cache.Insert("MyData1", connectionString);
3) The Add method has the same signature as the Insert method, but it returns an object representing the item you added.
Cache.Add("MyData1", connectionString);
To retrieve the from cache:
stringName = Cache["MyData"];
If the cached data is not a string, you may need to cast it to the proper data type.
result = (DataSet)Cache["result"];
One of the benefits of using the Cache object as opposed to the Application object is that the CLR will dump contents of Cache if the system is in danger of running out of memory.
I need to store a few attributes of an authenticated user (I am using Membership API) and I need to make a choice between using Profiles or adding a new table with UserId as the PK. It appears that using Profiles is quick and needs less work upfront. However, I see the following downsides:
The profile values are squished into a single ntext column. At some point in the future, I will have SQL scripts that may update user's attributes. Querying a ntext column and trying to update a value sounds a little buggy to me.
If I choose to add a new user specific property and would like to assign a default for all the existing users, would it be possible?
My first impression has been that using profiles may cause maintainance headaches in the long run. Thoughts?
There was an article on MSDN (now on ASP.NET http://www.asp.net/downloads/sandbox/table-profile-provider-samples) that discusses how to make a Profile Table Provider. The idea is to store the Profile data in a table versus a row, making it easier to query with just SQL.
More onto that point, SQL Server 2005/2008 provides support for getting data via services and CLR code. You could conceivably access the Profile data via the API instead of the underlying tables directly.
As to point #2, you can set defaults to properties, and while this will not update other profiles immediately, the profile would be updated when next it is accessed.
Seems to me you have answered your own question. If your point 1 is likely to happen, then a SQL table is the only sensible option.
Check out this question...
ASP.NET built in user profile vs. old stile user class/tables
The first hint that the built-in profiles are badly designed is their use of delimited data in a relational database. There are a few cases that delimited data in a RDBMS makes sense, but this is definitely not one of them.
Unless you have a specific reason to use ASP.Net Profiles, I'd suggest you go with the separate tables instead.
There are few string lists in my web application that i don't know where to store in DB or just class.
ie. I have 7 major browsers with which users enter the site. I want to save these stats thus i need to create browser column in UserLogin database. I don't want to waste space and resources so i can save full browser name in each login row. So i either need to save browserID field and hook it up with Browsers table which will store names following db normalization rules or to have sort of Dataholder abstract class which has a list of browsers from which i can retrieve browser name by it's ID...
The question what should i do ? These few data lists i have contain no more than 200 items each so i think it makes sense to have them as abstract class but again i don't know whether MS-SQL will handle multiple joins so well. Think of idea when i have user with country,ip,language,browser and few more stats ..
thanks
I have been on both sides of the fence about this.
My rule of thumb is:
If one of these lists changes, will I have to do changes to the code, too?
(e.g..: in your case, if someone writes "yet another browser" tomorrow, will I need to write code that caters for it?)
If the answer is "most probably yes" or "definitely" you can leave it inside code.
In all other cases (even just a "maybe, 50%-50%) you better put it in the DB, or at the very least a property file.
And please consider this, too: if you expect to have to provide statistics based on this data (e.g.: "how many users use Explorer") you better put it in the DB anyway: it becomes part of your domain data and therefore it must be there.
About the "domain data" part.
The information stored in your DB is the "domain data" of your application. It is, in a sense, a (hopefully consistent) representation of what your application is about - it represents the "known universe" for your application.
If you agree to this definition, then you must also accept that it does not make sense to have 99.9% of your "reality" in the DB, and 0.1% outside of it - if nothing else, it makes some operations cumbersome (if you only store the smallint you can't create meaningful reports without either post-processing them using the class to decode "1" into "Firefox" or providing some other key for the end-user).
It also makes impossible for you to leverage some inherent DB techniques like foreign key (if you just use a smallint without correlating it to any other table, who guarantees that "10" is an acceptable value in your domain?)
MS SQL handles multiple joins really well; it's up to you where you want to store the data. You can also consider XML too, as another option. I would consider the database or XL; it is easier to change the values than if the values are in code (have to recompile/deploy to change when in production).
HTH.
What is the best way to design the Domain objects which can have multi-lingual fields. An example can be a Product class with Description being multi-lingual.
I have found few links but could not decide which one is the best way.
http://fabiomaulo.blogspot.com/2009/06/localized-property-with-nhibernate.html
(This stores all localised language data in one field. Can be a problem if we query from Sql)
http://ayende.com/Blog/archive/2006/12/26/LocalizingNHibernateContextualParameters.aspx
(This one has a warning at the beginning that it is a hack and no longer supported)
http://www.webdevbros.net/2009/06/24/create-a-multi-languaged-domain-model-with-nhibernate-and-c/
(This does not describe how multilingual data will be structured in the database.)
Anyone having experience with using NHibernate with multi-lingual data. Is there a better way?
The third option looks great. The hibernate mapping is given, but not the database schema - if that's what you are missing, then I'll sketch it out here:
dictionary
----------
ID: int - identity
name: nvarchar(255)
phrase
------
dictionary_id:int (fkey dictionary.ID)
culture_id:int (LCID)
phrase:nvarchar(255) - this is the default size - seems too small
According to this blog entry, 255 is the default string length for String values. To overcome the short string length on the phrase text, you can change the <element> tag to
<element column="phrase" type="String" length="4001"></element>
To use this in your domain model, you add a PhraseDictionary property to your entity where you want translatable text. E.g. the title property or decription property.
I think the article describes a great approach, and is the one that I would go
for.
EDIT: In response to the comments, make the length less than 4001 if you know the absolute maximum size is less than that, as this will typically be faster. Also, NHibernate will lazily fetch the collection, but it may fetch all the items at once. You can profile to determine if this has any performance implications. (If you have only a handful of languages then I doubt you will see a difference.) If you have many languages (Say 50+) then it may be worthwhile creating custom properties to fetch the localized text. These will issue queries to fetch specifically the text required. More importantly, you may be able to fetch all the text for a given entity in one query, rather than each localized text property as a separate query.
Note that this extra effort is only needed if profiling gives you reason to be concerned about the performance. Chances are that the implementation in the article as is will function more than adequately.
I only have experience for Hibernate, but since nHibernate is so similar:
One option is to define a component type MultilingualString with members for each language (this assumes the set of languages is known at coding time). This type is also a convenient location to place an getter for the string by language id.
class MultiLingualString {
String english;
String chinese;
String klingon;
String forLanguage(Language lang) {
switch (lang) {
// you can guess what goes here
}
}
}
This results in the strings for all languages being stored in separate columns in the database while the representation in the object world retains fine granularity.
The advantage is that no join is required to fetch the strings. On the other hand, the only way not to fetch a string with this approach is to use a projection, which is a severe limitation if the strings are large, numerous and rarely needed.
If you do this a lot, writing a UserType might be worth it.
From a strictly database oriented standpoint with SQL Server, you should have one table with all of the base data (record key, dates, numbers, etc) and one table with all of the translatable string data. Let call the two tables Base and Base_Description.
Base ensures that there is a single key for each record, the key might be a string or auto-generated id depending on your particular use case.
The Base_Description table is related to the Base table, but also contains a value to select the language that the data is in. In my projects we use the langid column from sys.languages because we can set the language of the connection with and then grab it with ##LANGID for most operations.
In our testing we found this to be significantly faster than having multiple fields for each language, it also allows you to add other languages more easily. We are also using SQL Server Full-Text indexing and it fully works with this method. You should index in the neutral language and then you can pick the language to search against at run time (also filtering against the LangID column in Base_Description).
Do your requirements include the domain objects actually having multiple-language properties in the same object? And, if so, is it unlimited translations stored in the object (in a collection, say - in which case I would say that it would need to be just like any master/detail or parent/child collection) or fixed translations, in which case the languages (and thus the mapping to results of a stored proc or whatever) have to be determined statically anyway?
In many internationalized applications I worked on, the data was in only one language - customer names, the product names (there was no point in mapping even identical products used in one country to products in another, they all had different distributors and different SKUs, and of course localized pricing). The interface was also only in one language (at a time). So all the domain objects only required one language at a time. Thus the language of the translation would be determined when the object was instantiated.
We had translation user interfaces which allowed users to update the translated texts, but these only required two languages at a time (local and the default). I can see this being closest to what you are talking about. I guess that you would have child collections for each translatable property with all the possible translations in the collection. This would probably be closest to the second solution in the third article you linked. Of course, at this point you would also need to see if you want eager/lazy loading etc.