Recommendation for setting minimal hot cache for least accessed tables

Recommendation for setting minimal hot cache for least accessed tables - azure-data-explorer

Is it fine to set hot cache as 0s (or very minimal) if we don't query certain table usually ? I am asking this because I was told once that as a rule of thumb hot cache has to be minimum 1 day regardless of access frequency because hot cache is used during rebuild/merge process and if data to be merged is not found hot cache during the merge, it's accessed from cold storage (Azure blob) causing throttling. Is that correct?

while it's technically possible to set 0s as the caching period, doing so may harm the efficiency of background processes such as merging data shards (e.g. due to having to read artifacts from blob storage instead of from the local cache).
it's recommended that you set the caching policy to at least several hours (e.g. 6h) in order to avoid the aforementioned potential impact.
plus, if the table is frequently queried - consider setting the caching period to match the timespan that is queried the most, to improve overall performance.

Related

Is it bad practice to store information in the cache for long periods of time?

I have a webpage, which takes a while to load because it has to pull information from lots of local databases. For example, if a user searches for person 1 then it will query 20 databases. It can sometimes take 5 minutes to pull all the information needed and apply the business logic. The best solution is to design a data warehouse, which is a long term aim.
If I use data caching it reduces the page load time (of the big records) from five minutes to four seconds. Is it bad practice to store information in the cache for a long period of time i.e. 24 hours? The cache will be refreshed every 24 hours. Alternatively I could store the cached information in a database table.
Every example I find online caches information for seconds e.g. 20 seconds.

Pros:
Faster load times
Less bandwidth usage
Less stress on the server
Cons:
May require high technical expertise to configure it just right
Will not work for content that is constantly being updated
For many system administrators, especially those with the skills to implement a caching system, the pros greatly outweigh the cons. Caching can make your websites run more smoothly for visitors and lessen the burden on your dedicated server.
For more Check this link

Cache is used for global resources, if in your application the data is per user then use Session which is like cache per user.
You can also cache results and connect it to database tables so if a table is being update so does the cache it is called Cache Dependency.
The main concern you need to have is what if the cached information in not up to date, in that case use Cache Dependency.
Don't worry about memory issues in your server, the server is already optimized and knows to clean cache in case of lack in memory.
I hope this helps you.

ASP.NET Page.Cache versus Page.Application storage for data synchronization?

Both Page.Cache and Page.Application can store an application's "global" data, shared among requests and threads.
How should one storage area be chosen over the other considering scenarios of data synchronization in the multi-threaded ASP.NET environment?
Looking for best practice and experienced recommendation.

If the data
is stable during the life of the application
must always be available and must not be purged
better store it in HttpApplicationState.
If the data
not necessarily is needed for the life of the application
changes frequently
can be purged if needed (for example low system memory)
can be discarded if seldom used
should be invalidated/refreshed under some conditions (dependency rule: time span, date, file timestamp, ...)
then use Cache.
Other important points:
Large amounts of data may better be stored in Cache, the server then can purge it if low on memory.
Cache is safe for multithreaded operations. Page.Application needs locking.
See also this article on etutorials.org for more details.

You typically would store the data in Page.Application Items Collection when you need it within the same request. Page.Cache is typically used in data caching scenarios when you want to use it across multiple requests.

Outputcache - how to determine optimal value for duration?

I read somewhere that for a high traffic site (I guess that is a murky term as well), 30 - 60 seconds is a good value. Obviously I could do a load test and vary the values, but I couldn't find any kind of documentation on this. Most samples have a minute, a couple of minutes. There's no recommended range. Is there something on msdn or anywhere that talks about this?

This all depends on whether or not the content changes frequently. For slowly or non-mutating content, a longer value works perfectly. However, you may need to shorten the value for always-changing data or risk bad output.

It all depends on how often a user requests your resource, and how big the resource is.
First, it is important to understand that when you cache something, that resource will remain the same until the cache duration runs out. A short duration cache will tax the webserver more than longer one, but the short will provide more up-to-date data should the requested resource change.
Obviously you want to cache database queries as much as possible, prioritizing those who are called often. But all cache takes memory on the server, and as resources runs low the cache will be evicted. Take this into consideration when caching large things for longer durations.
If you want data on how often users requests a resource you can use Google Analytics, which is extremely easy to set up.
For very exhausitive analytics you can use Kiwik. It requires a local server though.
On very changing resources, don't cache at all, unless it's really really resource heavy and isn't vital to be realtime updated.
To give you an exact number or recommendation would be to make you a disservice, there are too many variables around.

Cache Persistence

I am using the Asp.Net Caching mechanism for a highly frequent changing web app.
the cache holds chat participants and their messages, and it needs to keep track of
participants presence.
Data is changing very frequently, participants go in and out, and messages are sent and recieved.
The cache provides me with solutions for:
- performance
- reducing the number of DDL operations in the database (SQL Server) - we had a problem with the transaction log getting full.
I want to continue working this way, but I cannot rely on the cache (I can lose all data when the Cache recycles, or some of the data when the memory gets full).
The option I see right now is to save the data to the database every time the cache changes, otherwise I will lose data.
But this means many SQL update/insert statements.
someone adviced me to persist to the database evey N messages/changes, but it's still not a reliable solution. I still lose data.
Anyone has an idea?
Thanks
Yaron

Fix your database capacity issues. If you need to be able to reliably save n changes per second, then your database needs to be able to handle n operations per second.
Anything else (including saving every few operations) will lead to some possibility of data loss. If you can accept that data loss risk, then that would work.
A distributed cache (project Velocity or otherwise) could also help (data is at least saved to multiple machines). But that needs extra hardware, and you could spend that on the capacity of the database.
Finally, rather than trying to cache changes look for other opportunities to cache database reads, taking that load off might allow the writes to go through. At least until you get more usage.

Using static data in ASP.NET vs. database calls?

We are developing an ASP.NET HR Application that will make thousands of calls per user session to relatively static database tables (e.g. tax rates). The user cannot change this information, and changes made at the corporate office will happen ~once per day at most (and do not need to be immediately refreshed in the application).
About 2/3 of all database calls are to these static tables, so I am considering just moving them into a set of static objects that are loaded during application initialization and then refreshed every 24 hours (if the app has not restarted during that time). Total in-memory size would be about 5MB.
Am I making a mistake? What are the pitfalls to this approach?

From the info you present, it looks like you definitely should cache this data -- rarely changing and so often accessed. "Static" objects may be inappropriate, though: why not just access the DB whenever the cached data is, say, more than N hours old?
You can vary N at will, even if you don't need special freshness -- even hitting the DB 4 times or so per day will be much better than "thousands [of times] per user session"!
Best may be to keep with the DB info a timestamp or datetime remembering when it was last updated. This way, the check for "is my cache still fresh" is typically very light weight, just get that "latest update" info and check it with the latest update on which you rebuilt the local cache. Kind of like an HTTP "if modified since" caching strategy, except you'd be implementing most of it DB-client-side;-).

If you decide to cache the data (vs. make a database call each time), use the ASP.NET Cache instead of statics. The ASP.NET Cache provides functionality for expiry, handles multiple concurrent requests, it can even invalidate the cache automatically using the query notification features of SQL 2005+.
If you use statics, you'll probably end up implementing those things anyway.
There are no drawbacks to using the ASP.NET Cache for this. In fact, it's designed for caching data too (see the SqlCacheDependency class http://msdn.microsoft.com/en-us/library/system.web.caching.sqlcachedependency.aspx).

With caching, a dbms is plenty efficient with static data anyway, especially only 5M of it.
True, but the point here is to avoid the database roundtrip at all.
ASP.NET Cache is the right tool for this job.

You didnt state how you will be able to find the matching data for a user. If it is as simple as finding a foreign key in the cached set then you dont have to worry.
If you implement some kind of filtering/sorting/paging or worst searching then you might at some point miss the quereing capabilities of SQL.
ORM often have their own quereing and linq makes things easy to, but it is still not SQL.
(try to group by 2 columns)
Sometimes it is a good way to have the db return the keys of a resultset only and use the Cache to fill the complete set.

Think: Premature Optimization. You'll still need to deal with the data as tables eventually anyway, and you'd be leaving an "unusual design pattern".
With event default caching, a dbms is plenty efficient with static data anyway, especially only 5M of it. And the dbms partitioning you're describing is often described as an antipattern. One example: multiple identical databases for multiple clients. There are other questions here on SO about this pattern. I understand there are security issues, but doing it this way creates other security issues. I've recently seen this same concept in a medical billing database (even more highly sensitive) that ultimately had to be refactored into a single database.
If you do this, then I suggest you at least wait until you know it's solving a real problem, and then test to measure how much difference it makes. There are lots of opportunities here for Unintended Consequences.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Recommendation for setting minimal hot cache for least accessed tables - azure-data-explorer

Related

Is it bad practice to store information in the cache for long periods of time?

ASP.NET Page.Cache versus Page.Application storage for data synchronization?

Outputcache - how to determine optimal value for duration?

Cache Persistence

Using static data in ASP.NET vs. database calls?

Categories

Resources