We're experimenting with appending timestamps to some URL's to let things cache but freshen them when they do change. We have code that boils down to this:
DateTime ts = File.GetLastWriteTime(absPath);
where absPath is a MappedPath of a url. So the web server will be checking this file's last write time every time we serve up a link to the file. Kinda gives me the willies - should it?
You should performance-test it, but off-hand I doubt it's any more expensive than testing a file's existence (e.g. whether it's read-only), and certainly less expensive than actually opening the file.
If (after testing) it you decide that it's a problem, you could also cache your calls to GetLastWriteTime (e.g. don't call it more than once every 5 seconds for any given file).
Also, I've never used it but if caching is a concern I hope you've considered delegating its implementation to some specialist like Squid instead of rolling your own.
I have not tried this, but your question is relevant to a situation that I have been thinking about.
You did not indicate what data is changing? database, xml data etc.
ASP.NET caching does support updating the cache based on a variety of dependencies.
Check out this article in the sections of File-based Dependency, Time-based Dependency,
and Key-based Dependency.
"Dependencies allow us to invalidate a particular item within the Cache based on changes to files, changes to other Cache keys, or at a fixed point in time. Let's look at each of these dependencies."
Here is the article:
http://msdn.microsoft.com/en-us/library/ms972379.aspx
Thanks
Joe
Essentially, there are three answers to your question of "how expensive?".
Too expensive - you've tested it and something has to change for the system to be usable.
Acceptable - you've tested it and it isn't great, but it is fast enough to use
Rather cheap - you've tested it and there is no noticeable impact on performance.
We can't really answer the question for you, so you'll just have to try it out. If you decide that it was too expensive or that it's worth your time to move it from acceptable to rather cheap, change the question to ask how to speed things up.
It'll incur additional small disk I/O's when the links are generated. If you create many URL's in a short period of time this could be a bottleneck. Noone can say for sure if this will impact your scenario - you really need to measure and see if this is going to be an issue.
Or if you're worried about it, why not cache it for a minute?
Related
so I need to maintain this old legacy project, where one part of it is amaturely written with Wordpess, lots of crappy custom plugins, lots of ducktaped scripts, that solve one or other problem, database is designed very very purely too, there is also other part written with Zend, which is thightly connected with WP part, there is also this other "masterpiece" project connected to first project data. Main table contains around 1.5M records and needs to be normalized too. now this big ball of nails "works", also it has lots of LOCs, which are result of bad foundations, so it is a huge pain to maintain. Thing is, the way I see this, by not rewriting we are loosing on the long term, because we lose flexibility both from technology and busi
ness perspective,plus it starts not to scale, but rewriting is a risk, plus we would need to convert old data to new data structures. Hacker part of me wants to break this, take a risk and do it right, but at the same time I am having a feeling that my immaturity tries to take a too big bite at once. So what do you think?
In situations like yours, 9 out of 10 times people suggest a rewrite and they are wrong.
Unless you have great application level knowledge about what the system is doing you will not be able to rewrite it successfully, quickly.
If the system is working today, but is crappy in many ways, and you have management buy-in (they own the software) to "fix stuffs", then I suggest an incremental approach will often be better than a full-on rewrite.
I suspect that the database is giving you the most headaches, so that may be the best place to start. Start by understanding the problem that it is currently solving and write that down. If there is no layer between the software and the db (other than jdbc or the like) add a layer. Once there is a layer separating the db from the application, it will be easier to change the db (and the layer) while minimizing the impact on the application.
At some point you will be happy with the first thing you changed. At that point, fix some other part. Repeat until the system is "more better".
Concerning risk: Taking risks is not bad, but being careless is terrible. Understand the risks and plan to mitigate them.
In situations like yours, 9 out of 10 times, I suggest to rewrite. Rationally, the situation a) won't get better, and b) will certainly get worse. You should bite the bullet before it's too late.
And by too late I mean something breaks completely and not only will you have to rewrite the whole thing, but your service will also be offline (ergo you may be also losing users/customers).
A good strategy in this situations is to "strangle" your application as described by Martin Fowler:
http://martinfowler.com/bliki/StranglerApplication.html
The strategy is to gradually create a new system around the edges of the old, letting it grow slowly over several years until the old system is strangled.
I already strangled a legacy application using this approach with great results and practically no offline time
I have what I consider to be a fairly simple application. A service returns some data based on another piece of data. A simple example, given a state name, the service returns the capital city.
All the data resides in a SQL Server 2008 database. The majority of this "static" data will rarely change. It will occassionally need to be updated and, when it does, I have no problem restarting the application to refresh the cache, if implemented.
Some data, which is more "dynamic", will be kept in the same database. This data includes contacts, statistics, etc. and will change more frequently (anywhere from hourly to daily to weekly). This data will be linked to the static data above via foreign keys (just like a SQL JOIN).
My question is, what exactly am I trying to implement here ? and how do I get started doing it ? I know the static data will be cached but I don't know where to start with that. I tried searching but came up with so much stuff and I'm not sure where to start. Recommendations for tutorials would also be appreciated.
You don't need to cache anything until you have a performance problem. Until you have a noticeable problem and have measured your application tiers to determine your database is in fact a bottleneck, which it rarely is, then start looking into caching data. It is always a tradeoff, memory vs CPU vs real time data availability. There is no reason to make your application more complicated than it needs to be just because.
An extremely simple 'win' here (I assume you're using WCF here) would be to use the declarative attribute-based caching mechanism built into the framework. It's easy to set up and manage, but you need to analyze your usage scenarios to make sure it's applied at the right locations to really benefit from it. This article is a good starting point.
Beyond that, I'd recommend looking into one of the many WCF books that deal with higher-level concepts like caching and try to figure out if their implementation patterns are applicable to your design.
Our application includes a default set of data. The default data includes coefficients and other factors that are unlikely to ever change but still need to be update-able by the user.
Currently, the original default data is stored as a populated class within the application. Data updates are stored to an external XML file. This design allows us to include a "reset" feature to restore the original default data. Our rationale for not storing defaults externally [e.g. XML file] was to minimize the risk of being altered. The overall volume of data doesn't warrant a database.
Is there a standard practice for storing "default" application data?
Suppose I were to answers: "Yes, there is a standard. 79% of systems worldwide externalise to a database." Would you now feel motivated to adopt a database? Surely not! Your particular requirements don't merit that overhead.
We're talking trade-offs here. Do the defaults need to change frequently? How much effort is it to change them using your current approach? Do you need to release different versions of the application with different defaults? Do the defaults change as you move from UAT to Production?
If you explore your requirements an engineering solution should emerge. In all likelyhood you will then make a better choice than the current common practice ("standard") that most folks have adopted, which all too often is to use whatever technique they used on their previous project.
For what it's worth, my personal "standard" is to externalise everything. Even when I don't expect things to change, sometime, somewhere, they do. Once I've decided to externalise then XML or property files doesn't make much difference to me.
properties files sound like OK to me. You can also include them inside the jar so that you don't have to carry all around with it.
Edit: "reset" function goes into your application code though.
Having these defaults in an external file could make updating the defaults easier, you could always have a copy of this in the download/on cd etc.
The short of it is: Is it costly to check an Application Variable such as Application("WebAppName") more 10-20 times each time a page loads?
Background: (feel free to critique)
Some includes in my site contain many links and images which cannot use relative urls due to their inclusion in different paths.
Hence these includes contain frequent instances of
<img src="<%=Application("Webroot")%>images\image.gif">
Is it expensive to keep calling an Application variable like this?
Should I just put the Application value in some local variable to use where needed?
IMPORTANT NOTE:
I need my webapp to run fine on a server whether it be in the root web ("/") or in a virtual subweb ("/app").
Thanks in advance for any wisdom shared.
It's cheap - very, very cheap - just a dictionary lookup. Compared with almost anything else you'll do in the app (loading something from disk or the network) this will be statistical noise.
In general though, the best thing to do if you're worried about things like this is to measure it. Arbitrarily put 10,000 calls into a page, and see how that affects performance. See how it affects concurrency as well - can you still get the throughput you need when processing multiple concurrent requests?
Just for info, another option is:
<img src="<%=VirtualPathUtility.ToAbsolute("~/images/image.gif")%>"
This works well especially in MVC, where you might write an extension method to do the job, i.e.
<%=Html.Image("~/images/image.gif")%>
The Application object is a synchronized collection which uses ReadWriteObjectLock (an internal class that just uses the lock keyword), so if you are only reading from the collection it will be as fast as a hash table lookup as Jon mentioned, but if at the same time someone is writing to this collection, readers will block until write is complete. If you are worried so much about performance, call the indexer once, store it to a local variable and use this variable in your views.
Use Request.ApplicationPath instead (only works if your app is set as a virtual directory in IIS)
Short answer - measure it and decide on your own environment. I would say it does not matter.
Longer answer - you should have the call wrapped in something anyway... Like WebConfiguration.Root.
That will give you the option to do whatever optimization to it anytime in the future.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I have been doing some reading on this subject, but I'm curious to see what the best ways are to optimize your use of the ASP.NET cache and what some of the tips are in regards to how to determine what should and should not go in the cache. Also, are there any rules of thumb for determining how long something should say in the cache?
Some rules of thumb
Think in terms of cache miss to request ratio each time you contemplate using the cache. If cache requests for the item will miss most of the time then the benefits may not outweigh the cost of maintaining that cache item
Contemplate the query expense vs cache retrieval expense (e.g. for simple reads, SQL Server is often faster than distributed cache due to serialization costs)
Some tricks
gzip strings before sticking them in cache. Effectively expands the cache and reduces network traffic in a distributed cache situation
If you're worried about how long to cache aggregates (e.g. counts) consider having non-expiring (or long-lived) cached aggregates and pro-actively updating those when changing the underlying data. This is a controversial technique and you should really consider your request/invalidation ratio before proceeding but in some cases the benefits can be worth it (e.g. SO rep for each user might be a good candidate depending on implementation details, number of unanswered SO questions would probably be a poor candidate)
Don't implement caching yet.
Put it off until you've exhausted all the Indexing, query tuning, page simplification, and other more pedestrian means of boosting performance. If you flip caching on before it's the last resort, you're going to have a much harder time figuring out where the performance bottlenecks really live.
And, of course, if you have the backend tuned right when you finally do turn on caching, it will work a lot better for a lot longer than it would if you did it today.
The best quote i've heard about performance tuning and caching is that it's an art not a science, sorry can't remember who said it but the point here is that there are so many factors that can have an effect on the performance of your app that you need to evaluate each situation case by case and make considered tweaks to that case until you reach a desired outcome.
I realise i'm not giving any specifics here but I don't really think you can
I will give one previous example though. I worked on an app that made alot of calls to webservices to built up a client profile e.g.
GET client
GET client quotes
GET client quote
Each object returned by the webservice contributed to a higher level object that was then used to build the resulting page. At first we gathered up all the objects into the master object and cached that. However we realised when things were not as quick as we would like that it would make more sense to cache each called object individually, this way it could be re-used on the next page the client sees e.g.
[Cache] client
[Cache] client quotes
[Cache] client quote
GET client quote upgrades
Unfortunately there is no pre-established rules...but to give you a common sense, I would say that you can easily cache:
Application Parameters (list of countries, phone codes, etc...)
Any other application non-volatile data (list of roles even if configurable)
Business data that is often read and does not change much (or not a big deal if it is not 100% accurate)
What you should not cache:
Volatile data that change frequently (usually the business data)
As for the cache duration, I tend to use different durations depending on the type of data and its size. Application Parameters can be cached for several hours or even days.
For some business data, you may want to have smaller cache duration (minutes to 1h)
One last thing is always to challenge the amount of data you manipulate. Remember that the end-user won't read thousands of records at the same time.
Hope this will give you some guidance.
It's very hard to generalize this sort of thing. The only hard-and-fast rule to follow is not to waste time optimizing something unless you know it needs to be done. Then the proper course of action is going to be very much dependent on the nitty gritty details of your application.
That said... I'll almost always cache global applications parameters in some easy to use object. This is certainly more of a programming convenience rather than optimization.
The one time I've written specific data caching code was for an app that interfaced with a very slow accounting database, and then it was read-only for data that didn't change very often. All writes went to the DB. With SQL Server, I've never run into a situation where the built-in ASP.NET-to-SQL Server interface was the slow part of the equation.