Priming the asp.net output cache - asp.net

Is there a way to programmatically prime the asp.net output cache? I've investigated the caching API and can't seem to find an obvious way to do this. Has anyone tried something like this? If so, what method did you use?

I gave some thought to this last year and ended up concluding that it was not that important for the case, but if it's important for you website, all you have to do is to simply call the webpages from somewhere like Application_Start (after all code has run) event but you shouldn't stop there!
The cache will eventually expire and to avoid that you should set up some way to cache the pages again before any clients requests that page.
Make the outputcache dependent on someother object in cache and set an expiration callback.
Thus, when that cache object expires, so does your pages and you should make http requests to the pages you want to recache and so on.
I'm answering to this question, but the amount of effort and question marks I still have in my mind lead me to advise not to go through with this...
UPDATE
The only kind of dependency you may set in outputcache is sql dependency. Use it if you want, but if you would need to depend your outputcache on some other business object, then this might get very difficult. I could tell you that you could set a database object and depend your database on it and expire it yourself using some kind of timer.
Man, the longer I write the more solutions and difficulties I find! I can't write a book for something that is not worthy your precious time. Believe me you that the usefulness for this will be nearly zero.

Priming the cache is as others have suggested as easy as requesting the pages you want cached. Of course if you do this programmaticly it will only request the HTML and not all the linked resources (CSS, JavaScript, Images...) which is a good thing to avoid wasted bandwidth.
For many websites the items that are cached which consume the biggest performance penalties are common to many or all pages. For example a navigation system on a large CMS or storefront may query the database and do a bunch of rendering work which can then be cached for all pages. Also a big part of the initial load in ASP.net is when the website if first accessed and loaded into memory. Both of these issues can be addressed by even calling a single page on your site, but there is nothing stopping you from making a list of URLs and calling each one periodically.
If your cache policy is set for a 20 minutes timeout, maybe request each page once every 17-18 minutes.
Here are some resources with source code to help you get started:
Good Simple Primer on requesting web URL in C#
Website Monitoring Windows Service
Asyncronous Website Monitor
As I mentioned before, you can easily extend these to "foreach" over an array or list of URLs to be requested.

Related

How to invalidate browser cache using just configuration in the webserver?

For a long time I've been updating ASP.NET pages on the server and never find the correct way to make changes visible on files like CSS and images.
I know if a append something in the URL the browser will think the file is another one:
<img src="/images/myLogo.png?v=1"/>
or perhaps changing its name:
<img src="/images/myLogo.v1.png"/>
Unfortunately it does not look the correct way. In a case were I'm using App_Themes the files in this folder are automatically injected in the page in a way I can't easily change the URL.
So my question is:
When I'm publishing de ASP.NET Application on the server what is the correct way to signal to IIS (and it notify browser after that) that a file was changed? It is not automatic? Should I change some configuration in IIS or perhaps make some "decoration" in the code?
I've already tried many questions here in SO like "ASP.NET - Invalidate browser cache", "How to refresh the browser cache of an image?", "Handle cached images? How to get the browser to show the new version?", and even "What is an elegant way to force browsers to reload cached CSS/JS files?" but none of them actually take another aproach else in a way you must handle it manually in the code instead of IIS or ASP.NET configuration.
The closer I could find is "Asking browsers to cache our images (ASP.NET/IIS)" where they set expiration but not based on the fact the files were update. Instead they used days or hour to cache those file so they would updated even when no changes were made.
I'm want to know if IIS or ASP.NET offers something related to this, automatically send to the browser that the files was changed. Is it possible/built in?
The options you have to update the browser side, cached item are:
Change the file name
Add url parameter
Place it on cache for a limited time (eg for couple of hours)
Compare the date-time of creation.
Signaling with eTag.
With the three two you avoiding one server call for each item, but the third option load it again after some time.
With the others you have to make one call to the server to see if needs to be load it again.
So you can not have all here, there is not correct way, and you need to chose what is the best for you, and what you can do. The faster from client perspective is the (1) and (2) options.
The direct answer to your question is to use eTag, or date-time compare of the file creation, but you loose that way, a call to the server, you only win the size of what is travel back.
Some more links:
http eTag
How do I support ETags in ASP.NET MVC?
Configuring ETags with Http module in asp.net
How to control web page caching, across all browsers?
Jquery getScript caching
and you can find even more.

Plone yields 404 on new content save

The issue: a content editor saves a new content item and gets a 404 on the proper-looking url for the new object. If they then refresh, the item is there, perfectly normal.
This happens for multiple Archetypes-based content types, and we've seen it on at least two different sites. We've seen it on Plone 3.x and 4.0.3. Here's what these sites have in common:
HAProxy load balancing (with and without session affinity)
Multiple ZEO clients
Using either ZODB 3.9.7 or 3.8.4
The issue happens only some of the time, maybe for 1 out of 4 content items
Has anyone seen anything like this?
I do not have an answer for you; this should not really happen. I certainly have not seen this.
You'll need to gather more information to troubleshoot this, and that perhaps requires interactive access to experts, and SO is not the place for such troubleshooting.
All I can do is advice that you gather as much information as possible, including a full trail of the user interaction from the various logs, including HAProxy and the ZEO server.
It may require additional instrumentation at the server level (when the NotFound error occurs, dump additional information about what is present, etc).
Some recomendations/questions:
Verfy that the content object was, indeed, created.
Check that the content views are correct (the ones that are declared in profiles/default/types/yourtype.xml).
Does it happen when adding content directly to plone instance (Without caching and load balancing?
Does it happen when adding content to direct plone instance with load balancing, but without caching? ---> And so on ...?
Maybe not an elegant one, but you might try inserting print statements or pdb breakpoints in the code so you can track whenever a content object was, indeed, created. Do this only as a desperate method of "instrumentation".
Yes. we have recently started seeing the same issue. We have almost the same setup. Haproxy (no session affinity).
I'm wondering since the pattern seems to be haproxy... perhaps its an issue with a request being redistributed after a timeout?
Updated:
We had this issue. It happens when you have a redirect back to the changed object after a save. This is because the 2nd request hits another zeo client which doesn't realise it's out of date.
The only solution we found was a to add temporary session affinity in haproxy (20s) during any POST. Not ideal but does work. I was just searching for a better solution which is why I found this old post.

ASP.NET: Legitimate architecture/HttpModule concern?

An architect at my work recently read Yahoo!'s Exceptional Performance Best Practices guide where it says to use a far-future Expires header for resources used by a page such as JavaScript, CSS, and images. The idea is you set a Expires header for these resources years into the future so they're always cached by the browser, and whenever we change the file and therefore need the browser to request the resource again instead of using its cache, change the filename by adding a version number.
Instead of incorporating this into our build process though, he has another idea. Instead of changing file names in source and on the server disk for each build (granted, that would be tedious), we're going to fake it. His plan is to set far-future expires on said resources, then implement two HttpModules.
One module will intercept all the Response streams of our ASPX and HTML pages before they go out, look for resource links and tack on a version parameter that is the file's last modified date. The other HttpModule will handle all requests for resources and simply ignore the version portion of the address. That way, the browser always requests a new resource file each time it has changed on disk, without ever actually having to change the name of the file on disk.
Make sense?
My concern relates to the module that rewrites the ASPX/HTML page Response stream. He's simply going to apply a bunch of Regex.Replace() on "src" attributes of <script> and <img> tags, and "href" attribute of <link> tags. This is going to happen for every single request on the server whose content type is "text/html." Potentially hundreds or thousands a minute.
I understand that HttpModules are hooked into the IIS pipeline, but this has got to add a prohibitive delay in the time it takes IIS to send out HTTP responses. No? What do you think?
A few things to be aware of:
If the idea is to add a query string to the static file names to indicate their version, unfortunately that will also prevent caching by the kernel-mode HTTP driver (http.sys)
Scanning each entire response based on a bunch of regular expressions will be slow, slow, slow. It's also likely to be unreliable, with hard-to-predict corner cases.
A few alternatives:
Use control adapters to explicitly replace certain URLs or paths with the current version. That allows you to focus specifically on images, CSS, etc.
Change folder names instead of file names when you version static files
Consider using ASP.NET skins to help centralize file names. That will help simplify maintenance.
In case it's helpful, I cover this subject in my book (Ultra-Fast ASP.NET), including code examples.
He's worried about stuff not being cached on the client - obviously this depends somewhat on how the user population has their browsers configured; if it's the default config then I doubt you'd need to worry about trying to second guess the client caching, it's too hard and the results aren't guaranteed, also it's not going to help new users.
As far as the HTTP Modules go - in principle I would say they are fine, but you'll want them to be blindingly fast and efficient if you take that track; it's probably worth trying out. I can't speak on the appropriateness of use RegEx to do what you want done inside, though.
If you're looking for high performance, I suggest you (or your architect) do some reading (and I don't mean that in a nasty way). I learnt something recently which I think will help -let me explain (and maybe you guys know this already).
Browsers only hold a limited number of simultaneous connections open to a specific hostname at any one time. e.g, IE6 will only do 6 connections to say www.foo.net.
If you call your images from say images.foo.net you get 6 new connections straight away.
The idea is to seperate out different content types into different hostnames (css.foo.net, scripts.foo.net, ajaxcalls.foo.net) that way you'll be making sure the browser is really working on your behalf.
http://code.google.com/p/talifun-web
StaticFileHandler - Serve Static Files in a cachable, resumable way.
CrusherModule - Serve compressed versioned JS and CSS in a cachable way.
You don't quite get kernel mode caching speed but serving from HttpRuntime.Cache has its advantages. Kernel Mode cache can't cache partial responses and you don't have fine grained control of the cache. The most important thing to implement is a consistent etag header and expires header. This will improve your site performance more than anything else.
Reducing the number of files served is probably one of the best ways to improve the speed of your website. The CrusherModule combines all the css on your site into one file and all the js into another file.
Memory is cheap, hard drives are slow, so use it!

Clear ASP.NET OutputCache across web applications

Is it possible to clear the output cache of one asp.net web application from inside another asp.net web application?
Reason being... We have several wep applications structured like...
http://www.website.com/intranet/cms/
http://www.website.com/area1/
http://www.website.com/area2/
Pages in /area1/ and /area2/ are cached and are managed through /intranet/cms/. When a page is edited using /intranet/cms/ I want to clear it out of the cache in the appropriate /area#/ application.
I already tried using a VaryByCustom that looks up a guid stored in the HttpContext.Cache but that seems to be cached per web application, that doesn't work.
Really if there were any way of passing data between web applications on a single server, that would solve my problem, since I can use that + VaryByCustom.
Thanks!
-Mike Thomas
The way I've done this in the past is to have a "hidden" page (in each of the /areaX sites) that does the flushing, reloading, etc. The page validates a shared secret query parameter before doing anything (to avoid DoS attacks). If valid the page would output an "OK" message once the operation is complete; generates a 404 error if the secret is invalid.
If you want the flush to be on a per-item or per-group basis then add a second parameter that identifies that item/group.
This method is also server technology independent, and can be triggered by other management tools if required.
One way I know of doing this is by using a shared resource as a dependency, usually a file. When the file is changed, the cache is cleared. I think you can use HttpResponse.AddFileDependency for this.
However, in these cases it's usually better to use an out-of-process cache such as memcached. I haven't tested it myself, but this link deals on using memcached with OutputCache.

Incremental or on-demand sitemap.xml

After reading Jeff's article about the importance of sitemaps, so I decided to generate one for my dynamic website.
I saw some articles about how to implement it with ASP.NET but every solution I saw showed how to generate it on the fly with an HTTP Handler.
But that solution means that every time someone asks for the file, my code has to iterate trought all my entries to re-generate one?
Wouldn't it be less resource consuming to generate it incrementally? For example on stackoverflow, every time a user adds a question, appending the new URL node?
You might want to cache the resulting XML and invalidate the cache whenever your site structure changes. This might lead to having a publish/subscribe mechanism for components of your web site, but in case of properly structured application this won't be a problem.
You mean cache the result? Yes there's no reason you couldn't do that. Depending on the amount of traffic your site is getting it might be unnecessary but if you're doing it simply to improve your technique there's a number of ways to approach it.

Resources