When a background CDN cache is updated with fresh data from the origin, is only the cache for that specific POP updated or does this update populate through the whole CDN so that other POPs also serve the fresh data?
It does not. You can see how this behavior would be a scaling issue:
Let's say the CDN you are using has N locations.
If you revalidate the cache in one of them and the CDN can distribute this content across all edges, it must perform (N-1) requests. After all, you are telling all other edges to update.
As N grows (more regions), the CDN will take more and more time to update all locations.
Related
To reduce the number of API calls we're making on our website, I built in a cache using ASP.Net's caching library.
The cache is currently set for 30 minutes.
The problem we're running into is that the download links seem to expire before this or reach a maximum of some sort and users are getting a page that tells them they don't have access to the download.
Now, if I reset the cache, it works.
So I'm looking for help / advice on how to better handle this.
1) I could just not cache anything and make each page load an individual API call, but that seems like unnecessary overhead.
2) I could reduce the cache lifetime.
3) I could build a separate service on our end that goes around the cache when the user clicks to download.
I'm a fan of #2 and #3, but I wondered if someone else could help offer some suggestions.
Thanks!
-Eric
With caching headers I can either make the client not check online for updates for a certain period of time, and/or check for etags every time. What I do not know is whether I can do both: use the offline version first, but meanwhile in the background, check for an update. If there is a new version, it would be used next time the page is opened.
For a page that is completely static except for when the user changes it by themselves, this would be much more efficient than having to block checking the etag every time.
One workaround I thought of is using Javascript: set headers to cache the page indefinitely and have some Javascript make a request with an If-Modified-Since or something, which could then dynamically change the page. The big issue with this is that it cannot invalidate the existing cache, so it would have to keep dynamically updating the page theoretically forever. I'd also prefer to keep it pure HTTP (or HTML, if there is some tag that can do this), but I cannot find any relevant hits online.
A related question mentions "the two rules of caching": never cache HTML and cache everything else forever. Just to be clear, I mean to cache the HTML. The whole purpose of the thing I am building is for it to be very fast on very slow connections (high latency, low throughput, like EDGE). Every roundtrip saved is a second or two shaved off of loading time.
Update: reading more caching resources, it seems the Vary: Cookie header might do the trick in my case. I would like to know if there is a more general solution though, and I didn't really dig into the vary-header yet so I don't know yet if that works.
Solution 1 (HTTP)
There is a cache control extension stale-while-revalidate which describes exactly what you want.
When present in an HTTP response, the stale-while-revalidate Cache-
Control extension indicates that caches MAY serve the response in
which it appears after it becomes stale, up to the indicated number
of seconds.
If a cached response is served stale due to the presence of this
extension, the cache SHOULD attempt to revalidate it while still
serving stale responses (i.e., without blocking).
cache-control: max-age=60,stale-while-revalidate=86400
When browser firstly request the page it will cache result for 60s. During that 60s period requests are answered from the cache without contacting of the origin server. During next 86400s content will be served from the cache and fetched from origin server simultaneously. Only if both periods 60s+86400s are expired cache will not serve cached content but wait for origin server to fresh data.
This solution has only one drawback. I was not able to find any browser or intermediate cache which currently supports this cache control extension.
Solution 2 (Javascript)
Another solution is usage of Service workers with its feature to make custom responses to requests. With combination with Cache API it is enough to provide the requested feature.
The problem is that this solution will work only for browsers (not intermediate caches nor another http services) and even not all browsers supports Services workers and Cache API.
I have this in my log:
2016-01-07 12:22:38,720 WARN [alfresco.cache.immutableEntityTransactionalCache] [http-apr-8080-exec-5] Transactional update cache 'org.alfresco.cache.immutableEntityTransactionalCache' is full (10000).
and I do not want to just increase this parameter without knowing what is really going on and having better insights of alfresco caches best practices!
FYI:
The warning appears when I list the element from document library root folder in a site. Note that the site does have ~300 docs/folder at that level, several of which are involved in current workflows and I am getting all of them in one single call (Client-side paging)
I am using an Alfresco CE 4.2.c instance with around 8k nodes
I ve seen this in my logs whenever you do a "big" transaction. By that I mean making a change to 100+ files in a batch.
Quoting Axel Faust:
The performance degredation is the reason that log message is a warning. When the transactional cache size is reached, the cache handling can no longer handle the transaction commit properly and before any stale / incorrect data is put into the shared cache, it will actually empty out the entire shared cache. The next transaction(s) will suffer bad performance due to cache misses...
Cache influence on Xmx depends on what the cache does unfortunately. The property value cache should have little impact since it stores granular values, but the node property cache would have quite a different impact as it stores the entire property map. I only have hard experience data from node cache changes and for that we calculated additional need of 3 GiB for an increase to four-times the standard cache size
It is very common to get these warnings.
I do not think that it is a good idea to change the default settings.
Probably you can try to change your code, if possible.
As described in this link to the alfresco forum by one of the Alfresco engineer, the value suggested by Alfresco are "sane". They are designed to work well in standard cases.
You can decide to change them, but you have to be careful because you can get lower performances than what you would get doing nothing.
I would suggest to investigate why your use of this webscript is causing the cache overflow and check if you can do something about it. The fact that you are retrieving 300 documents/folders in the same time, it is likely to be the cause.
In the following article you can find how to troubleshoot and solve issues with the cache.
Alfresco cache tuning
As described in that article, I would suggest to increase the log level for ehcache:
org.alfresco.repo.cache.EhCacheTracerJob=DEBUG
Or selectively adding the name of the cache that you want to monitor.
I have an asp.net web page generated by server (IIS 8.5), it displays some graphs based on data stored in the back-end. I manually updated the database (bulk insert some data), and refresh the browser, but the page does not show the new data.
I think it a cache problem, since when I press ctrl + F5, the new data appears. So how should I solve this problem? Do something in the web server?
You can control browser caching via the Expires, Cache-Control, Last-Modified and ETag headers.
Take a look at these two Google Developers pages.
If you want to disable caching at any cost, include a unique tag in your image URLs that changes every time the image content changes, for example:
http://example.test/path/to/image/graph1.png?version=2014-3-19
With version changing every time you update the image. Since it is a new URL it is guaranteed not to be cached. Be careful with this technique though, since using this technique when it's not really needed can lead to long loading times (since well, you disabled caching of the images).
My objective is to achieve zero downtime during deployment. My site uses akamai as CDN. Lets say I do have primary and secondary cluster of IIS servers. During deployment, the updates are made to secondary cluster. Before switchover from primary to secondary, can I request akamai to cache the content and do a cutover at a specified time?
The problem you are going to have is to guarantee that your content is cached on ALL akamai servers. Is the issue that you want to force content to be refreshed as soon as you cutover?
There are a few options here.
1 - Use a version in the requests "?v=1". This version would ALWAYS be requested from origin and would be appended to every request. As soon as you update your site, update the version on origin, so that the next request will append "?v=2" thus "busting" the cache and forcing an origin hit for all requests
2 - Change your akamai config to "honor webserver TTLs". You can then set very low or almost 0 TTLs right before you cut over and then increase gradually after you cutover
3 - Configure akamai to use If-MOdified-Since. This will force akamai to "validate" if any requests have changed.
4 - Use ECCU which can purge a whole directory, but this can take up to 40 minutes, but should be manageable during a maint window.
I don't think this would be possible based on my experience with Akamai (but things change faster than I can keep up with) - you can flush the content manually (at a cost) so you could flush /* we used to do this for particular files during deployments (never /* because we had over 1.2M URLs) but I can't see how Akamai could cache a non-visible version of your site for instant cut-over without having some secondary domain and origin.
However I have also found that Akamai are pretty good to deal with and it would definitely be worth contacting them in relation to a solution.