akamai refresh cache before deployment and do cutover at specified time - cdn

My objective is to achieve zero downtime during deployment. My site uses akamai as CDN. Lets say I do have primary and secondary cluster of IIS servers. During deployment, the updates are made to secondary cluster. Before switchover from primary to secondary, can I request akamai to cache the content and do a cutover at a specified time?

The problem you are going to have is to guarantee that your content is cached on ALL akamai servers. Is the issue that you want to force content to be refreshed as soon as you cutover?
There are a few options here.
1 - Use a version in the requests "?v=1". This version would ALWAYS be requested from origin and would be appended to every request. As soon as you update your site, update the version on origin, so that the next request will append "?v=2" thus "busting" the cache and forcing an origin hit for all requests
2 - Change your akamai config to "honor webserver TTLs". You can then set very low or almost 0 TTLs right before you cut over and then increase gradually after you cutover
3 - Configure akamai to use If-MOdified-Since. This will force akamai to "validate" if any requests have changed.
4 - Use ECCU which can purge a whole directory, but this can take up to 40 minutes, but should be manageable during a maint window.

I don't think this would be possible based on my experience with Akamai (but things change faster than I can keep up with) - you can flush the content manually (at a cost) so you could flush /* we used to do this for particular files during deployments (never /* because we had over 1.2M URLs) but I can't see how Akamai could cache a non-visible version of your site for instant cut-over without having some secondary domain and origin.
However I have also found that Akamai are pretty good to deal with and it would definitely be worth contacting them in relation to a solution.

Related

Cloudflare optimization techniques (free plan)

OK, so I'm trying to benefit from the CF's free plan and squeeze as much as I can out of it. The main goal is to get the site served from the CF cache so it will load faster in the browser, if only for the first visit and search engines. It is a Wordpress so it can be a little slower than other sites.
So, to have CF cache properly I have set the following rules. You probably know that under the free plan 3 is the maximum:
https://example.com/wp-content/*
Browser Cache TTL: a year, Cache Level: Cache Everything, Edge Cache TTL: a month
https://example.com/wp-admin/*
Security Level: High, Cache Level: Bypass, Disable Apps, Disable Performance
https://example.com/*
Auto Minify: HTML, CSS & JS, Browser Cache TTL: 30 minutes, Cache Level: No Query String, Edge Cache TTL: 2 hours, Email Obfuscation: On, Automatic HTTPS Rewrites: On
Exactly in this order. These should allow CF to cache the files stored in the wp-content (uploads etc) for the maximum amount of time, then ignore and bypass the wp-admin and finally serve all the others (products in my case, blog articles, pages and so on) from its cache, although these should have a shorter time. I've also set the caching level in the Cloudflare dashboard to 'No query string'.
So far CF caches all the above and first time visitors or search engines should get a super fast page.
Next, I've added the following in the site's footer:
<script>jQuery(document).ready(function(){var e="?"+(new Date).getTime();jQuery("a").each(function(){jQuery(this).attr("href",jQuery(this).attr("href")+e)})})</script>
This script appends the current date to all links on the page. By doing this I want the visitor to get the latest version of the page (ie from my server), not the one stored by CF, because CF should not cache ULRs such as https://example.com/samplepage?234523445345 as it was instructed previously, in both the cache settings and the page rules.
Now, what I'm worried about is CF caching pages belonging to logged in members, such as account details. While the string javascript does work and the members would click a link such as /account?23456456 and therefore the page should not get cached, I have to wonder 'what if?'.
So, is there any better way to achieve what I am trying to (fast loading without caching members pages and sensitive details, such as shopping cart)? Or is this the maximum I can get out of the free plan?
In your case. Completely wordpress site? It is really very simple than other platforms to optimise. A new service called. Automatic Platform optimisation (APO). enable this in your cloudflare and install this in your wordpress plugin. Then connect the cloudflare to wordpress through APO.. And try to cache everything from your origin server. This will reduce the TTFB and RTT. This two will defenitely satisfy your site performance and speed.

TTFB from very high to very low Cloudflare

We are using a wordpress setup with hosting on Google Cloud and Cloudflare.
In Cloudflare we are using the page cache feature which should help to decrease the TTFB substantially. What it basically does is to cache every static page and serves it to the client directly. What makes me wonder is that if I make a request in the morning the TTFB is like over 1 second. All requests after that the TTFB reduces to 70ms. That is a lot. It almost feels like a browser cache when I visit a website for the second time. But after some time the TTFB spikes again to over 1 second, almost as if Cloudflare drops the cache. That's why we additionally added the EDGE Cache TTL Time of 1 month, but still. I have those daily spikes and I think every user has a TTFB over 1 second when visiting our site for the first time.
Any guesses why this is so random?
This is the guide directly from cloudflare about the page cache:
https://support.cloudflare.com/hc/en-us/articles/236166048-Caching-Static-HTML-with-WordPress-WooCommerce
Appreciate your help
I believe Cloudflare doesn't cache universally, meaning that one retrieval for a cached static resource does not cache a copy on all cloudflare servers. In fact, I believe that cloudflare caches ray-wide in its caching implementation. It seems that the "1 second" TTFB is probably Cloudflare retrieving from your origin server and caching the result because it hasn't cached it for that ray yet.
Regardless of the above, it seems that the "1 second" TTFB is probably Cloudflare retrieving from your origin server and caching the result. To confirm this you can look at the response and there will be a CF-Cache-Status header that indicates HIT or MISS. You will probably see that it is always MISS for the 1 second+ requests. You should also see another header called CF-Ray that looks something like 5abb86fb2d6c9bc1-SJC where the SJC is the data center code. You should verify that this is a datacenter that is located geographically close to you to make sure that your DNS is set up correctly to get a nearby cloudflare server per the site list here: https://www.cloudflarestatus.com/

How to display the cached version first and check the etag/modified-since later?

With caching headers I can either make the client not check online for updates for a certain period of time, and/or check for etags every time. What I do not know is whether I can do both: use the offline version first, but meanwhile in the background, check for an update. If there is a new version, it would be used next time the page is opened.
For a page that is completely static except for when the user changes it by themselves, this would be much more efficient than having to block checking the etag every time.
One workaround I thought of is using Javascript: set headers to cache the page indefinitely and have some Javascript make a request with an If-Modified-Since or something, which could then dynamically change the page. The big issue with this is that it cannot invalidate the existing cache, so it would have to keep dynamically updating the page theoretically forever. I'd also prefer to keep it pure HTTP (or HTML, if there is some tag that can do this), but I cannot find any relevant hits online.
A related question mentions "the two rules of caching": never cache HTML and cache everything else forever. Just to be clear, I mean to cache the HTML. The whole purpose of the thing I am building is for it to be very fast on very slow connections (high latency, low throughput, like EDGE). Every roundtrip saved is a second or two shaved off of loading time.
Update: reading more caching resources, it seems the Vary: Cookie header might do the trick in my case. I would like to know if there is a more general solution though, and I didn't really dig into the vary-header yet so I don't know yet if that works.
Solution 1 (HTTP)
There is a cache control extension stale-while-revalidate which describes exactly what you want.
When present in an HTTP response, the stale-while-revalidate Cache-
Control extension indicates that caches MAY serve the response in
which it appears after it becomes stale, up to the indicated number
of seconds.
If a cached response is served stale due to the presence of this
extension, the cache SHOULD attempt to revalidate it while still
serving stale responses (i.e., without blocking).
cache-control: max-age=60,stale-while-revalidate=86400
When browser firstly request the page it will cache result for 60s. During that 60s period requests are answered from the cache without contacting of the origin server. During next 86400s content will be served from the cache and fetched from origin server simultaneously. Only if both periods 60s+86400s are expired cache will not serve cached content but wait for origin server to fresh data.
This solution has only one drawback. I was not able to find any browser or intermediate cache which currently supports this cache control extension.
Solution 2 (Javascript)
Another solution is usage of Service workers with its feature to make custom responses to requests. With combination with Cache API it is enough to provide the requested feature.
The problem is that this solution will work only for browsers (not intermediate caches nor another http services) and even not all browsers supports Services workers and Cache API.

Long waiting (TTFB) time for scripts / styles on Azure Website

I have this intriguing problem on Azure Website. My website uses 4 script files and 3 style files, each minified. They are not so big, bigest has near 200 KBs. Website had already started. Azure's Always On option is turned on. When I call to WebApi for data it returns in <50ms.
And when app is reloaded it needs 250 ms just to get first byte from tiniest script, and others needs much more. Initial Html is loaded in 60 ms. Scripts/styles are cached so they are not downloaded, but the TTFB time is killing the performance. This repeats every single reload. App is not containing any sophisticated configuration so it should run much faster than it.
What can cause such problems?
Although your static files are cached, the browser still issues requests with if-modifies-since header (which results in a 304).
While it doesn't need to download the actual content, it still needs to wait the RTT + server think time to continue.
I would suggest two things:
Adding Cache-Control and Expire headers - will help avoid 304 in some cases (pretty much unless you hit F5)
Using a proper CDN - such as Incapsula or others, that will minimize the RTT + think time. It can also be used to easily control cache settings for various resources.
More good stuff here.
Good Luck!
From here:
As you saw earlier, IIS 7 caches the compressed versions of static
files. So, if a request arrives for a static file whose compressed
version is already in the cache, it doesn’t need to be compressed
again.
But what if there is no compressed version in the cache? Will IIS 7
then compress the file right away and put it in the cache? The answer
is yes, but only if the file is being requested frequently. By not
compressing files that are only requested infrequently, IIS 7 saves
CPU usage and cache space.
By default, a file is considered to be requested frequently if it is
requested two or more times per 10 seconds.
So, the reason your users are being served an uncompressed version of the javascript file is because it didn't meet the default threshold for being compressed; in other words, the javascript file was not requested 2 times within 10 seconds.
To control this, there is one attribute we must change on the <serverRuntime> element, which controls compression: frequentHitThreshold. In order for your file to be compressed when it is requested once, change your <serverRuntime> element to look like this:
<serverRuntime enabled="true" frequentHitThreshold="1" />
This will slightly impact your CPU performance if you have many javascript files that are being served and you have users quite often, but likely if you have users often enough to impact CPU from compressing these files, then they are already compressed and cached!
My guess would be Azures always on.
If it works anything like the one CloudFlare provides, it essentially proxies the request and tries to cache it.
Depending on the exact implementation of this cache on the side of Azure, it might wait for the scripts output to complete to cache it/validate the cache and then pass it on to the browser.
You might have a chance checking the caching configuration and disable always on for your scripts if possible.
The scripts and styles are static files and by default are compressed (you can check this with HTTP header "content-encoding": gzip) before being sent to client. So, the TTFB consists of network latency, browser HTTP channel scheduling and the static file compression time from server.
On the other hand, your Web API data is dynamic data and by default is not compressed, so possible its TTFB is less than the TTFB for static files.
However, you don't need to switch off static compressing, otherwise TTFB is minimized but content transferring time will be extended. Actually, you don't need to worry about TTFB, see more info: https://blog.cloudflare.com/ttfb-time-to-first-byte-considered-meaningles/
I finished with storing files on Azure Storage and serving them by Azure CDN. It provides high speed of response and costs nothing. I add them to blob every publish, in Pre-build event by Gulp.
well... there are 2 main problems with your site:
you are using AZURE - a high priced service with a poor performance.... don't ask me why people think that this is a good service
you are storing client files side-by-side with the server files.. while server files should be stored in a specific server, client files can practically can be served from... everywhere
so - please use a CDN (or any other server) for your client side files (mainly css and js, you may consider moving fonts and images as well)

I'm confused about HTTP caching

I've been thinking about batch reads and writes in a RESTful environment, and I think I've come to the realization that I have broader questions about HTTP caching. (Below I use commas (",") to delimit multiple record IDs, but that detail is not particular to the discussion.)
I started with this problem:
1. Single GET invalidated by batch update
GET /farms/123 # get info about Old MacDonald's Farm
PUT /farms/123,234,345 # update info on Old MacDonald's Farm and some others
GET /farms/123
How does a caching server in between the client and the Farms server know to invalidate its cache of /farms/123 when it sees the PUT?
Then I realized this was also a problem:
2. Batch GET invalidated by single (or batch) update
GET /farms/123,234,345 # get info about a few farms
PUT /farms/123 # update Old MacDonald's Farm
GET /farms/123,234,345
How does the cache know to invalidate the multiple-farm GET when it sees the PUT go by?
So I figured that the problem was really just with batch operations. Then I realized that any relationship could cause a similar problem. Let's say a farm has zero or one owners, and an owner can have zero or one farms.
3. Single GET invalidated by update to a related record
GET /farms/123 # get info about Old MacDonald's Farm
PUT /farmers/987 # Old MacDonald sells his farm and buys another one
GET /farms/123
How does the cache know to invalidate the single GET when it sees the PUT go by?
Even if you change the models to be more RESTful, using relationship models, you get the same problem:
GET /farms/123 # get info about Old MacDonald's Farm
DELETE /farm_ownerships/456 # Old MacDonald sells his farm...
POST /farm_ownerships # and buys another one
GET /farms/123
In both versions of #3, the first GET should return something like (in JSON):
farm: {
id: 123,
name: "Shady Acres",
size: "60 acres",
farmer_id: 987
}
And the second GET should return something like:
farm: {
id: 123,
name: "Shady Acres",
size: "60 acres",
farmer_id: null
}
But it can't! Not even if you use ETags appropriately. You can't expect the caching server to inspect the contents for ETags -- the contents could be encrypted. And you can't expect the server to notify the caches that records should be invalidated -- caches don't register themselves with servers.
So are there headers I'm missing? Things that indicate a cache should do a HEAD before any GETs for certain resources? I suppose I could live with double-requests for every resource if I can tell the caches which resources are likely to be updated frequently.
And what about the problem of one cache receiving the PUT and knowing to invalidate its cache and another not seeing it?
Cache servers are supposed to invalidate the entity referred to by the URI on receipt of a PUT (but as you've noticed, this doesn't cover all cases).
Aside from this you could use cache control headers on your responses to limit or prevent caching, and try to process request headers that ask if the URI has been modified since last fetched.
This is still a really complicated issue and in fact is still being worked on (e.g. see http://www.ietf.org/internet-drafts/draft-ietf-httpbis-p6-cache-05.txt)
Caching within proxies doesn't really apply if the content is encrypted (at least with SSL), so that shouldn't be an issue (still may be an issue on the client though).
HTTP protocol supports a request type called "If-Modified-Since" which basically allows the caching server to ask the web-server if the item has changed. HTTP protocol also supports "Cache-Control" headers inside of HTTP server responses which tell cache servers what to do with the content (such as never cache this, or assume it expires in 1 day, etc).
Also you mentioned encrypted responses. HTTP cache servers cannot cache SSL because to do so would require them to decrypt the pages as a "man in the middle." Doing so would be technically challenging (decrypt the page, store it, and re-encrypt it for the client) and would also violate the page security causing "invalid certificate" warnings on the client side. It is technically possible to have a cache server do it, but it causes more problems than it solves, and is a bad idea. I doubt any cache servers actually do this type of thing.
Unfortunately HTTP caching is based on exact URIs, and you can't achieve sensible behaviour in your case without forcing clients to do cache revalidation.
If you've had:
GET /farm/123
POST /farm_update/123
You could use Content-Location header to specify that second request modified the first one. AFAIK you can't do that with multiple URIs and I haven't checked if this works at all in popular clients.
The solution is to make pages expire quickly and handle If-Modified-Since or E-Tag with 304 Not Modified status.
You can't cache dynamic content (withouth drawbacks), because... it's dynamic.
In re: SoapBox's answer:
I think If-Modified-Since is the two-stage GET I suggested at the end of my question. It seems like an OK solution where the content is large (i.e. where the cost of doubling the number of requests, and thus the overhead is overcome by the gains of not re-sending content. That isn't true in my example of Farms, since each Farm's information is short.)
It is perfectly reasonable to build a system that sends encrypted content over an unencrypted (HTTP) channel. Imagine the scenario of a Service Oriented Architecture where updates are infrequent and GETs are (a) frequent, (b) need to be extremely fast, and (c) must be encrypted. You would build a server that requires a FROM header (or, equivalently, an API key in the request parameters), and sends back an asymmetrically-encrypted version of the content for the requester. Asymmetric encryption is slow, but if properly cached, beats the combined SSL handshake (asymmetric encryption) and symmetric content encryption. Adding a cache in front of this server would dramatically speed up GETs.
A caching server could reasonably cache HTTPS GETs for a short period of time. My bank might put a cache-control of about 5 minutes on my account home page and recent transactions. I'm not terribly likely to spend a long time on the site, so sessions won't be very long, and I'll probably end up hitting my account's main page several times while I'm looking for that check I recently sent of to SnorgTees.

Resources