How to purge Firebase Hosting CDN cache for a path with wildcards or query strings (GET params)? [migrated] - firebase

This question was migrated from Stack Overflow because it can be answered on Webmasters Stack Exchange.
Migrated 2 days ago.
Context
I'm having a Cloud Run service, that generates images, hosted via Firebase hosting. I benefits from the included Firebase Hosting CDN. Because, with Firebase hosting:
Any requested static content is automatically cached on the CDN.
🎉
Creating CDN cache works well
The first time when requesting an image it will be generated on the fly, subsequent requests it will get the CDN-version.
This works amazingly well.
Purging one explicit URL works too
Sometimes I need to remove all live versions of an image. Therefore, I can do a PURGE request, and remove the specific path from the CDN.
But, how to work with wildcards or query strings?
It seems that the Firebase CDN cache has different keys for all variations of a URL. If I would e.g. issue a request to a specific URL with a PURGE method:
curl -X PURGE https://my.firebasehosting.site.com/path/to/content
This will purge the CDN cache for that URL, but not for https://my.firebasehosting.site.com/path/to/content?someAdditionalStuffHere
What would the way to PURGE urls at locations like:
https://my.firebasehosting.site.com/path/to/content
https://my.firebasehosting.site.com/path/to/content?someAdditionalStuffHere
https://my.firebasehosting.site.com/path/to/content?width=200&height=100
https://my.firebasehosting.site.com/path/to/**
etc?
...In one go, without knowing all variants that have been generated? Is this even possible?

Related

Function to output HTML and store the result in firebase hosting

I want to respond to Firebase events to generate (keep updated) to generate HTML pages and put them to Firebase Hosting so that they can be immediately available for use. I have it working except for the part about uploading the resulting HTML to Firebase hosting. It seems like I cannot do it this way but I want to so that all the pages are pre-rendered and ready to load fast.
I have cloud functions connected to hosting but that is the same old way of fetching from the database during a request cycle which I wanted to avoid.
On this page it says "Prerender your single-page apps to improve SEO." and thats what I want. Is it possible? How to store the pre-rendered pages from a HTTP function?
The "Prerender your single-page apps to improve SEO." talked about on that page is prerender in the cloud before serving the content to the requesting party. It is not generate static files when data updates before a request is even made.Generally the prerendering with appropriate caching headers is enough for most use-cases.
If you really want to pregenerate all the pages whenever data changes, you could do that but that'll be more complicated. There are some good articles and guides about deploying to Firebase Hosting after continuous integration finishes. The general idea holds true for what it sounds like you want except what triggers the build/deploy is data driven rather than code change.
The way to pre-render HTML so that metadata such as JSON-LD is available to search engines and opengraph is available to social media platforms for rich cards in shared links is to use Cloud Functions. You basically run Express/Pug (previously Jade) in your cloud function(s) to respond with HTML after whatever database/datastore lookups have completed. I've implemented this and it works great.
Call functions via HTTP requests provides some direction. You basically add some forwarding info to customize your hosting. This will direct HTTP calls over to your Express server running in Cloud Functions. Check the firebase functions github repo for sample code.

Is it possible to host my Node.js server on Firebase cloud delivering by CDN?

I have seen that Google Firebase offers a static files hosting solution (for the front end) which is served in SSL and by CDN. That means, I can serve customers all around the world with a server located probably close to them and enjoying good speeds.
Now I want to do the same with my Node.js backend code.
That means, instead of hosting my backend code in my own VPS, that will be probably fast only for who lives close to my server, I want to deploy the same server to Firebase's CDN and ofcourse, over HTTPS.
What I have found for now is the Firebase Functions which is probably a Node.js server. However I am not sure if its running uppon a CDN, so it will be fast just as the static files serving, or that its just a server located somewhere in US that has to serve worldwide.
In addition, if there is such a service - where I can host my back end code with SSL, may I have the "standard" express configuration I have now on my VPS?
And what for about clusters/workers? How many workers I can have when using the Firebase solution (if there is one like that).
Thanks.
SSL and firebase functions & hosting?
You get HTTPS by default for hosting and functions. If you need functions to served from your custom domain and not https://us-central1-[projectname].cloudfunctions.net, you will need to configure your firebase.json file to rewrite your routes to your firebase functions. The main thing to flag here is both options you get HTTPS and certs issues directly from google/firebase.
When you bring a custom domain over it can take up to 1-2 hours for firebase to issue the certificate, but all this happens automatically without you having to do anything.
Does firebase functions integrate with a CDN?
Yes, but you need to set the correct s-maxage header in your response to ensure the firebase CDN will store it. See here for more info on this.
Cache invalidation is still hard with firebase so I would keep this in mind before you set anything.
How many workers I can have when using the Firebase solution (if there is one like that).
One benefit of using firebase functions is that you don't need to really give much thought to the resources behind the backend. If you have heavier workloads you can increase your ram/ cpu power in the google console for your selected function. The endpoint will scale up and down depending on how many requests it gets. On the flip side if it doesn't get any requests (usually in non prod environments) it'll go to an idle state. You need to be aware of a cold start problem before you fully commit to using this as a replacement to your current nodejs VPs hosting solution.
I personally use the cache control headers to ensure the functions responses are pushed into the CDN edge, which takes the edge off the cold start issue (for me and my use case).

Do Cloudfront images expire?

We are using Amazon cloudfront to serve static files from our website. We are copying these image references to another database on a mobile app, so that the images in the app are served from Cloudfront as well, so we need the URL to be permanent.
A URL for an image on our site looks something like this: http://d3q35tken14acg.cloudfront.net/cdn/farfuture/M17vstJzweaXVBR4penpg6CEv_v8DwxSKZIqZKlR6rY/mtime:1493753067/sites/unfestival/files/Screen%20Shot%202017-05-02%20at%2020.18.53.png
The mtime:1493753067 in the URL makes me wonder if the URL will expire.
My question: do URLs like this expire, or are they permanent?
The way CloudFront works can be a bit confusing, but the cache applies over the file content, not the URL. It's intended to be used for CDNs and such, where URLs need to be static, and it just ensures you can retrieve the files from a nearby region, reducing latency.
So basically Cloudfront URLs shouldn't expire for most of the cases.
Looks like you are using signed urls. It totally depends upon the time you specified when you created the URL.
If you specify time of one day while creating URL, one day after cloudfront will not answer for this URL marking this as unauthorised.

Can sites on Firebase hosting include non-https resources?

I have been trying to migrate my site from divshot to firebase, since firebase has taken over divshot and shut it down.
Mine is a simple read only site that does not need https. It also contains links to external sites which do not support https. The site worked perfectly on divshot but it looks like firebase forces all sites to use https. Unfortunately, this causes the external sites that my site references to fail loading. The error being:
Mixed Content: The page at 'https://mysite.firebaseapp.com/' was loaded over HTTPS, but requested an insecure resource 'http://www.externalsite.com/'. This request has been blocked; the content must be served over HTTPS.
I tried to remove the http: so the external site is just //www.externalsite.com/, but this causes certificate errors. I can't change it to https since this external site doesn't support it.
Is there any way around this problem?
The short answer is no. This is completely by design. It's a security flaw to allow http on a https site. Therefore it's blocked.
However,
Solution 1: Find a https version of resource This might not be possible in your case.
Solution 2: convert resource to https It might be possible to host the file or resource yourself with https. This may require you to copy a file or something, which I say carefully, don't pirate stuff that you shouldn't.
Solution 3: Redirect This one is probably the most involved solution to do but if you are trying to access some service then you could make your own service to redirect it. You are on firebase which means you could probably hack together some cloud function to make a http request (How to make an HTTP request in Cloud Functions for Firebase?)
Solution 4: Don't use Firebase Don't want to do any of the above and you can't live with out the http call? You might just dump firebase and move to some other hosting service.
Hope you find this helpful it might not be the answer your looking for but it might point you in the right direction.

Wordpress (using varnish + apc + w3tc): Do the statistics get updated when the data is being retrieved from the cache?

If the data is served to the client from the varnish cache, does it still get registered as a hit in the statistics (could be derived from nginx logs or might be google analytics)?
I believe that the apc doesn't affect the statistics as it caches only the PHP and the rest of the content is still derived from the nginx.
Similarly what about browser cache?
Varnish affects analytics based on access logs since cache hits do not result in a request to the nginx backend, and therefore they are not logged by the backend. For that you need to setup Varnish logging.
Google Analytics, however, is not affected by Varnish, since the tracking is based on Javascript executed within the browser. Even if the browser caches the page HTML and the Javascript, the script would still be executed and statistics logged by Google Analytics (or other Javascript based analytics services, such as Piwik).
well yes and no, It depends on what page and what varnish rules, and what exactly counts, lets try to group them
we have backend counters like access logs, and frontend counters like google analytics.
backend counters: you need a request to hit the backend to actually record a hit, if you get a total varnish hit and the cache was served totally from the varnish cache, the backend won't even know a visit came by, and you'll get confusing numbers because each vcl_fetch would count a hit but vcl_hit wouldn't.
But if the counter is stored in a page that varnish vcl_pass then that hit would be recorded, and you don't need to do any thing special, like pages that include cookies.
frontend counters: (like google analytics and all javascript analytic libraries) Those counters should not be affected by caching, because the analytics code is still served from cache, and the counting happens to their servers, so even if your nginx server is totally dead, and varnish is serving from it's cache, your counts will still be counted normally without any interruptions.
PS about wordpress total cache: Honestly I haven't really used it before and I don't know how it exactly works, but I assume it compiles HTML pages to serve directly instead of querying the database on each hit, if we assume there's no varnish, the hits would count on access log but if you have some sort of database counter that runs on the article.php for example then you might have a problem, because there's a possibility that that file isn't run each time a new hit comes, so you need to double check that if you use php to count the hits.

Resources