How does hsts preload work on the backend - http

I know the hsts (http to https) will work from the very first time If my site is registered in the preload list.
On the other hand I am also declaring preload in hsts header in my web server.
What if I access my site for the very first time with http which one is gonna happen first?
I mean will the site access the preload list first or web server first?

You need to submit your site to the browsers preload list. It will then yet you are issuing the preload header (to prevent bad actors submitting sites to preload list when they don’t want it), and include it in the inbuilt list in future release.
Some browsers also regularly scan or crawl websites looking for sites with preload headers to include. Though I believe this is done less, and it’s better to explicitly submit your site.
After the site is included in the browsers preload list and request for http:// version will automatically be converted to https://. This happens before you send the request, so before you get the HSTS header response.
That’s the point of preloading - to protect you before you even make a single request.
Personally I’m not a fan of preload. Hard coding a list of sites something into a browser has obvious scaling issues but, more importantly, when you do that you’re taking a risk with something you can’t change back without waiting months or possibly years for browser vendors to pickup the reverted setting to remove the code. I personally believe preload is overkill for most sites.

Related

Preventing Safari prefetching web pages (server side)

When typing a web link into the safari URL field, the browser attempts to prefetch all links it has previously seen before, both GET and POST.
This causes each and every web link a server supports that is listed in the dropdown as a possible completion to be activated. This is problematic. For example, if a web site has authentication with an /auth/logout link for logging out, then this can cause the link to be activated if it appears in the dropdown, logging the user out unintentionally.
Many browsers send a specific header (eg. 'Purpose: Prefetch' in chrome) that allows the server side to filter prefetch/preload requests (eg. return a 503) but Safari doesn't seem to send any distinguishing header field. It also seems to try to prefetch POST requests, which seems very broken to me. Get requests are notionally at least idempotent, but POST requests are supposed to be understood to be data changing.
Has anyone got a solution to this? Please don't suggest that the browser preload feature can be turned off by the end user - that ISN'T a solution from a service delivery perspective.
Has anyone got an explanation as to why browsers would do this and NOT signal the purpose in a header field? (I get why prefetching is a useful Ux capacity, but not why its useful while typing URLs, especially for URLs already previously downloaded and thus capable of returning prefetching metadata that would allow a server to selectively disable the capability where appropriate) From what I can tell, this kind of functionality started to appear with header fields included, but some browsers have removed this signature. why? It seems to be dreadfully broken to me.
thanks.

Last-modified cache does not show when user has logged-in

I am using the last-modified HTTP header to help browsers with caching and I have noticed an annoying problem.
If a user visits a page BEFORE he has logged in, then the browser is showing the cached page even after the user logs-in. This means he is unable to see his log-in information (profile pic, notifications etc) in the header until he visits a page on the site he has not visited before.
Because the content of the actual article itself has not changed since his first visit, he is served up the same page even if he logs-in.
I have tried checking to see if the user has just logged-in (using a SESSION.LoggedIn variable), and then use the current DateTime for Last-Modified, Expires and Cache-Control to tell the browser to serve up a fresh copy of the page but it doesn't work on the Android browser. It just serves the cached version again. What this means is that the user cannot tell that they have logged-in because their name and other credentials don't appear at the top of the page.
How do I use HTTP header caching effectively and also take care of people visiting the same page as both logged-in and anonymously? The logged-in information sits in the header of the site (just like on SO) so is there a way not to cache the siteheader but the rest of the page?
Use a tool like firebug to see the network traffic for a URL. You'll notice that it is 'file' objects: html files, javascript files, css files, images, etc.
I don't think that you can cache a div (or other page layout construct) very easily.
It has been a while since I attempted to use the last-modified HTTP header for caching. I ran into similar problems that you have. Browser implementation/compatibility wasn't 100%. I've also used last-modified in an attempt to inform search engine spiders that files have changed. That didn't work very well either. Eventually I removed all of my attempts at last-modified caching/hinting and just allow the web server and browsers to deal with it.
Eventually I ended up spending a lot of time optimizing database queries, database indexes, and in a few cases implemented the cachedwithin attribute of cfquery tags. This attempt at improving site performance has worked better for me.

Can we use etags to get the latest version of image from a CDN

We have a use case where we are storing our images in a CDN. Let's say we are storing a.jpg in the cache and if the user uploads a newer version of the file, then it will flush the cache and overwrite the a.jpg. Now the challenge is that the browser might have cached the file. Since we cannot flush the cached image in the browser we are thinking of using one of the 2 approaches mentioned below :
Append a version a_v1.jpg, a_v2.jpg (version id is the checksum) this will eliminate the need for flushing the browser and CDN cache. I found a lot of documentation about this on the internet and so many people are using this.
Use the etag of the file to find eliminate the stale cache in the browser. I found that CDN's support etags but I did not find literature that etag is used for images.
Can you please share your thoughts about using etag header for cache busting ? Is this a good practice to use it ?
well i wouldn't suggest etag. This might have its advantage but has its setbacks as well. Say you are running two servers then the etag when content served from each of these servers might change.
Best thing i would suggest is control what the browser is caching and how long.
What i mean is send expiry headers when sending response from cdn to client browser say 5min TTL. This way browser will respect the expiry header. And once expired browser will send a fresh request to cdn when the page is refreshed.

serving images from one domain for multiple websites

we have nearly 13 domains within our company and we would like to serve images from one application in order to leverage caching.
for example, we will have c1.example.com and we will put all of our product images under this application. but here I have some doubts;
1- how can I force client browser's to cache the image and do not request it again?
2- when I reference those images on my application, I will use following html markup;
<img scr="http://c1.example.com/core/img1.png" />
but this causes a problem when I run the website under https. It gives warning about the page. It should have been used https//c1.example.com/core/img1.png when I run my apps under https. what should I do here? should I always use https? or is there a way to switch between auto?
I will run my apps under IIS 7.
Yes you need to serve all resources over https when the html-page is served over https. Thats the whole point of using https.
If the hrefs are hardcoded in the html one solution could be to use a Response Filter that will parse all content sent to the client and replace http with https when necessary. A simple Regular Expression should do the trick. There are plenty of articles out there about how these filters are working.
About caching you need to send the correct cache-headers and etag. There are several of questions and answers on this on SO like this one IIS7 Cache-Control
You need to use HTTP headers to tell the browser how to cache. It should work by default (assuming you have no query string in your URLs) but if not, here's a knowledge base article about the cache-control header:
http://support.microsoft.com/kb/247404
I really don't know much about IIS, so I'm not sure if there are any other potential pitfalls. Note that browsers may still send HEAD requests sometimes.
I'd recommend you setup the image server so that HTTP/S is interchangeable, then just serve HTTPS Urls from HTTPS requests.

Far future expire header and HTTP 304

I'm trying to optimize the loading time of a website. One of the things I've done is set a far-futures expires header for static content so that they are cached (as described by Yahoo). However, even though they are cached, the browser still sends a request and gets back a 304 (Not Modified) response for that resource.
I realize the 304 response is very small and probably has minimal performance effect, but is there a way to make it such that the browser will no longer send the request at all and just always use the cache for that resource?
You may want to try turning off ETags if you are sending both etags and expires. Some people suggest turning off eTags, especially if you have a load balancer.
Also, note, when you press reload on your page, Firefox WILL recheck all the resources. These will come back with 304's. If you press shift-reload, it will re-request all the resources without etags. So don't use the refresh/reload button to test your last-modifed/etag settings.

Resources