Using Cache-Control in HTTP per Element - http

I am using only HTTP and set "meta http-equiv="Cache-control" content="public""
in my head. How can I set the max-age of an element? Do I need to wrap it in a certain tag or would I use an attribute? I would like to set the max-age of my css and js sources to a certain max age.

Every resource being loaded is a separate request. If you want to control the caching of your JS and CSS files, you'll have to set the appropriate headers for those requests. Your web server should have a way to do that - if not, you'll just have to write your own handlers to add the headers as needed.
There is nothing you can do in HTML to achieve this, short of inlining the files.

Related

Is Content-Security-Policy header applicable only for text/html Content-Type?

From the OWASP's website
https://cheatsheetseries.owasp.org/cheatsheets/Content_Security_Policy_Cheat_Sheet.html:
Send a Content-Security-Policy HTTP response header from your web server.
Content-Security-Policy: ...
Using a header is the preferred way and supports the full CSP feature set. Send it in all HTTP responses, not just the index page.
I don't understand how that could be true as it is possible to set the Content-Security-Policy by using a meta tag in the HTML.
I also don't see how the policy can apply to anything else but HTML pages.
Does anyone have idea why that statement above was made and if it is safe to only send HTTP header Content-Security-Policy for text/html responses?
By the way, the policy is too big and I would like to sent as fewer bytes as possible.
This is still something that’s not formally specified and there ai still some debate on this: https://github.com/w3c/webappsec/issues/520
In general there’s two arguments here:
On the one hand some other file types (XML, PDF, perhaps even SVGs) could benefit from CSP and any resource could become the page by right clicking and opening in a separate tab.
On the other hand CSPs can get quite big and are usually written for HTML pages. So a bit wasteful to send on other resources and most of it won’t be relevant.
The right answer (as suggested by above) is probably to have a reduced, and very strict, CSP for all non-HTML responses.
But I think for most people having it on the HTML only will be good enough and bring most of the benefits of CSP. Then again CSP is an advanced technique so if going as far as that, then why not do it properly?
Using a header is the preferred way and supports the full CSP feature set.
I don't understand how that could be true as it is possible to set the Content-Security-Policy by using a meta tag in the HTML.
Inside the meta tag are not supported the directives:
report-to and report-uri
frame-ansectors
sandox
Also meta tag does not support Content-Security-Policy-Report-Only feature, only the Content-Security-Policy.
All resources that start loading before meta tag in the HTML code are not affected by CSP. Malicious scripts can be injected as first item of the <head> section just before meta tags
The nonce-value is exposed in meta tag therefore can be easely stealing by script and reuse.
Using meta tag you can only set the CSP for HTML pages, but CSP is applied for XSLT in the XML pages, and for some other kinds of content (see below).
Therefore indeed an HTTP header is the preferred way to delivery CSP and using CSP via meta tag does not allow you to use full CSP feature set.
Send it in all HTTP responses, not just the index page.
I also don't see how the policy can apply to anything else but HTML pages.
The specification had in mind a little different - you should send CSP with any response page with HTML content, not only for 200 OK, but even for 404 Not found
403 Access Forbidden, etc.
Because these pages has access to cookie that can be steal in the page not covered by CSP.
CSP is applied not only to HTML pages, but to XSLT in XML-pages, to external javascripts files for workers (in Firefox). Also frame-ancestors directive of CSP HTTP header applies to any content (JPEG/GIF/PNG/PDF/MP4/etc) intended to be embedded into iframe, see the nitty-gritty here.

Defining a dynamic Content Security Policy in a Single Page Application

I want to define a content security policy that allows loading images from any origin by default but restricts this to allow only a specific set of origins in some sections of the website.
In a traditional website that makes a new HTTP request for every navigation, this could be easily done by sending a different Content-Security-Policy HTTP header for the pages that require the stricter policy. But in a single page application, this is of course not possible because navigating to a more restrictive section of the app does not cause a new HTTP request (also I would like to define policies on more dynamic conditions than a URL navigation).
I know that—besides in an HTTP header—CSP policies can also be defined in a meta tag and when multiple CSP policies are defined a request must pass all of them to be permitted. So my first approach to solving the problem was setting a default CSP in a Content-Security-Policy header for the entire page and then dynamically set more restrictive policies by adding a <meta http-equiv="Content-Security-Policy" content="…"> tag to the document's head when required.
And this works just fine for dynamically adding more restrictive policies. The big problem is that removing that meta tag or modifying it does not remove or modify the associated content security policy (tested in Chrome in Firefox). This behavior is defined in the W3C Content Security Policy spec:
Note: Modifications to the content attribute of a meta element after the element has been parsed will be ignored.
So is there any way to dynamically add (and more importantly also remove) a content security policy that does not rely on a HTTP navigation? I would like to avoid setting a restrictive image policy by default and then excepting individual images through hashes or nonces as this would be quite elaborate to implement.
In SPA you can each time to create a new fullscreen iframe and fill it by script. <iframe>, as nested browsing context, can have own CSP meta tag regardless of the parent page.
The parent page will contains only script to manage iframe's content, may be it's possible to use Worker() for this purpose.

Is Content-Type HTTP header always required?

This question is about browser behavior as well as protocol specification for linking, importing, including or ajaxing css, js, image and other resources from within html, js or css files.
While testing static files and compressed content delivery in different browsers, I found that some browsers start behaving differently if you move away from conventions. For example, IE6 creates problem if you do not send Content-Disposition: inline; header for all inline css and js etc files, and a recent version of safari does not properly handle pre-compressed gzip CSS files if you use file extension .gz like in main-styles.css.gz.
My question is about the behavior of browsers about Content-Type response header. Since <link>, <script> and <img> tags already reasonably specify the content type of the resource, can this header be safely skipped, or do some browsers require it for some historical reason?
In short, no, it's not required. But it's recommended.
Most browser that I know of will treat <link>, <script>, and <img> properly if they are not sent with headers, but there's no real good reason not to send the headers. Basically, without Content-Type headers, the browser is left to try and guess based on the content.
From RFC2616:
Content-Type specifies the media type of the underlying data.
Content-Encoding may be used to indicate any additional content
codings applied to the data, usually for the purpose of data
compression, that are a property of the requested resource. There is
no default encoding.
Any HTTP/1.1 message containing an entity-body SHOULD include a
Content-Type header field defining the media type of that body. If
and only if the media type is not given by a Content-Type field, the
recipient MAY attempt to guess the media type via inspection of its
content and/or the name extension(s) of the URI used to identify the
resource. If the media type remains unknown, the recipient SHOULD
treat it as type "application/octet-stream".
Regarding the keyword SHOULD, specified in RFC2119:
SHOULD: This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.
It is required for backward compatibility.
For example: Internet Explorer 10 needs Content-Type:image/svg+xml in order to render any svg file
IE10, IE9 and probably other browsers always need the Content-Type header.
I ran into a problem in java where I tried to post some data via the library chrriis.dj.nativeswing.swtimpl.components.JWebBrowser, which basically displays an internet explorer inside a java program. But the simple php script on the back-end would not parse my post-data. (Used WebBrowserNavigationParameters to set post data while navigating to a certain page) I finally found out that the Content-Type header had to be set for php to properly paste the post-data. (This was not set by default.) Setting it to Content-Type: application/x-www-form-urlencoded and everything worked fine. So, I guess setting Content-Type should allways be done when POSTing data to php.

Why adding version number to CSS file path?

I noticed some websites put the version numbers (especially) in the CSS file path. For example:
<link rel="stylesheet" type="text/css" href="style.css?v=12345678" />
What is the main purpose to put the version number? If the purpose is to remember when the CSS file was updated last time, shouldn't the version number added as a comment inside the CSS file?
From HTML5 ★ Boilerplate Docs:
What is ?v=1" '?v=1' is the JavaScript/CSS Version Control with
Cachebusting
Why do you need to cache JavaScript CSS? Web page designs are getting
richer and richer, which means more scripts and stylesheets in the
page. A first-time visitor to your page may have to make several HTTP
requests, but by using the Expires header you make those components
cacheable. This avoids unnecessary HTTP requests on subsequent page
views. Expires headers are most often used with images, but they
should be used on all components including scripts, stylesheets etc.
How does HTML5 Boilerplate handle JavaScript CSS cache? HTML5
Boilerplate comes with server configuration files: .htacess,
web.config and nginx.conf. These files tell the server to add
JavaScript CSS cache control.
When do you need to use version control with cachebusting?
Traditionally, if you use a far future Expires header you have to
change the component's filename whenever the component changes.
How to use cachebusting? If you update your JavaScript or CSS, just
update the "?v=1" to "?v=2", "?v=3" ... This will trick the browser
think you are trying to load a new file, therefore, solve the cache
problem.
It's there to make sure that you have the current version. If you change your website and leave the name as before, browser may not notice the change and use old CSS from its cache. If you add version, the browser will download the new stylesheet.
If you set caches to expire far in the future adding ?v=2 will let the server know this is a new file but you won't need to give it a unique name (saving you a global search and replace)
HTM5 boilerplate also includes it in their project.
Check this video also: HTML5 Boilerplate Walkthrough.
One of the reason could be to bypass file caching. Same name CSS files can be cached by the servers and may result in bad display if new version has has layout changes.
This is to optimise browser-caching. You can set the header for CSS files to never expire so the browser will always get it from its cache.
But if you do this, you'll get problems when changing the CSS file because some browsers might not notice the change. By adding/changing the version-parameter it's "another" request and so it won't be taken from the cache (but after the new version is cached, it's taken from there in the future to save bandwidth/number of requests until the version changes again).
A detailed explanation can be found at html5boilerplate.com.
My knowledge is pretty much out of date regarding websites, but the variable stored in the 'href' argument is received by the browser through HTTP. Using the usual tricks in URL-rewriting you could actually have an arbitrary script that produces CSS-output when called. That output can differ, depending on the argument.

How much time would be good to set in expire header for a website?

"Web pages are becoming increasingly
complex with more scripts, style
sheets, images, and Flash on them. A
first-time visit to a page may require
several HTTP requests to load all the
components. By using Expires headers
these components become cacheable,
which avoids unnecessary HTTP requests
on subsequent page views. Expires
headers are most often associated with
images, but they can and should be
used on all page components including
scripts, style sheets, and Flash."
As written in Yslow.
my question is how much time would be good to set in expire header for a website which has multiple stylesheets, Flash headers, Javascripts, images, PDF, MS Excel files, PPT etc.?
If I want to set same expire time on all things.
What I do is set the expiry time of CSS and JavaScript files to a high value, like 1 or 5 years. When I change the stylesheets or JS files, I change the version number in their URLs, to prevent stale files from being served from cache.
Looks like this is what SO does, too:
<link rel="stylesheet" href="http://sstatic.net/stackoverflow/all.css?v=e27c9b7474df">
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js"></script>
<script src="http://sstatic.net/js/question.js?v=b05e8725a843" type="text/javascript"></script>
So when they change the stylesheet, they change all.css?v=e27c9b7474df to all.css?v=some new version. The question.js javascript file follows the same convention. But filenames would work too, you could call your CSS/JS files all-1.css, then change it to all-2.css, etc. The actual format of the version number is up to you as long as the URL changes.
If your page resources (images/css/js) typically don't change and are static you can set the expires header to something far out like 1 year.
For the pages themselves it really depends on the content. If you content changes very frequently you should make sure your expires header are not set that large otherwise your visitors will receive stale content.
If you think about a site like SO itself, the content changes so frequently that the expires header on the page is very small. From the headers, looks like they use a 60 second max age and have a expires header 1 minute out from the current.

Resources