On my website, files can be shared by URLs like
"/file/file_id",
and the server sends back exactly the file contents with the filename being specified too.
I guess I should do something with the Content-Type header. If I say
Content-Type: "image"
Firefox gladly executes html files too. It seems to be solved by
Content-Type: "image/jpeg"
For one I think having to just say "I'm an image!" should be sufficient by standards. For example with a typo(leaving off "jpeg") I could exploit my whole site. Plus now I have to look after all common image types and implement headers for them.
Secondly it would be great if there was a header for this(DO NOT EXECUTE). Is there one?
I looked at some "X-XSS-Protection" header but it looks like something else only IE understands anyway. Sorry if this was answered somewhere, I have not found it.
X-Content-Type-Options: nosniff
Makes browsers respect the Content-Type you send, so if you're careful to only send known-safe types (e.g. not SVG!), it'll be fine.
There's also CSP that might be a second line of defence:
Content-Security-Policy: default-src 'none'
Sites that are very careful about security host 3rd party content on a completely different top-level domain (to get same-origin policy protection and avoid cookie injection through compromised subdomains).
Traditionally there have been many ways to circumvent the different protections. As such, a full defense relies on multiple mechanisms (defense-in-depth).
Most larger companies solve this by hosting such files on custom domain (e.g. googleusercontent.com). If an attacker is able to execute script on such a domain, at least that does not give XSS access to the main web site.
X-Content-Type-Options is a non-standard header, and was up until very recently not supported in Firefox, but it is still a part of the defense. It's possible to construct files which are valid in many formats (I have a file that is a "valid" gif, html, javascript and pdf).
Images can normally be served directly (with x-content-type-options).
Other files can be served with content-type text/plain, while serving others with "Content-Disposition: attachment" to force a download instead of showing them in the browser.
Related
Basically, http2 push using http2_push_preload doesn't work if you set header Vary: Accept on your response because you are doing content negotiation using the Accept request header. I'm using content negotiation to send (http2 push) webp pics instead of jpg to clients that support it.
HTTP/2 Push works for .js, .css files and all in the same call and shows "Push/Other" in Chrome DevTools, but fails for this one unique case (jpg content negotiated to webp), and shows just "Other" (not pushed) in Chrome DevTools.
Content negotiation for brotli, gzip compressions all work fine and get pushed properly using the Vary: Accept-Encoding and same for languages using the Vary: Accept-Language.
Only Vary: Accept fails.
Please help I'm at the point of giving up.
P.S: I was going through nginx source https://github.com/nginx/nginx/blob/master/src/http/v2/ngx_http_v2.c. Do a Crtl+F and you will find cases for only "Accept-Encoding" and "Accept-Language", nothing for "Accept". So I think "Accept" case is not yet supported by nginx??
P.P.S: I'm not overpushing, only using http2 push for the hero image.
Edit: Here's bug ticket on nginx site for those who want to track it:
https://trac.nginx.org/nginx/ticket/1851
https://trac.nginx.org/nginx/ticket/1817
Edit 2: Nginx team has responded by saying they are not going to support it due to security reasons (you can find the response in the duplicate bug post), which I believe is due to pushing from different origins like CDNs? Anyway, I need this feature, so the only option left is to:
Create a custom patch or package.
Use some other server software that supports it.
Manually implement in website code a feature to rewrite .jpg paths to .jpg.webp if requests are coming from clients that support webp.
(I don't give up :P)
I'm not entirely surprised by this and Apache does the same. If you want this to change suggest to raise a bug with nginx but wouldn't be surprised if they didn't prioritise it.
It also seems the browsers don't handle this situation very well either.
HTTP/2 push is fraught with opportunities to over push and this is one example. You should not push if client does not support WebP and you often won't know that with the information that you have at this point. Chrome seems to send webp in the accept header when you ask for the HTML for example, but Firefox does not.
Preload is a much better, safer, option that will respect vary headers and also cache status.
I'm going to create a website which — in addition to its own content — would have links (in iframes) to the world biggest newspaper websites like New York Times, Financial Times and some other.
But I've faced with a problem of framing permission. For example, NY Times shows me an error Load denied by X-Frame-Options: http://www.nytimes.com/ does not permit framing. I have read many forums and didn't found a workable solution. Tried to add Header always append X-Frame-Options SAMEORIGIN into .haccess file but it didn't help. Is there any way to solve this problem?
Some websites have a server setting that will not allow other websites to "frame" their content. This is mainly to protect their copyrights and direct traffic to their websites only.
This is typically done by adding the following to Apache's configuration ( httpd.conf file):
Header always append X-Frame-Options SAMEORIGIN
Unfortunately, there is really nothing you can do about it if you want to frame the website.
If your goal isn't to build a website (intended for others to visit) which embeds other websites inside your own, and this is truly for personal use, then a solution is to search for and install any add-on that lets you modify response headers, or even more poignant - get the "Ignore X-Frame-Options" add-on.
These add-ons will intercept the response from the remote server and allow you to replace the X-Frame-Options header value with ALLOWALL - which in turn will cause your browser to allow the response to be embedded in a frame.
As it turns out, another SO question even discusses the code required to write your own add-on that does this: Disable X-Frame-Option on client side
Just add Ignore X-Frame-Options Header by ThomazPom this addon on mozzila and it will work fine. And There is no other solution. Below is the link
https://addons.mozilla.org/en-US/firefox/addon/ignore-x-frame-options-header/
I'm trying to implement content negotiation based on client Accept headers so that clients that accept image/webp get webp images while clients that don't get plain old jpeg. webp and jpeg image are served from the same url, i.e. /images/foo-image/ and the content returned varies on the Accept header presented by the client. This now works great on my site.
Next challenge is to get this working AWS CloudFront sitting in front of my site. I'm setting the Vary header to Vary: Accept to let CloudFront know that it has to cache and serve different content based on the client Accept headers.
This doesn't seem to work unfortunately, i.e. CloudFront just serves up whatever it first gets it's hands on, Vary and Accept notwithstanding. Interestingly, CloudFront does seem to be able to vary content based on Accept-Encoding (i.e. gzip).
Does anyone know what gives?
It turns out this is documented as not supposed to work:
The only acceptable value for the Vary header is Accept-Encoding. CloudFront ignores other values.
UPDATE: AWS now has support for more sophisticated content negotiation. I wrote a blog post on how to take advantage of this.
Just to update this question, CloudFront now supports caching by different headers, so you can now do this
I've created a static website that is hosted on a S3 Bucket. My asset files (css and js files) are minified and compressed with gzip. The filename itself is either file_gz.js or file_gz.css and is delivered with a Content-Encoding: gzip header.
So far, I've tested out the website on various browsers and it works fine. The assets are delivered with their compressed versions and the page doesn't look any different.
The only issue that I see is that since this is a S3 bucket, there is no failsafe for when the the client (the browser) doesn't support the gzip encoding. Instead the HTTP request will fail and there will be no styling or javascript-enhancements applied to the page.
Does anyone know of any problems by setting Content-Encoding: gzip? Do all browsers support this properly? Are there any other headers that I need to append to make this work properly?
Modern browsers support encoded content pretty much across the board. However, it's not safe to assume that all user agents will. The problem with your implementation is that it completely ignores HTTP's built-in method for avoiding this very problem: content negotiation. You have a couple of options:
You can continue to close your eyes to the problem and hope that every user agent that accesses your content will be able to decode your gzip resources. Unfortunately, this will almost certainly not be the case; browsers are not the only user-agents out there and the "head-in-the-sand" approach to problem solving is rarely a good idea.
Implement a solution to negotiate whether or not you serve a gzipped response using the Accept-Encoding header. If the client does not specify this header at all or specifies it but doesn't mention gzip, you can be fairly sure the user won't be able to decode a gzipped response. In those cases you need to send the uncompressed version.
The ins and outs of content negotiation are beyond the scope of this answer. You'll need to do some research on how to parse the Accept-Encoding header and negotiate the encoding of your responses. Usually, content encoding is accomplished through the use of third-party modules like Apache's mod_deflate. Though I'm not familiar with S3's options in this area, I suspect you'll need to implement the negotiation yourself.
In summary: sending encoded content without first clearing it with the client is not a very good idea.
Have you CSS / minfied CSS (example.css [247 kb]).
Use cmd gzip -9 example.css and covert file will be like example.css.gz [44 kb].
Rename the file example.css.gz to example.css.
Upload the file into S3 bucket and in the properties click meta-data.
Add new meta-data tag, select Context-Encoding and value gzip.
Now your CSS will be minified and also gzip.
source:
http://www.rightbrainnetworks.com/blog/serving-compressed-gzipped-static-files-from-amazon-s3-or-cloudfront/
Starting recently, some of my new web pages (XHTML 1.1) are setup to do a regex of the request header Accept and send the right HTTP response headers if the user agent accepts XML (Firefox and Safari do).
IE (or any other browser that doesn't accept it) will just get the plain text/html content type.
Will Google bot (or any other search bot) have any problems with this? Is there any negatives to my approach I have looked over? Would you think this header sniffer would have much effect on performance?
One problem with content negotiation (and with serving different content/headers to different user-agents) is proxy servers. Considering the following; I ran into this back in the Netscape 4 days and have been shy of server side sniffing ever since.
User A downloads your page with Firefox, and gets a XHTML/XML Content-Type. The user's ISP has a proxy server between the user and your site, so this page is now cached.
User B, same ISP, requests your page using Internet Explorer. The request hits the proxy first, the proxy says "hey, I have that page, here it is; as application/xhtml+xml". User B is prompted to download the file (as IE will download anything sent as application/xhtml+xml.
You can get around this particular issue by using the Vary Header, as described in this 456 Berea Street article. I also assume that proxy servers have gotten a bit smarter about auto detecting these things.
Here's where the CF that is HTML/XHTML starts to creep in. When you use content negotiation to serve application/xhtml+xml to one set of user-agents, and text/html to another set of user agents, you're relying on all the proxies between your server and your users to be well behaved.
Even if all the proxy servers in the world were smart enough to recognize the Vary header (they aren't) you still have to contend with the computer janitors of the world. There are a lot of smart, talented, and dedicated IT professionals in the world. There are more not so smart people who spend their days double clicking installer applications and thinking "The Internet" is that blue E in their menu. A mis-configured proxy could still improperly cache pages and headers, leaving you out of luck.
The only real problem is that browsers will display xml parse errors if your page contains invalid code, while in text/html they will at least display something viewable.
There is not really any benefit of sending xml unless you want to embed svg or are doing xml processing of the page.
I use content negotiation to switch between application/xhtml+xml and text/html just like you describe, without noticing any problems with search bots. Strictly though, you should take into account the q values in the accept header that indicates the preference of the user agent to each content type. If a user agent prefers to accept text/html but will accept application/xhtml+xml as an alternate, then for greatest safety you should have the page served as text/html.
The problem is that you need to limit your markup to subset of both HTML and XHTML.
You can't use XHTML features (namespaces, self-closing syntax on all elements), because they will break in HTML (e.g. <script/> is unclosed to text/html parser and will kill document up to next </script>).
You can't use XML serializer, because it could break text/html mode (may use XML-only features mentioned in previous point, may add tagname prefixes (PHP DOM sometimes does <default:h1>). <script> is CDATA in HTML, but XML serializer may output <script>if (a && b)</script>).
You can't use HTML's compact syntax (implied tags, optional quotes), because it won't parse as XML.
It's risky to use use HTML tools (including most template engines), because they don't care about well-formedness (a single unescaped & in href or <br> will completely break XML, and make your site appear to work only in IE!)
I've tested indexing of my XML-only website. It's been indexed even though I've used application/xml MIME type, but it appeared to be parsed as HTML anyway (Google did not index text that was in <[CDATA[ ]]> sections).
Since IE doesn't support xhtml as application/xhtml+xml, the only way to get cross browser support is to use content negotiation. According to Web Devout, content negotiation is hard due to the misuse of wildcards where web browsers claim to support every type of content in existence! Safari and Konquer support xhtml, but only imply this support by a wildcard, while IE doesn't support it, yet implies support too.
The W3C recommends only sending xhtml to browsers that specifically declare support in the HTTP Accept header and ignoring those browsers that don't specifically declare support. Note though, that headers aren't always reliable and it has been known to cause issues with caching. Even if you could get this working, having to maintain two similar, but different versions would be a pain.
Given all these issues, I'm in favor of giving xhtml a miss, when your tools and libraries let you, of course.