Securing HTTP referer - http

I develop software which stores files in directories with random names to prevent unauthorized users to download a file.
The first thing we need about this is to store them in a separate top-level domain (to prevent cookie theft).
The second danger is HTTP referer which may reveal the name of the secret directory.
My experiments with Chrome browser shows that HTTP referer is sent only when I click a link in my (secret) file. So the trouble is limited only to files which may contain links (in Chrome HTML and PDF). Can I rely on this behavior (not sending the referer is the next page is opened not from a current (secret) page link but with some other method such as entering the URL directly) for all browsers?
So the problem was limited only to HTML and PDF files. But it is not a complete security solution.
I suspect that we can fully solve this problem by adding Content-Disposition: attachment when serving all our secret files. Will it prevent the HTTP referer?
Also note that I am going to use HTTPS for a man-in-the-middle not to be able to download our secret files.

You can use the Referrer-Policy header to try to control referer behaviour. Please take note that this requires clients to implement this.
Instead of trying to conceal the file location, may I suggest you implement proper authentication and authorization handling?

I agree that Referrer-Policy is your best first step, but as DaSourcerer notes, it is not universally implemented on browsers you may support.
A fully server-side solution is as follows:
User connects to .../<secret>
Server generates a one-time token and redirects to .../<token>
Server provides document and invalidate token
Now the referer will point to .../<token>, which is no longer valid. This has usability trade-offs, however:
Reloading the page will not work (though you may be able to address this with a cookie or session)
Users cannot share URL from URL bar, since it's technically invalid (in some cases that could be a minor benefit)
You may be able to get the same basic benefits without the usability trade-offs by doing the same thing with an IFRAME rather than redirecting. I'm not certain how IFRAME influences Referer.
This entire solution is basically just Referer masking done proactively. If you can rewrite the links in the document, then you could instead use Referer masking on the way out. (i.e. rewrite all the links so that they point to https://yoursite.com/redirect/....) Since you mention PDF, I'm assuming that this would be challenging (or that you otherwise do not want to rewrite the document).

Related

Is there another way to set cookies than through HTTP headers?

I'm writing some http client code to interact with a website, and I need to set some cookies. Simply visiting the website sets 4 cookies (as seen in Chrome Settings).
However, when I look at the HTTP response headers for when those cookies were set (using Live HTTP Headers extension), there is no Set-Cookie header anywhere. How were those cookies set? Is there another way than through Set-Cookie?
Edit: Some of the cookies are HttpOnly.
If you load a site in your browser, it might also load other assets that can also set cookies (given that they are on the same domain).
But there is a second way to set cookies: with Javascript via document.cookies.
As far as I know, if your javascript or python code sets a cookie for that domain, then the response will include the SET-COOKIE field. You can view that from at least the inspect console.
So I see that you're using HTTP live extension, but it doesn't look like it shows that field in the response.
I tried looking for other extensions that could show it, but I wasn't able to find one as far as I know. I suppose we both can always fall back to the chrome inspect console. If you go to the network tab, you should actually see the req-resp.

Disable cache sharing among websites

Is there a way to tell the browser not to share a cached resource among websites?
I want to give websites a link to some JavaScript on my server and I want to make the response be different for each domain using the Referer header as check.
The response which will be cached should be available to the domain that requested it and when the end users visit another site that uses the script link, another request should be made.
I don't know whether I understand your question.
Does your scenario like: stackoverflow.com and yourwebsite.com use the same script called "https://ajax.googleapis.com/ajax/libs/jquery/1.12.4/jquery.min.js", but you don't want to share the cached script with stackoverflow.com
This is under the control of googleapis.com's web server.
So if the cached resource's origin server(googleapis.com) want to implement the feature as you said, it may use the Vary response header. Vary Header define the secondary key of cache.
Maybe "Vary: Origin" but only work for CORS
Maybe "Vary: referer" but referer contains url path
It still doesn't solve your problem but I hope it helps.
see MDN HTTP Cache Doc and [RFC 7234 Section 4.1]

How to exploit HTTP header XSS vulnerability?

Let's say that a page is just printing the value of the HTTP 'referer' header with no escaping. So the page is vulnerable to an XSS attack, i.e. an attacker can craft a GET request with a referer header containing something like <script>alert('xss');</script>.
But how can you actually use this to attack a target? How can the attacker make the target issue that specific request with that specific header?
This sounds like a standard reflected XSS attack.
In reflected XSS attacks, the attacker needs the victim to visit some site which in some way is under the attacker's control. Even if this is just a forum where an attacker can post a link in the hope somebody will follow it.
In the case of a reflected XSS attack with the referer header, then the attacker could redirect the user from the forum to a page on the attacker's domain.
e.g.
http://evil.example.com/?<script>alert(123)>
This page in turn redirects to the following target page in a way that preserves referer.
http://victim.example.org/vulnerable_xss_page.php
Because it is showing the referer header on this page without the proper escaping, http://evil.example.com/?<script>alert(123)> gets output within the HTML source, executing the alert. Note this works in Internet Explorer only.
Other browsers will automatically encode the URL rendering
%3cscript%3ealert%28123%29%3c/script%3e
instead which is safe.
I can think of a few different attacks, maybe there are more which then others will hopefully add. :)
If your XSS is just some header value reflected in the response unencoded, I would say that's less of a risk compared to stored. There may be factors to consider though. For example if it's a header that the browser adds and can be set in the browser (like the user agent), an attacker may get access to a client computer, change the user agent, and then let a normal user use the website, now with the attacker's javascript injected. Another example that comes to mind is a website may display the url that redirected you there (referer) - in this case the attacker only has to link to the vulnerable application from his carefully crafted url. These are kind of edge cases though.
If it's stored, that's more straightforward. Consider an application that logs user access with all request headers, and let's suppose there is an internal application for admins that they use to inspect logs. If this log viewer application is web based and vulnerable, any javascript from any request header could be run in the admin context. Obviously this is just one example, it doesn't need to be blind of course.
Cache poisoning may also help with exploiting a header XSS.
Another thing I can think of is browser plugins. Flash is less prevalent now (thankfully), but with different versions of Flash you could set different request headers on your requests. What exactly you can and cannot set is a mess and very confusing across Flash plugin versions.
So there are several attacks, and it is necessary to treat all headers as user input and encode them accordingly.
Exploitation of xss at referrer header is almost like a traditional reflected xss, Just an additional point to make is "Attacker's website redirects to victim website and hence referrer header with required javascript will be appended to the victim website request".
Here One essential point that needs to be discussed is Why only with IE one can exploit this vulnerability why not with other browsers?
Traditional answer for this question is 'Chrome and firefox automatically encodes URL parameters and hence XSS is not possible..' But interesting thing here is when we have n number of bypasses for traditional xss bypasses. why can't we have bypasses for this scenario.
Yes.. We can bypass with following payload which is same way to bypass HTML validation in traditional payload.
http://evil.example.com/?alert(1)//cctixn1f .
Here the response could be something like this:
The link on the
referring
page seems to be wrong or outdated.
Response End
If victim clicks on referring page, alert will be generated..
Bottomline: Not just only IE, XSS can be possible even in Mozilla and Firefox when referrer is being used as part of href tag..

Was there ever a proposal to include the URL fragment into the HTTP request?

In the current HTTP spec, the URL fragment (the part of the URL including and following the #) is not sent to the server in any way. However with the increased spread of AJAX, which uses the fragment to maintain some form of state, there are a lot of situations where it would be useful for the server to have knowledge of the URL fragment at request time.
For example, if you go to http://facebook.com, then click a user name in your stream, the URL will become http://faceboook.com/#!/username - to allow FB to update your page without reloading all of its bootstrap JS and HTML. However, if you were to reload this with your browser, the server would have no way of seeing the "#/!username" part of the URL, and therefore could not pre-render the content for you. This forces your browser to make an extra request once the client Javascript has loaded and parsed the fragment.
I am wondering if there have been any efforts or proposals towards creating a standard mechanism to achieve this.
For example, there could be a standard HTTP header, which would be sent with the value of the URL fragment - any server which cared about such things could then have access to it.
It seems like this would be a very useful thing for the web-application community as a whole, so I am surprised to not have heard anything proposed. Perhaps I missed it though.
Imho, the fragment identifier really is not a good place to store the state, it has been designed for something else.
That being said, http://www.jenitennison.com/blog/node/154 has a good discussion of the whole subject.
I found this proposal by Google to make Ajax pages crawlable, but it addresses a more constrained set of use cases. Specifically, it creates a way to replace the URL fragment with a URL parameter to obtain the same HTML output from the server as would be generated by a client visiting the equivalent URL with the fragment. However, such URLs are useless for actually running the Ajax apps, since they would necessitate a page reload every time.
Webkit Bug 24175 - URL Redirect Loses Fragment refers to Handling of fragment identifiers in redirected URLs which may be of interest.
A suggestion for a future version of HTTP may be to add an (optional)
Fragment header to the request, which holds the fragment identifier.
Even simpler may be to allow an HTTP request to contain a fragment
identifier.

Is there any downside for using a leading double slash to inherit the protocol in a URL? i.e. src="//domain.example"

I have a stylesheet that loads images from an external domain and I need it to load from https:// from secure order pages and http:// from other pages, based on the current URL. I found that starting the URL with a double slash inherits the current protocol. Do all browsers support this technique?
HTML ex:
<img src="//cdn.domain.example/logo.png" />
CSS ex:
.class { background: url(//cdn.domain.example/logo.png); }
If the browser supports RFC 1808 Section 4, RFC 2396 Section 5.2, or RFC 3986 Section 5.2, then it will indeed use the page URL's scheme for references that begin with "//".
When used on a link or #import, IE7/IE8 will download the file twice per http://paulirish.com/2010/the-protocol-relative-url/
Update from 2014:
Now that SSL is encouraged for everyone and doesn’t have performance concerns, this technique is now an anti-pattern. If the asset you need is available on SSL, then always use the https:// asset.
One downside occurs if your URLs are viewed outside the context of a web page. For example, an email message sitting in an email client (say, Outlook) effectively has no URL, and when you're viewing a message containing a protocol-relative URL, there is no obvious protocol context at all (the message itself is independent of the protocol used to fetch it, whether it's POP3, IMAP, Exchange, uucp or whatever) so the URL has no protocol to be relative to. I've not investigated compatibility with email clients to see what they do when presented with a missing protocol handler - I'm guessing that most will take a guess at http. Apple Mail refuses to let you enter a URL without a protocol. It's analogous to the way that relative URLs do not work in email because of a similarly missing context.
Similar problems could occur in other non-HTTP contexts such as in tweets, SMS messages, Word documents etc.
The more general explanation is that anonymous protocol URLs cannot work in isolation; there must be a relevant context. In a typical web page it's thus fine to pull in a script library that way, but any external links should always specify a protocol. I did try one simple test: //stackoverflow.com maps to file:///stackoverflow.com in all browsers I tried it in, so they really don't work by themselves.
The reason could be to provide portable web pages. If the outer page is not transported encrypted (http), why should the linked scripts be encrypted? This seems to be an unnecessary performance loss. In case, the outer page is securely transported encrypted (https), then the linked content should be encrypted, too. If the page is encrypted, the linked content not, IE seems to issue a Mixed Content warning. The reason is that an attacker can manipulate the scripts on the way. See http://ie.microsoft.com/testdrive/Browser/MixedContent/Default.html?o=1 for a longer discussion.
The HTTPS Everywhere campaign from the EFF suggests to use https whenever possible. We have the server capacity these days to serve web pages always encrypted.
Just for completeness. This was mentioned in another thread:
The "two forward slashes" are a common shorthand for "whatever protocol is being used right now"
if (plain http environment) {
use 'http://example.com/my-resource.js'
} else {
use 'https://example.com/my-resource.js'
}
Please check the full thread.
It seems to be a pretty common technique now. There is no downside, it only helps to unify the protocol for all assets on the page so should be used wherever possible.

Resources