A website was audited for vulnerabilities and it had flagged XSS for many pages which, from my point of view, do not appear to be vulnerable as I don't display any data captured from form the page or the URL (such as query string).
Acunetix flagged the following URL as XSS by adding some javacript code
http://www.example.com/page-one//?'onmouseover='pU0e(9527)
Report:
GET /page-one//?'onmouseover='pU0e(9527)'bad=' HTTP/1.1
Referer: https://www.example.com/
Connection: keep-alive
Authorization: Basic FXvxdAfafmFub25cfGb=
Accept: /
Accept-Encoding: gzip,deflate
Host: example.com
So, how could this be vulnerable or is it possible that it's vulnerable?
Above all, if onmouseover can be added as XSS then how will it be affected?
Since you asked for more information, I'll post my response as an answer.
The main question as I see it:
Can there still be an XSS vulnerability from the query string if I don't use any of the parameters in my code?
Well, if they actually aren't used at all, then it should not be possible. However, there are subtle ways that you could be using them that you may have overlooked. (Posting the actual source code would be useful here).
One example would be something like this:
Response.Write("<a href='" +
HttpContext.Current.Request.Url.AbsoluteUri) + "'>share this link!</a>
This would put the entire URL in the body of the web page. The attacker can make use of the query string even though they aren't mapped to variables because the full URL is written in the response. Keep in mind it could also be in a hidden field.
Be careful writing out values like HttpContext.Current.Request.Url.AbsoluteUri or HttpContext.Current.Request.Url.PathAndQuery.
Some tips:
Confirm that the scanner is not reporting a false positive by opening the link in a modern browser like Chrome. Check the console for an error about "XSS Auditor" or similar.
Use an antixss library to encode untrusted output before writing to the response.
read this: https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet
Related
We've noticed that for some users of our website, they have a problem that if they following links to the website from external source (specifically Outlook and MS Word) that they arrive at the website in such a way that User.IsAuthenticated is false, even though they are still logged in in other tabs.
After hours of diagnosis, it appears to be because the FormsAuthentication cookie is not sent sometimes when the external link is clicked. If we examine in Fiddler, we see different headers for links clicked within the website, versus the headers which are as a result of clicking a link in a Word document or Email. There doesn't appear to be anything wrong with the cookie (has "/" as path, no domain, and a future expiration date).
Here is the cookie being set:
Set-Cookie: DRYXADMINAUTH2014=<hexdata>; expires=Wed, 01-Jul-2015 23:30:37 GMT; path=/
Here is a request sent from an internal link:
GET http://domain.com/searchresults/media/?sk=creative HTTP/1.1
Host: domain.com
Cookie: Diary_SessionID=r4krwqqhaoqvt1q0vcdzj5md; DRYXADMINAUTH2014=<hexdata>;
Here is a request sent from an external (Word) link:
GET http://domain.com/searchresults/media/?sk=creative HTTP/1.1
Host: domain.com
Cookie: Diary_SessionID=cpnriieepi4rzdbjtenfpvdb
Note that the .NET FormsAuthentication token is missing from the second request. The problem doesn't seem to be affected by which browser is set as default and happens in both Chrome and Firefox.
Is this normal/expected behaviour, or there a way we can fix this?
Turns out this a known issue with Microsoft Word, Outlook and other MS Office products: <sigh>
See: Why are cookies unrecognized when a link is clicked from an external source (i.e. Excel, Word, etc...)
Summary: Word tries to open the URL itself (in case it's an Office document) but gets redirected as it doesn't have the authentication cookie. Due to a bug in Word, it then incorrectly tries to open the redirected URL in the OS's default browser instead of the original URL. If you monitor the the "process" column in Fiddler it's easy to see the exact behaviour from the linked article occurring:
For Filepicker.io we built "grab from url", but certain sites aren't happy with not passing a User-Agent header. I could just use a stock browser user agent as suggested in some other answers, but as a good web citizen I wanted to know if there isa more appropriate user-agent to set for a server requesting another server's data?
Depends on the language you wrote your server in. For example, Python's urllib sets a default value to User-agent: Python-urllib/2.1, but you can just as easily set it to something like User-agent: filepicker.io/<your-version-here> or something more language specific if you'd like.
I was wondering how companies like Double-Click include a cookie in their image responses to track users. Similarly, how do the images (e.g. smart pixels) send information back to their servers?
Please provide a scripting example if possible (any language is okay) [note: if this is resolve doings something server side, please describe how this would be accomplished using APACHE].
Cheers,
Rob
How do they include a cookie ? Configure the server, probably via script to send cookies with the responses. Images are http requests, that follow the http protocol, there is nothing magical about them.
"Smart pixels" convey their information simply via the request the browser must send to the server in order to load the image. Information about the user/browser, can be gathered via javascript and embedded in the url.
To do this in php, you'd use the setCookie function.
<?php
$value = 'something from somewhere';
setcookie("TestCookie", $value);
setcookie("TestCookie", $value, time()+3600); /* expire in 1 hour */
setcookie("TestCookie", $value, time()+3600, "/~rasmus/", ".example.com", 1);
?>
That code was taken from the php doc I referenced above. Basically this adds the Set-Cookie to the HttpResponse header like: Set-Cookie: UserID=JohnDoe; Max-Age=3600; Version=1
See http://en.wikipedia.org/wiki/List_of_HTTP_header_fields and search for Set-Cookie
ALSO, in scripting languages, like PHP, make sure you set the header before you render any content. This is because the HTTP Headers are the first thing sent in the response, so as soon as you write content, the headers should've already been written.
Another quote from the PHP:setcookie doc:
Like other headers, cookies must be sent before any output from your
script (this is a protocol restriction). This requires that you place
calls to this function prior to any output, including and
tags as well as any whitespace.
I'm writing a simple crawler, and ideally to save bandwidth, I'd only like to download the text and links on the page. Can I do that using HTTP Headers? I'm confused about how they work.
You're on the right track to solving the problem.
I'm not sure how much you already know about HTTP headers, but basically an HTTP header is just a string formatting for a web server - it follows a protocol - and is pretty straightforward in that aspect. You write a request, and receive a response. The requests look like the things you see in the Firefox plugin LiveHTTPHeaders at https://addons.mozilla.org/en-US/firefox/addon/3829/.
I wrote a small post at my site http://blog.gnucom.cc/2010/write-http-request-to-web-server-with-php/ that shows you how you can write a request to a web server and then later read the response. If you only accept text/html you'll only accept a subset of what is available on the web (so yes, it will "optimize" your script to an extent). Note this example is really low level, and if you're going to write a spider you may want to use an existing library like cURL or whatever other tools your implementation language offers.
Yes, with using Accept: text/html you should only get HTML as a valid responses. That’s at least how it ought to be.
But in practice there is a huge difference between the standards and the actual implementations. And proper content negotiation (that’s what Accept is for) is one of the things that are barely supported.
An HTML page contains just the text plus some tag markup.
Images, scripts and stylesheets are (usually) external files that are referenced from the HTML markup. This means that if you request a page, you will already receive just the text (without the images and other stuff).
Since you are writing the crawler, you should make sure it doesn't follow URLs from images, scripts or stylesheets.
I'm not 100% sure, but I believe that GET /foobar.png will return the image even if you send Accept: text/html. For this reason I believe you should just filter what kind of URLs you crawl.
In addition, you may try to read the response headers in the crawler and close the connection before you read the body if the Content-Type is not text/html. It might be worthwhile for undesired larger files.
Let's say that I'm uploading a large file via a POST HTTP request.
Let's also say that I have another parameter (other than the file) that names the resource which the file is updating.
The resource cannot be not part of the URL the way you can do it with REST (e.g. foo.com/bar/123). Let's say this is due to a combination of technical and political reasons.
The server needs to ignore the file if the resource name is invalid or, say, the IP address and/or the logged in user are not authorized to update the resource. This can easily be done if the resource parameter came first in the POST request.
Looks like, if this POST came from an HTML form that contains the resource name first and file field second, for most (all?) browsers, this order is preserved in the POST request. But it would be naive to fully rely on that, no?
In other words the order of HTTP parameters is insignificant and a client is free to construct the POST in any order. Isn't that true?
Which means that, at least in theory, the server may end up storing the whole large file before it can deny the request.
It seems to me that this is a clear case where RESTful urls have an advantage, since you don't have to look at the POST content to perform certain authorization/error checking on the request.
Do you agree? What are your thoughts, experiences?
More comments please! In my mind, everyone who's doing large file uploads (or any file uploads for that matter) should have thought about this.
You can't rely on the order of POST variables that's for sure. Especially you can't trust form arrays to be in correct order when submitting/POSTing the form. You might want to check the credentials etc. somewhere else before getting to the point of posting the actual data if you want to save the bandwidth.
I'd just stick whatever variables you need first in the request's querystring.
On the client,
<form action="/yourhandler?user=0&resource=name" method="post">
<input type="file" name="upload" /></form>
Gets you
POST /yourhandler?user=0&resource=name HTTP/1.1
Content-Type: multipart/form-data; boundary=-----
...
-----
Content-Disposition: form-data; name="upload"; filename="somebigfile.txt"
Content-Type: text/plain
...
On the server, you'd then be able to check the querystring before the upload completes and shut it down if necessary. (This is more or less the same as REST but may be easier to implement based on what you have to work with, technically and politically speaking.)
Your other option might be to use a cookie to store that data, but this obviously fails when a user has cookies disabled. You might also be able to use the Authorization header.
You should provide more background on your use case. So far, I see absolutely no reason, why you should not simply PUT the large entity to the resource you intend to create or update.
PUT /documents/some-doc-name
Content-Type: text/plain
[many bytes of text data]
Why do you think that is not a possible solution in your case?
Jan