We got a long-running website where XSS lurks. The problem comes from that some developers directly - without using HtmlEncode/Decode() - retrieve Request["sth"] to do the process, putting on the web.
I wonder if there is any mechanism like HTTPModule to help us HtmlEncode() all the items in a Http request to avoid XSS to some extent.
Appreciate for any suggestion.
Rgds,
Ricky
The problem is not retrieving Request data without HTML-encoding. In fact that's perfectly correct. You should not encode any text until the final output stage when you spit it into an HTML page.
Trying to blanket-encode incoming parameters, whether that's HTML-encoding or SQL-encoding, is totally the wrong thing. It may hide XSS holes in your app but it does not fix them. You will still have a hole if you output content that hasn't come from parameters, or has been processed since then. Meanwhile the automatic encoding will fill your database with multiply-escaped & crud.
You need to fix the output stage, that's where the problem lies.
Like bobince said, this is an output problem, not an input problem. If you can isolate where this data is being output on the page, you could create a Filter and add it to the Response object. This filter would isolate the areas that are common output and then HtmlEncode them.
Related
I have this project where I need to know if a visitor legitimately arrived from a QR code. Document.referrer value from a QR code shows blank. I have looked at some answers suggesting to put parameter in the query string (e.g. ?source=qr), but anyone could easily add the parameter into the URL and my code would believe it is from a QR code (e.g. www.project.com/check.page?source=qr) . I have thought of adding codes to make sure it is from a mobile phone / tablet as secondary way to authenticate but many browsers have add-ons to fool websites.
Any suggestions would be greatly appreciated.
Thanks in advance.
I think the best solution for you is creating your regional QR Codes pointing to:
Region 1) http://example.com/?qr=f61060194c9c6763bb63385782aa216f
Region 2) http://example.com/?qr=731417b947aa548528344fab8e0f29b6
Region 3) http://example.com/?qr=df189e7f7c8b89edd05ccc6aec36c36d
if the value of the parameter qr is anything other than f61060194c9c6763bb63385782aa216f, 731417b947aa548528344fab8e0f29b6 or df189e7f7c8b89edd05ccc6aec36c36d, then you can ignore it and assume the user didn't come from any QR Code.
Of course, any user can remove the source parameter. But at least he can't add a valid one, unless he really had access to the code.
...but anyone could easily add the parameter into the URL and my code would believe it is from a QR code
Well, anyone could also scan the QR code, view the link, and remove the source=qr from it.
Data collection is never 100% reliable. Users can change their browser's user agent, inject cookies with some strange values, open your page through a proxy server, and so on.
You could create your own device or App for scanning the QR-code. If you read the post I've linked, you will see that this is a waste of time and resources.
So, what is left is to make a solution which will work for most of the users. Appending a source=qr parameter to your URL seems to be the simplest solution. You could also link to an entirely different domain and redirect the request, so it would be more fraud-safe. But it will never be 100% accurate.
I want to prevent or hamper the parsing of the classifieds website that I'm improving.
The website uses API with JSON responses. As a solution, I want to add useless data between my data as programmers will probably parse by ID. And not give a clue about it in neither JSON response body nor header; so they won't be able to distinguish it without close inspection.
To prevent users from seeing it, I won't give that "useless data" to my users if they don't request it explicitly by ID. From an SEO perspective, I know that Google won't parse the page with useless data if there isn't any internal or external link.
How reliable would that technic be? And what problems/disadvantages/drawbacks do you think can occur in terms of user experience or SEO? Any ideas or suggestions will be very much appreciated.
P.S. I'm also limiting big request counts made in a short time. But it doesn't help. That's why I'm thinking of this technic.
I think banning parsers won't be better because they can change IP and etc.
Maybe I can get a better solution by requiring a login to access more than 50 item details for example (and that will work for Selenium, maybe?). Registering will make it harder. Even they do it, I can see these users and slow their response times and etc.
I have solved a problem with a solution I found here on SO, but I am curious about if another idea I had is as bad as I think it might be.
I am debugging a custom security Attribute we have on/in several of our controllers. The Attribute currently redirects unauthorized users using a RedirectResult. This works fine except when calling the methods with Ajax. In those cases, the error returned to our JS consists of a text string of all the HTML of our error page (the one we redirect to) as well as the HTTP code and text 200/OK. I have solved this issue using the "IsAjaxRequest" method described in the answer to this question. Now I am perfectly able to respond differently to Ajax calls.
Out of curiosity, however, I would like to know what pitfalls might exist if I were to instead have solved the issue by doing the following. To me it seems like a bad idea, but I can't quite figure out why...
The ActionExecutingContext ("filterContext") has an HttpContext, which has a Request, which in turn has an AcceptTypes string collection. I notice that on my Ajax calls, which expect JSON, the value of filterContext.HttpContext.Request.AcceptTypes[0] is "application/json." I am wondering what might go wrong if I were to check this string against one or more expected content types and respond to them accordingly. Would this work, or is it asking for disaster?
I would say it works perfect, and I have been using that for years.
The whole point use request headers is to be able to tell the server what the client accept and expect.
I suggest you read more here about Web API and how it uses exactly that technique.
Is there a way to determine if the request coming to a handler (lets assume the handler responds to get and post) is being performed by a real browser versus a programmatic client?
I already know that it is easy to spoof things like the User Agent and the Referrer, but are there other headers that are more difficult to spoof? Maybe headers that are not commonly available in classes like .net's HttpWebRequest?
The other path that I looked at is maybe using the Encrypted View State to send a value to the browser that gets validated on the server side, though couldn't that value simply be scraped from the previous response and added as a post parameter to the next request?
Any help would be much appreciated,
Cheers,
There is no easy way to differentiate because in the end, a post programitically looks the same to the server as a post by a user from the browser.
As mentioned, captcha's can be used to control posting but are not perfect (as it is very hard but not impossible for a computer to solve them). They also can annoy users.
Another route is only allowing authenticated users to post, but this can also still be done programatically.
If you want to get a good feel for how people are going to try to abuse your site, then you may want to look at http://seleniumhq.org/
This is very similar to the famous Halting Problem in computer science. See some more on the proof, and Alan Turing here: http://webcache.googleusercontent.com/search?q=cache:HZ7CMq6XAGwJ:www-inst.eecs.berkeley.edu/~cs70/fa06/lectures/computability/lec30.ps+alan+turing+infinite+loop+compiler&cd=1&hl=en&ct=clnk&gl=us
The most common way is using captcha's. Of course captcha's have their own issues (users don't really care for them) but they do make it much more difficult to programatically post data. Doesn't really help with GETs though you can force them to solve a captcha before delivering content.
Many ways to do this, like dynamically generated XHR requests that can only be made with human tasks.
Here's a great article on NP-Hard problems. I can see a huge possibility here:
http://www.i-programmer.info/news/112-theory/3896-classic-nintendo-games-are-np-hard.html
One way: You could use some tricky JS to handle tokens on click. So your server issues token-id's to elements on the page during the backend render phase. Log these in a database or data file. Then, when users click around and submit, you can compare the id's sent via the onclick() function. There's plenty of ways around this, but you could apply some heuristics to determine if posts are too fast to be a human or not, that is, even if they scripted the hijacking of the token-ids and auto submitted, you could check that the time between click events appears automated. Signed up for a twitter account lately? They use passive human detection that while not 100% foolproof, it is slower and more difficult to break. Many if not all of the spam accounts there had to be human opened.
Another Way: http://areyouahuman.com/
As long as you are using encrypted methods verifying humanity without crappy CAPTCHA is possible.I mean, don't ignore your headers either. These are complimentary ways.
The key is to have enough complexity to make for an NP-Complete problem in terms of number of ways to solve the total number of problems is extraordinary. http://en.wikipedia.org/wiki/NP-complete
When the day comes when AI can solve multiple complex Human problems on their own, we will have other things to worry about than request tampering.
http://louisville.academia.edu/RomanYampolskiy/Papers/1467394/AI-Complete_AI-Hard_or_AI-Easy_Classification_of_Problems_in_Artificial
Another company doing interesting research is http://www.vouchsafe.com/play-games they actually use games designed to trick the RTT into training the RTT how to be more solvable by only humans!
It's C# + .net 2.0, but this can be pretty much any environment
I have a list in my website where you can get coming from several different locations (modules). It is a normal list screen, with add/edit (an other) links in it.
You sometimes get here from one module, sometimes from an other module. The only thing that I would like to fix is the Back link. It should always take you back to the module where you came from.
What is the best way to do this?
Session variable (this is so far my best choice, but I am not really happy to use this)
URL. I could transmit a parameter all over, "carry it with me" (i don't like this a lot because of the number of the pages that are related to this list, I would have to carry that variable with me in add, edit and other screens, too many places to change)
cookies ?(limited number of cookies, and things like that, don't really like this one either.)
4-5-6 ?
How are all you guys handling this issue?
Save the referer in Session and redirect them to this from the back link eventhandler.
This assumes you are doing nothing funky with the querystring.
I have concerns myself about using Session to store information but for this scenario, it is perfectly acceptable.
Cookies are another option, but overkill, unless you factor in the possibility of losing session information on website/app-pool recycle