How to prevent XSS in Asp.NET - asp.net

What are the techniques that one can use to prevent cross site scripting in asp.net? Are there any non ready implementations that one can use to achieve a website protected against xss?

We did in-house development for this purpose for a long time, but finally Microsoft provided a library for it. We now replaced our library with this one completely. It can simply be used as follows:
string sanitizedString = Microsoft.Security.Application.Sanitizer.GetSafeHtmlFragment(string myStringToBeChecked);
The only problem with this method is that it trims multiple whitespaces that are separated with line ending characters. If you do not want that to happen, you may consider splitting the string first with respect to line ending characters (\r\n), then calculate the number of whitespaces before and after these splitted strings, apply sanitizer, append whitespaces back and concatenate.
Other than that, Microsoft library works fine.

The microsoft anti cross site scripting library is a good start. It has some useful helper methods to prevent XSS. It is now part of the larger Microsoft Web Protection Library

To protect cross site scripting attack as following ways.
1. Use HtmlEncoding while saving the input content that is received from web application controls usch as textbox etc.
2. Use InnerText instead of InnerHtml while displaying data on the page.
3. Do the input sanitization before saving the data into database.

Related

Encoder.HtmlEncode encodes Farsi characters

I want to use the Microsoft AntiXss library for my project. When I use the Microsoft.Security.Application.Encoder.HtmlEncode(str) function to safely show some value in my web page, it encodes Farsi characters which I consider to be safe. For instance, it converts لیست to لیست. Am I using the wrong function? How should I be able to print the user input in my page safely?
I'm currently using it like this:
<h2>#Encoder.HtmlEncode(ViewBag.UserInput)</h2>
I think I messed up! Razor view encodes the values unless you use #Html.Raw right? Well, I encoded the string and it encoded it again. So in the end it just got encoded twice and hence, the weird looking chars (Unicode values)!
If your encoding (lets assume that it's Unicode by default) supports Farsi it's safe to use Farsi, without any additional effort, in ASP.NET MVC almost always.
First of all, escape-on-input is just wrong - you've taken some input and applied some transformation that is totally irrelevant to that data. It's generally wrong to encode your data immediately after you receive it from the user. You should store the data in pure view to your database and encode it only when you display it to the user and according to the possible vulnerabilities for the current system. For example the 'dangerous' html characters are not 'dangerous' for SQL or android etc. and that's one of the main reasons why you shouldn't encode the data when you store it in the server. And one more reason - when you html encode the string you got 6-7 times more characters for your string. This can be a problem with server constraints for strings length. When you store the data to the sql server you should escape, validate, sanitize your data only for it and prevent only its vulnerabilities (like sql injection).
Now for ASP.NET MVC and razor you don't need to html encode your strings because it's done by default unless you use Html.Raw() but generally you should avoid it (or html encode when you use it). Also if you double encode your data you'll result in corrupted output :)
I Hope this will help to clear your mind.

User input sanitisation in asp.net

I need to sanitise user input (or output) for a web app I'm developing. The user input is just plain text, and I want to prevent HTML or other "harmful" strings. However characters such as less than, greater than, apostrophes, ampersands, quotes, etc., should be allowed.
I guess the first step is to disable request validation to prevent the generic "a potentially dangerous value was detected" message, but what else do I need to do? I can't simply htmlencode the output otherwise I'll end up with &lt being displayed in place of a less than character, for example.
Are there any tools that can help? I had a quick look at the AntiXSS library but from what I've seen it's just a glorified htmlencoder, or am I missing something? What about MVC - does this have anything built in?
I've never found a decent article on this kind of thing. Some say to sanitise input, while others say to sanitise output, and examples are typically over-simplistic, using techniques like htmlencoding, which will reformat perfectly valid characters such as a less than.
The Anti-XSS library is the standard library in ASP.Net WebForms for now. Though it is sub optimal. And the latest version (4.2) has several breaking bugs that haven't been fixed in awhile.
Also see the MSDN article Information Security - Anti-Cross Site Scripting.
See Should I use the Anti-XSS Security Runtime Engine in ASP.NET MVC? for your answer regarding MVC. From that answer:
Phil Haack has an interesting blog post here
http://haacked.com/archive/2009/02/07/take-charge-of-your-security.aspx.
He suggests using Anti-XSS combined with CAT.NET.

How to test localized UI in ASP.NET WebForms

I am trying to come up with a way to automate testing of localized UI in ASP.NET WebForms. Basically I have button that toggles the current locale and code that populates the right text from resource file. The problem is how to test it.
One approach is to use BDD in a form of
As a Spanish speaking user
I want to switch to Spanish
So that I can use the site more comfortably
Scenario Outline:
... a bunch of steps to get each possible string (labels, buttons, messages, etc)
Another approach is to use TDD in a form of row based tests and check each property (which is WebForms is not trivial).
The first approach forces repeating existing scenarios, the second is very difficult and not clear.
How do people test localization?
Well I am in the same boat at this moment...
What I am trying to do.. is
to extract the user-visible strings used by the automated tests out into a swappable block (in my case this is a .net resource file). The idea is to have different machines (or VMs or change at runtime) and run the same suite across different localized versions of the app.
That leaves the switch language feature (that we don't support at the moment): that you could test by exercising the switch behavior and doing a cursory check in a test.
Finally you really need a set of human eyes to ensure everything has been localized and is accessible (e.g. not clipped and stuff). There are other aspects too that can't be automated.. e.g. the use of colors to signal alarms.
To ensure there are no hard coded user-visible strings, create a junk resource file with junk characters, wire it to the app and manually breeze through all the screens periodically (every 2 weeks). If you still see english strings, something still resides out of the resource file. Once everything is in the resource file, you still need someone who speaks the language to ensure that the localized strings appear correctly or match the context in which they are shown.

Semicolon as URL query separator

Although it is strongly recommended (W3C source, via Wikipedia) for web servers to support semicolon as a separator of URL query items (in addition to ampersand), it does not seem to be generally followed.
For example, compare
        http://www.google.com/search?q=nemo&oe=utf-8
        http://www.google.com/search?q=nemo;oe=utf-8
results. (In the latter case, semicolon is, or was at the time of writing this text, treated as ordinary string character, as if the url was: http://www.google.com/search?q=nemo%3Boe=utf-8)
Although the first URL parsing library i tried, behaves well:
>>> from urlparse import urlparse, query_qs
>>> url = 'http://www.google.com/search?q=nemo;oe=utf-8'
>>> parse_qs(urlparse(url).query)
{'q': ['nemo'], 'oe': ['utf-8']}
What is the current status of accepting semicolon as a separator, and what are potential issues or some interesting notes? (from both server and client point of view)
The W3C Recommendation from 1999 is obsolete. The current status, according to the 2014 W3C Recommendation, is that semicolon is now illegal as a parameter separator:
To decode application/x-www-form-urlencoded payloads, the following algorithm should be used. [...] The output of this algorithm is a sorted list of name-value pairs. [...]
Let strings be the result of strictly splitting the string payload on U+0026 AMPERSAND characters (&).
In other words, ?foo=bar;baz means the parameter foo will have the value bar;baz; whereas ?foo=bar;baz=sna should result in foo being bar;baz=sna (although technically illegal since the second = should be escaped to %3D).
As long as your HTTP server, and your server-side application, accept semicolons as separators, you should be good to go. I cannot see any drawbacks. As you said, the W3C spec is on your side:
We recommend that HTTP server implementors, and in particular, CGI implementors support the use of ";" in place of "&" to save authors the trouble of escaping "&" characters in this manner.
I agree with Bob Aman. The W3C spec is designed to make it easier to use anchor hyperlinks with URLs that look like form GET requests (e.g., http://www.host.com/?x=1&y=2). In this context, the ampersand conflicts with the system for character entity references, which all start with an ampersand (e.g., "). So W3C recommends that web servers allow a semicolon to be used as a field separator instead of an ampersand, to make it easier to write these URLs. But this solution requires that writers remember that the ampersand must be replaced by something, and that a ; is an equally valid field delimiter, even though web browsers universally use ampersands in the URL when submitting forms. That is arguably more difficult that remembering to replace the ampersand with an & in these links, just as would be done elsewhere in the document.
To make matters worse, until all web servers allow semicolons as field delimiters, URL writers can only use this shortcut for some hosts, and must use & for others. They will also have to change their code later if a given host stops allowing semicolon delimiters. This is certainly harder than simply using &, which will work for every server forever. This in turn removes any incentive for web servers to allow semicolons as field separators. Why bother, when everyone is already changing the ampersand to & instead of ;?
In short, HTML is a big mess (due to its leniency), and using semicolons help to simplify this a LOT. I estimate that when i factor in the complications that i've found, using ampersands as a separator makes the whole process about three times as complicated as using semicolons for separators instead!
I'm a .NET programmer and to my knowledge, .NET does not inherently allow ';' separators, so i wrote my own parsing and handling methods because i saw a tremendous value in using semicolons rather than the already problematic system of using ampersands as separators. Unfortunately, very respectable people (like #Bob Aman in another answer) do not see the value in why semicolon usage is far superior and so much simpler than using ampersands. So i now share a few points to perhaps persuade other respectable developers who don't recognize the value yet of using semicolons instead:
Using a querystring like '?a=1&b=2' in an HTML page is improper (without HTML encoding it first), but most of the time it works. This however is only due to most browsers being tolerant, and that tolerance can lead to hard-to-find bugs when, for instance, the value of the key value pair gets posted in an HTML page URL without proper encoding (directly as '?a=1&b=2' in the HTML source). A QueryString like '?who=me+&+you' is problematic too.
We people can have biases and can disagree about our biases all day long, so recognizing our biases is very important. For instance, i agree that i just think separating with ';' looks 'cleaner'. I agree that my 'cleaner' opinion is purely a bias. And another developer can have an equally opposite and equally valid bias. So my bias on this one point is not any more correct than the opposite bias.
But given the unbiased support of the semicolon making everyone's life easier in the long run, cannot be correctly disputed when the whole picture is taken into account. In short, using semicolons does make life simpler for everyone, with one exception: a small hurdle of getting used to something new. That's all. It's always more difficult to make anything change. But the difficulty of making the change pales in comparison to the continued difficulty of continuing to use &.
Using ; as a QueryString separator makes it MUCH simpler. Ampersand separators are more than twice as difficult to code properly than if semicolons were used. (I think) most implementations are not coded properly, so most implementations aren't twice as complicated. But then tracking down and fixing the bugs leads to lost productivity. Here, i point out 2 separate encoding steps needed to properly encode a QueryString when & is the separator:
Step 1: URL encode both the keys and values of the querystring.
Step 2: Concatenate the keys and values like 'a=1&b=2' after they are URL encoded from step 1.
Step 3: Then HTML encode the whole QueryString in the HTML source of the page.
So special encoding must be done twice for proper (bug free) URL encoding, and not just that, but the encodings are two distinct, different encoding types. The first is a URL encoding and the second is an HTML encoding (for HTML source code). If any of these is incorrect, then i can find you a bug. But step 3 is different for XML. For XML, then XML character entity encoding is needed instead (which is almost identical). My point is that the last encoding is dependent upon the context of the URL, whether that be in an HTML web page, or in XML documentation.
Now with the much simpler semicolon separators, the process is as one wud expect:
1: URL encode the keys and values,
2: concatenate the values together. (With no encoding for step 3.)
I think most web developers skip step 3 because browsers are so lenient. But this leads to bugs and more complications when hunting down those bugs or users not being able to do things if those bugs were not present, or writing bug reports, etc.
Another complication in real use is when writing XML documentation markup in my source code in both C# and VB.NET. Since & must be encoded, it's a real drag, literally, on my productivity. That extra step 3 makes it harder to read the source code too. So this harder-to-read deficit applies not only to HTML and XML, but also to other applications like C# and VB.NET code because their documentation uses XML documentation. So the step #3 encoding complication proliferates to other applications too.
So in summary, using the ; as a separator is simple because the (correct) process when using the semicolon is how one wud normally expect the process to be: only one step of encoding needs to take place.
Perhaps this wasn't too confusing. But all the confusion or difficulty is due to using a separation character that shud be HTML encoded. Thus '&' is the culprit. And semicolon relieves all that complication.
(I will point out that my 3 step vs 2 step process above is usually how many steps it would take for most applications. However, for completely robust code, all 3 steps are needed no matter which separator is used. But in my experience, most implementations are sloppy and not robust. So using semicolon as the querystring separator would make life easier for more people with less website and interop bugs, if everyone adopted the semicolon as the default instead of the ampersand.)

Multiple Language Support for Classic-Asp

I want to translate my web page to 7 different languages and I'm curious about what is the best way to handle this?
I know this subject opened multiple times but I didn't get a reasonable answer.
Actually, all the topics are about php and gettext but I use classic asp (vbscript).
The method I'm using now is that;
I have en.asp and tr.asp which contains
lang_home="Home Page" and lang_home="Ana Sayfa"
and in my pages, I display them like <%=lang_home%>. I don't want to use lots of bracelets because I believe they slow down my site.
Evan, I thought that <%=GetTranslatedText(lang_home)%>
What I need to know is what is the best approach for multi-language web sites for asp and is there any solution like gettext for asp?
Thanks in advance.
There are only two ways to send dynamic text to the browser in ASP:
Write the entire HTML page with Response.Write calls
Embedded calls to Response.Write in otherwise-static HTML.
I think you're on the right path, balancing the need to have easily-editable HTML code with a fast lookup and replacement of language-specific strings. At least faster than, say, a bunch of SELECT CASE statetments, or a lookup against a Collection.
(If performance is really an issue, why not move over to ASP.NET?)
One other option is to pre-compile your ASP pages... keep a template of, say, "default.asp.template" that contains variables, separate language files (like you have now), and some code to generate "default-en.asp", "default-en.asp", etc. each time you change your template. Then, set the "default.asp" to simply and silently transfer execution to the correct page based on the user language.
An excellent (but commercial) app I've used for pre-compilation of ASP pages is WebGecko APGen (http://www.webgecko.com/).

Resources