Making an ad manager plugin for WordPress, so the advertisement code can be almost anything, from good code to dirty, even evil.
I'm using simple sanitization like:
$get_content = '<script>/*code to destroy the site*/</script>';
//insert into db
$sanitized_code = addslashes( $get_content );
When viewing:
$fetched_data = /*slashed code*/;
//show as it's inserted
echo stripslashes( $fetched_data );
I'm avoiding base64_encode() and base64_decode() as I learned their performance is a bit slow.
Is that enough?
if not, what else I should ensure to protect the site and/or db from evil attack using bad ad code?
I'd love to get your explanation why you are suggestion something - it'll help deciding me the right thing in future too. Any help would be greatly appreciated.
addslashes then removeslashes is a round trip. You are echoing the original string exactly as it was submitted to you, so you are not protected at all from anything. '<script>/*code to destroy the site*/</script>' will be output exactly as-is to your web page, allowing your advertisers to do whatever they like in your web page's security context.
Normally when including submitted content in a web page, you should be using htmlspecialchars so that everything comes out as plain text and < just means a less then sign.
If you want an advertiser to be able to include markup, but not dangerous constructs like <script> then you need to parse the HTML, only allowing tags and attributes you know to be safe. This is complicated and difficult. Use an existing library such as HTMLPurifier to do it.
If you want an advertiser to be able to include markup with scripts, then you should put them in an iframe served from a different domain name, so they can't touch what's in your own page. Ads are usually done this way.
I don't know what you're hoping to do with addslashes. It is not the correct form of escaping for any particular injection context and it doesn't even remove difficult characters. There is almost never any reason to use it.
If you are using it on string content to build a SQL query containing that content then STOP, this isn't the proper way to do that and you will also be mangling your strings. Use parameterised queries to put data in the database. (And if you really can't, the correct string literal escape function would be mysql_real_escape_string or other similarly-named functions for different databases.)
I have a website that allows to enter HTML through a TinyMCE rich editor control. It's purpose is to allow users to format text using HTML.
This user entered content is then outputted to other users of the system.
However this means someone could insert JavaScript into the HTML in order to perform a XSS attack on other users of the system.
What is the best way to filter out JavaScript code from a HTML string?
If I perform a Regular Expression check for <SCRIPT> tags it's a good start, but an evil doer could still attach JavaScript to the onclick attribute of a tag.
Is there a fool-proof way to script out all JavaScript code, whilst leaving the rest of the HTML untouched?
For my particular implementation, I'm using C#
Microsoft have produced their own anti-XSS library, Microsoft Anti-Cross Site Scripting Library V4.0:
The Microsoft Anti-Cross Site Scripting Library V4.0 (AntiXSS V4.0) is an encoding library designed to help developers protect their ASP.NET web-based applications from XSS attacks. It differs from most encoding libraries in that it uses the white-listing technique -- sometimes referred to as the principle of inclusions -- to provide protection against XSS attacks. This approach works by first defining a valid or allowable set of characters, and encodes anything outside this set (invalid characters or potential attacks). The white-listing approach provides several advantages over other encoding schemes. New features in this version of the Microsoft Anti-Cross Site Scripting Library include:- A customizable safe list for HTML and XML encoding- Performance improvements- Support for Medium Trust ASP.NET applications- HTML Named Entity Support- Invalid Unicode detection- Improved Surrogate Character Support for HTML and XML encoding- LDAP Encoding Improvements- application/x-www-form-urlencoded encoding support
It uses a whitelist approach to strip out potential XSS content.
Here are some relevant links related to AntiXSS:
Anti-Cross Site Scripting Library
Microsoft Anti-Cross Site Scripting Library V4.2 (AntiXSS V4.2)
Microsoft Web Protection Library
Peter, I'd like to introduce you to two concepts in security;
Blacklisting - Disallow things you know are bad.
Whitelisting - Allow things you know are good.
While both have their uses, blacklisting is insecure by design.
What you are asking, is in fact blacklisting. If there had to be an alternative to <script> (such as <img src="bad" onerror="hack()"/>), you won't be able to avoid this issue.
Whitelisting, on the other hand, allows you to specify the exact conditions you are allowing.
For example, you would have the following rules:
allow only these tags: b, i, u, img
allow only these attributes: src, href, style
That is just the theory. In practice, you must parse the HTML accordingly, hence the need of a proper HTML parser.
If you want to allow some HTML but not all, you should use something like OWASP AntiSamy, which allows you to build a whitelisted policy over which tags and attributes you allow.
HTMLPurifier might also be an alternative.
It's of key importance that it is a whitelist approach, as new attributes and events are added to HTML5 all the time, so any blacklisting would fail within short time, and knowing all "bad" attributes is also difficult.
Edit: Oh, and regex is a bit hard to do here. HTML can have lots of different formats. Tags can be unclosed, attributes can start with or without quotes (single or double), you can have line breaks and all kinds of spaces within the tags to name a few issues. I would rely on a welltested library like the ones I mentioned above.
Regular expressions are the wrong tool for the job, you need a real HTML parser or things will turn bad. You need to parse the HTML string and then remove all elements and attributes but the allowed ones (whitelist approach, blacklists are inherently insecure). You can take the lists used by Mozilla as a starting point. There you also have a list of attributes that take URL values - you need to verify that these are either relative URLs or use an allowed protocol (typically only http:/https:/ftp:, in particular no javascript: or data:). Once you've removed everything that isn't allowed you serialize your data back to HTML - now you have something that is safe to insert on your web page.
I try to replace tag element format like this:
public class Utility
{
public static string PreventXSS(string sInput) {
if (sInput == null)
return string.Empty;
string sResult = string.Empty;
sResult = Regex.Replace(sInput, "<", "< ");
sResult = Regex.Replace(sResult, #"<\s*", "< ");
return sResult;
}
}
Usage before save to db:
string sResultNoXSS = Utility.PreventXSS(varName)
I have test that I have input data like :
<script>alert('hello XSS')</script>
it will be run on browser. After I add Anti XSS the code above will be:
< script>alert('hello XSS')< /script>
(There is a space after <)
And the result, the script won't be run on browser.
In our application developed in html5 and javascript whenever a user submits(with text contain < and > and #) the form (containing a comments text field )we get the following error:
{"Message":"A potentially dangerous Request.Form value was detected from the client ......
Now the dev team has fixed this issue saying that they handled this at the server side.
Now i want to test different scenarios just to ensure that this issue wont repeat next time and for any other special characters .
Can anyone suggest me the different scenarios i can test here apart from entering the special characters in the comments text box and submitting the form?
The problem here is the middle of the line (HTML).
The chain:
I have WinForm program that uses awesomium (alternative to native webBrowser) to view Html page that has a part of asp.net page in it's iframe.
The problem:
The problem is that I need to pass value to asp.net page, it is easily achieved without middle of the chain (Html iframe) by sending hashed and crypted querystring.
How it works:
WinForm do some thing, then use few-step-crypt to code all the needed values into 1 string.
Then it should send this string to asp.net page through the iframe (and that's the problem, it is easy to receive query string in asp.net page, but firstly I need to receive it in Html and send to asp.net).
Acceptable answers:
1) Probably the most easily one - using JavaScript. I have heard it is possible to be done in that way.
How I imagine this - I send query string from WinForm to Html page as http:\\HtmlPage.html?AspNet.aspx?CryptedString
Then Html receive it with JavaScript and put querystring "AspNet.aspx?CryptedString" into iframe's "src=http:\\" resulting in "src=http:\\AspNet.aspx?CryptedString"
And then I easily get it in asp.net page.
2) Somehow create >>>VIRTUAL<<<(NOTE: Virtual, I don't want querystring to be saved on the HDD, even don't suggest) asp.net or html page with iframe source taken directly from WinForm string.
Probably that is possible with awesomium, but I'm new to it and don't know how to (if it is possible ofc).
3) Some web service with which I can communicate between asp.net and WinForm through the existing HTML iframe.
4) Another way that replace one of 3 previous, that doesn't save "values" in querystring/else on HDD nor is visible for the user, doesn't use asp.net page's server to create iframe-page on it. On HTML page's server HTML is only allowed, PhP isn't.
5) If you don't know any of 4 above - suggest free PhP hosting without ads (if such exists, what I highly doubt).
Priority:
The best one would be #3, then #2, then #1, then #5 (#4 is excluded as it is unknown).
And in the end:
Thanks in advance for your help.
P.S.Currently at work, so I'll check/try all answers later on and will report tomorrow if any suits my needs. Thanks again.
Answering my own question. I have found 2 ways that can do what I did want.
The first one:
Creating a RAM file System.IO.MemoryStream or another method (google c# create a file in ram).
The second one:
Creating a hidden+encrypted+system+custom-readable-only-by-program-crypt file somewhere in the far away folder via File.SetAttributes Method and System.IO.StreamWriter/Reader or System.IO.FileStream or System.IO.TextWriter, etc. depending on what it should be.
Once this file was used for needs delete it + delete on exit + delete on start using
if (File.Exists(path)
{
File.Delete(path);
}
(Need more reputation to post few links -_-, and I don't want to post only part of them, either all or no at all, so use google if you'll need anything from here).
If you'll need to store "Small temp file" and not for a long time use first one, if "Heavy" use second one, unless you badly need to use RAM for it.
document.aspnetForm.action = "https://www.paypal.com/cgi-bin/webscr";
I use master page and paypal payment page but giving error "document.aspnetform is not defined"
I can't tell from your question whether you are doing this on the client using JavaScript or on the server in C#. I guess the former as you are using document all lower case. Either way check your capitalisation - Javascript is case sensitive so you may need document.AspNetForm or something similar as your identifier. Just make sure it matches up to whatever the title of the form is in the source code.