Using FreeTextBox, I'm capturing HTML-formatted text. The purpose is to allow a website owner to update their web page content on a few pages. I have the system completed except for knowing what to do with the resultant HTML markup.
After the page editor completes their work, I can get the output from FreeTextBox, in html format, like so: <font color="#000080"><b>This is some text.</b></font>
I tried storing it as escaped markup in web.config, but that didn't work since it kept hosing the tags even after I changed them to escaped characters, like so: <font color="#000080">
The reason I wanted to store this kind of string as a key in web.config is that I could successfully store a static string, set a lebel's value to it, and successfully render the text. But when I try to escape it, it gets reformatted in web.config by .Net somehow.
So I escaped all the characters, encoded them as Base64 and stored that. Then on page_load, I tried to decode it, but it just shows up as text, with all the html tags showing as well - it doesn't get rendered. I know a million people use this control, but I'm damned if I can figure out how to do it right.
So here's my question: how can I inject the saved HTML into an edited page so it shows up in browsers like the editor wants it to look?
Try Server.HtmlDecode to output the HTML to the screen.
As a side note, I prefer to use CKEditor for html-formatted input. I found it is the better option among all options (FreeTextBox, TinyMCE, anything else?) and it has got completely rewritten and faster in the version 3.0!
In case anyone comes here for the answer, here's one way to do it.
I had initial problems with web.config changing some of the HTML tags upon storage, so we use B64 encoding (may not be necessary). Store the saved html markup to an AppSettings key in web.config as Base64 encoding, using this for your setting update function. Add error checking and whatever else you need it to do:
'create configuration object
Dim cfg As Configuration
cfg = WebConfigurationManager.OpenWebConfiguration("~")
'get reference to appsettings("HTMLstring")
Dim HTMLString As KeyValueConfigurationElement = _
CType(cfg.AppSettings.Settings("HTMLstring"), KeyValueConfigurationElement)
'get text entered by user and marked up with HTML tags from FTB1, then
'encode as Base64 so we can store it as XML-safe string in web.config
Dim b64String As String = Convert.ToBase64String(System.Text.Encoding.UTF8.GetBytes(FTB1.Text))
'save new value into web.config
If Not HTMLString Is Nothing Then
HTMLString.Value = b64String
cfg.Save()
End If
Next, add a Literal control to the aspx markup:
<asp:Literal id="charHTML" runat="server"/>
To add the saved HTML to the post-edited page, do the following in Page_Load:
'this string of HTML code is stored in web.config as Base64 to preserve XML-unsafe characters that come from FreeTextBox.
Dim injectedHTML As String = System.Text.Encoding.UTF8.GetString(Convert.FromBase64String(AppSettings("HTMLstring")))
'the literal control will directly inject this HTML instead of encoding it
charHTML.Mode = LiteralMode.PassThrough
'set the value
charHTML.Text = injectedHTML
Hope this helps. sF
Related
I have a form where use enters multiple line of texts in a text area.
Some of the lines can have html markups as well. Say one line is bold.
How should I save the text in my database?
Should I store them as like this?
This is a greap post
<br/>
I love this type of findings.
<br/>
<br/>
Thanks for sharing
OR like this?
This is a greap post
<br/>
I love this type of findings.
<br/>
<br/>
Thanks for sharing
During editing:
I must show the text as they were entered. So line break will be replaced by new line
That way use sees there is a line break. Textarea won't unserstand br markup
During displaying:
I must render the text so that it appears like this on the page:
This is a greap post
I love this type of findings.
Thanks for sharing
I want to know the cleanest way to store text that can have markup in them.
Thanks for help
Since you want to output HTML, you will have to store the input in it's raw format in the database. There is only one catch though. You never should trust input, since all input is evil, especially in this case, since outputting HTML directly as it is inputted, opens the possibility of an cross-site scripting (XSS) attack.
You have basically got two options:
Use a HTML sanitizer that let's you remove all tags that are not known to be safe. A good sanitizer is the one that comes with the Microsoft AntiXss toolkit.
Encode the input and decode parts of the result that are known to be safe, for instance:
string[] safeList = { "<br/>", "<b>", "</b>", "<i>", "</i>" };
public static string EncodeInputWithSafeList(string unsafeInput)
{
// First: encode the complete input.
string safeInput = Encoder.HtmlEncode(unsafeInput);
// Next: decode each tag that is known to be safe.
foreach (string safeTag in safeList)
{
string encodedTag = Encoder.HtmlEncode(safeTag, false);
safeInput = safeInput.Replace(encodedTag, safeTag);
}
return safeInput;
}
Note: The example uses the Encoder class from the Microsoft AntiXss toolkit.
Now the question becomes, at what point should we clean it up. Normally you should encode the output just before you send it to the client and not store it encoded in the database, since it depends on the output type (HTML, PDF, JSON) how data should be encoded. This is amplified by the fact that in case there is a bug in the encoder, there is no way to fix it, since the data is already encoded.
In this case it is a bit more tricky though, since the input is HTML and not just text. I would say that sanitizing is something you still would want to do before hand, because this way you prevent bad input from entering your database. The EncodeInputWithSafeList method is a bit tricky, because it is both a sanitizer and an encoder. When we run it before it goes into the database, it prevents the output from changing when we change the safe list. This can be both a good thing and a bad thing, but I would say that when you add new tags to the safe list, you wouldn't want old data to suddenly change. So in this case I would go with input encoding, instead of output encoding.
When you go with input encoding, name the database column in such way that it is clear that we're dealing with sanitized, encoded data.
Try htmlentities($str, ENT_QUOTES); before you save the data, and html_entity_decode($str) after you fetch it from your db, before you render it to the browser.
saving it to your database like this:
<p>This is a greap post
<br/>
I love this type of findings.
<br/>
<br/>
Thanks for sharing</p>
would work..
I am getting a block of XML back from a web service. The client wants to see this raw XML in a label on the page. When I try this:
lblXmlReturned.Text = returnedXml;
only the text gets displayed, without any of the XML tags. I need to include everything that gets returned from the web service.
This is a trimmed down sample of the XML being returned:
<Result Matches="1">
<VehicleData>
<Make>Volkswagen</Make>
<UK_History>false</UK_History>
</VehicleData>
<ABI>
<ABI_Code></ABI_Code>
<Advisory_Insurance_Group></Advisory_Insurance_Group>
</ABI>
<Risk_Indicators>
<Change_In_Colour>false</Change_In_Colour>
</Risk_Indicators>
<Valuation>
<Value xsi:nil="true" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"></Value>
</Valuation>
<NCAP>
<Pre_2009></Pre_2009>
</NCAP>
</Result>
What can I do to make this appear on the screen? I noticed that Stack Overflow does a pretty good job of putting the XML on the scren. I checked the source and it's using <pre> tags. Is this something that I have have to use?
It would be easier to use a <asp:Literal /> with it's Mode set to Encode than to deal with manually encoding Label's Text
<asp:Literal runat="server" ID="Literal1" Mode="Encode" />
You need to HtmlEncode the XML first (which escapes special characters like < and > ):
string encodedXml = HttpUtility.HtmlEncode(xml);
Label1.Text = encodedXml;
Surrounding it in PRE tags would help preserve formatting, so you could do:
string encodedXml = String.Format("<pre>{0}</pre>", HttpUtility.HtmlEncode(xml));
Label1.Text = encodedXml;
As Bala R mentions, you could just use a Literal control with Mode="Encode" as this automatically HtmlEncodes any string. However, this would also encode any PRE tags you added into the string, which you wouldn't want. You could also use white-space:pre in CSS which should do the same thing as the PRE tag, I think.
i'm using a little javascript in my website for my navigation bar, which is made up of a few ImageButton controls. in the code behind i have this:
Dim homeImage As String = GetLocalResourceObject("HomeImage")
imgBtnHome.Attributes.Add("OnMouseOver", HomeImage)
and in the resx file, i've tried these, but they don't work: (note the single and double quotes)
key: HomeImage value: "this.src='images/HomeImage.gif'"
key: HomeImage value: "this.src='images/HomeImage.gif'"
can anyone tell me what i'm doing wrong? is it even possible to read "quoted" text from a local resource file?
Yes, you can store quotes in a resx string value. If you look at the XML that's generated for a resource file, you'll see that the quotes are in the value entries.
However, your script isn't going to work with quotes around it. Think of it this way - say you wanted to pop up an alert box. You would do:
imgBtnHome.Attributes.Add("OnMouseOver", "alert('hi')")
NOT
imgBtnHome.Attributes.Add("OnMouseOver", """alert('hi')""");
You're passing in a string value for the script, not a quoted string value. Try removing the double quotes altogether, and leave the single quotes.
I need to store an escaped html string in a key in web.config using the
KeyValueConfigurationElement.Save method built into framework 3.5. But when I try to do so,
it keeps escaping my ampersands.
Code looks like this:
strHTML = DecodeFTBInput(FTB1.Text)
FTB1.Text is a string of HTML, like this: <b><font color="#000000">Testing</font></b>
DecodeFTPInput uses the String.Replace() method to change < and > to < and >, and " to ".
Given the above string and function, let's say strHTML now contains the following:
<b><font color="#000000">Testing</font></b>
Of course I can manually edit web.config to store the correct value, but I need the authenticated admin user to be able to change the html themselves.
The problem is that when I try to save this string into its key in web.config, it escapes all ampersands
as & which ruins the string.
How can I get around this?
web.config is an XML file, so when it writes values there the .NET Framework stores strings using HTML encoding, replacing the < > & characters with <, > and &, and much more besides.
You'll need to stop your DecodeFTPInput method from HTML encoding the string if you want the HTML in the web.config file to be editable. Otherwise you'll be HTML encoding twice, which isn't the result you want!
I have an HTML file with a ® (copyright) and ™ (trademark) symbol in the text. These are just two among many other symbols. When I read the html file into a literal control it converts the symbols to something else.
The copyright symbol converts to � (open box in ff)
The trademark symbol converts to ™ (as expected)
If (System.IO.File.Exists(FullName)) Then
Dim StreamReader1 As New System.IO.StreamReader(FullName)
Contents.Text = StreamReader1.ReadToEnd()
StreamReader1.Close()
End If
Contents is a <asp:Literal runat="server" ID="Contents"></asp:Literal> and it's the only control in the aspx page.
From some research I think this is related to the encoding but I don't know why it would change how to fix it.
The html file does not contain any Content-Type settings in the head section.
If it's at all possible to shift this processing to the Render method, you could use HttpResponse.WriteFile to see if it handles these characters better than the Literal control does. If you're doing nothing with the content of this file other than assigning it to the control and then letting it render, then you should be able to do this OK.