Special character in HTML output, likely due to an encoding issue - asp.net

I am seeing a special character in the ASP .NET page I am rendering.
This page reads that content as XML Response from a REST service.
If I load the XML in browser, it displays "-" fine. (It's longer than the usual dash :))
But when print on the ASPX page repeater using EVAL, it displays a special character.
The page has a meta tag.
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
Though, the browser detects the page encoding as UTF-8.
I am looking for a solution so that I can get rid of special character.

The char is probably ASCII code 150 or 151. Some programs (notably MS-Word) use these for dash and long dash. The problem is that charset ISO-8859-1 does not map characters between 128 -159 to any value, so you cannot be sure how the browser will display the character.
The following function (just typed in, not checked) will convert your source string from 8859-1 to UTF-8
function string MakeUTF8String(string SourceStr)
{
byte[] b = Encoding.GetEncoding("iso-8859-1").GetBytes(SourceStr)
return System.Text.Encoding.UTF8.GetString(b);
}

Related

What is causing my browser to render an asp's &nbsp incorrectly?

I have an asp page rendering some text from a table into html. Some of the text has the non-breaking-space character in it (unicode U+00A0). The browser auto-detects the character encoding to be unicode, which is good, but it isn't rendering the correctly. It is rendering them as � (the replacement character). When I change the page encoding to be "Western" instead of "Unicode", the � characters disappear.
Shouldn't the non-breaking-space be a normal character for a Unicode encoded web page to render? What is happening to cause this?
I have verified that the character stored in the database is the non-breaking-space by using SQL Server's ASCII and UNICODE functions, both return 160.
Also, when I run this code snippet String.fromCharCode(160) it returns " ", so the browser does seem to understand that character is supposed to be a space. Could the ASP be messing those characters up between querying them and writing them as html?
The asp file was saved with ANSI encoding. Switching the file's encoding to UTF-8 solved the problem. I'm guessing even though the page said it's charset was UTF-8, it really wasn't. This explains why 'Western' encoding worked while "Unicode" did not.

Getting none-English character from query string

I have this in my querystring - sug_zehut=ז
(ז is a Hebrew letter)
Although I'm well aware that this is bad practice, I have to receive it like so in my query string (not my code..)
When I write it to a hidden I get sug_zehut=%EF%BF%BD as a part of the querystring, and when I try to put it in a string and put that in a hidden, I get � (I found here that those two are the same).
Anyhow, the question is - How do I get the value ז to my variable?
(I'm using .net version 4)
Thanks.
%EF%BF%BD is the code of � sumbol. That's mean your server don't know this character because of you sent symbol in one encoding and try get it as UTF-8.
Try to set up utf-8 encoding in every place (at least for testing of the problem):
In web.config
http://msdn.microsoft.com/en-us/library/ydkak5b9(v=vs.71).aspx
In the page
<meta charset="utf-8"> vs <meta http-equiv="Content-Type">
Change real file encoding if you directly set up your symbol (.cs,
.cshtml, .aspx, .js, ...)
If it's not working that please tell me how do you get this url and navigate on it?
Here's a small function to convert your character to an HTML encoded version (ז in this case). HtmlEncode won't do it for you in most versions of .net, unfortunately.
private static string UnicodeConvertChar(char input)
{
if (input > 159)
{
return "&#" + ((int)input).ToString() + ";";
}
else
{
return input.ToString();
}
}
You can convert it before setting it in your hidden field, and then when you get it back, you can simply read it by using HttpUtility.HtmlDecode.
Here's a dotnetfiddle for you to play around with it:
https://dotnetfiddle.net/jbDY0V

What is the expected encoding for QWebView::setHtml?

I found a strange effect that I do not understand: I have a HTML file encoded in UTF-8. It also has a meta element with content="text/html; charset=UTF-8"/>.
If I load the HTML file in QWebView, it is displayed correctly.
If I load the HTML file in a QByteArray (still looks like valid UTF-8), convert it into a QString (still looks like valid UTF-8), and set this via setHTML on the QWebView, it is displayed incorrectly (as if interpreted as ASCII).
If I take the same QByteArray, and set it via setContent on the QWebView, passing "text/html; charset=UTF-8" as mime type, it is displayed correctly again.
What is the expected encoding for QWebView::setHtml? The documentation only mentions that external CSS and script files are interpreted as UTF-8. This is using Qt 4.8.2.
There is no expected encoding because the text should already have been decoded to 16-bit unicode when you created the QString. It's up to you to do that correctly, but if you used the QString(const QByteArray&) constructor then Qt will by default treat the contents as ASCII.
If you want to treat the content as UTF-8 then you can use QString::fromUtf8. If you need to do something more sophisticated you can use QTextCodec to read many different encodings.
To solve this problem I iterate many cases, but true was in that:
QTextCodec::setCodecForCStrings(QTextCodec::codecForName("UTF8"));
because QtWebKit uses a converting to std::string inside self.
I used setContent(bytearray, "text/html; charset=utf-8") and it worked. The "utf-8" should be in lowercase.

Converting French Characters to ASCII on form submit

Hi there I was wondering if anyone can help with this. I am submitting values to an access database using ASP classic and need to convert French characters to ASCII. I have done it before with an form to email script. here is the code that I used with the first line being the code that writes a value to the database field. any help would be great.
[code]
'--------------------------------------------------------------------------
' Checks form fields and headings for French Characters and replaces them
' with the ASCII equivalent.
'--------------------------------------------------------------------------
rsAddComments.Fields("Customer") = Request.Form("Customer")
body = Replace(body,"À",chr(192))
body = Replace(body,"Á",chr(193))
body = Replace(body,"Â",chr(194))
body = Replace(body,"Î",chr(206))
[/code]
Try to just put the ISO Latin 1 character set on the page:
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
This work for me.
Actually that will not work. It is the way that access reads to text, same as outlook. I had to convert to ascii for outlook. What I did for that was
body = Replace(body,"é",chr(999))
Now I need to replace all of the values from the form fields with the ascii using ASP before it writes to the database. What I don't know is what to put in place of BODY in the above code.

Arabic QueryString problem (???? in the value)

I am sending an arabic value in a querystring, when retrieving it on the server, the value is erroneous and is replaced by quotation marks (????).
for example:
http://server/mypage.aspx?qs=مرحبا
the value of Request.QueryString("qs") is ?????
Note that Response.Write('مرحبا') executes correctly.
Any idea about this querystring problem?
Thanks.
Just URL Encode the arabic string and it should work fine.
Edit: You must URL Encode the string before putting it in the querystring.
For instance, if you were to url encode the space character, it will appear as %20 in your querystring, like this:
http://foo.com/dosomething?param1=hello%20world
Then when you read param1 you URL Decode it, and you get the string "hello world"
You could also URL Encode every single character but for regular characters it's pointless.
I sent an Arabic text in my query string
and when I resieved this string it was Encoded
after Server.UrlDecode
departmentName = Server.UrlDecode(departmentName);
it back again to arabic
so just use Server.UrlDecode(encodedString);
Hope this help you
I had a similar problem and solved it by putting the following line in my web.config file:
<globalization fileEncoding="windows-1256"
requestEncoding="windows-1256" responseEncoding="windows-1256"/>"
And this in the head section of my HTML page:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
The Non english characters can't be passed without being encoded ,
so you need to encode the value before you redirect to the target page as follows:
string text="مرحبا";
text=Server.UrlEncode(text);
string url="http://server/mypage.aspx?qs="+text;
Response.Redirect(url);

Resources