Inconsistent display of unicode characters on web pages - asp.net

I have a C program that outputs some data including unicode non-English characters. It works fine in a Windows 7 command window, a Linux telnet session but used to fill a Label field on a ASP/html page it doesn't work in all situations - the platform that the web server is running on seems to affect the output.
Where I have a machine running Windows XP SP3 the program works fine in a command window but in the web page the characters are wrong. e.g. Ø is displayed as Ï.
The web page works fine where the web server is on Windows 7 and Server 2003 SP2. Web browser choice makes no difference.

The problem is probably one of character encoding.
The character encoding can be specified in each page or by setting a default value in the web server.
The Windows XP IIS probably has ISO-8859-1 as the default character set.
you can either change it by configuring IIS or by specifying the character set in each HTML page.

When Ø (U+00D8) is displayed as Ï (U+00CF), the probable explanation is that the HTML page is ISO-8859-1 or Windows-1252 encoded but the browser is interpreting it as CP 850 encoded. Check this using View → Encoding in your browser (it will show you the current encoding being applied to interpret the page, and you can change that to Windows-1252, which exists under some name (like “Western European (Windows)”) there. – There are some other encodings in which the byte 0xD8 is interpreted as Ï.
If this turns out to be the right explanation, and even if not, check the actual and declared character encoding of the page and make sure that they match. See the W3C page
Character encodings.

Related

How to include special characters in RDLC report without t2embed error

I am having issues getting special characters to appear in RDLC. More specifically the Spanish Accent characters such as :
Ñ
Á
Í
Ó
Ú
The error is : System.DllNotFoundException: Unable to load DLL 'T2Embed': Access is denied
It seems that GoDaddy does not have this extension and it is not being moved over during deployment. I have tried to manually transfer and different error messages appear. In case permissions was an issue, using the Plesk admin portal, the file T2Embed.dll was given full control permissions.
I can simply just add restrictions to prevent these characters from being entered in the first place, but since the audience of these reports are from a Spanish speaking country, I'd rather allow entry and just fix RDLC to allow this.
that is Godaddy Issue with font embedding. Solution is set "Device info" to "Not embedding fonts".
But, then your PDF is not showing proper characters on some devices. Android phones not showing unicode characters at all. I think we should change hosting.
Just copy the "C:\Windows\SysWOW64\t2embed.dll" file to your bin folder.

# in URL required but doesn't work on PC in Qt app

I have a Qt app that I have inherited and have to support. There is a piece of code that generates a URL that looks like this:
http://foo.bar.com:8000/#/workitem/71327434512586907410/report
The page is then loaded with setUrl
On a Mac this works fine, but on Windows the page is not loaded, and I do not even see the request reach the server. I found this:
https://bugreports.qt.io/browse/QTWEBKIT-56
On the Mac where it works I do not see the # in the request that the server gets. But if I remove the # in the code, I get a 404.
So my questions are this:
What does the # mean in this context?
Why is it required for this URL to be recognized?
Why does it work on Mac and not on Windows? Is it that bug in the link?
The webserver is nginx, and the framework is falcon.
I have a bit more info on this.
When the URL contains the # I see in the nginx log this:
POST /workitem/67876029556368716590/report
And the request is successfully served.
But when the URL does not have the # I see this in the log:
GET /workitem/67876029556368716590
And that returns the 404.
Another update:
I have figured out that the # is a Angular JS routing thing:
AngularJS routing without the hash '#'
So now my only question is, is there a Qt bug that is preventing this from working in Windows.
'#' is an unsafe character and should be encoded.
All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
systems that do not normally deal with fragment or anchor
identifiers, so that if the URL is copied into another system that
does use them, it will not be necessary to change the URL encoding.
http://www.ietf.org/rfc/rfc1738.txt
This turned out to be a big red herring. I was able to modify the system so that I did not need the # in the URL and it still didn't work. Turned out the issue was that the code was trying to download jquery using SSL and it did not work on systems that did not have the SSL libs installed. I changed the code to download jquery from my server instead of the internet and it all worked.

Unicode character not showing properly

I am working on a classic asp application hosted in IIS 6. In one asp page user enters some data and this data is e-mailed using Jmail utility.
When the user enters Swedish characters like äöü the mail does not display these characters properly. I found that setting charset in the tag will help. But it is causing me more confusions. The website is hosted on two machines and application is behaving quite differently in both the machines.
Machine 1:
If I set Charset to UTF-8 unicode characters are displayed as two characters. Browser is sending data in UTF encoding but server is decoding in ASCII.
If I set the charset to ISO-8859-1 unicode characters are displayed properly.
Machine 2:
If I set Charset to UTF-8 unicode characters are displayed properly.
If I set Charset to ISO-8859-1 unicode characters are not displayed at all.
Question:
How can I make the same code work in both places?
try to set the encoding in the 2 machine to UTF-8.
how to set encoding in iis.
copy all the files from the machine 2(That work only with UTF-8) to machine 1.
and then try.

Different behaviours of treating \ (backslash) in the url by FireFox and Chrome

BACKGROUND
According to my experience when my ubuntu workstation is configured on domain with active directory, the user name created for me was according to the following pattern.
domain_name\user_name
Using the userdir extensions of apache on linux will require to use user name in the URL in order to access public_html in the home directory.
http://localhost/~domain_name\user_name
PROBLEM A:
Chrome converts all the backslash '\' characters in the URL to forward slash '/' and the resultant url becomes as under that is totally different and always results Not Found.
http://localhost/~domain_name/user_name
Firefox on the other hand does not convert back slash to forward slash so http request to intended target is served by web server.
Common solution is to encode back slash in %5C.
PROBLEM B:
If we use a similar path (containing \ in path) in CSS #import construct, the import process of css file as HTTP Get Request is failed by reporting 404 error and the URL reported in the 404 error miss the presence of \ altogether. It means \ is removed from the URL before to invoke GET request against it.
This behavior is common in Firefox and Chrome. But they have uncommon solutions
Firefox needs escaped back slash to work in css import process.
#import url("http://localhost/~domain_name\\user_name/path/to/css");
Chrome as usual needs an encoded back slash solution.
#import url("http://localhost/~domain_name%5Cuser_name/path/to/css");
What is the unified solutions to deal with \ in URL?
Is there a way to avoid a \ to appear in user name?
The unified solution to deal with backslash in a URL is to use %5C. RFC 2396 did not allow that character in URLs at all (so any behavior regarding that character was just error-recovery behavior). RFC 3986 does allow it, but is not widely implemented, not least because it's not exactly compatible with existing URL processors.
Chrome, in particular, does the same thing as IE: assumes you meant a forward slash any time you type a backslash, as you discovered, because that's what Windows file paths do.
Try using the Slashy add-on in the firefox to help you with it.Here's a link to it.
Slashy
This backslash auto conversion issue has fixed in Chrome version >= 53.0.2785.116.
Now the backslashes are treated properly as %5C.

UTF-8 server encoding results in � characters on an ASP.NET site

I am running an ASP.NET WebForms blog engine web site at maxpavlov.com
I am writing mostly in Russian on my blog. Sometimes, even though I am writing a perfectly normal Russian characters in, when I view the resulting rendered blog post page, I get some symbols substituted with �� characters.
I started digging. First, I have checked to see if a UTF-8 is set as a response encoding in the globalization section in web.config. It always was. Then I have noticed, that the pages my site generate don't have a <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> attibute provided in the page header. So I have added it to both masterpages (the display one, and the admin one - it is blogengine.net's specific stuff).
Now all pages that the web server generates have the charset value set to UTF-8, but the problem remains.
The site, when I create a blog post saves it to XML file, that also has an encoding set at the top of the XML file to UTF-8 with the following line:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
Still, problem characters appear in the browsers, when I go to my site.
Where else should I fix this encoding problem?
More info: Fiddler tells me that the response header Content-Type: text/html; charset=utf-8
What is interesting, is that in different browsers, different characters in the HTTP Response get substituted with a �.
By the way, if anyone still wonders what the thing is - it's IIS Native RewriteModule. It's buggy even in version 2, if you disable it for the site, the problem goes away. Tried to report it in IIS.net - didn't believe me. Just learned to live without it on web sites that need to display a cyrillic characters.
Try using Windows-1251 (cyrillic) encoding for the russian alphabet.

Resources