I have a page in ASP.net (VB) that I'm serving via IIS.
The page is basically a translation of the uk site.
I have:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
at the top of the code, and all the characters show ok in the code.
however in (all) browsers many of the special polish characters, such as 'Ł' are missing, replaced directly with 'L'.
Is this an IIS thing? or could it be something else?
ETA: I just noticed that the polish text portion drawn out of the SQL database is being displayed correctly within the same page..! Odd!
Further edit:
I have found the basic source of the issue, I think, but not a solution:
The areas that are not showing properly are headers and footers, which are imported into the page via Server Side Include.
It seams some sort of encoding is being lost in this import / injection.
Should the imported file have some sort of encoding header?
This sounds like a problem with encoding in your static content files. The content-type <meta> has no bearing on the actual physical encoding of the file. I have a suspicion the file is saved in Codepage 1252 instead of UTF-8.
I suggest you open your *.aspx files (where I assume you're storing the problematic Polish text) in a text editor that supports different encodings (such as VS or Notepad2. Not WordPad or Windows Notepad). Force-save the file with UTF-8 encoding (in VS, go File > Advanced Save Options and ensure "Unicode (UTF-8 with signature)" is selected). Then access your site again.
Also ensure that the Content-Type HTTP header is also correctly set to UTF-8.
Related
I built a static site initially and am now in the process of converting it to a wordpress site. You can find it here The last image in the right column, when clicked, should open up a fancybox and play a video. It worked very well in the static site, but for some reason in wordpress the box appears at the bottom of the page instead of the center. I'm pretty sure it is seeing the css because I can click on the link and find it.
This is the result of the validation of your page
http://validator.w3.org/check?uri=http://training.mercury.stellarbluewebdesign.com/LittlestTumorFoundation/
Notice the comment :
Byte-Order Mark found in UTF-8 File.
The Unicode Byte-Order Mark (BOM) in UTF-8 encoded files
is known to cause problems for some text editors and older
browsers.
Also notice
Line 1, Column 1: Non-space characters found without seeing a
doctype first. Expected <!DOCTYPE html>
Passing your code through an editor in ansi mode (and showing all symbols), this is what I get :
Those preceding hidden characters before the DOCTYPE in your document makes your browser run in quirks mode hence the unexpected behavior of fancybox (which needs the document in standards mode to run properly)
What you have to do is to save your WP (php) files in an editor using UTF-8 without BOM encoding and upload them again (and alternatively forcing your ftp software to upload in binary mode)
I put <meta http-equiv="content-type" content="text/html; charset=utf-8" /> in the head, and DOCTYPE at the top of my _Layout.cshtml, but still when I'm viewing the website, the special Turkish characters are displayed as like '1; and such in the source, not in the page. The webpage displays them correctly, the source file has the problem..
Do you know what else I should do to correct this?
Try to save your source file as UTF-8 with BOM.
I was having a similar issue and it seems that Razor expects files to have the BOM signature.
In Visual Studio: FILE > Advanced Save Options:
I had exactly the same problem, tried all the solutions and none of them worked, checked .cshtml file encoding with Notepad++ and it was correct (utf8 with BOM), SaveAs file with encoding (as shown above) but the problem still persisted.
Finally I found out that there was nothing wrong with my razor files, it seems that ASP.Net core MVC does not set UTF-8 encoding as default encoding and it must be set if needed by developer.
I added the below code at the end of ConfigureServices method of Program.cs (it adds UTF-8 encoding to default encodings) and the problem solved.
services.AddWebEncoders(o => {
o.TextEncoderSettings = new System.Text.Encodings.Web.TextEncoderSettings(UnicodeRanges.All);
});
When I put this word "Bibliothèque" in a .aspx page, I see it correctly "Bibliothèque".
If I put the same word in a .html file, I see "Bibliothèque"
How can this be possible? Must be an IIS issue but I can't find the setting.
How can a .aspx file show the right word but not a .html file.
Open the file named web.config in the ASP.NET project. The value of requestEncoding attribute in globalization element is "utf-8". It means the requested texts were encoded as UTF-8 character set.
check your browser what it is support. you can change it using character encoding. So your HTML is giving you the result according to browser character encoding.
To ensure it will always work, for this specific example, you can replace the non ASCII characters using Html entities, like this: Bibliothèque. But this is not always practical in general.
Otherwise, there are other various ways to make it work:
use byte order mark encoding (sometimes called 'signature', or BOM, by editors) and save the file as UTF-8
add a META character encoding to your html file.
define what HTTP headers will be sent to the client using the globalization element in the application web.config (responseEncoding, etc.)
define what HTTP headers will be sent to the client using the ASP.NET #page directive
The best is to make sure all this is consistent in your application. UTF-8 support is now widespread, so it's a good choice as the encoding.
An interesting article on the encoding subject :The Definitive Guide to Web Character Encoding
I'm still having problems with loading the style sheet for these pages. Works fine in Safari but ff and IE, no joy:
http://www.mainstayprojects.com/teardrop.html andb
www.mainstayprojects.com
Although i am more clear as to what's causing this problem, thanks the the answers to my previous posting (stackoverflow.com/questions/3273655/css-file-not-loading), I am at a loss to how to fix the issue. I have re-saved many times with different doctypes and content type meta tag as well as saving the file as a charset=utf-8 file but have not been able make any headway!
Really need some help.
Your server is still claiming the HTML document is ISO-8859-1 (although the document itself looks like UTF-8).
Meanwhile the stylesheet appears to be UTF-8, the server fails to state what encoding it is, and the first line of the stylesheet claims that it is UTF-16.
Pick an encoding
Configure your editor to use it
Configure your server to specify it
If you put any information about the encoding at the document level — get it right!
We have a website that uses Classic ASP.
Part of our release process substitutes values in a file and we found a bug in it where it will write the file out as UTF-8.
This then causes our application to start spitting out garbage. Apostrophes get returned as some encoded characters.
If we then go an remove the BOM that says this file is UTF-8 then the text that was previously rendered as garbage is now displayed correctly.
Is there something that IIS does differently when it encounters UTF-8 a file?
I was searching on the same exact issue yesterday and came across:
http://blog.inspired.no/utf-8-with-asp-71/
Important part from that page, in case it goes away...
ASP CODE:
Response.ContentType = "text/html"
Response.AddHeader "Content-Type", "text/html;charset=UTF-8"
Response.CodePage = 65001
Response.CharSet = "UTF-8"
and the following HTML META tag:
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8" />
We were using the meta tag and asp CharSet property, yet the page still didn't render correctly. After adding the other three lines to the asp file everything just worked.
Hope this helps!
UTF-8 does not use BOMs; it is an annoying misfeature in some Microsoft software that puts them there. You need to find what step of your release process is putting a UTF-8-encoded BOM in your files and fix it — you should stop that even if you are using UTF-8, which really these days is best.
But I doubt it's IIS causing the display problem. More likely the browser is guessing the charset of the final displayed page, and when it sees bytes that look like they're UTF-8 encoded it guesses the whole page is UTF-8. You should be able to stop it doing that by stating a definitive charset by using an HTTP header:
Content-Type: text/html;charset=iso-8859-1
and/or a meta element in the HTML
<meta http-equiv="Content-Type" content="text/html;charset=iso-8859-1" />
Now (assuming ISO-8859-1 is actually the character set your data are in) it should display OK. However if your file really does have a UTF-8-encoded BOM at the start, you'll now see that as ‘’ in your page, which is what those bytes look like in ISO-8859-1. So you still need to get rid of that misBOM.
If you using access db you should write
Session.CodePage=65001
Set tabtable= Conn.Execute("SELECT * FROM table")