I was doing a bunch of search-replace operations in notepad++ to effectively minify my css - mostly removing whitespace/tabs etc...) This ended up breaking much of my css.
Apparently a strange character (​) was inserted all over the place) Using notepad++ in UTF-8 without BOM, I cannot see these, but they appeared in a view-source.
I was able to remove these by doing a search replace in ANSI encoding, but my question is, what is this character, and why might it have appeared?
The string “​” is the UTF-8 encoded form of ZWSP when misinterpreted as windows-1252 encoded data. (Checked this using a nice UTF-8 decoder.) This explains why you don’t see it in Notepad++ in UTF-8 mode; ZWSP (zero-width space) is an invisible character with no width.
Apparently browsers are interpreting the style sheet as windows-1252 encoded. Saving the file with BOM might help, since then browsers would probably guess the encoding better. The real fix is to make sure (in a server-dependent manner) that the server sends appropriate Content-Type header for the CSS file.
But if this is the only non-Ascii character in your CSS file, it does not matter in practice, after you have removed the offending data.
I don’t know of any simple way to make Notepad++ insert ZWSP (you could of course use general character insertion utilities in the system), so it’s a bit of mystery where it came from. Perhaps via copy and paste from somewhere.
Using the web developer plug in or ext in Firefox you can see the problem character in the css document.
In Visual Studio all I could see was:
}
.t
Web developer showed an unwanted hidden character, an "a" with a caret on top:
}
â.t
The utf encoder link above revealed this
} (the encoded character for ampersand)
.t
and this
but simply fix the problem by deleting and retyping.
Related
Okay, something just went crazy. Unless China is taking over starting with my test style.css file on my iepage - well I guess they are starting off on the right foot hating on IE, but anyways. It loads with no stylesheet - sad :( I go into the Web inspector and see that all my linked files are filled with [possibly] Chinese characters (瑨汭笠ऊ楷瑤...) I have tried deleting the files on the server and re-uploading them. The local files look fine and when loading the files directly they look fine. I didn't do anything that should of changed the rendering or anything either.
So I think I figured it out. This is weird. But anyway.
I copied and pasted your HTML to a local file to experiment with. And it loaded just fine. It was saved as UTF-8. Then I changed it to UTF-16, and I got exactly what you're seeing! As far as can tell, the browser (Firefox for Linux for me) is assuming the linked files are all in the same encoding as the HTML...
So - I assume the file on the server is in UTF-16, and if you change it to UTF-8 you should be good. Hope that fixes it!
PS: According to Firebug, your HTML is compressed by your server, even if you never explicitly told it to. But that doesn't seem to be causing any problems, thankfully.
I encountered this same problem with XML files exported from PowerShell that were embedded in iFrames.
There was no issue in IE10/11 or Edge, but Firefox and Chrome wouldn't load the stylesheet.
The original page loading the iFrames was UTF8 encoded, same with the stylesheet. However, the XML file was exported to UTF16LE ("Unicode" in PowerShell). When the XML file was loaded from the iFrame, it loaded the stylesheet as Chinese characters.
I converted the encoding in PowerShell...
Get-Content C:\foldername\file.html -Encoding Unicode | Set-Content -Encoding UTF8 C:\foldername\file.html
...and it worked! My guess is that IE must treat the encoding of all files the same as the parent, which meant that the UTF16LE encoded file was rendered as UTF8. Chrome and Firefox apparently don't do that.
Thanks Xavier Holt for setting me on the right path!
Another quick solution is to change the file encoding using Notepad.
Open the file in Notepad and Save As with the UTF-8 option selected from the drop down
it may be the .html file itself. I solved my similar problem by copying the contents of the original .html file and pasting it into a new file with the same name in the same directory (change the original's file name at first and delete the remainder of course)
I'm building a website with Wordpress, but sometimes (I don't know why) the html entities declared in my php files are displayed as a � (a black diamond with a white question mark). My charset attribute is already set to UTF-8 (as shown in the picure below). It could be a problem of the text editor (I'm currently using the built-in editor of Aruba)... How can I make sure that the encoding is right?
Try copying your code files into notepad++ and make sure that the encoding is actually set to UTF-8 without BOM.
When I put this word "Bibliothèque" in a .aspx page, I see it correctly "Bibliothèque".
If I put the same word in a .html file, I see "Bibliothèque"
How can this be possible? Must be an IIS issue but I can't find the setting.
How can a .aspx file show the right word but not a .html file.
Open the file named web.config in the ASP.NET project. The value of requestEncoding attribute in globalization element is "utf-8". It means the requested texts were encoded as UTF-8 character set.
check your browser what it is support. you can change it using character encoding. So your HTML is giving you the result according to browser character encoding.
To ensure it will always work, for this specific example, you can replace the non ASCII characters using Html entities, like this: Bibliothèque. But this is not always practical in general.
Otherwise, there are other various ways to make it work:
use byte order mark encoding (sometimes called 'signature', or BOM, by editors) and save the file as UTF-8
add a META character encoding to your html file.
define what HTTP headers will be sent to the client using the globalization element in the application web.config (responseEncoding, etc.)
define what HTTP headers will be sent to the client using the ASP.NET #page directive
The best is to make sure all this is consistent in your application. UTF-8 support is now widespread, so it's a good choice as the encoding.
An interesting article on the encoding subject :The Definitive Guide to Web Character Encoding
I'm still having problems with loading the style sheet for these pages. Works fine in Safari but ff and IE, no joy:
http://www.mainstayprojects.com/teardrop.html andb
www.mainstayprojects.com
Although i am more clear as to what's causing this problem, thanks the the answers to my previous posting (stackoverflow.com/questions/3273655/css-file-not-loading), I am at a loss to how to fix the issue. I have re-saved many times with different doctypes and content type meta tag as well as saving the file as a charset=utf-8 file but have not been able make any headway!
Really need some help.
Your server is still claiming the HTML document is ISO-8859-1 (although the document itself looks like UTF-8).
Meanwhile the stylesheet appears to be UTF-8, the server fails to state what encoding it is, and the first line of the stylesheet claims that it is UTF-16.
Pick an encoding
Configure your editor to use it
Configure your server to specify it
If you put any information about the encoding at the document level — get it right!
I'm writing all sorts of multi lingual text to .txt files using AIR's
fileStream.writeUTFBytes()
For english characters everything works perfectly. But as soon as there are chinese, arabic or any other non-english characters the sentences are totally messed up.
For example:
对着大叔摄影师的确没爱....
becomes
对着大叔摄影师的确没爱....
How can this be fixed?
writeUTFBytes doesn't mess up anything since it doesn't process the content.
Whatever goes in the pipe comes out.
The text you are sending is most likely encoded in Unicode/UTF-8
Make sure that you are openning the file with an editor that supports unicode (even Windows Notepad supports it, but it defaults to ANSI).