Why does writeUTFBytes mess up non-english characters? - apache-flex

I'm writing all sorts of multi lingual text to .txt files using AIR's
fileStream.writeUTFBytes()
For english characters everything works perfectly. But as soon as there are chinese, arabic or any other non-english characters the sentences are totally messed up.
For example:
对着大叔摄影师的确没爱....
becomes
对着大叔摄影师的确没爱....
How can this be fixed?

writeUTFBytes doesn't mess up anything since it doesn't process the content.
Whatever goes in the pipe comes out.
The text you are sending is most likely encoded in Unicode/UTF-8
Make sure that you are openning the file with an editor that supports unicode (even Windows Notepad supports it, but it defaults to ANSI).

Related

Wkhtmltoxsharp no cyrillic

I am using wkhtmltopdf with wkhtmltoxsharp wrapper. When I try to convert HTML to "pdf" it converts it, but only when the text is latin. I cannot convert Cyrillic text, i get some strange characters.
Can you please help me, if you know any solution?
I have tested wkhtmltopdf with cyrilic, chinese and korean and they all work for me. Do you do the conversion on your desktop or on a server? It could be that the server does not have proper fonts the installed - that commonly causes this problem.
Also, it would help to see a little example of the content you are converting and the command (or piece of code) you use to convert.

Special char added to css (​) where did this come from?

I was doing a bunch of search-replace operations in notepad++ to effectively minify my css - mostly removing whitespace/tabs etc...) This ended up breaking much of my css.
Apparently a strange character (​) was inserted all over the place) Using notepad++ in UTF-8 without BOM, I cannot see these, but they appeared in a view-source.
I was able to remove these by doing a search replace in ANSI encoding, but my question is, what is this character, and why might it have appeared?
The string “​” is the UTF-8 encoded form of ZWSP when misinterpreted as windows-1252 encoded data. (Checked this using a nice UTF-8 decoder.) This explains why you don’t see it in Notepad++ in UTF-8 mode; ZWSP (zero-width space) is an invisible character with no width.
Apparently browsers are interpreting the style sheet as windows-1252 encoded. Saving the file with BOM might help, since then browsers would probably guess the encoding better. The real fix is to make sure (in a server-dependent manner) that the server sends appropriate Content-Type header for the CSS file.
But if this is the only non-Ascii character in your CSS file, it does not matter in practice, after you have removed the offending data.
I don’t know of any simple way to make Notepad++ insert ZWSP (you could of course use general character insertion utilities in the system), so it’s a bit of mystery where it came from. Perhaps via copy and paste from somewhere.
Using the web developer plug in or ext in Firefox you can see the problem character in the css document.
In Visual Studio all I could see was:
}
.t
Web developer showed an unwanted hidden character, an "a" with a caret on top:
}
â.t
The utf encoder link above revealed this
} (the encoded character for ampersand)
.t
and this
but simply fix the problem by deleting and retyping.

HtmlEncode Local resources

I have a web site that uses local resources. The main text (so not the labels, etc.) on de default page is stored in a file. This file is added to my local resources file default.aspx.fi-FI.resx and is named text-defaultPage. It's a regular text file with tags etc.
The problem is however, that the text is Finnish in other words it uses a lot of characters having umlaut (ä) and other special characters.
The person for whom the web site is wants to edit this text himself but he doesn't know anything about programming, html entities etc.
Is there a way to make it so that those characters are encoded with say htmlEncode?
in my Global.asax I check for the selected language and the page gets reload with that language.
Edit
Never mind, I made the files Unicode text files.
Solution to the problem is make the files unicode.

Text in resource file is not displaying correctly in Chinese

I have a website that is written in asp.net where the text is resourced in resource files. I am able to successfully view the correct characters for all (Spanish, German, Arabic, Korean, etc) languages except Chinese. When I change my browser to Chinese (any version) I get weird characters displayed. I have Chinese fonts installed and have tried changing encoding but nothing seems to work. This is happening in both IE9 and FF9.
Ex:
English - Progress
Chinese (in resource file) - 進度
Display in browser - 进程
Any help would be appreciated,
Matt
If you see weird characters in the .resx file when you edit it in Visual Studio, that means the file itself is corrupt. Fix the errors and the text should show up properly in the browser.

Arabic Locale Support in Flex

Today, I learn how to localize my Flex application and to support multiple languages. The tutorials on-line are great. However, non of them mention the Arabic locale.
So basically, I created the Arabic (Jordan) locale files in the SDK folder by using:
copylocale en_US ar_JO
I navigated to the locale folder and I was able to see the ar_JO folder in there... So I assume everything went smooth.
Next, I followed the tutorials (www.babelfx.org) and was able to localize my test application in English, French, and Arabic. Clicking on any of those languages switches the labels of my simple form/into the desired language... however:
When switching to the Arabic language the labels turn into empty square symbols. If you are wondering, yes I can open a notepad and type Arabic text and save it successfully.
When I type Arabic text into the text boxes, I can see the Arabic words that I typed correctly (the labels are still square symbols).
Any ideas what I might be missing here??
I tried changing the font of my application (right on the application tag I set the fontFamily) into Simplified Arabic which comes by default on Windows.
Thanks
Have you embedded a font into your swf which can render Arabic? Are you using that font? If the answer is no to either, then I suggest reading up on the subject.
One thing to remember about Flash and fonts is that it has incredible power which comes from the fact that one is able to embed actual fonts into the swf itself. One also needs to remember that Flash is incredibly finicky and is prone to throwing fits if you fail to do so.
The solution is to change the context-type to UTF-8. Three ways to accomplish this from within Flex Builder:
(Option 1) Right click the file from the File Navigator and select Properties
(Option 2) With the file open, navigate to the File menu and choose Properties
(Option 3) With the file open, press Alt + Enter to bring up the file Properties
Once the properties window is displayed, you will see the option to change the file encoding from Default to Other (UTF-8).
Note: At least for me, once I changed the content-type to UTF-8, I had to close my unsaved file, open it back up, and paste my contents back into the file in order to clear the error message. Then clean the project (Project -> Clean...) and let it rebuild.
I found the solution. Actually, I didn't have to embed any fonts or anything in order to get it working.
My problem was the encoding in the resources.properties file. I opened it in Notepad++, then I noticed the Encoding menu. At that time, I remembered reading something about that the encoding of the resources files should be UTF-8. So I converted the encoding to UTF-8 from the menu, compiled, it didn't work! After couple of retries and cleaning the project, it worked successfully!!!
Just a reminder for everybody (as I have fallen into this while working this problem out):
For mx components, embedded fonts must have the embedAsCFF set to false.

Resources