how to read an hebrew text from a file - nsstring

I want to read a text of hebrew from a file to nsstring object,I know the code line:
NSString *text=[NSString stringWithContentsOfFile:path encoding:encoding error:NULL];
i dont know what kind of file it have to be:rtf,xml or something else
and what kind of encoding I have to use for hebrew

UTF-8 is the proper encoding you want.
http://www.alanwood.net/unicode/hebrew.html

Related

Qt i18n encoding

Is there any difference in the way strings are translated in QT between literal strings in code and strings defined in a .ui file? Besides this, I should able to translate between a UTF-8 string to another UTF-8 string, not just from ASCII, right?
This doubt stems from a bug I found when trying to include UTF-8 characters (formatting characters like '»') in a "source" english string in my .ui file. The result is that the translations for those strings are not picked up by QT.
Note: I didn't forget to update to tags in the .ts file

NSData to NSString losing data

I'm attempting to convert a binary file into text, the problem is that a large portion of the file was not encoded in ascii and ends up being special characters. I'm using
[[NSString alloc] initWithData:data encoding:NSASCIIStringEncoding];
but am only getting a few characters back in a 20000 byte data block. What I would like to be able to see is all of the text (even if most is nonsense), which is what I get when I open the file using a binary editor.
It's a binary file. To read it, you find the documentation for the file format, then you parse it. Trying to throw it all into an NSString* seems absolutely pointless.

How to convert mime text to utf-8 in flex

Hi I'm wondering how I can convert some mime text like =?ISO-8859-1?Q? into utf-8 so it's readable for the users.
Thanks
You can do it from file properties (Rt Clikc File in Eclipse) in resource
hopes that helps
Get your text into ByteArray
Use readMultiByte() and specify encoding you need (they used iso-8859-1 right in the example, he-he.)

Fix Special Characters in String

I've got a program that in a nutshell reads values from a SQL database and writes them to a tab-delimited text file.
The issue is that some of the values in the database have special characters (TM, dash, ellipsis, etc.) When written to the text file, the formatting is lost and they come across as junk "™ or – etc"
When the value is viewed in the immediate window, before it is written to the txt file, everything looks fine. My guess is that this is an issue of encoding. But, I'm not real sure how to proceed, where to look, or what to look for.
Is this ASCII or UTF-8? If it's one of those how do I correct it before it's written to the text file.
Here's how I build the text file (where feedStr is a StringBuilder)
objReader = New StreamWriter(filePath)
objReader.Write(feedStr)
objReader.Close()
The default encoding for StreamWriter is UTF8 (with no byte order mark). Your result file is ok, the question is what do you open it in afterwards? If you open it in a UTF8 capable text editor, the characters should look the way you want.
You can also write the text file in another encoding, for example iso-8859-1 (latin1)
objReader = New StreamWriter(filePath, false, Encoding.GetEncoding("iso-8859-1"))

How to add encoding information to the response stream in ASP.NET?

I have following piece of code:
public void ProcessRequest (HttpContext context)
{
context.Response.ContentType = "text/rtf; charset=UTF-8";
context.Response.Charset = "UTF-8";
context.Response.ContentEncoding = System.Text.Encoding.UTF8;
context.Response.AddHeader("Content-disposition", "attachment;filename=lista_obecnosci.csv");
context.Response.Write("ąęćżźń󳥌ŻŹĆŃŁÓĘ");
}
When I try to open generated csv file, I get following behavior:
In Notepad2 - everything is fine.
In Word - conversion wizard opens and asks to convert the text. It suggest UTF-8, which is somehow ok.
In Excel - I get real mess. None of those Polish characters can be displayed.
I wanted to write those special encoding-information characters in front of my string, i.e.
context.Response.Write((char)0xef);
context.Response.Write((char)0xbb);
context.Response.Write((char)0xbf);
but that won't do any good. The response stream is treating that as normal data and converts it to something different.
I'd appreciate help on this one.
I ran into the same problem, and this was my solution:
context.Response.BinaryWrite(System.Text.Encoding.UTF8.GetPreamble());
context.Response.Write("ąęćżźń󳥌ŻŹĆŃŁÓĘ");
What you call "encoding-information" is actually a BOM. I suspect each of those "characters" is getting encoded separately. To write the BOM manually, you have to write it as three bytes, not three characters. I'm not familiar with the .NET I/O classes, but there should be a method available to you that takes a byte or byte[] parameter and writes them directly to the file.
By the way, the UTF-8 BOM is optional; in fact, its use is discouraged by the Unicode Consortium. If you don't have a specific reason for using it, save yourself some hassle and leave it out.
EDIT: I just remembered you can also write the actual BOM character, '\uFEFF', and let the encoder handle it:
context.Response.Write('\uFEFF');
I think the problem is with Excel based on Microsoft Excel mangles Diacritics in .csv files. To prove this, copy your sample output string of ąęćżźń󳥌ŻŹĆŃŁÓĘ and paste into a test file using your favorite editor, and save as a UTF-8 encoded .csv file. Open in Excel and see the same issues.
The answer from Alan Moore
translated to VB:
Context.Response.Write(""c)

Resources