I have a QString which contains some unicode chars, and I want to convert the QString to a QByteArray. (Which is later converted back to a QString)
If I use toLatin, toUtf8 or toLocal8bit the unicode characters are lost upon conversion.
How to I convert these unicode character filled QString to a QByteArray?
Related
I am trying to read percentage encoded urls with umlauts, such as äüö,..., with Qt:
QString str = "Nu%CC%88rnberg"
qDebug() << QUrl::fromPercentEncoding(str.toUtf8());
But the output is Nu¨rnberg instead of Nürnberg. How can I correctly decode urls with umlauts in this form?
Regards,
I have done this issue but I am little confused with result. First if you want to use letter ü use %C3%BC not %CC%88 (according to https://www.w3schools.com/tags/ref_urlencode.asp). So you need
QString str = "N%C3%BCrnberg";
QString encoded = QUrl::fromPercentEncoding(str.toUtf8());
But if you output it in qDebug() stream you can get different symbol (I guess it is because your default system encoding). But if you output it in GUI element you will have your ü symbol
QMessageBox::information(this, "", encoded);
this means main window.
I have two buffers (example sizes):
char c[512];
QChar q[256];
Assuming 'c' contains multibyte character string (UTF-8). I need to convert it to QChar sequence and place it in 'q'.
I guess a good example of what I need could be MultiByteToWideChar function.
IMPORTANT: this operation shall not involve any explicit or implicit memory allocations, except for additional allocations on the stack, maybe.
Please, do not answer if you are not sure what the above means.
QChar contains an ushort as only member, so its size is sizeof(ushort).
In QString context it represents UTF-16 'characters' (code points).
So it's all about encoding here.
If you know your char const * is UTF-16 data in the same endianness / byte order as your system, simply copy it:
memcpy(q, c, 512);
If you want to initialize a QString with your const char * data, you could just interpret it as UTF-16 using QString::fromRawData():
QString strFromData = QString::fromRawData(reinterpret_cast<QChar*>(c), 256);
// where 256 is sizeof(c) * sizeof(char) / sizeof(QChar)
Then you don't even need the QChar q[256] array.
If you know your data is UTF-8, you should use QString::fromUtf8() and then simply access its inner memory with QString::constData().
Using QString with UTF-8 I don't know of any method to completely prevent heap allocations. But the mentioned way should only allocate twice: Once for the PIMPL of QString, once for the UTF-16 string data.
If your input data is encoded as UTF-8, the answer is No: You cannot convert it using Qt.
Proof: Looking at the source code of qtbase/src/corelib/codecs/qutfcodec.cpp we see that all functions for encoding / decoding create new QString / QByteArray instances. No function operates on two arrays as in your question.
When i try read non-latin characters (e. g. russian) from QSettings in QT i have wrong values, something like "Ð\224адад". How to do it? Please, help.
I use ubuntu.
Try first reading byte array and then convert it to string from UTF8, e.g.:
QSettings settings("filename.ini", QSettings::IniFormat);
QByteArray ba = settings.value("goup/key").toByteArray();
QString str = QString::fromUtf8(ba.data(), ba.length());
I don't understand what happens when I create a text stream and then do setCodec("some_encoding"), does it start assuming that the file I'm reading from is in some_encoding and when I do QTextStream::readAll return me a QString in some_encoding? Or does QTextStream::readAll return a QString in unicode?
Here's what I do:
QString read(const char* encoding)
{
QTextStream stream(&file);
stream.setCodec(encoding);
return stream.readAll();
}
But I don't get a unicode string back. So, bottom line is, I want to know, how, having a file in some encoding, do I save all the contents as Unicode into a QString? If readAll() returns a string in the encoding specified, how do I convert that QString from that encoding to unicode?
Turns out this didn't have anything to do with encodings. I did stream.seek(0) before reading and it read it all right. I suspected that the problem was with encodings because usually when they're off you either get questions marks or empty strings everywhere, in this case I got an empty string.
I am obtaining the content from a QTextEdit object by using the following code:
QString text=my_QTextEdit.toPlainText();
What is the encoding that QTextEdit uses, a what encoding is used in the QString I get back from the toPlainText() call?
Thanks.
QTextEdit.toPlainText() returns a QString object, which is always a unicode character string (see documentation).
The QString class provides the functions toLatin1(), toAscii() and toUtf8(), which allow you to convert the string from unicode to an 8-bit string that you can process further. So Qt handles the encoding & decoding of the string for you.
If you want to create a QString instance from a given byte-string, you can use the functions fromAscii(), fromLatin1() or fromUtf8().
All controls in Qt are enabled for 16-bit characters. That means that content of a QTextEdit is Unicode (or UTF-32/UCS-4) (see also http://developer.nokia.com/Community/Discussion/showthread.php/215203-how-to-correctly-display-Unicodes-in-QPlainTextEdit).
When getting the content of a QTextEdit control (via plainText()), you get back a QString which contains Unicode.
From there on, you can convert to other format as you like: toUTF8(), toUCS4(), ...