QString to unicode std::string

QString to unicode std::string - qt

I know there is plenty of information about converting QString to char*, but I still need some clarification in this question.
Qt provides QTextCodecs to convert QString (which internally stores characters in unicode) to QByteArray, allowing me to retrieve char* which represents the string in some non-unicode encoding. But what should I do when I want to get a unicode QByteArray?
QTextCodec* codec = QTextCodec::codecForName("UTF-8");
QString qstr = codec->toUnicode("Юникод");
std::string stdstr(reinterpret_cast<const char*>(qstr.constData()), qstr.size() * 2 ); // * 2 since unicode character is twice longer than char
qDebug() << QString(reinterpret_cast<const QChar*>(stdstr.c_str()), stdstr.size() / 2); // same
The above code prints "Юникод" as I've expected. But I'd like to know if that is the right way to get to the unicode char* of the QString. In particular, reinterpret_casts and size arithmetics in this technique looks pretty ugly.

The below applies to Qt 5. Qt 4's behavior was different and, in practice, broken.
You need to choose:
Whether you want the 8-bit wide std::string or 16-bit wide std::wstring, or some other type.
What encoding is desired in your target string?
Internally, QString stores UTF-16 encoded data, so any Unicode code point may be represented in one or two QChars.
Common cases:
Locally encoded 8-bit std::string (as in: system locale):
std::string(str.toLocal8Bit().constData())
UTF-8 encoded 8-bit std::string:
str.toStdString()
This is equivalent to:
std::string(str.toUtf8().constData())
UTF-16 or UCS-4 encoded std::wstring, 16- or 32 bits wide, respectively. The selection of 16- vs. 32-bit encoding is done by Qt to match the platform's width of wchar_t.
str.toStdWString()
U16 or U32 strings of C++11 - from Qt 5.5 onwards:
str.toStdU16String()
str.toStdU32String()
UTF-16 encoded 16-bit std::u16string - this hack is only needed up to Qt 5.4:
std::u16string(reinterpret_cast<const char16_t*>(str.constData()))
This encoding does not include byte order marks (BOMs).
It's easy to prepend BOMs to the QString itself before converting it:
QString src = ...;
src.prepend(QChar::ByteOrderMark);
#if QT_VERSION < QT_VERSION_CHECK(5,5,0)
auto dst = std::u16string{reinterpret_cast<const char16_t*>(src.constData()),
src.size()};
#else
auto dst = src.toStdU16String();
If you expect the strings to be large, you can skip one copy:
const QString src = ...;
std::u16string dst;
dst.reserve(src.size() + 2); // BOM + termination
dst.append(char16_t(QChar::ByteOrderMark));
dst.append(reinterpret_cast<const char16_t*>(src.constData()),
src.size()+1);
In both cases, dst is now portable to systems with either endianness.

Use this:
QString Widen(const std::string &stdStr)
{
return QString::fromUtf8(stdStr.data(), stdStr.size());
}
std::string Narrow(const QString &qtStr)
{
QByteArray utf8 = qtStr.toUtf8();
return std::string(utf8.data(), utf8.size());
}
In all cases you should have utf8 in std::string.

You can get the QByteArray from a UTF-16 encoded QString using this:
QTextCodec *codec = QTextCodec::codecForName("UTF-16");
QTextEncoder *encoderWithoutBom = codec->makeEncoder( QTextCodec::IgnoreHeader );
QByteArray array = encoderWithoutBom->fromUnicode( str );
This way you ignore the unicode byte order mark (BOM) at the beginning.
You can convert it to char * like:
int dataSize=array.size();
char * data= new char[dataSize];
for(int i=0;i<dataSize;i++)
{
data[i]=array[i];
}
Or simply:
char *data = array.data();

Related

How can i convert a QByteArray into a hex string?

I have the blow QByteArray.
QByteArray ba;
ba[0] = 0x01;
ba[1] = 0x10;
ba[2] = 0x00;
ba[3] = 0x07;
I have really no idea how to convert this QByteArray into resulted string which have "01100007", which i would use the QRegExp for pattern matching on this string?

First of all, the QByteArray does not contain "hex values", it contains bytes (as it's name implies). Number can be "hex" only when it is printed as text.
Your code should be:
QByteArray ba(4, 0); // array length 4, filled with 0
ba[0] = 0x01;
ba[1] = 0x10;
ba[2] = 0x00;
ba[3] = 0x07;
Anyway, to convert a QByteArray to a hex string, you got lucky: just use QByteArray::toHex() method!
QByteArray ba_as_hex_string = ba.toHex();
Note that it returns 8-bit text, but you can just assign it to a QString without worrying much about encodings, since it is pure ASCII. If you want upper case A-F in your hexadecimal numbers instead of the default a-f, you can use QByteArray::toUpper() to convert the case.

QString has following contructor:
constructor QString(const QByteArray &ba)
But note that an octal number is preceeded by 0 in c++, so some of your values are deciamal, some octal, none of them are hex.

How to convert double* data to const char* or QByteArray efficiently

I am trying to use the network programming APIs in Qt in my project. One part of my code requires me to convert double* data to QByteArray or a const char*.
I searched through the stackoverflow questions and could find many people suggesting this code :
QByteArray array(reinterpret_cast<const char*>(data), sizeof(double));
or, for an array of double :
QByteArray::fromRawData(reinterpret_cast<const char*>(data),s*sizeof(double));
When I use them in my function, It does notgive me the desired result. The output seems to be random characters.
Please Suggest an efficient way to implement it in Qt. Thank you very much for your time.
Regards
Alok

If you just need to encode and decode a double into a byte array, this works:
double value = 3.14159275;
// Encode the value into the byte array
QByteArray byteArray(reinterpret_cast<const char*>(&value), sizeof(double));
// Decode the value
double outValue;
// Copy the data from the byte array into the double
memcpy(&outValue, byteArray.data(), sizeof(double));
printf("%f", outValue);
However, that is not the best way to send data over the network, as it will depend on the platform specifics of how the machines encode the double type. I would recommend you look at the QDataStream class, which allows you to do this:
double value = 3.14159275;
// Encode the value into the byte array
QByteArray byteArray;
QDataStream stream(&byteArray, QIODevice::WriteOnly);
stream << value;
// Decode the value
double outValue;
QDataStream readStream(&byteArray, QIODevice::ReadOnly);
readStream >> outValue;
printf("%f", outValue);
This is now platform independent, and the stream operators make it very convenient and easy to read.

Assuming that you want to create a human readable string:
double d = 3.141459;
QString s = QString::number(d); // method has options for format and precision, see docs
or if you need localization where locale is a QLocale object:
s = locale.toString(d); // method has options for format and precision, see docs
You can easily convert the string into a QByteArray using s.toUtf8() or s.toLatin1() if really necessary. If speed is important there also is:
QByteArray ba = QByteArray::number(d); // method has options for format and precision, see docs

what's the difference between QString and QLatin1String?

Like the title
1.what's the difference between QString and QLatin1String??
2.when and where do I need to use one of them??
3.following:
QString str;
str = "";
str = QLatin1String("");
Is "" == QLatin1String("")??

QString holds unicode. A string literal "foo" is a byte sequence that could contain text in any encoding. When assigning a string literal to a QString, QString str = "foo", you implicitely convert from a byte sequence in undefined encoding to a QString holding unicode. The QString(const char*) constructor assumes ASCII and will convert as if you typed QString str = QString::fromAscii("foo"). That would break if you use non-ascii literals in your source files (e.g., japanese string literals in UTF-8) or pass character data from a char* or QByteArray you read from elsewhere (a file, socket, etc.). Thus it's good practice to keep the unicode QString world and the byte array QByteArray/char* world separated and only convert between those two explicitly, clearly stating which encoding you want to use to convert between those two. One can define QT_NO_CAST_FROM_ASCII and QT_NO_CAST_TO_ASCII to enforce explicit conversions (I would always enable them when writing a parser of any sort).
Now, to assign a latin1 string literal to a QString variable using explicit conversion, one can use
QString foo = QString::fromLatin1("föö");
or
QString foo = QLatin1String("föö");
Both state that the literal is encoded in latin1 and allow "encoding-safe" conversions to unicode.
I find QLatin1String nicer to read and the QLatin1String docs explain why it will be also faster in some situations.
Wrapping string literals, or in some cases QByteArray or char* variables, holding latin1 data for conversion is the main use for QLatin1String, one wouldn't use QLatin1String as method arguments, member variables or temporaries (all QString).

QString is Unicode based while QLatin1String is US-ASCII/Latin-1 based
Unicode is a super set of US-ASCII/Latin-1. If you only deal with US-ASCII/Latin-1 characters, the two are the same for you.
http://doc.qt.io/qt-4.8/qstring.html
http://doc.qt.io/qt-4.8/qlatin1string.html

How to convert QList<QByteArray> to QString in QT?

I have a QList<QByteArray> that I want to print out in a QTextBrowser. QTextBrowser->append() takes a QString.
Despite a ton of searching online, I have not found a way to convert the data I have into a QString.

There are several functions to convert QByteArray to QString: QString::fromAscii(), QString::fromLatin1(), QString::fromUtf8() etc. for the most common ones, and QTextCodec for other encodings. Which one is the correct one depends on the encoding of the text data in the byte array.

Try:
for(int i=0; i<list.size(); ++i){
QString str(list[i].constData());
// use your string as needed
}

from QByteArray to QString, do
const char * QByteArray::constData () const
Returns a pointer to the data stored in the byte array. The pointer
can be used to access the bytes that compose the array. The data is
'\0'-terminated. The pointer remains valid as long as the byte array
isn't reallocated or destroyed.
This function is mostly useful to pass a byte array to a function that
accepts a const char *.
you then have this QString constructor
QString ( const QChar * unicode )

Qt - Converting QString to Unicode QByteArray

I have a client-server application where client will be in Qt(Ubuntu) and server will be C#. Qt client willsend the strings in UTF-16 encoded format.
I have used the QTextCodec class to convert to UTF-16. But whenever the conversion happens it will be padded with some more characters. For example
"<bind endpoint='2_3'/>"
will be changed to
"\ff\fe<\0b\0i\0n\0d\0 \0e\0n\0d\0p\0o\0i\0n\0t\0=\0'\02\0_\03\0'\0/\0>\0\0\0"
I have following code which converts the QString to QByteArray
//'socketMessage' is the QString, listener is the QTcpSocket
QTextCodec *codec = QTextCodec::codecForName("UTF-16");
QByteArray data = codec->fromUnicode(socketMessage);
listener->write(data);
I have even tried the QTextStream,QDataStream etc for encoding. But everytime I end up with the same result. Am I doing something wrong here?

Though the question is asked long ago, I had the same problem. The solution to it is to create an QTextEncoder with the option QTextCodec::IgnoreHeader.
QTextCodec *codec = QTextCodec::codecForName("UTF-16");
QTextEncoder *encoderWithoutBom = codec->makeEncoder( QTextCodec::IgnoreHeader );
QString str("Foobar")
QByteArray bytes = encoderWithoutBom ->fromUnicode( s );
This will result in a QByteArray without BOM.

The \ff\fe at the beginning is the Unicode byte order mark (BOM) for UTF-16, little-endian. I'm not sure how to get the QTextCodec to ommit that but if you want to get a QByteArray from a string in UTF-16 without the BOM you could try this:
QString s("12345");
QByteArray b((const char*) (s.utf16()), s.size() * 2);

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex