what's the difference between QString and QLatin1String? - qt

Like the title
1.what's the difference between QString and QLatin1String??
2.when and where do I need to use one of them??
3.following:
QString str;
str = "";
str = QLatin1String("");
Is "" == QLatin1String("")??

QString holds unicode. A string literal "foo" is a byte sequence that could contain text in any encoding. When assigning a string literal to a QString, QString str = "foo", you implicitely convert from a byte sequence in undefined encoding to a QString holding unicode. The QString(const char*) constructor assumes ASCII and will convert as if you typed QString str = QString::fromAscii("foo"). That would break if you use non-ascii literals in your source files (e.g., japanese string literals in UTF-8) or pass character data from a char* or QByteArray you read from elsewhere (a file, socket, etc.). Thus it's good practice to keep the unicode QString world and the byte array QByteArray/char* world separated and only convert between those two explicitly, clearly stating which encoding you want to use to convert between those two. One can define QT_NO_CAST_FROM_ASCII and QT_NO_CAST_TO_ASCII to enforce explicit conversions (I would always enable them when writing a parser of any sort).
Now, to assign a latin1 string literal to a QString variable using explicit conversion, one can use
QString foo = QString::fromLatin1("föö");
or
QString foo = QLatin1String("föö");
Both state that the literal is encoded in latin1 and allow "encoding-safe" conversions to unicode.
I find QLatin1String nicer to read and the QLatin1String docs explain why it will be also faster in some situations.
Wrapping string literals, or in some cases QByteArray or char* variables, holding latin1 data for conversion is the main use for QLatin1String, one wouldn't use QLatin1String as method arguments, member variables or temporaries (all QString).

QString is Unicode based while QLatin1String is US-ASCII/Latin-1 based
Unicode is a super set of US-ASCII/Latin-1. If you only deal with US-ASCII/Latin-1 characters, the two are the same for you.
http://doc.qt.io/qt-4.8/qstring.html
http://doc.qt.io/qt-4.8/qlatin1string.html

Related

QRegExp and Null Character in Qt

i want search in a binary file with regular expression.
my search is successful in Text files, but not match in binary file, because QRegExp in function indexIn stop search when meet the NULL Character (chr(0)).
what can i do to solve this problem?
QString can contain null characters, it's just its constructors that are inconsistent...
QString::fromUtf8(const char *str, int size = -1) uses the given size, while QString::fromUtf8(const QByteArray &str) forces a strlen instead of using the bytearray size. See for yourself Qt code.
QRegExp also supports null characters:
QString s(QChar(0));
QRegExp re(s);
qDebug() << re.indexIn(s); // will print 0, not -1

How can i split a wchar_t / TCHAR / WCHAR / LPTSTR into a QStringList?

While working with the Win32API, the function i must use returns its results by writing them to buffer of type LPTSTR as well as the individual number of characters that were written.enter code here
As this buffer is a string, and the function can return multiple values, the actual result data look something like this:
Value1\0Value2\0Value3\0\0
What is the best way to get this into a QStringList?
LPTSTR = Long Pointer to TCHAR. On modern systems (those with unicode support) this is synonymous with a WCHAR array.
Since your output buffer will contain characters where each is two bytes it is thus compatible with UTF16.
QString has a fromUtf16 static method which requires a simple cast to satisfy the compiler.
In this case, we MUST also specify the total length of the entire string. Failure to do this results in QString only reading the input data up until the first null character, ignoring any other result data.
Once we actually have a QString to work with, splitting it is simple. Call QString's split() method specifying a null character wrapped in a QChar.
Optionally, and required in my case, specifying SplitBehavior as SkipEmptyParts ensures that no empty strings (the result of parsing the null character) end up in my desired result (the QStringList of values).
Example:
// The data returned by the API call.
WCHAR *rawResultData = L"Value1\0Value2\0Value3\0";
// The number of individual characters returned.
quint64 numberOfWrittenCharacters = 22;
// Create a QString from the returned data specifying
// the size.
QString rString =
QString::fromUtf16((const ushort *)rawResultData, numberOfWrittenCharacters);
// Finally, split the string into a QStringList
// ignoring empty results.
QStringList results =
rString.split(QChar(L'\0'), QString::SkipEmptyParts);

QString to unicode std::string

I know there is plenty of information about converting QString to char*, but I still need some clarification in this question.
Qt provides QTextCodecs to convert QString (which internally stores characters in unicode) to QByteArray, allowing me to retrieve char* which represents the string in some non-unicode encoding. But what should I do when I want to get a unicode QByteArray?
QTextCodec* codec = QTextCodec::codecForName("UTF-8");
QString qstr = codec->toUnicode("Юникод");
std::string stdstr(reinterpret_cast<const char*>(qstr.constData()), qstr.size() * 2 ); // * 2 since unicode character is twice longer than char
qDebug() << QString(reinterpret_cast<const QChar*>(stdstr.c_str()), stdstr.size() / 2); // same
The above code prints "Юникод" as I've expected. But I'd like to know if that is the right way to get to the unicode char* of the QString. In particular, reinterpret_casts and size arithmetics in this technique looks pretty ugly.
The below applies to Qt 5. Qt 4's behavior was different and, in practice, broken.
You need to choose:
Whether you want the 8-bit wide std::string or 16-bit wide std::wstring, or some other type.
What encoding is desired in your target string?
Internally, QString stores UTF-16 encoded data, so any Unicode code point may be represented in one or two QChars.
Common cases:
Locally encoded 8-bit std::string (as in: system locale):
std::string(str.toLocal8Bit().constData())
UTF-8 encoded 8-bit std::string:
str.toStdString()
This is equivalent to:
std::string(str.toUtf8().constData())
UTF-16 or UCS-4 encoded std::wstring, 16- or 32 bits wide, respectively. The selection of 16- vs. 32-bit encoding is done by Qt to match the platform's width of wchar_t.
str.toStdWString()
U16 or U32 strings of C++11 - from Qt 5.5 onwards:
str.toStdU16String()
str.toStdU32String()
UTF-16 encoded 16-bit std::u16string - this hack is only needed up to Qt 5.4:
std::u16string(reinterpret_cast<const char16_t*>(str.constData()))
This encoding does not include byte order marks (BOMs).
It's easy to prepend BOMs to the QString itself before converting it:
QString src = ...;
src.prepend(QChar::ByteOrderMark);
#if QT_VERSION < QT_VERSION_CHECK(5,5,0)
auto dst = std::u16string{reinterpret_cast<const char16_t*>(src.constData()),
src.size()};
#else
auto dst = src.toStdU16String();
If you expect the strings to be large, you can skip one copy:
const QString src = ...;
std::u16string dst;
dst.reserve(src.size() + 2); // BOM + termination
dst.append(char16_t(QChar::ByteOrderMark));
dst.append(reinterpret_cast<const char16_t*>(src.constData()),
src.size()+1);
In both cases, dst is now portable to systems with either endianness.
Use this:
QString Widen(const std::string &stdStr)
{
return QString::fromUtf8(stdStr.data(), stdStr.size());
}
std::string Narrow(const QString &qtStr)
{
QByteArray utf8 = qtStr.toUtf8();
return std::string(utf8.data(), utf8.size());
}
In all cases you should have utf8 in std::string.
You can get the QByteArray from a UTF-16 encoded QString using this:
QTextCodec *codec = QTextCodec::codecForName("UTF-16");
QTextEncoder *encoderWithoutBom = codec->makeEncoder( QTextCodec::IgnoreHeader );
QByteArray array = encoderWithoutBom->fromUnicode( str );
This way you ignore the unicode byte order mark (BOM) at the beginning.
You can convert it to char * like:
int dataSize=array.size();
char * data= new char[dataSize];
for(int i=0;i<dataSize;i++)
{
data[i]=array[i];
}
Or simply:
char *data = array.data();

Qt can't get characters from Unicode string

How can I get Unicode character (QChare type) from Unicode characters string (QString type).?I am trying with operator[] for Qstring object, and with it's member function at(), but it's not helping me(I'm using Qt Creator 2.0.1). I'm begginer in Qt, so this is maybe a simple question.
Did you try something like:
QString s("text");
QChar unicodeChar(s.at(0).unicode());
If you load your string into a QString, you can still use the .at(index) function. It will return a QChar which is a single wchar_t (UTF-16). You can cast that QChar to a wchar_t to get the unicode character.

How to convert QList<QByteArray> to QString in QT?

I have a QList<QByteArray> that I want to print out in a QTextBrowser. QTextBrowser->append() takes a QString.
Despite a ton of searching online, I have not found a way to convert the data I have into a QString.
There are several functions to convert QByteArray to QString: QString::fromAscii(), QString::fromLatin1(), QString::fromUtf8() etc. for the most common ones, and QTextCodec for other encodings. Which one is the correct one depends on the encoding of the text data in the byte array.
Try:
for(int i=0; i<list.size(); ++i){
QString str(list[i].constData());
// use your string as needed
}
from QByteArray to QString, do
const char * QByteArray::constData () const
Returns a pointer to the data stored in the byte array. The pointer
can be used to access the bytes that compose the array. The data is
'\0'-terminated. The pointer remains valid as long as the byte array
isn't reallocated or destroyed.
This function is mostly useful to pass a byte array to a function that
accepts a const char *.
you then have this QString constructor
QString ( const QChar * unicode )

Resources