How insert unicode string into sqllite in c++? - sqlite

I used
sqlite3_open16(v_st,&m_db)//SQLITE_OK
I want to insert unicode string.
Example:
sqlite3_exec(m_db,"UPDATE t1 SET a1='text'",0,0,0); //is good
but
sqlite3_exec(m_db,"UPDATE t1 SET a1='текст'",0,0,0); //is bad

When you use the sqlite3_open16 function you open the DB with the UTF-16 encoding.
And when you use the sqlite3_exec(m_db,"UPDATE t1 SET a1='текст'",0,0,0) function you use the encoding of your compiler (UTF-8 by default or cp1251).
You should use the same encoding in both cases. You may do one of theese things:
1) use the sqlite3_open function to open it with UTF-8 encoding and enshure your compiler uses UTF-8 as well;
or
2) change your compiler encoding to UTF-16;
or
3) use the sql statement PRAGMA encoding = "UTF-8"

string ANSItoUTF16le(const char* v_str)
{
string t_res;
unsigned char* t_buf=(unsigned char*)v_str;
auto t_size=strlen(v_str);
unsigned char t_s='А',t_f='п',t_E='Ё',t_e='Ё',t2_s='р',t2_f='я';
unsigned char t_c;
for(auto i=0;i<t_size;i++)
{
if(t_buf[i]>=t_s&&t_buf[i]<=t_f)
{
t_res+='\xD0';
t_c='\x90'+t_buf[i]-t_s;
t_res+=t_c;
continue;
}
if(t_buf[i]>=t2_s&&t_buf[i]<=t2_f)
{
t_res+='\xD1';
t_c='\x80'+t_buf[i]-t2_s;
t_res+=t_c;
continue;
}
if(t_buf[i]==t_E)
{
t_res+="\xD0\x01";
continue;
}
if(t_buf[i]==t_e)
{
t_res+="\xD1\x91";
continue;
}
t_res+=v_str[i];
}
return t_res;
}
and
sqlite3_exec(m_db,ANSItoUTF16le("UPDATE t1 SET a1='текст'").c_str(),0,0,0);

You can insert unicode string into sqllite in c++ by using a wrapper class on top of sqlite3.
CppSqlite3U is a unicode wrapper around the SQLite database. It allows using UNICODE types, such as wchar_t (instead of char), or It can be downloaded from here. also described in this article. For example: a query executed by execQuery can be of a LPCTSTR type, meaning that it can be either an array of wide chars or chars depending on whether UNICODE is defined in your project.
CppSQLite3Query execQuery(LPCTSTR szSQL);

Related

QString to unicode std::string

I know there is plenty of information about converting QString to char*, but I still need some clarification in this question.
Qt provides QTextCodecs to convert QString (which internally stores characters in unicode) to QByteArray, allowing me to retrieve char* which represents the string in some non-unicode encoding. But what should I do when I want to get a unicode QByteArray?
QTextCodec* codec = QTextCodec::codecForName("UTF-8");
QString qstr = codec->toUnicode("Юникод");
std::string stdstr(reinterpret_cast<const char*>(qstr.constData()), qstr.size() * 2 ); // * 2 since unicode character is twice longer than char
qDebug() << QString(reinterpret_cast<const QChar*>(stdstr.c_str()), stdstr.size() / 2); // same
The above code prints "Юникод" as I've expected. But I'd like to know if that is the right way to get to the unicode char* of the QString. In particular, reinterpret_casts and size arithmetics in this technique looks pretty ugly.
The below applies to Qt 5. Qt 4's behavior was different and, in practice, broken.
You need to choose:
Whether you want the 8-bit wide std::string or 16-bit wide std::wstring, or some other type.
What encoding is desired in your target string?
Internally, QString stores UTF-16 encoded data, so any Unicode code point may be represented in one or two QChars.
Common cases:
Locally encoded 8-bit std::string (as in: system locale):
std::string(str.toLocal8Bit().constData())
UTF-8 encoded 8-bit std::string:
str.toStdString()
This is equivalent to:
std::string(str.toUtf8().constData())
UTF-16 or UCS-4 encoded std::wstring, 16- or 32 bits wide, respectively. The selection of 16- vs. 32-bit encoding is done by Qt to match the platform's width of wchar_t.
str.toStdWString()
U16 or U32 strings of C++11 - from Qt 5.5 onwards:
str.toStdU16String()
str.toStdU32String()
UTF-16 encoded 16-bit std::u16string - this hack is only needed up to Qt 5.4:
std::u16string(reinterpret_cast<const char16_t*>(str.constData()))
This encoding does not include byte order marks (BOMs).
It's easy to prepend BOMs to the QString itself before converting it:
QString src = ...;
src.prepend(QChar::ByteOrderMark);
#if QT_VERSION < QT_VERSION_CHECK(5,5,0)
auto dst = std::u16string{reinterpret_cast<const char16_t*>(src.constData()),
src.size()};
#else
auto dst = src.toStdU16String();
If you expect the strings to be large, you can skip one copy:
const QString src = ...;
std::u16string dst;
dst.reserve(src.size() + 2); // BOM + termination
dst.append(char16_t(QChar::ByteOrderMark));
dst.append(reinterpret_cast<const char16_t*>(src.constData()),
src.size()+1);
In both cases, dst is now portable to systems with either endianness.
Use this:
QString Widen(const std::string &stdStr)
{
return QString::fromUtf8(stdStr.data(), stdStr.size());
}
std::string Narrow(const QString &qtStr)
{
QByteArray utf8 = qtStr.toUtf8();
return std::string(utf8.data(), utf8.size());
}
In all cases you should have utf8 in std::string.
You can get the QByteArray from a UTF-16 encoded QString using this:
QTextCodec *codec = QTextCodec::codecForName("UTF-16");
QTextEncoder *encoderWithoutBom = codec->makeEncoder( QTextCodec::IgnoreHeader );
QByteArray array = encoderWithoutBom->fromUnicode( str );
This way you ignore the unicode byte order mark (BOM) at the beginning.
You can convert it to char * like:
int dataSize=array.size();
char * data= new char[dataSize];
for(int i=0;i<dataSize;i++)
{
data[i]=array[i];
}
Or simply:
char *data = array.data();

Qt Check QString to see if it is a valid hex value

I'm working with Qt on an existing project. I'm trying to send a string over a serial cable to a thermostat to send it a command. I need to make sure the string only contains 0-9, a-f, and is no more or less than 6 characters long. I was trying to use QString.contains, but I'm am currently stuck. Any help would be appreciated.
You have two options:
Use QRegExp
Use the QRegExp class to create a regular expression that finds what you're looking for. In your case, something like the following might do the trick:
QRegExp hexMatcher("^[0-9A-F]{6}$", Qt::CaseInsensitive);
if (hexMatcher.exactMatch(someString))
{
// Found hex string of length 6.
}
Update
Qt 5 users should consider using QRegularExpression instead of QRegExp:
QRegularExpression hexMatcher("^[0-9A-F]{6}$",
QRegularExpression::CaseInsensitiveOption);
QRegularExpressionMatch match = hexMatcher.match(someString);
if (match.hasMatch())
{
// Found hex string of length 6.
}
Use QString Only
Check the length of the string and then check to see that you can convert it to an integer successfully (using a base 16 conversion):
bool conversionOk = false;
int value = myString.toInt(&conversionOk, 16);
if (conversionOk && myString.length() == 6)
{
// Found hex string of length 6.
}

what's the difference between QString and QLatin1String?

Like the title
1.what's the difference between QString and QLatin1String??
2.when and where do I need to use one of them??
3.following:
QString str;
str = "";
str = QLatin1String("");
Is "" == QLatin1String("")??
QString holds unicode. A string literal "foo" is a byte sequence that could contain text in any encoding. When assigning a string literal to a QString, QString str = "foo", you implicitely convert from a byte sequence in undefined encoding to a QString holding unicode. The QString(const char*) constructor assumes ASCII and will convert as if you typed QString str = QString::fromAscii("foo"). That would break if you use non-ascii literals in your source files (e.g., japanese string literals in UTF-8) or pass character data from a char* or QByteArray you read from elsewhere (a file, socket, etc.). Thus it's good practice to keep the unicode QString world and the byte array QByteArray/char* world separated and only convert between those two explicitly, clearly stating which encoding you want to use to convert between those two. One can define QT_NO_CAST_FROM_ASCII and QT_NO_CAST_TO_ASCII to enforce explicit conversions (I would always enable them when writing a parser of any sort).
Now, to assign a latin1 string literal to a QString variable using explicit conversion, one can use
QString foo = QString::fromLatin1("föö");
or
QString foo = QLatin1String("föö");
Both state that the literal is encoded in latin1 and allow "encoding-safe" conversions to unicode.
I find QLatin1String nicer to read and the QLatin1String docs explain why it will be also faster in some situations.
Wrapping string literals, or in some cases QByteArray or char* variables, holding latin1 data for conversion is the main use for QLatin1String, one wouldn't use QLatin1String as method arguments, member variables or temporaries (all QString).
QString is Unicode based while QLatin1String is US-ASCII/Latin-1 based
Unicode is a super set of US-ASCII/Latin-1. If you only deal with US-ASCII/Latin-1 characters, the two are the same for you.
http://doc.qt.io/qt-4.8/qstring.html
http://doc.qt.io/qt-4.8/qlatin1string.html

How to convert TBuf8 to QString

I've tried to convert using the following code:
template< unsigned int size >
static QString
TBuf82QString( const TBuf8< size > &buf )
{
return QString::fromUtf16(
reinterpret_cast<unsigned short*>(
const_cast<TUint8*>(
buf.Ptr() ) ), buf.Length() );
}
But It always returns something like ?????b.
EDIT: Changed code example
Using a template probably isn't a good solution, since it will result in a new instantiation of this block of code within your application binary, for every size of input string which is converted. Since the output type (QString) contains no compile-time constant, this means you end up with code bloat, for no gain.
A better approach would be to leverage the fact that TBuf8<N> inherits from TDesC8:
QString TBuf2QString(const TDesC8 &buf)
{
return QString::fromLocal8Bit(reinterpret_cast<const char *>(buf.Ptr()),
buf.Length());
}
TBuf<16> foo(_L("sometext"));
QString bar = TBuf2QString(foo);
TBuf8 is used for binary data or non-Unicode strings. TBuf16 is used for Unicode strings. TBuf is conditionally compiled and will always be TBuf16 as Symbian OS is natively Unicode.
Try using QString::fromLocal8Bit() with TBuf8::Ptr()

SQLite3: Insert BLOB with NULL characters in C++

I'm working on the development of a C++ API which uses custom-designed plugins
to interface with different database engines using their APIs and specific SQL
syntax.
Currently, I'm attempting to find a way of inserting BLOBs, but since NULL is
the terminating character in C/C++, the BLOB becomes truncated when constructing
the INSERT INTO query string. So far, I've worked with
//...
char* sql;
void* blob;
int len;
//...
blob = some_blob_already_in_memory;
len = length_of_blob_already_known;
sql = sqlite3_malloc(2*len+1);
sql = sqlite3_mprintf("INSERT INTO table VALUES (%Q)", (char*)blob);
//...
I expect that, if it is at all possible to do it in the SQLite3 interactive console, it should be possible to construct the query string with properly escaped NULL characters. Maybe there's a way to do this with standard SQL which is also supported by SQLite SQL syntax?
Surely someone must have faced the same situation before. I've googled and found some answers but were in other programming languages (Python).
Thank you in advance for your feedback.
Thank you all again for your feedback. This time I'm reporting how I solved the problem with the help of the indications provided here. Hopefully this will help others in the future.
As suggested by the first three posters, I did use prepared statements — additionally because I was also interested in getting the columns' data types, and a simple sqlite3_get_table() wouldn't do.
After preparing the SQL statement in the form of the following constant string:
INSERT INTO table VALUES(?,?,?,?);
it remains the binding of the corresponding values. This is done by issuing as many sqlite3_bind_blob() calls as the columns. (I also resorted to sqlite3_bind_text() for other "simple" data types because the API I'm working on can translate integers/doubles/etc into a string). So:
#include <stdio.h>
#include <string.h>
#include <sqlite3.h>
/* ... */
void* blobvalue[4] = { NULL, NULL, NULL, NULL };
int blobsize[4] = { 0, 0, 0, 0 };
const char* tail = NULL;
const char* sql = "INSERT INTO tabl VALUES(?,?,?,?)";
sqlite3_stmt* stmt = NULL;
sqlite3* db = NULL;
/* ... */
sqlite3_open("sqlite.db", &db);
sqlite3_prepare_v2(db,
sql, strlen(sql) + 1,
&stmt, &tail);
for(unsigned int i = 0; i < 4; i++) {
sqlite3_bind_blob(stmt,
i + 1, blobvalue[i], blobsize[i],
SQLITE_TRANSIENT);
}
if(sqlite3_step(stmt) != SQLITE_DONE) {
printf("Error message: %s\n", sqlite3_errmsg(db));
}
sqlite3_finalize(stmt);
sqlite3_close(db);
Note also that some functions (sqlite3_open_v2(), sqlite3_prepare_v2()) appear on the later SQLite versions (I suppose 3.5.x and later).
The SQLite table tabl in file sqlite.db can be created with (for example)
CREATE TABLE tabl(a TEXT PRIMARY KEY, b TEXT, c TEXT, d TEXT);
You'll want to use this function with a prepared statement.
int sqlite3_bind_blob(sqlite3_stmt*, int, const void*, int n, void(*)(void*));
In C/C++, the standard way of dealing with NULLs in strings is to either store the beginning of the string and a length, or store a pointer to the beginning of a string and one to the end of the string.
You want to precompile the statement sqlite_prepare_v2(), and then bind the blob in using sqlite3_bind_blob(). Note that the statement you bind in will be INSERT INTO table VALUES (?).

Resources