SQLite strings with NUL - sqlite

Can strings in SQLite 3 include NUL characters?
If the answer to 1 is "yes", how can they be written in SQL queries? SQLite doesn't seem to have chr or char functions.

In general, no - SQLite internally is not 8-bit clean, probably due to its Tcl heritage. While NULs do not cause corruption problems, SQLite typically stops processing strings at the first embedded NUL character.
This is true even for operators such as GLOB. For instance, you cannot match a BLOB column with GLOB when you have embedded NUL characters, e.g. this
select * from table where blobcol glob x'00022a';
will only match empty blob values: While you can use literal BLOB syntax (i.e. x'hexdigits') and use the resulting values where strings are used, SQLite typically only uses the part before the first NUL.
At least,. this is the state of affairs up to including SQLite 3.26.0.
Note that SQLite also has a BLOB type which can store embedded NULs without any issues, but there is very little functionality in SQLite to use them, and they are often harder to use from SQL interface libraries.
They also silently convert to strings in many contexts, at which point the embedded NULs start causing issues again.

Not sure from which version onwards it is supported, but you can do it:
create table foo (bar data);
insert into foo(bar) values (x'001122334400ff');
select length(bar),hex(bar),bar from foo;

Related

How to extract a Teradata .TPT file with UTF-8 encoding

We are currently extracting several Teradata .TPT files that we will upload to AWS S3, however the files are coming with ANSI encode
I need them to come with encode UTF-8
You must specify the character set in your TPT script. At the top add:
USING CHARACTER SET UTF8
The tricky part is that UTF8 here has 3 bytes per character, so in your DEFINE SCHEMA you must triple the size of each field.
For example if your schema looks like:
DEFINE SCHEMA s_some_export
(
status VARCHAR(20),
userid VARCHAR(20),
firstname VARCHAR(64),
);
You'll have to triple the values to accommodate your UTF8 characters:
DEFINE SCHEMA s_some_export
(
status VARCHAR(60),
userid VARCHAR(60),
firstname VARCHAR(192),
);
Sometimes, because I'm lazy, I define my TPT with USING CHARACTER SET UTF16 so that I only need double each field size (the math is easier). BUT it means I have to convert it to UTF8 after extraction. In Linux this would just be iconv -f UTF-16LE -t UTF-8 myoutputfile.csv > myoutputfile.utf8.csv
Some caveats:
If your table's field is defined as CHAR and CHARACTER SET LATIN then you may run into column size issues with your schema. see here
Dates and Timestamps can get wierd as they don't need to be doubled so defining them as VARCHAR in your schema can get you into trouble. You may have to fuss around a bit here. My suggestion would be to change the view from which you are selecting the data for you TPT and CAST(yourdate AS VARCHAR(10)) as yourdate and then use VARCHAR(30) in your schema so you don't have to think about the field types while defining your schema. This means extra CPU overhead in your extraction, but unless you are running tight on resources I think it's worth it. I'm also very lazy that way and always happy to just get the damned TPT to extract data without much debugging.

Escape chars for SQLite3 command (without prepare)

I am creating a C++ program which will output a series of SQL statements (create, insert, etc) and write them to a file. This file will be used to create and populate a SQLite3 database.
I need to ensure that and values inserted are properly escaped so they can fit within the double quoted string (in the insert statement). Since there is no SQLite database available (this program just writes to a text file), I cannot use prepare. Can someone tell me which characters need to be escaped and how?
So far I've only found that the ' character needs to be escaped with another '
Inside a string, the only character to be escaped is the quote ' itself.
As for table/column names, you need to quote them if they conflict with SQL keywords.

How to escape string for SQLite FTS query

I'm trying to perform a SQLite FTS query with untrusted user input. I do not want to give the user access to the query syntax, that is they will not be able to perform a match query like foo OR bar AND cats. If they tried to query with that string I would want to interpret it as something more like foo \OR bar \AND cats.
There doesn't seem to be anything built in to SQLite for this, so I'll probably end up building my own escaping function, but this seems dangerous and error-prone. Is there a preferred way to do this?
The FTS MATCH syntax is its own little language. For FTS5, verbatim string literals are well defined:
Within an FTS expression a string may be specified in one of two ways:
By enclosing it in double quotes ("). Within a string, any embedded double quote characters may be escaped SQL-style - by adding a second double-quote character.
(redacted special case)
It turns out that correctly escaping a string for an FTS query is simple enough to implement completely and reliably: Replace " with "" and enclose the result in " on both ends.
In my case it then works perfectly when I put it into a prepared statement such as SELECT stuff FROM fts_table WHERE fts_table MATCH ?. I would then .bind(fts_escape(user_input)) where fts_escape is the function I described above.
OK I've investigated further, and with some heavy magic you can access the actual tokenizer used by SQLite's FTS. The "simple" tokenizer takes your string, separates it on any character that is not in [A-Za-z0-0], and lowercases the remaining. If you perform this same operation you will get a nicely "escaped" string suitable for FTS.
You can write your own, but you can access SQLite's internal one as well. See this question for details on that: Automatic OR queries using SQLite FTS4

SQLite nvarchar(100) field can accept 200 char field. Why?

I have a table with a field defined as nvarchar(100).
I just noticed if inserted a new record (an 200 string value for example) the query works and not throws any exception.
Is a SQLite 'feature'?
Usign SQLite 1.0.94 with Visual Studio 2010 / C# and SQLite v3 dabatabse.
SQLite doesn't recognize the limit you specified in statement, so it's not enforced.
In order to enforce it, you might need a statement like this:
CREATE TABLE t (f TEXT CHECK(LENGTH(f)<101));
So text with more than 100 characters cannot be inserted.
SQLite has a single unlimited TEXT datatype. See the documentation:
http://www.sqlite.org/datatype3.html#affname
Note that numeric arguments in parentheses that following the type
name (ex: "VARCHAR(255)") are ignored by SQLite - SQLite does not
impose any length restrictions on the length of strings, BLOBs or numeric
values.

SQLite database supporting Unicode data

I'm using java swing application which needs unicode string to drag into jtable.Is it possible to store unicode data in SQLITE database? If so,which SQLite does support unicode..I need free sqlite not the premium..
SQLite always stores text data as Unicode, using the Unicode encoding specified when the database was created. The database driver itself takes care to return the data as the Unicode string in the encoding used by your language/platform.
If you have conversion problems, either your application tried to store an ASCII string without converting it to Unicode, or you tried to read one value and force a conversion on it.
SQLite uses a kind of dynamic typing, where each value is stored using a specific storage class. A column's type specifies the affinity or how the value is treated. For example:
A column with NUMERIC affinity may contain values using all five storage classes. When text data is inserted into a NUMERIC column, the storage class of the text is converted to INTEGER or REAL
There are five storage classes, NULL, INTEGER, REAL, TEXT, BLOB. TEXT stores string data using the Unicode encoding specified for the database (UTF-8, UTF-16BE or UTF-16LE).
What specific problem are you facing, or is this a general question?
SQLite always uses Unicode strings.
sqlite3 doesn't fully support UNICODE. There is a wrapper class called CppSQLite3 which fully supports UNICODE>

Resources