Using SQLite how do I index columns in a CREATE TABLE statement? - sqlite

How do I index a column in my CREATE TABLE statement? The table looks like
command.CommandText =
"CREATE TABLE if not exists file_hash_list( " +
"id INTEGER PRIMARY KEY, " +
"hash BLOB NOT NULL, " +
"filesize INTEGER NOT NULL);";
command.ExecuteNonQuery();
I want filesize to be index and would like it to be 4 bytes

You can't do precisely what you're asking, but unlike some RDBMSs, SQLite is able to perform DDL inside of a transaction, with the appropriate result. This means that if you're really concerned about nothing ever seeing file_hash_list without an index, you can do
BEGIN;
CREATE TABLE file_hash_list (
id INTEGER PRIMARY KEY,
hash BLOB NOT NULL,
filesize INTEGER NOT NULL
);
CREATE INDEX file_hash_list_filesize_idx ON file_hash_list (filesize);
COMMIT;
or the equivalent using the transaction primitives of whatever database library you've got there.
I'm not sure how necessary that really is though, compared to just doing the two commands outside of a transaction.
As others have pointed out, SQLite's indexes are all B-trees; you don't get to choose what the type is or what portion of the column is indexed; the entire indexed column(s) are in the index. It will still be efficient for range queries, it just might take up a little more disk space than you'd really like.

I don't think you can. Why can't you use a second query? And specifying the size of the index seems to be impossible in SQLite. But then again, it is SQ_Lite_ ;)
create index
myIndex
on
myTable (myColumn)

Related

Indexes with custom collations in sqlite

Assuming I have a schema like this:
CREATE TABLE abc(
id INTEGER PRIMARY KEY AUTOINCREMENT,
txt TEXT
);
CREATE INDEX "txtCS" ON "abc"("txt" COLLATE MY_CUSTOM_SORT);
when will sqlite use my index on txt ?
because I ran:
EXPLAIN QUERY PLAN SELECT * FROM abc ORDER BY txt COLLATE MY_CUSTOM_SORT DESC ...
and it tells me that it scans the table, twice, using the txtCS index (It doesn't search like I expected.)
MY_CUSTOM_SORT is my own sorting function that I hooked with sqliteCreateCollation. I just need that index for some queries that involve special ordering and I want them to be fast
In the EXPLAIN QUERY PLAN output, SEARCH means that the database tries to look up some particular record(s) with specific values, while SCAN means that the database goes through the entire table.
This query returns all records, so the most efficient operation is a SCAN.
Either operation can be sped up with an index.
(In a SCAN, the database just goes through all index entries in order.)

SQLite data retrieve with select taking too long

I have created a table with sqlite for my corona/lua app. It's a hashtable with ~=700 000 values.The table has two columns, which are the hashcode (a string), and the value (another string). During the program I need to get data several times by providing the hashcode.
I'm using something like this code to get the data:
for p in db:nrows([[SELECT * FROM test WHERE id=']].."hashcode"..[[';]]) do
print(p)
-- p = returned value --
end
This statement is though taking insanely too much time to perform
thanks,
Edit:
Success!
the mistake was with the primare key thing.I set the hashcode as the primary key like below and the retrieve time whent to normal:
CREATE TABLE IF NOT EXISTS test (id STRING PRIMARY KEY , array);
I also prepared the statements in advance as you said:
stmt = db:prepare("SELECT * FROM test WHERE id = ?;")
[...]
stmt:bind(1,s)
for p in stmt:nrows() do
The only problem was that the db file size,that was around 18 MB, went to 29,5 MB
You should create the table with id as a unique primary key; this will automatically make an index.
create table if not exists test
(
id text primary key,
val text
);
You should not construct statements using string concatenation; this is a security issue so avoid getting in this habit. Also, you should prepare statements in advance, at program initialization, and run the prepared statements.
Something like this... initially:
hashcode_query_stmt = db:prepare("SELECT * FROM test WHERE id = ?;")
then for each use:
hashcode_query_stmt:bind_values(hashcode)
for p in hashcode_query_stmt:urows() do ... end
Ensure that there is an index on the id/hashcode column? Without one such queries will be slow, slow, slow. This index should probably be unique.
If only selecting the value/hashcode (SELECT value FROM ..), it may be beneficial to have a covering index over (id, value) as that can avoid additional seeking to the row data (see SQLite Query Planning). Try it with and without such a covering index.
Also, it may be worthwhile to employ caching if the same hashcodes are queried multiple times.
As already stated, get sure you have an index on ID.
If you can't change table schema now, you can add a index ad hoc:
CREATE INDEX test_id ON test (id);
About hashes: if you are computing hashes in your software to speed up searches, don't!
SQLite will use your supplied hashes as any regular string/blob. Also, RDBMS are optimized for efficient searching, which may be greatly improved with indexes.
Unless your hashing to save space, you are wasting processor time computing hashes in your application.

SQLite: Ordering my select results

I have a table with unique usernames and a bunch of string data I am keeping track of. Each user will have 1000 rows and when I select them I want to return them in the order they were added. Is the following code a necessary and correct way of doing this:
CREATE TABLE foo (
username TEXT PRIMARY KEY,
col1 TEXT,
col2 TEXT,
...
order_id INTEGER NOT NULL
);
CREATE INDEX foo_order_index ON foo(order_id);
SELECT * FROM foo where username = 'bar' ORDER BY order_id;
Add a DateAdded field and default it to the date/time the row was added and sort on that.
If you absolutely must use the order_ID, which I don't suggest. Then at least make it an identity column. The reason I advise against this is because you are relying on side affects to do your sorting and it will make your code harder to read.
If each user will have 1000 rows, then username should not be the primary key. One option is to use the int identity column which all tables have (which optimizes I/O reads since it's typically stored in that order).
Read under "RowIds and the Integer Primary Key" # http://www.sqlite.org/lang_createtable.html
The data for each table in SQLite is stored as a B-Tree structure
containing an entry for each table row, using the rowid value as the
key. This means that retrieving or sorting records by rowid is fast.
Because it's stored in that order in the B-tree structure, it should be fast to order by the int primary key. Make sure it's an alias for rowid though - more in that article.
Also, if you're going to be doing queries where username = 'bob', you should consider an index on the username column - especially there's going to be many users which makes the index effective because of high selectivity. In contrast, adding an index on a column with values like 1 and 0 only leads to low selectivity and renders the index very ineffective. So, if you have 3 users :) it's not worth it.
You can remove the order_id column & index entirely (unless you need them for something other than this sorting).
SQLite tables always have a integer primary key - in this case, your username column has silently been made a unique key, so the table only has the one integer primary key. The key column is called rowid. For your sorting purpose, you'll want to explicitly make it AUTOINCREMENT so that every row always has a higher rowid than older rows.
You probably want to read http://www.sqlite.org/autoinc.html
CREATE TABLE foo (
rowid INTEGER PRIMARY KEY AUTOINCREMENT,
username TEXT UNIQUE KEY,
...
Then your select becomes
select * from foo order by rowed;
One advantage of this approach is that you're re-using the index SQLite will already be placing on your table. A date or order_id column is going to mean an extra index, which is just overhead here.

How to merge N SQLite database files into one if db has the primary field?

I have a bunch of SQLite db files, and I need to merge them into one big db files.
How can I do that?
Added
Based on this, I guess those three commands should merge two db into one.
attach './abc2.db' as toMerge;
insert into test select * from toMerge.test
detach database toMerge
The problem is the db has PRIMARY KEY field, and I got this message - "Error: PRIMARY KEY must be unique".
This is the test table for the db.
CREATE TABLE test (id integer PRIMARY KEY AUTOINCREMENT,value text,goody text)
I'm just thinking off my head here... (and probably after everybody else has moved on, too).
Mapping the primary key to "NULL" should yield the wanted result (no good if you use it as foreign key somewhere else, since the key probably exists, but has different contents)
attach './abc2.db' as toMerge;
insert into test select NULL, value, goody from toMerge.test;
detach database toMerge;
actual test:
sqlite> insert into test select * from toMerge.test;
Error: PRIMARY KEY must be unique
sqlite> insert into test select NULL, value, goody from toMerge.test;
sqlite> detach database toMerge;
I'm not 100% sure, but it seems that I should read all the elements and insert the element (except the PRIMARY KEY) one by one into the new data base.

SQLITE: Unable to remove an unnamed primary key

I have a sqlite table that was originally created with:
PRIMARY KEY (`column`);
I now need to remove that primary key and create a new one. Creating a new one is easy, but removing the original seems to be the hard part. If I do
.indices tablename
I don't get the primary key. Some programs show the primary key as
Indexes: 1
[] PRIMARY
The index name is typically in the [].
Any ideas?
You can't.
PRAGMA INDEX_LIST('MyTable');
will give you a list of indices. This will include the automatically generated index for the primary key which will be called something like 'sqlite_autoindex_MyTable_1'.
But unfortunately you cannot drop this index...
sqlite> drop index sqlite_autoindex_MyTable_1;
SQL error: index associated with UNIQUE or PRIMARY KEY constraint cannot be dropped
All you can do is re-create the table without the primary key.
I the database glossary; a primary-key is a type of index where the index order is typically results in the physical ordering of the raw database records. That said any database engine that allows the primary key to be changed is likely reordering the database... so most do not and the operation is up to the programmer to create a script to rename the table and create a new one. So if you want to change the PK there is no magic SQL.
select * from sqlite_master;
table|x|x|2|CREATE TABLE x (a text, b text, primary key (`a`))
index|sqlite_autoindex_x_1|x|3|
You'll see that the second row returned from my quick hack has the index name in the second column, and the table name in the third. Try seeing if that name is anything useful.

Resources