Convert SQLite ROWID to primary key - sqlite

I have a large table without a primary key. Instead, I've been using the built-in ROWID as one.
Problem: now I can't VACUUM my database, because:
The VACUUM command may change the ROWIDs of entries in any tables that do not have an explicit INTEGER PRIMARY KEY.
How can I convert the ROWID to an INTEGER PRIMARY KEY, preserving existing ROWIDs and their behavior (or otherwise get VACUUM to not mess up ROWIDs)?

Related

Do I need to create index for PK in SQLite tables?

If I have a single column integer PK on a SQLite table - do I need to create explicit index or is it take care of by the engine?
Same question for string/text single field PK
From UNIQUE constraints:
In most cases, UNIQUE and PRIMARY KEY constraints are implemented by
creating a unique index in the database. (The exceptions are INTEGER
PRIMARY KEY and PRIMARY KEYs on WITHOUT ROWID tables.)
If a column is defined as INTEGER PRIMARY KEY it is actually an alias of the rowid of the column:
The data for rowid tables is stored as a B-Tree structure containing
one entry for each table row, using the rowid value as the key. This
means that retrieving or sorting records by rowid is fast. Searching
for a record with a specific rowid, or for all records with rowids
within a specified range is around twice as fast as a similar search
made by specifying any other PRIMARY KEY or indexed value.
The exception mentioned above is that if the declaration of a column with
declared type "INTEGER" includes an "PRIMARY KEY DESC" clause, it does
not become an alias for the rowid and is not classified as an integer
primary key.

sqlite3 vacuum command and referenced rowid

suppose I have a database like:
CREATE TABLE top_table (
test_id PRIMARY KEY,
cmd TEXT);
CREATE TABLE job_table (
id PRIMARY KEY,
rid integer references top_table(rowid));
If I do a vacuum, would it preserve the relationship "rid integer references top_table(rowid)"? Meaning, would it either leave the top_table rowids unchanged, or would it change them and make corresponding changes to rid in job_table? I want to do the vacuum because I'm merging databases, so I read in a second table, insert its data into an existing table, and drop the second table. If the vacuum won't work properly, my next thought was to create the combined database, drop the table, do a sump and read the sql into a new database. Or is there an easier/cleaner method that I'm unaware of?
Using :-
CREATE TABLE top_table ( test_id PRIMARY KEY, cmd TEXT);
CREATE TABLE job_table ( id PRIMARY KEY, rid integer references top_table(rowid));
Could be an issue as you are referencing the rowid directly rather than referencing an alias of the rowid. Typically an alias of the rowid would be used as per :-
CREATE TABLE top_table ( test_id INTEGER PRIMARY KEY, cmd TEXT);
CREATE TABLE job_table ( id INTEGER PRIMARY KEY, rid INTEGER REFERENCES top_table(test_id));
That is specifying INTEGER PRIMARY KEY, instead of PRIMARY KEY (column affinity NUMERIC) is a special case which makes the column being defined an alias of the rowid, the rowid will then not be subject to change by the VACUUM command as per :-
The VACUUM command may change the ROWIDs of entries in any tables that do not have an explicit INTEGER PRIMARY KEY.
SQLITE -CREATE TABLE - ROWID's and the INTEGER PRIMARY KEY
SQLite - VACUUM
I'm somewhat new at this, but strictly speaking, am I using foreign keys? I'm not declaring it to be a foreign key, and "pragma foreign_keys" returns 0
No BUT only because Foreign KEY enforcement is off (hasn't been turned on), thus the REFERENCES top_table(test_id) has no effect, other than being parsed.
Assuming that you programmatically maintain the referential integrity this should not be an issue.
It can be advantageous to have FOREIGN KEY support on as not only will referential integrity be enforced, the ON UPDATE and ON DELETE actions can then be utilised which can simplify handling updates and deletions by using the CASCADE option (e.g. deleting a parent will a) work without a conflict and b) cascade the deletion so that the children rows will be deleted automatically (if they can be))
SQLite - Enabling Foreign Key Support
SQLite - ON DELETE and ON UPDATE Actions
regarding the comment
test_id is a string, which can be relatively long (60-80 characters or so). the original version of the schema had test_id as a member of both tables. the version that I've shown above (with a bunch of fields removed from both tables for this question) was a check on how much smaller the database got if I switched matching long text strings to matching the rowid which seemed to make more sense than adding a field that serves the same purpose)
In the case where a string is used for referencing will not only save space (in the longer term) BUT the overheads will also be greater than using an alias of the rowid.
In the longer term, because SQLite save chunks (pages). A table will (by default) takes up at least 4k and then 8k ....
First, space wise, not only will the extra 52-80 bytes waste space the parent's column should also be unique. If UNIQUE were specified then there is the additional overhead of an index.
However, with an alias of the rowid, then 8 bytes max for the integer, which is used anyway unless the table is defined using WITHOUT ROWID. If I recall there is an extra byte for the flag that the column is an alias of the rowid. The rowid and therefore alias has to be unique and the index (as such) exists. Furthermore due to it being central to SQLite accessing by rowid can be twice as fast.
In short an alias of the rowid is probably the best option in many (probably most) cases for relationships.

Autoincrement in SQLite tables

I create a table, lets name it CUSTOMERS in SQLite:
CREATE TABLE "CUSTOMERS" (
"tel" INTEGER NOT NULL,
"customer" VARCHAR ,
);
When I see the table from a GUI (I use SQLite Manager from Firefox), I noticed that there is an extra column rowid which is working like auto-increment. My question is, in tables where I don't use a primary key should I specify a column like:
ROWID INTEGER PRIMARY KEY AUTOINCREMENT
If I execute this query PRAGMA table_info(CUSTOMERS); I get only the two columns tel,customer.
Sqlite usually adds a rowid automatically as #laato linked in the comments SqLite : ROWID
That can be removed, but does not need to be specified. So there is no need to add it to the Create Table.
The hidden rowid allows delete's to be targetted at a single row, bu
t if you are using the ROWID as a specific foreign key, it would be better to name a column explicitly. That will then become a synonym with the rowid.
You can see here, as it was commented in your question.
However, if you are in a dilemma what to choose between these options, SQLite recommends that you should not use AUTOINCREMENT attribute because:
The AUTOINCREMENT keyword imposes extra CPU, memory, disk space, and
disk I/O overhead and should be avoided if not strictly needed. It is
usually not needed.
More info you can read here.

Can someone give me a PK insert sample?

So I'm making things complicated ...I think. A primary key basically is to make the row unique. Is that correct? Anyone want to show me an insert statement with the values for PK?
The SQLite documentation says:
On an INSERT, if the ROWID or INTEGER PRIMARY KEY column is not
explicitly given a value, then it will be filled automatically with an
unused integer, usually one more than the largest ROWID currently in
use. This is true regardless of whether or not the AUTOINCREMENT
keyword is used.
So, on a table like
CREATE TABLE test(id INTEGER PRIMARY KEY, descr TEXT);
an insert with a valid id could be
INSERT INTO test(descr) VALUES('this is a test');
A primary key, also called a primary keyword, is a key in a relational database that is unique for each record. It is a unique identifier, such as a driver license number, telephone number (including area code), or vehicle identification number (VIN). A relational database must always have one and only one primary key.
if you are using CREATE TABLE, if you are creating the primary key on a single field, you can use:
CREATE TABLE mytable (
field1 TEXT,
field2 INTEGER PRIMARY KEY,
field3 BLOB,
);
Reference more at: https://www.sqlite.org/lang_createtable.html & http://sqlite.org/faq.html#q11

Create SQL table allow NULL(unique key?)

This is my code:
CREATE TABLE table(
id int(1) UNIQUE,
name varchar(128) NOT NULL)
ALTER TABLE table ADD UNIQUE KEY id (id);
I'm not sure if I need the second statement or not, but either way it doesn't work when inserting INSERT INTO table( name ) VALUES( "test" ). It says table.id may not be NULL. What do I need to do to make it work with NULL(and Auto Increment?)
Replacing the first column with id int(1) NOT NULL AUTO_INCREMENT, makes it say syntax error.
The column must be an INTEGER PRIMARY KEY for autoincrement behavior.
The AUTOINCREMENT keyword imposes extra CPU, memory, disk space, and disk I/O overhead and should be avoided if not strictly needed. It is usually not needed.
In SQLite, a column with type INTEGER PRIMARY KEY is an alias for the ROWID (except in WITHOUT ROWID tables) which is always a 64-bit signed integer
On an INSERT, if the ROWID or INTEGER PRIMARY KEY column is not explicitly given a value, then it will be filled automatically with an unused integer ..
If the AUTOINCREMENT keyword appears after INTEGER PRIMARY KEY, that changes the automatic ROWID assignment algorithm to prevent the reuse of ROWIDs over the lifetime of the database. In other words, the purpose of AUTOINCREMENT is to prevent the reuse of ROWIDs from previously deleted rows.
Unique is insufficient to trigger this behavior; and using AUTOINCREMENT is required to ensure a monotonic increment scheme.
Thus, to expose the surrogate ROWID without a guarantee that the ID will never be reused, use:
id INTEGER PRIMARY KEY
To ensure the ID is never reused over the table/database lifetime (which is the MySQL autoincrement and SQL Server identity column behavior), use:
id INTEGER PRIMARY KEY AUTOINCREMENT
No other types or can be used for this behavior - this includes INT, even though it ends up with the same affinity.

Resources