Is it possible to (emulate?) AUTOINCREMENT on a compound-PK in Sqlite? - sqlite

According to the SQLite docs, the only way to get an auto-increment column is on the primary key.
I need a compound primary key, but I also need auto-incrementing. Is there a way to achieve both of these in SQLite?
Relevant portion of my table as I would write it in PostgreSQL:
CREATE TABLE tstage (
id SERIAL NOT NULL,
node INT REFERENCES nodes(id) NOT NULL,
PRIMARY KEY (id,node),
-- ... other columns
);
The reason for this requirement is that all nodes eventually dump their data to a single centralized node where, with a single-column PK, there would be collisions.

The documentation is correct.
However, it is possible to reimplement the autoincrement logic in a trigger:
CREATE TABLE tstage (
id INT, -- allow NULL to be handled by the trigger
node INT REFERENCES nodes(id) NOT NULL,
PRIMARY KEY (id, node)
);
CREATE TABLE tstage_sequence (
seq INTEGER NOT NULL
);
INSERT INTO tstage_sequence VALUES(0);
CREATE TRIGGER tstage_id_autoinc
AFTER INSERT ON tstage
FOR EACH ROW
WHEN NEW.id IS NULL
BEGIN
UPDATE tstage_sequence
SET seq = seq + 1;
UPDATE tstage
SET id = (SELECT seq
FROM tstage_sequence)
WHERE rowid = NEW.rowid;
END;
(Or use a common my_sequence table with the table name if there are multiple tables.)

A trigger works, but is complex. More simply, you could avoid serial ids. One approach, you could use a GUID. Unfortunately I couldn't find a way to have SQLite generate the GUID for you by default, so you'd have to generate it in your application. There also isn't a GUID type, but you could store it as a string or a binary blob.
Or, perhaps there is something in your other columns that would serve as a suitable key. If you know that inserts won't happen more frequently than the resolution of your timestamp format of choice (SQLite offers several, see section 1.2), then maybe (node, timestamp_column) is a good primary key.
Or, you could use SQLite's AUTOINCREMENT, but set the starting number on each node via the sqlite_sequence table such that the generated serials won't collide. Since rowid is SQLite is a 64-bit number, you could do this by generating a unique 32-bit number for each node (IP addresses are a convenient, probably unique 32 bit number) and shifting it left 32 bits, or equivalently, multiplying it by 4294967296. Thus, the 64-bit rowid becomes effectively two concatenated 32-bit numbers, NODE_ID, RECORD_ID, guaranteed to not collide unless one node generates over four billion records.

How about...
ASSUMPTIONS
Only need uniqueness in PK, not sequential-ness
Source table has a PK
Create the central table with one extra column, the node number...
CREATE TABLE tstage (
node INTEGER NOT NULL,
id INTEGER NOT NULL, <<< or whatever the source table PK is
PRIMARY KEY (node, id)
:
);
When you rollup the data into the centralized node, insert the number of the source node into 'node' and set 'id' to the source table's PRIMARY KEY column value...
INSERT INTO tstage (nodenumber, sourcetable_id, ...);
There's no need to maintain another autoincrementing column on the central table because nodenumber+sourcetable_id will always be unique.

Related

sqlite3 vacuum command and referenced rowid

suppose I have a database like:
CREATE TABLE top_table (
test_id PRIMARY KEY,
cmd TEXT);
CREATE TABLE job_table (
id PRIMARY KEY,
rid integer references top_table(rowid));
If I do a vacuum, would it preserve the relationship "rid integer references top_table(rowid)"? Meaning, would it either leave the top_table rowids unchanged, or would it change them and make corresponding changes to rid in job_table? I want to do the vacuum because I'm merging databases, so I read in a second table, insert its data into an existing table, and drop the second table. If the vacuum won't work properly, my next thought was to create the combined database, drop the table, do a sump and read the sql into a new database. Or is there an easier/cleaner method that I'm unaware of?
Using :-
CREATE TABLE top_table ( test_id PRIMARY KEY, cmd TEXT);
CREATE TABLE job_table ( id PRIMARY KEY, rid integer references top_table(rowid));
Could be an issue as you are referencing the rowid directly rather than referencing an alias of the rowid. Typically an alias of the rowid would be used as per :-
CREATE TABLE top_table ( test_id INTEGER PRIMARY KEY, cmd TEXT);
CREATE TABLE job_table ( id INTEGER PRIMARY KEY, rid INTEGER REFERENCES top_table(test_id));
That is specifying INTEGER PRIMARY KEY, instead of PRIMARY KEY (column affinity NUMERIC) is a special case which makes the column being defined an alias of the rowid, the rowid will then not be subject to change by the VACUUM command as per :-
The VACUUM command may change the ROWIDs of entries in any tables that do not have an explicit INTEGER PRIMARY KEY.
SQLITE -CREATE TABLE - ROWID's and the INTEGER PRIMARY KEY
SQLite - VACUUM
I'm somewhat new at this, but strictly speaking, am I using foreign keys? I'm not declaring it to be a foreign key, and "pragma foreign_keys" returns 0
No BUT only because Foreign KEY enforcement is off (hasn't been turned on), thus the REFERENCES top_table(test_id) has no effect, other than being parsed.
Assuming that you programmatically maintain the referential integrity this should not be an issue.
It can be advantageous to have FOREIGN KEY support on as not only will referential integrity be enforced, the ON UPDATE and ON DELETE actions can then be utilised which can simplify handling updates and deletions by using the CASCADE option (e.g. deleting a parent will a) work without a conflict and b) cascade the deletion so that the children rows will be deleted automatically (if they can be))
SQLite - Enabling Foreign Key Support
SQLite - ON DELETE and ON UPDATE Actions
regarding the comment
test_id is a string, which can be relatively long (60-80 characters or so). the original version of the schema had test_id as a member of both tables. the version that I've shown above (with a bunch of fields removed from both tables for this question) was a check on how much smaller the database got if I switched matching long text strings to matching the rowid which seemed to make more sense than adding a field that serves the same purpose)
In the case where a string is used for referencing will not only save space (in the longer term) BUT the overheads will also be greater than using an alias of the rowid.
In the longer term, because SQLite save chunks (pages). A table will (by default) takes up at least 4k and then 8k ....
First, space wise, not only will the extra 52-80 bytes waste space the parent's column should also be unique. If UNIQUE were specified then there is the additional overhead of an index.
However, with an alias of the rowid, then 8 bytes max for the integer, which is used anyway unless the table is defined using WITHOUT ROWID. If I recall there is an extra byte for the flag that the column is an alias of the rowid. The rowid and therefore alias has to be unique and the index (as such) exists. Furthermore due to it being central to SQLite accessing by rowid can be twice as fast.
In short an alias of the rowid is probably the best option in many (probably most) cases for relationships.

How do FK:PK relations work in Sqlite?

I am using the DB Browser for SQLite to try and figure this out. I've opened Northwind.sqlite and in it it shows me the following for a table:
CREATE TABLE `Order Details` (
`OrderID` int,
`ProductID` int,
`UnitPrice` float ( 26 ),
`Quantity` int,
`Discount` float ( 13 ),
PRIMARY KEY(`OrderID`,`ProductID`)
);
However, in the Sql Server Northwind OrderID and ProductID are foreign keys, not primary keys. Does this work differently in SQLite? And if so, how do the relationships work?
thanks - dave
The above will create a table that has no FOREIGN keys but 2 indexes.
One a largely hidden index according to rowid.
The other, PRIMARY KEY(OrderID,ProductID) will be an index according to the combination of OrderId and ProductID.
some things about rowid (aka id)
rowid is an automatically created column called rowid (it can also be referenced using oid or rowid (case independent)) and if present is really the primary key.
rowid will be a unique signed integer using up to 64 bits. The lowest value and also the first value will be 1, the highest value being 9223372036854775807.
In later versions of SQLite 3.8.2 on the WITHOUT ROWID keyword was added to allow suppression of the rowid column/index (your Order Details table may benefit being a without rowid table).
if a column is defined with the type INTEGER PRIMARY KEY or INTEGER PRIMARY KEY AUTOINCREMENT then that column (there can only be 1 such column per table) is an alias of for the rowid column.
AUTOINCREMENT introduces a rule that when inserting a row the rowid must be greater than any that exist or existed.
It DOES NOT guarantee that the rowid will monotonically increase, although generally the id will (even without AUTOINCREMENT (perhaps the most misused/misunderstood keyword in SQLite)).
Without AUTOINCREMENT SQlite may find a lower rowid and use that, but not until a rowid of 9223372036854775807 has been reached.
AUTOINCREMENT, if a rowid of 9223372036854775807 has been reached will is an SQLITE_FULL exception.
AUTOINCREMENT results in overheads (e.q. a table named sqlite_sequence is then maintained recording the highest given sequence number). The documentation recommends that it not be used unless required, which is rarely the case.
Some limited testing I did resulted in an 8-12% greater processing time for AUTOINCREMENT. What are the overheads of using AUTOINCREMENT for SQLite on Android?
For more about rowid see SQLite Autoincrement and also Clustered Indexes and the WITHOUT ROWID Optimization
Coding PRIMARY KEY (if not on an INTEGER column i.e. not an alias of rowid) implies a UNIQUE constraint. It is not saying/checking that the value or any of the values in a clustered index exists in any other table.
Note null is not considered to be the same value, so in your Order Details table it is possible to have any combination of the values as null.
Coding a FOREIGN KEY introduces a constraint that the referenced value(s) must exist in the respective table/column. Additionally :-
Usually, the parent key of a foreign key constraint is the primary key
of the parent table. If they are not the primary key, then the parent
key columns must be collectively subject to a UNIQUE constraint or
have a UNIQUE index. If the parent key columns have a UNIQUE index,
then that index must use the collation sequences that are specified in
the CREATE TABLE statement for the parent table.
SQLite Foreign Key Support
Considering all of this you may want to do make some changes to the Order Details table :-
You could make it a WITHOUT ROWID table.
You could make both the OrderID and the ProductID columns NOT NULL.
You could add FOREIGN KEY's to both the OrderID and the ProductID columns.
So perhaps you could have :-
CREATE TABLE `Order Details` (
`OrderID` int NOT NULL REFERENCES `Orders` (`OrderId`), -- ADDED NOT NULL and FKEY
`ProductID` int NOT NULL REFERENCES `Products`(`ProductId`) , -- ADDED NOT NULL and FKEY
`UnitPrice` float ( 26 ),
`Quantity` int,
`Discount` float ( 13 ),
PRIMARY KEY(`OrderID`,`ProductID`)
)
WITHOUT ROWID -- ADDED WITHOUT ROWID
;
The above uses column constraints
Alternately, utilising TABLE constraints, you could do :-
CREATE TABLE `Order Details` (
`OrderID` int NOT NULL, -- ADDED NOT NULL
`ProductID` int NOT NULL, -- ADDED NOT NULL
`UnitPrice` float ( 26 ),
`Quantity` int,
`Discount` float ( 13 ),
PRIMARY KEY(`OrderID`,`ProductID`),
FOREIGN KEY (`OrderId`) REFERENCES `Orders`(`OrderId`), -- ADDED FKEY AS TABLE CONSTRAINT
FOREIGN KEY (`ProductID`) REFERENCES `Products`(`ProductID`) -- ADDED FKEY AS TABLE CONSTRAINT
)
WITHOUT ROWID -- ADDED WITHOUT ROWID
;
Both have the same outcome, the only difference being where the FOREIGN KEY constraints are defined.
Both the above assumes that the referenced tables are Orders and Products.

Can someone give me a PK insert sample?

So I'm making things complicated ...I think. A primary key basically is to make the row unique. Is that correct? Anyone want to show me an insert statement with the values for PK?
The SQLite documentation says:
On an INSERT, if the ROWID or INTEGER PRIMARY KEY column is not
explicitly given a value, then it will be filled automatically with an
unused integer, usually one more than the largest ROWID currently in
use. This is true regardless of whether or not the AUTOINCREMENT
keyword is used.
So, on a table like
CREATE TABLE test(id INTEGER PRIMARY KEY, descr TEXT);
an insert with a valid id could be
INSERT INTO test(descr) VALUES('this is a test');
A primary key, also called a primary keyword, is a key in a relational database that is unique for each record. It is a unique identifier, such as a driver license number, telephone number (including area code), or vehicle identification number (VIN). A relational database must always have one and only one primary key.
if you are using CREATE TABLE, if you are creating the primary key on a single field, you can use:
CREATE TABLE mytable (
field1 TEXT,
field2 INTEGER PRIMARY KEY,
field3 BLOB,
);
Reference more at: https://www.sqlite.org/lang_createtable.html & http://sqlite.org/faq.html#q11

SQLite: Ordering my select results

I have a table with unique usernames and a bunch of string data I am keeping track of. Each user will have 1000 rows and when I select them I want to return them in the order they were added. Is the following code a necessary and correct way of doing this:
CREATE TABLE foo (
username TEXT PRIMARY KEY,
col1 TEXT,
col2 TEXT,
...
order_id INTEGER NOT NULL
);
CREATE INDEX foo_order_index ON foo(order_id);
SELECT * FROM foo where username = 'bar' ORDER BY order_id;
Add a DateAdded field and default it to the date/time the row was added and sort on that.
If you absolutely must use the order_ID, which I don't suggest. Then at least make it an identity column. The reason I advise against this is because you are relying on side affects to do your sorting and it will make your code harder to read.
If each user will have 1000 rows, then username should not be the primary key. One option is to use the int identity column which all tables have (which optimizes I/O reads since it's typically stored in that order).
Read under "RowIds and the Integer Primary Key" # http://www.sqlite.org/lang_createtable.html
The data for each table in SQLite is stored as a B-Tree structure
containing an entry for each table row, using the rowid value as the
key. This means that retrieving or sorting records by rowid is fast.
Because it's stored in that order in the B-tree structure, it should be fast to order by the int primary key. Make sure it's an alias for rowid though - more in that article.
Also, if you're going to be doing queries where username = 'bob', you should consider an index on the username column - especially there's going to be many users which makes the index effective because of high selectivity. In contrast, adding an index on a column with values like 1 and 0 only leads to low selectivity and renders the index very ineffective. So, if you have 3 users :) it's not worth it.
You can remove the order_id column & index entirely (unless you need them for something other than this sorting).
SQLite tables always have a integer primary key - in this case, your username column has silently been made a unique key, so the table only has the one integer primary key. The key column is called rowid. For your sorting purpose, you'll want to explicitly make it AUTOINCREMENT so that every row always has a higher rowid than older rows.
You probably want to read http://www.sqlite.org/autoinc.html
CREATE TABLE foo (
rowid INTEGER PRIMARY KEY AUTOINCREMENT,
username TEXT UNIQUE KEY,
...
Then your select becomes
select * from foo order by rowed;
One advantage of this approach is that you're re-using the index SQLite will already be placing on your table. A date or order_id column is going to mean an extra index, which is just overhead here.

Can I use anything other than BIGINT as Primary Key data type in SQLite?

I was psyched about the possibility of using SQLite as a database solution during development so that I could focus on writing the code first and dynamically generating the db at runtime using NHibernate's ShemaExport functionality. However, I'm running into a few issues, not the least of which is that it seems that SQLite requires me to use Int64 for my primary keys (vs, say, Int32 or Guid). Is there any way around this?
Note: I should specify that this is in the context of an app using NHibernate. It is not strictly speaking the case that one can't create a table in SQLite with an INT datatype, but the behavior when you save and retrieve the data seems to indicate that it's being stored and/or retrieved as Int64.
SQLite will let you use any field in your table as a PRIMARY KEY. Doing so will implicitly create a UNIQUE index on the field. This is then the field that you, as a developer, can consider to be the primary unique identifier for the field. It can be any supported SQLite data type (below).
SQLite will always create an implicit internal numeric identifier for every table. It will have several aliases including RowID, OID, and _ROWID_. If you create your primary key as INTEGER PRIMARY KEY then it will use the same field as your primary key and SQLite's internal numeric identifier.
SQLite doesn't have a concept of Int32 or Int64 or Guid data types. It only has four data types: INT, REAL, TEXT, and BLOB. When you run DDL against SQLite if you use anything other than these four identifiers, SQLite will use a set of rules to determine which type to use. Basically, Int32 and Int64 are treated as aliases of INT and end up doing the exact same thing.
Even once you've created the tables with the data types you mentioned for each field, all you set is the type affinity for that field. SQLite does not enforce data types. Any data can be put into any field regardless of the declared type. SQLite will use the type affinity to convert data if possible, so if you insert '123' as a text string into an INT field, it will store it as the number 123.
The only exception to the type affinity is INTEGER PRIMARY KEY FIELDS. Those must be integers.
Integers in SQLite are always stored with a variable length field. So depending on the size of the integer, you may actually get an Int32 back for some rows an Int64 for others, all within the same field. This depends on the wrapper you're using, in this case NHibernate (I guess with System.Data.SQLite).
It does not require you to use Int64, however, it is possible that it only allows that when you specify a numeric primary key. Because sqlite doesn't really have referential integrity checking (though there has been recent discussion of this and perhaps dr hipp has even implemented, i haven't checked lately), all primary key means is "Make this column unique and create an index on it". there isn't much special about it. You can certainly use varchar or text for a primary key. for example, this works:
create table t_test (
theID varchar(36) primary key,
nm varchar(50)
)
in the above you could use theID to store a guid in text form.
More info can be found here: http://www.sqlite.org/lang_createtable.html#rowid
#weenet ... per your comments, the following code works just fine.
i think you need to post your code if you're still having troubles.
create table t_test2 (
theID int32 primary key,
nm varchar(50)
);
insert into t_test2 (theID, nm) values (1, 'don');
insert into t_test2 (theID, nm) values (2, 'weenet');
select * from t_test2;
additionally, this code works fine (varchar as a primary key):
create table t_test (
theID varchar(36) primary key,
nm varchar(50)
)
insert into t_test (theID, nm) values ('abcdefg', 'don');
insert into t_test (theID, nm) values ('hijklmnop', 'weenet');
select * from t_test

Resources