How is rowid determined on sqlite insertion? - sqlite

I´d like to predict (reverse engineer really) the rowid of any to-be-inserted row in a sqlite table (to reconstruct a stream of sqlite insertions using the rowid of some tables as foreign key in other tables). The insertion may happen after an arbitrary sequence of insertions and deletions. How is the rowid determined by sqlite on insertion?
Is it an ever incrementing counter?
int64_t next_rowid() {
static int64_t r = 0;
return ++r;
}
Maybe the smallest row not in use?
// Algorithm description, not (likely) working code
static sorted_set<int64_t> deleted;
static int64_t top = 0;
int64_t next_rowid() {
if(deleted.size()==0) deleted.push(++top);
return deleted.pop_front();
}
void delete_rowid(int64_t r) {
deleted.push(r);
}
Some other scheme?
Unspecified?

https://sqlite.org/autoinc.html -
SQLite is single thread, so for most cases it performs select max(id) +1 from the_table. From that perspective it is really hard to tell what was the sequence. You can however provide valid sequence threating deleted stuff as not present. Or maybe I missed something.
Edit
As CL spotted. Autoincrement works in more stable way. So you can't get same id twice. And from that you can see that something was deleted meanwhile...

First, there are 2 types of rowid determination algorithms. Depending upon whether or not AUTOINCREMENT has been specified.
AUTOINCREMENT means that the rowid is guaranteed to increase within the limitations of the size of the number (9223372036854775807). If that number is reached, then any subsequent insert attempt fails with an SQLITE_FULL exception.
Without AUTOINCREMENT in the above scenario the algorithm will try to find an unused rowid and therefore the resultant rowid may be lower than other existing rowids.
Neither of the algorithms guarantee an increment of 1, rather that usually they will increment by 1.
AUTOINCREMENT results in a table sqlite_sequence being created, the last used rowid is held in the sequence column, Note! it can be manipulated/altered so add 1 record then change it to 100 and next insert will likely be 101.
The name column is the name of the table that the row is for.
I changed the name column, as a test, to a non-existent table name (last sequence was the 101) inserting a record still resulted in 102, so it would appear that in the absence of the respective sequence in sqlite_sequence the algorithm still locates a higher rowid.
I then lowered the sequence to 2, the next rowid was 103.
So the guarantee of a higher rowid seems to be thorough.
I next added a 2nd row to sqlite_sequence for the same table with a sequence number of 600. Insert came up with a rowid of 104.
As SQLite possibly selects the first row according to id, I then changed the id of from 2 (1 is the one that was changed to a non-existent table name) to 20. 3 is the rowid of the rouge/2nd entry row. The inserted rowid was 601.
As an attempt to try to fool SQLite I deleted the newly added row in the table and the row with a rowid of 3, sequence value of 601 in the sqlite_sequence table. SQLite was fooled, the rowid of the inserted row was 105.
As such the algorithms appear to be along the lines of :-
a) for where AUTOINCREMENT isn't specified
1 greater than the highest rowid in the table in which the row is being inserted unless it is greater than 9223372036854775807, in which case an unused rowid will be sought.
b) 1 greater than the greater of the highest rowid in the table into which the row is being inserted and the sequence stored in the first row for the table in the sqlite_sequence table. Noting that the sqlite_sequence table may be updated but then that the insert does not take place e.g. if the insert fails due to constraints.
Much of the above is based upon this

Related

Teradata: MULTISET vs SET Table performance

So SET tables involve a performance impact, as for every new row inserted or updated in the table, Teradata checks if a duplicate row already exists in the table, which can be a serious issue if there is a large number of records. However, we can improve its performance by defining a unique constraint on any column in the SET table, such as a Unique Primary Index (UPI). This helps us avoid the additional overhead of performing the duplicate row check, as the UPI guarantees that there will be no duplicate rows.
Does this mean that a SET table with a UPI/USI will have the same performance as a MULTISET table with a UPI/USI? Please explain.
And if your table has a Unique Primary Index, should you create it as a SET or a MULTISET table?
There will be no performance difference for a SET/MULTISET table with a UPI, there's just a difference regarding Insert/Select: a SET table silently ignores duplicate rows (#rows inserted less than #rows selected) while a MULTISET table throws an error (Duplicate row checks are done before checking for uniqueness).
But adding a USI will not prevent duplicate row checks, a new row has to be inserted into the base table first to create it's ROWID before it's inserted into the USI subtable.

Limiting the number of rows a table can contain based on the value of a column - SQLite

Since SQLite doesn't support TRUE and FALSE, I have a boolean keyword that stores 0 and 1. For the boolean column in question, I want there to be a check for the number of 1's the column contains and limit the total number for the table.
For example, the table can have columns: name, isAdult. If there are more than 5 adults in the table, the system would not allow a user to add a 6th entry with isAdult = 1. There is no restriction on how many rows the table can contain, since there is no limit on the amount of entries where isAdult = 0.
You can use a trigger to prevent inserting the sixth entry:
CREATE TRIGGER five_adults
BEFORE INSERT ON MyTable
WHEN NEW.isAdult
AND (SELECT COUNT(*)
FROM MyTable
WHERE isAdult
) >= 5
BEGIN
SELECT RAISE(FAIL, "only five adults allowed");
END;
(You might need a similar trigger for UPDATEs.)
The SQL-99 standard would solve this with an ASSERTION— a type of constraint that can validate data changes with respect to an arbitrary SELECT statement. Unfortunately, I don't know any SQL database currently on the market that implements ASSERTION constraints. It's an optional feature of the SQL standard, and SQL implementors are not required to provide it.
A workaround is to create a foreign key constraint so isAdult can be an integer value referencing a lookup table that contains only values 1 through 5. Then also put a UNIQUE constraint on isAdult. Use NULL for "false" when the row is for a user who is not an adult (NULL is ignored by UNIQUE).
Another workaround is to do this in application code. SELECT from the database before changing it, to make sure your change won't break your app's business rules. Normally in a multi-user RDMS this is impossible due to race conditions, but since you're using SQLite you might be the sole user.

Insert or ignore every column

I have a problem with a sqlite command.
I have a table with three columns: Id, user, number.
The id is continuing. Now if I put a user and a number inside my list, my app should compare if such a user with this number already exist. The problem is, if I use a standard "insert or ignore" command, the Id column is not fixed, so I will get a new entry every time.
So is it possible just two compare two of three columns if they are equal?
Or do I have to use a temporary list, where are only two columns exist?
The INSERT OR IGNORE statement ignores the new record if it would violate a UNIQUE constraint.
Such a constraint is created implicitly for the PRIMARY KEY, but you can also create one explicitly for any other columns:
CREATE TABLE MyTable (
ID integer PRIMARY KEY,
User text,
Number number,
UNIQUE (User, Number)
);
You shouldn't use insert or ignore unless you are specifying the key, which you aren't and in my opinion never should if your key is an Identity (Auto number).
Based on User and Number making a record in your table unique, you don't need the id column and your primary key should be user,number.
If for some reason you don't want to do that, and bearing in mind in that case you are saying that User,Number is not your uniqueness constraint then something like
if not exists(Select 1 From MyTable Where user = 10 and Number = 15)
Insert MyTable(user,number) Values(10,15)
would do the job. Not a SqlLite boy, so you might have to rwiddle with the syntax and wrap escape your column names.

will the maximum number of rows decrease after records are deleted from a table as the row id keeps incrementing

This might be a beginners question, but when testing my sqlite data base, I found that when I delete a row, the row id keeps incrementing when I insert a new row and doesn't reuse for instance the row id of a deleted row. So, what will happen if the row id runs out to it's maximum value, while there are less rows in the table?
This is documented:
If the table has previously held a row with the largest possible ROWID, then new INSERTs are not allowed and any attempt to insert a new row will fail with an SQLITE_FULL error.
If you omit the AUTOINCREMENT keyword, IDs will still autoincrement, but can be reused if you delete the last row or if the values overflow:
If the largest ROWID is equal to the largest possible integer (9223372036854775807) then the database engine starts picking positive candidate ROWIDs at random until it finds one that is not previously used.
When you add row number as auto increment you have to check largest value. If data rows go to that limit you have to use bigger data type. But usually integer doesn't cross because a database designer must keep eye on normalization.
If data rows give so big. You are really stuck with the queries. It will take huge time. SQLite is mainly useful for low end device. They are not so capable of handling big data.

Set start value for AUTOINCREMENT in SQLite

How can I set the start value for an AUTOINCREMENT field in SQLite?
From the SQLite web site:
SQLite keeps track of the largest ROWID that a table has ever held using the special SQLITE_SEQUENCE table. The SQLITE_SEQUENCE table is created and initialized automatically whenever a normal table that contains an AUTOINCREMENT column is created. The content of the SQLITE_SEQUENCE table can be modified using ordinary UPDATE, INSERT, and DELETE statements. But making modifications to this table will likely perturb the AUTOINCREMENT key generation algorithm. Make sure you know what you are doing before you undertake such changes.
I tried this, and it works:
UPDATE SQLITE_SEQUENCE SET seq = <n> WHERE name = '<table>'
Where n+1 is the next ROWID you want and table is the table name.
Explicitly insert the value-1 into the table, then delete the row.
Edit: the next comment down, which discusses editing the SQLITE_SEQUENCE table directly is probably preferable: https://stackoverflow.com/a/692871/10093
I am using the below query which solves the problem when the sqlite_sequence does not have a record for the table (i.e. first record was not added yet to the table), otherwise it updates the sequence.
BEGIN TRANSACTION;
UPDATE sqlite_sequence SET seq = <n> WHERE name = '<table>';
INSERT INTO sqlite_sequence (name,seq) SELECT '<table>', <n> WHERE NOT EXISTS
(SELECT changes() AS change FROM sqlite_sequence WHERE change <> 0);
COMMIT;
One way to do it is to insert the first row specifying explicitly the row id you want to start with. SQLite will then insert row ids that are higher than the previous highest.
In solution with SQLITE_SEQUENCE table, the entry into this table seems to be added after the first insert into the table with the autoincrement column is added. In some cases this might cause troubles (i.e autoincrement still starts from 1, not from wanted value).
Just wanted to add a few notes to the very much appreciated answer from iTech:
The name column in sqlite_sequence is case sensitive. (Perhaps its only me, but coming from other databases I always assume that string comparison is case insensitive).
SQLite seems to be robust: if the number in sqlite_sequence is wrong and would lead to a duplicated rowid value, sqlite will use the next available number for the rowid (checked with sqlite 3.28)
Same is true if the row in sqlite_sequence gets deleted.
I used as suggested in a comment the "WHERE NOT EXISTS SELECT name from sqlite_sequence WHERE name = 'table'" instead of checking "changes()"
I tried this and it works good:
FOR INSERT
INSERT INTO sqlite_sequence (name, seq) VALUES ('<table name>', <value>)
TO UPDATE
UPDATE sqlite_sequence SET seq = <value> WHERE name= '<table name>'

Resources