SQLite: how does REPLACE INTO determine if a row exists? - sqlite

I found this post explaining the difference between UPDATE and "INSERT OR REPLACE INTO". It explains that
INSERT OR REPLACE INTO names (id, name) VALUES (1, "John")
will insert a new row if no record with id =1 exists, and will replace the row with id = 1 if it does exist. My question is: how does SQLite know or decide that 'id' is the field whose values determine if records already exist or not?
In other words, why wouldn't sqlite search for a record with name = "John" and replace the id value? Does this depend on an index that's not being talked about in the above example, or does SQLite give special treatment to fields named 'id' or fields named first in a row of field names?

See the CONFLICT clause documentation for how this is dealt with. Essentially, it is based on UNIQUE and NOT NULL constraints (primary keys being rather usual as a constraint to select whether to update or insert).
When a UNIQUE constraint violation occurs, the REPLACE algorithm deletes pre-existing rows that are causing the constraint violation prior to inserting or updating the current row and the command continues executing normally. If a NOT NULL constraint violation occurs, the REPLACE conflict resolution replaces the NULL value with he default value for that column, or if the column has no default value, then the ABORT algorithm is used. If a CHECK constraint violation occurs, the REPLACE conflict resolution algorithm always works like ABORT.
When the REPLACE conflict resolution strategy deletes rows in order to satisfy a constraint, delete triggers fire if and only if recursive triggers are enabled.
The update hook is not invoked for rows that are deleted by the REPLACE conflict resolution strategy. Nor does REPLACE increment the change counter. The exceptional behaviors defined in this paragraph might change in a future release.

Related

Correct usage of the SQLite ON CONFLICT clause

I have a SQLite database which, amongst other things, has the following table.
CREATE TABLE IF NOT EXISTS biases
(
data INTEGER NOT NULL,
link INTEGER DEFAULT 0,
bias_type INTEGER,
ignores INTEGER DEFAULT 0,
desists INTEGER DEFAULT 0,
encashes INTEGER DEFAULT 0,
accesses INTEGER DEFAULT 0,
scraps INTEGER DEFAULT 0,
CONSTRAINT pk_bias_mix PRIMARY KEY(data,link,bias_type)
);
The constraint pk_bias_mix is being used to ensure that no two rows can have the same values for all three columns data, link and bias_type columns. So suppose I do
INSERT INTO biases (data,link,bias_type,ignores) VALUES(1,1,1,1);
things work as expected - a new row is inserted in the table. If I issue the same INSERT again I get the error
UNIQUE CONSTRAINT FAILED: biases.data,biases.link,biases.bias_type
just as expected. I tried to use the SQLite ON CONFLICT clause thus
INSERT INTO biases (data,link,bias_type,ignores) VALUES(1,1,1,1)
ON CONFLICT(data,link,bias_type) DO UPDATE SET ignores = ignores + 1;
and it worked as I had hoped - instead of adding a new row or throwing up an error SQLite incremented the value of the ignores column in the row with the matching data, link and bias_type values.
However, this is just the result of an experiment. It is not immediately clear to me from the SQLite docs that this is indeed how ON CONFLICT is supposed to behave - i.e it can be given two or more conflict constraints to be checked. What I mean by two or more constraints is specifying multiple, comma separated, columns inside CONFLICT(...) as I have done in the example above.
I suspect that this is the right usage since I am merely specifying a CONFLICT condition that replicates my indicated CONSTRAINT. However, I cannot see this explained explicitly anywhere in the docs. I'd be much obliged to anyone who might be able to confirm this.
From UPSERT:
UPSERT is a special syntax addition to INSERT that causes the INSERT
to behave as an UPDATE or a no-op if the INSERT would violate a
uniqueness constraint.
and:
The special UPSERT processing happens only for uniqueness constraint
on the table that is receiving the INSERT.
So the DO UPDATE part is not triggered by any constraint conflict but only by a unique constraint violation.
Also:
The syntax that occurs in between the "ON CONFLICT" and "DO" keywords
is called the "conflict target". The conflict target specifies a
specific uniqueness constraint that will trigger the upsert.
So it is not possible to have two or more conflict constraints to be checked in one statement.
However you can use separate UPSERT statements to check for 2 different unique constraint violations.
See a simplified demo where I added 1 more UNIQUE constraint to your table:
CONSTRAINT con_scraps UNIQUE(scraps)

Creating index with unique constraint for new blank field

A new column was added to an existing DB table (PA0023).
DB: HANA
The column should be unique, therefore i tried to create a unique index constraint via SE11.
Activation succeeded. However, while creating the index via Utilities... Database Utility, an error showed up:
Request: Create Index PA0023-Z01
...
sql:
CREATE UNIQUE INDEX 'PA0023~Z01' ON 'PA0023'
('MANDT',
'RECORD_KEY')
cannot CREATE UNIQUE INDEX; duplicate key found [5] Several documents with the same ID exist in the index;SAPABAP1:PA0023.$uc_PA0023~Z01$ content not unique, cannot define unique constraint. rowCount != distinctCount.
There aren't rows with the same value filled in that column. There are rows with blank value, which are considered duplicates. After replacing blanks in development environment, index was created well. It's less possible in production, because there are many records with an empty value in that new field.
So my question is: Is there a way to create the unique constraint without replacing the blanks?
You cannot create a unique constraint if the existing data does not provide uniqueness. So no you can't do this if you have multiple NULL values for the key. You would need to ensure the data is unique before creating the constraint.
This is normal database practice, it's not HANA specific.
While it is true that a compound primary key cannot contain any nullable columns it is permitted for a compound unique/candidate key to be defined with nullable columns. The only golden rule is that when adding or updating a record if any column in the unique key contains a NULL value then the index entry is NOT written to the database.
MySQL does this by default.
SQL Server will do this provided that you add "WHERE columnX IS NOT NULL" to the key's definition.
ORACLE is the same as SQL Server, except that the syntax is more complicated.

Limiting the number of rows a table can contain based on the value of a column - SQLite

Since SQLite doesn't support TRUE and FALSE, I have a boolean keyword that stores 0 and 1. For the boolean column in question, I want there to be a check for the number of 1's the column contains and limit the total number for the table.
For example, the table can have columns: name, isAdult. If there are more than 5 adults in the table, the system would not allow a user to add a 6th entry with isAdult = 1. There is no restriction on how many rows the table can contain, since there is no limit on the amount of entries where isAdult = 0.
You can use a trigger to prevent inserting the sixth entry:
CREATE TRIGGER five_adults
BEFORE INSERT ON MyTable
WHEN NEW.isAdult
AND (SELECT COUNT(*)
FROM MyTable
WHERE isAdult
) >= 5
BEGIN
SELECT RAISE(FAIL, "only five adults allowed");
END;
(You might need a similar trigger for UPDATEs.)
The SQL-99 standard would solve this with an ASSERTION— a type of constraint that can validate data changes with respect to an arbitrary SELECT statement. Unfortunately, I don't know any SQL database currently on the market that implements ASSERTION constraints. It's an optional feature of the SQL standard, and SQL implementors are not required to provide it.
A workaround is to create a foreign key constraint so isAdult can be an integer value referencing a lookup table that contains only values 1 through 5. Then also put a UNIQUE constraint on isAdult. Use NULL for "false" when the row is for a user who is not an adult (NULL is ignored by UNIQUE).
Another workaround is to do this in application code. SELECT from the database before changing it, to make sure your change won't break your app's business rules. Normally in a multi-user RDMS this is impossible due to race conditions, but since you're using SQLite you might be the sole user.

Only update column if new value is higher

I want to add a constraint, so a column only updates its value if the new value passed in is greater than its current value. Otherwise it should silently ignore. In pseudocode:
CREATE TABLE t (col INTEGER CHECK (new.value > col.value) ON CONFLICT IGNORE)
SQLite 3.7.4.
I have decided to use MAX():
UPDATE t SET col = MAX(col, newval) [, col2 = xxx, ...]
Although technically it does overwrite the value, at least it can never be lowered.
When using a WHERE clause to enforce this constraint, other columns (e.g. col2) affected by the UPDATE would not be updated if the clause prohibited a match based on col's value.
I could not work out the correct syntax for a trigger to ignore the UPDATE if the constraint was violated. Regardless, I'd imagine using a trigger would incur much greater overhead than a simple MAX() call.

SQLite "INSERT OR REPLACE INTO" vs. "UPDATE ... WHERE"

I've never seen the syntax INSERT OR REPLACE INTO names (id, name) VALUES (1, "John") used in SQL before, and I was wondering why it's better than UPDATE names SET name = "John" WHERE id = 1. Is there any good reason to use one over the other. Is this syntax specific to SQLite?
UPDATE will not do anything if the row does not exist.
Where as the INSERT OR REPLACE would insert if the row does not exist, or replace the values if it does.
Another fact to notice: INSERT OR REPLACE will replace any values not supplied in the statement.
For instance if your table contains a column "lastname" which you didn't supply a value for, INSERT OR REPLACE will nullify the "lastname" if possible (if constraints allow it), or fail.
REPLACE INTO table(column_list) VALUES(value_list);
is a shorter form of
INSERT OR REPLACE INTO table(column_list) VALUES(value_list);
For REPLACE to execute correctly your table structure must have unique rows, whether a simple primary key or a unique index.
REPLACE deletes, then INSERTs the record and will cause an INSERT Trigger to execute if you have them setup. If you have a trigger on INSERT, you may encounter issues.
This is a work around.. not checked the speed..
INSERT OR IGNORE INTO table (column_list) VALUES(value_list);
followed by
UPDATE table SET field=value,field2=value WHERE uniqueid='uniquevalue'
This method allows a replace to occur without causing a trigger.
The insert or replace query would insert a new record if id=1 does not already exist.
The update query would only oudate id=1 if it aready exist, it would not create a new record if it didn't exist.

Resources