sqlite3: INSERT OR IGNORE OR UPDATE for multiple unique columns - sqlite

My simplified database looks like this:
CREATE TABLE my_table(hash TEXT, timestamp DATE, value1 TEXT, value2 REAL, UNIQUE(hash, timestamp));
The hash column is an MD5 hash of everything in the row.
Problem
A script will put the following example row in the table:
INSERT INTO my_table VALUES("7494ab07987ba112bd5c4f9857ccfb3f", "2019-12-19", "temp", 1);
Then the script will run an hour later and value2 may or may not have changed. This is equivalent to hash changing or remaining the same. If the hash has changed for the given date I want to update the row with the new values.
For a concrete example, below are the three possible statements that could run 1 hour after the statement above.
(case 1) INSERT INTO my_table VALUES("7494ab07987ba112bd5c4f9857ccfb3f", "2019-12-19", "temp", 1);
or...
(case 2) INSERT INTO my_table VALUES("64a5dcbe3f4af29ca58b8d99b1c3a9f2", "2019-12-19", "temp", 2);
or...
(case 3) INSERT INTO my_table VALUES("514292c37cc92b2ee2cd797328bed2d5", "2019-12-20", "temp 2", 3);
If (case 1) runs, the INSERT statement should be IGNOREd.
If (case 2) runs, the INSERT statement should be an UPDATE.
If (case 3) runs, the INSERT statement should run normally.
For a given timestamp (new_timestamp) which may or may not already be in my table, and hash (new_hash) I need a statement that will do something like this:
if new_timestamp in my_table:
if new_hash == old_hash:
IGNORE
else:
UPDATE
else:
INSERT
Any help is greatly appreciated.
EDIT:
The unique key for the row is the timestamp contrary to what my initial question said. The timestamp will not ever change. I will know some other value in the row changed by watching the hash, but I use the timestamp to ensure I am back in the right row. Case 2 could be a REPLACE for all I care, but if the hash changes for a given timestamp I do not want to create a new row for that timestamp. The timestamp will always be unique.

Assuming that the timestamp is defined as unique (which does not match the current question details), addition of the UPSERT clause will satisfy all three cases:
INSERT INTO my_table (hash, timestamp, value1, value2)
VALUES ('7494ab07987ba112bd5c4f9857ccfb3f', '2019-12-19', 'temp', 1)
ON CONFLICT (timestamp) DO UPDATE
SET hash = excluded.hash, value1 = excluded.value1, value2 = excluded.value2
WHERE hash != excluded.hash;
(FYI, the standard string delimiter is a single quote ', not double quotes ".)

Related

MariaDB Stored Procedure store paramters for update

I am trying to write a MariaDB stored procedure.
Due to SQL_SAFE_UPDATES, it is required to use the ID column to use in the WHERE clause for updates. Due to this, what is the normal approach to also select a value from one of the other columns? I do not want to have multiple SELECT statements as it seems inefficient and room for error because they could return values from different rows.
I would like to store my first select statement
SELECT id, sequence FROM RECORDSEQUENCE WHERE SEQTABLE = SeqTable;
In the following two parameters #id, #seq from two seperate columns in the above query and use them in the UPDATE statement as well as the IF statement.
CREATE DEFINER=`sd`#`%` PROCEDURE `SD_GenerateNextRecordSequence`(IN SeqTable int)
BEGIN
SELECT id, sequence FROM RECORDSEQUENCE WHERE SEQTABLE = SeqTable;
IF (#seq IS NOT NULL) THEN
SET #NEXTSEQ := #seq+1;
UPDATE RECORDSEQUENCE SET RECORDSEQUENCE = #NEXTSEQ WHERE id = #id;
ELSE
SET #NEXTSEQ := 100;
INSERT INTO RECORDSEQUENCE (RECORDSEQUENCE,SEQTABLE) VALUES (#NEXTSEQ,SeqTable);
END IF;
SELECT #NEXTSEQ as SEQUENCE;
END

Converting a field to lower case and merging data in an sqlite database

I need to merge some randomly uppercased data that has been collected in an SQLite table key_val, such that key is always lowercase and no vals are lost. There is a unique compound index on key,val.
The initial data looks like this:
key|val
abc|1
abc|5
aBc|1
aBc|5
aBc|3
aBc|2
AbC|1
abC|3
The result after the merge would be
key|val
abc|1
abc|2
abc|3
abc|5
In my programmer brain, I would
for each `key` with upper case letters;
if a lower cased `key` is found with the same value
then delete `key`
else update `key` to lower case
Re implementing the loop has a sub query for each row found with upper case letters, to check if the val already exists as a lower case key
If it does, I can delete the cased key.
From there I can UPDATE key = lower(key) as the "duplicates" have been removed.
The first cut of the programming method of finding the dupes is:
SELECT * FROM key_val as parent
WHERE parent.key != lower(parent.key)
AND 0 < (
SELECT count(s.val) FROM key_val as s
WHERE s.key = lower(parent.key) AND s.val = parent.val
)
ORDER BY parent.key DESC;
I'm assuming there's a better way to do this in SQLite? The ON CONFLICT functionality seems to me like it should be able to handle the dupe deletion on UPDATE but I'm not seeing it.
First delete all the duplicates:
DELETE FROM key_val AS k1
WHERE EXISTS (
SELECT 1
FROM key_val AS k2
WHERE LOWER(k2.key) = LOWER(k1.key) AND k2.val = k1.val AND k2.rowid < k1.rowid
);
by keeping only 1 combination of key and val with the min rowid.
It is not important if you kept the key with all lower chars or not, because the 2nd step is to update the table:
UPDATE key_val
SET key = LOWER(key);
See the demo.
Honestly it might just be easier to create a new table and then insert into it. As it seems you really just want a distinct select here, use:
INSERT INTO kev_val_new ("key", val)
SELECT DISTINCT LOWER("key"), val
FROM key_val;
Once you have populated the new table, you may drop the old one, and then rename the new one to the previous name:
DROP TABLE key_val;
ALTER TABLE key_val_new RENAME TO key_val;
I agree with #Tim that it would be easire to re-create table using simple select distict lower().. statement, but that's not always easy if table has dependant objects (indexes, triggers, views). In this case this can be done as sequence of two steps:
insert lowered keys which are not still there:
insert into t
select distinct lower(tr.key) as key, tr.val
from t as tr
left join t as ts on ts.key = lower(tr.key) and ts.val = tr.val
where ts.key is null;
now when we have all lowered keys - remove other keys:
delete from t where key <> lower(key);
See fiddle: http://sqlfiddle.com/#!5/84db50/11
However this method assumes that key is always populated (otherwise it would be a strange key)
If vals can be null then "ts.val = tr.val" should be replaced with more complex stuff like ifnull(ts.val, -1) = ifnull(tr.val, -1) where -1 is some unused value (can be different). If we can't assume any unused value like -1 then it should be more complex check for null / not null cases.

Get next number at insert, reset to 0 at new year

I am saving documents to database, each document has to have an id with the format YYYY-00000:
the first 4 characters are the current year
the second five characters are numbers. They start with 1 each year and then increment.
For example I could have these documents in my database: 2011-00001, 2011-00002, 2011-00003, 2012-00001, 2012-00002, ...
I am thinking something like this:
add two columns to table Documents (Year and Number)
Year is computed column, something like year(getdate())
Number is computed column, which gets value from a function GetNextNumberForCurrentYear
GetNextNumberForCurrentYear returns next number for the current year (for example select max(Number) + 1 from Documents where Year = year(getdate()), and some isnull checking)
But i fear, that two users could want to save the document at the same time and that they would receive the same Number. Is this possible? Any better ideas?
It is a ASP.NET C# web application, .NET 4.0, MSSQL 2005, I have the control over all the parts of the application.
PS: after insert I would like to return the Id of the new document to the user, so I would probably have to do something like: select Id from Documents where SomeId = scope_identity(), so I guess there should be an identity column somewhere...?
Edit (final solution): I get the next number from stored procedure, build the Id of the document (in format YYYY-00001) in .NET, save the whole document to the database (using TransactionScope for whole process) and then return the Id to the user.
create table DocumentNumbers ([Year] int not null, Number int not null default 1)
insert into DocumentNumbers ([Year], Number)
select 2012, 1 -- and more...
create procedure GetNextDocumentNumber
#year int
as
begin
declare #myResult table (nextNumber int)
update DocumentNumbers
set Number = isnull(Number, 0) + 1
output inserted.Number into #myResult
where [Year] = #year
select top 1 nextNumber from #myResult
end
You could create a table NumberSeries, which contains a column Year and a column CurrentNo and a function that returns the next number from it like the following:
DECLARE #myResult TABLE (nextNumber INT)
UPDATE NumberSeries
OUTPUT INSERTED.NextNo INTO #myResult
SET
CurrentNo = ISNULL(CurrentNo, 0) + 1
WHERE
Year = Year(GetDate())
DECLARE #result INT
#result = (SELECT TOP 1 nextNumber FROM #myResult)
RETURN #result
This updates the NumberSeries table atomically and inserts the new value into the #myResult table variable. After that, it returns the first (and only) value from the #myResult table variable.
Everything else, like SCOPE_IDENTITY() and such may cause errors when using triggers or in other cases - the solution using the OUTPUT clause is safe.
EDIT
As for returning the ID of the inserted document: this is basically the same thing.
DECLARE #myDocId TABLE (yr int, no int)
INSERT INTO Documents
OUTPUT INSERTED.Year , INSERTED.YearID INTO #myDocID
...
SELECT TOP 1
CAST(yr AS NVARCHAR) +
'_' +
RIGHT(REPLICATE('0', 5) + CAST(no AS NVARCHAR), 5) AS NewID

Sqlite3 INSERT trigger only fire on REPLACE statement even where record already exists?

I'm trying out the Sqlite3 REPLACE (INSERT OR REPLACE) command. I like to keep a created datetime (creDT) and an update datetime (updDT). So I created a database and a trigger for INSERT (creDT & updDT) and one for UPDATE (updDT), but each REPLACE (especially the ones where the primary key already exists) ends up with the current time in both creDT and updDT. Does REPLACE DELETE and INSERT instead of UPDATE?
Is this the standard behavior or am I doing something wrong?
def createDbTables(self):
self.sqlCursor.execute("""
CREATE TABLE rfdetector (
sn TEXT PRIMARY KEY,
detector TEXT,
hex TEXT,
updDT DATE,
creDT DATE)
""")
self.sqlCursor.execute("""
CREATE TRIGGER insert_rfdetector_creDT
AFTER INSERT ON rfdetector
BEGIN
UPDATE rfdetector SET creDT = DATETIME('now','localtime') WHERE rowid = new.rowid;
UPDATE rfdetector SET updDT = DATETIME('now','localtime') WHERE rowid = new.rowid;
END;
""")
self.sqlCursor.execute("""
CREATE TRIGGER update_rfdetector_updDT
AFTER UPDATE ON rfdetector
BEGIN
UPDATE rfdetector SET updDT = DATETIME('now','localtime') WHERE rowid = new.rowid;
END;
""")
def insertSql(self, data):
self.sqlCursor.execute(
'REPLACE INTO rfdetector (sn, hex, detector) VALUES (?, ?, ?)',
(data.serialNumber, data.hex, data.detector))
Looks like SQLite performs a DELETE then INSERT on REPLACE:
REPLACE
When a UNIQUE constraint violation occurs, the REPLACE
algorithm deletes pre-existing rows that are causing the
constraint violation
prior to inserting or updating the current row and the command
continues executing normally. If a NOT NULL constraint violation
occurs, the REPLACE conflict resolution replaces the NULL value with
the default value for that column, or if the column has no default
value, then the ABORT algorithm is used. If a CHECK constraint
violation occurs, the REPLACE conflict resolution algorithm always
works like ABORT.
from: http://www.sqlite.org/lang_conflict.html

INSERT IF NOT EXISTS ELSE UPDATE?

I've found a few "would be" solutions for the classic "How do I insert a new record or update one if it already exists" but I cannot get any of them to work in SQLite.
I have a table defined as follows:
CREATE TABLE Book
ID INTEGER PRIMARY KEY AUTOINCREMENT,
Name VARCHAR(60) UNIQUE,
TypeID INTEGER,
Level INTEGER,
Seen INTEGER
What I want to do is add a record with a unique Name. If the Name already exists, I want to modify the fields.
Can somebody tell me how to do this please?
Have a look at http://sqlite.org/lang_conflict.html.
You want something like:
insert or replace into Book (ID, Name, TypeID, Level, Seen) values
((select ID from Book where Name = "SearchName"), "SearchName", ...);
Note that any field not in the insert list will be set to NULL if the row already exists in the table. This is why there's a subselect for the ID column: In the replacement case the statement would set it to NULL and then a fresh ID would be allocated.
This approach can also be used if you want to leave particular field values alone if the row in the replacement case but set the field to NULL in the insert case.
For example, assuming you want to leave Seen alone:
insert or replace into Book (ID, Name, TypeID, Level, Seen) values (
(select ID from Book where Name = "SearchName"),
"SearchName",
5,
6,
(select Seen from Book where Name = "SearchName"));
You should use the INSERT OR IGNORE command followed by an UPDATE command:
In the following example name is a primary key:
INSERT OR IGNORE INTO my_table (name, age) VALUES ('Karen', 34)
UPDATE my_table SET age = 34 WHERE name='Karen'
The first command will insert the record. If the record exists, it will ignore the error caused by the conflict with an existing primary key.
The second command will update the record (which now definitely exists)
You need to set a constraint on the table to trigger a "conflict" which you then resolve by doing a replace:
CREATE TABLE data (id INTEGER PRIMARY KEY, event_id INTEGER, track_id INTEGER, value REAL);
CREATE UNIQUE INDEX data_idx ON data(event_id, track_id);
Then you can issue:
INSERT OR REPLACE INTO data VALUES (NULL, 1, 2, 3);
INSERT OR REPLACE INTO data VALUES (NULL, 2, 2, 3);
INSERT OR REPLACE INTO data VALUES (NULL, 1, 2, 5);
The "SELECT * FROM data" will give you:
2|2|2|3.0
3|1|2|5.0
Note that the data.id is "3" and not "1" because REPLACE does a DELETE and INSERT, not an UPDATE. This also means that you must ensure that you define all necessary columns or you will get unexpected NULL values.
INSERT OR REPLACE will replace the other fields to default value.
sqlite> CREATE TABLE Book (
ID INTEGER PRIMARY KEY AUTOINCREMENT,
Name TEXT,
TypeID INTEGER,
Level INTEGER,
Seen INTEGER
);
sqlite> INSERT INTO Book VALUES (1001, 'C++', 10, 10, 0);
sqlite> SELECT * FROM Book;
1001|C++|10|10|0
sqlite> INSERT OR REPLACE INTO Book(ID, Name) VALUES(1001, 'SQLite');
sqlite> SELECT * FROM Book;
1001|SQLite|||
If you want to preserve the other field
Method 1
sqlite> SELECT * FROM Book;
1001|C++|10|10|0
sqlite> INSERT OR IGNORE INTO Book(ID) VALUES(1001);
sqlite> UPDATE Book SET Name='SQLite' WHERE ID=1001;
sqlite> SELECT * FROM Book;
1001|SQLite|10|10|0
Method 2
Using UPSERT (syntax was added to SQLite with version 3.24.0 (2018-06-04))
INSERT INTO Book (ID, Name)
VALUES (1001, 'SQLite')
ON CONFLICT (ID) DO
UPDATE SET Name=excluded.Name;
The excluded. prefix equal to the value in VALUES ('SQLite').
Firstly update it. If affected row count = 0 then insert it. Its the easiest and suitable for all RDBMS.
Upsert is what you want. UPSERT syntax was added to SQLite with version 3.24.0 (2018-06-04).
CREATE TABLE phonebook2(
name TEXT PRIMARY KEY,
phonenumber TEXT,
validDate DATE
);
INSERT INTO phonebook2(name,phonenumber,validDate)
VALUES('Alice','704-555-1212','2018-05-08')
ON CONFLICT(name) DO UPDATE SET
phonenumber=excluded.phonenumber,
validDate=excluded.validDate
WHERE excluded.validDate>phonebook2.validDate;
Be warned that at this point the actual word "UPSERT" is not part of the upsert syntax.
The correct syntax is
INSERT INTO ... ON CONFLICT(...) DO UPDATE SET...
and if you are doing INSERT INTO SELECT ... your select needs at least WHERE true to solve parser ambiguity about the token ON with the join syntax.
Be warned that INSERT OR REPLACE... will delete the record before inserting a new one if it has to replace, which could be bad if you have foreign key cascades or other delete triggers.
If you have no primary key, You can insert if not exist, then do an update. The table must contain at least one entry before using this.
INSERT INTO Test
(id, name)
SELECT
101 as id,
'Bob' as name
FROM Test
WHERE NOT EXISTS(SELECT * FROM Test WHERE id = 101 and name = 'Bob') LIMIT 1;
Update Test SET id='101' WHERE name='Bob';
I believe you want UPSERT.
"INSERT OR REPLACE" without the additional trickery in that answer will reset any fields you don't specify to NULL or other default value. (This behavior of INSERT OR REPLACE is unlike UPDATE; it's exactly like INSERT, because it actually is INSERT; however if what you wanted is UPDATE-if-exists you probably want the UPDATE semantics and will be unpleasantly surprised by the actual result.)
The trickery from the suggested UPSERT implementation is basically to use INSERT OR REPLACE, but specify all fields, using embedded SELECT clauses to retrieve the current value for fields you don't want to change.
I think it's worth pointing out that there can be some unexpected behaviour here if you don't thoroughly understand how PRIMARY KEY and UNIQUE interact.
As an example, if you want to insert a record only if the NAME field isn't currently taken, and if it is, you want a constraint exception to fire to tell you, then INSERT OR REPLACE will not throw and exception and instead will resolve the UNIQUE constraint itself by replacing the conflicting record (the existing record with the same NAME). Gaspard's demonstrates this really well in his answer above.
If you want a constraint exception to fire, you have to use an INSERT statement, and rely on a separate UPDATE command to update the record once you know the name isn't taken.

Resources