Why is SQLite insertion cost linear? - sqlite

I had assumed that the time cost of insertion is log to the number of records. But my test in SQLite 3.22 seems showing it is linear. Note both X/Y are in log scale.
The size(k) column is the number of rows I inserted at each test. Its unit is K. I did 3 tests. Journal and synchronous are off. Locking_mode is exclusive. All operations are included in one transaction.
time1
create table t1 (id primary key, name text);
create index nameIdx on t1(name)
// for i = [1:<size>]
// insert into t1 values(i, "foo"i)
create table t2 (id primary key, value int);
// for i = [1:<size>]
// insert into t2 values(i, i)
time2
create table t1 (id primary key, name text);
// for i = [1:<size>]
// insert into t1 values(i, "foo"i)
create index nameIdx on t1(name)
create table t2 (id primary key, value int);
// for i = [1:<size>]
// insert into t2 values(i, i)
time3
create table t1 (id primary key, name text, value int);
// for i = [1:<size>]
// insert into t1 values(i, "foo"i, 0)
create index nameIdx on t1(name)
// for i = [1:<size>]
// update t1 set value=0 where id=<i>
All the 3 test cases have similar costs. They seem linear.
Also I had thought case 3 can be faster because update does not need rebalance tree or add new records. But case3 is a little bit slower...
Are the results expected? Maybe my input data are too small to see the log complexity?

SQLite optimizes inserts at the end of the table. sqlite3BtreeInsert() in btree.c has:
/* If the cursor is currently on the last row and we are appending a
** new row onto the end, set the "loc" to avoid an unnecessary
** btreeMoveto() call */
To get worse run time, try inserting the rows in random order, or at least inserting the a very large value first.
Anyway, the run time is dominated by disk I/O, and the non-leaf pages are most likely to be cached. Use more transactions (so that all pages need to be flushed to disk), or use an in-memory database.

Related

Automatic random primary key within a specific range in sqlite3?

I would like to automatically insert a primary key every time I add a new record to an SQLite3 table, much like a PRIMARY KEY AUTOINCREMENT except that the value should be randomly chosen from some range (say 0000 through 9999) rather than being assigned sequentially.
For demonstration purposes, let's restrict the range to 1 through 6 instead and try to populate the following table:
CREATE TABLE dice (rolled INTEGER PRIMARY KEY NOT NULL);
Now every time I insert a new record into that table, I want a new random primary key to be created.
The following works and does exactly what I want
INSERT INTO dice VALUES(
(
WITH RECURSIVE roll(x) AS (
VALUES(ABS(RANDOM()) % 6 + 1)
UNION ALL
SELECT x % 6 + 1 FROM roll
)
SELECT x FROM roll WHERE (
SELECT COUNT(*) FROM dice where rolled = x
) = 0 LIMIT 1
)
);
except that I have to invoke it manually/explicitly.
Is there any way to embed the above (or an equivalent) calculation for the random primary key into a DEFAULT clause for the "rolled" column or into some sort of trigger, so that a new key will be calculated automatically every time I insert a record?

SQLITE fill value with unique random table

I want to create a table with a field that is unique and limited to a certain value. Lets say that the limit is 100, the table is full, I remove a random row, and when I create a new row it has the value that was freed before.
It doesn't need to be the fastest thing in the world (the limit is quite small), I just want to implement it in a DB.
Any ideas?
Create one more column in main table, say deleted (integer, 0 or 1). When you need to delete with certain id, do not really delete it, but simply update deleted to 1:
UPDATE mytable SET deleted=1 WHERE id = <id_to_delete>
When you need to insert, find id to be reused:
SELECT id FROM mytable WHERE deleted LIMIT 1
If this query returns empty result, then use INSERT to create new id. Otherwise, simply update your row:
UPDATE mytable SET deleted=0, name='blah', ... WHERE id=<id_to_reuse>
All queries reading from your main table should have WHERE constraint with NOT deleted condition:
SELECT * FROM mytable WHERE NOT deleted
If you add index on deleted, this method should work fast even for large number of rows.
This solution does everything in a trigger, so you can just use a normal INSERT.
For the table itself, we use an autoincrementing ID column:
CREATE TABLE MyTable(ID INTEGER PRIMARY KEY, Name);
We need another table to store an ID temporarily:
CREATE TABLE moriturus(ID INTEGER PRIMARY KEY);
And the trigger:
CREATE TRIGGER MyTable_DeleteAndReorder
AFTER INSERT ON MyTable
FOR EACH ROW
WHEN (SELECT COUNT(*) FROM MyTable) > 100
BEGIN
-- first, select a random record to be deleted, and save its ID
DELETE FROM moriturus;
INSERT INTO moriturus
SELECT ID FROM MyTable
WHERE ID <> NEW.ID
ORDER BY random()
LIMIT 1;
-- then actually delete it
DELETE FROM MyTable
WHERE ID = (SELECT ID
FROM moriturus);
-- then change the just inserted record to have that ID
UPDATE MyTable
SET ID = (SELECT ID
FROM moriturus)
WHERE ID = NEW.ID;
END;

How to generate a column with id which increments on every insert

This is my table where i want my PNRNo to be generated as 'PNRRES001' for the first entry, and consecutive entries with 'PNRRES002','PNRRES002' so on.
So while creating table only i called that column to function which will generate the PNR no, User just has to enter the CustomerNo from the front end, and data wit PNR & Customer No will updated to the PNRDetails table.
CREATE TABLE PNRDetails(PNRNo AS (DBO.FuncIncPNR()) ,customerNo INT
--FUNCTION TO GENERATE THE PNR NUMBER
ALTER FUNCTION dbo.FuncIncPNR()
RETURNS VARCHAR(20)
AS
BEGIN
DECLARE #RR VARCHAR(20) SET #RR='PNRRESA001'
--here i have checked if no value is there then return the first value as 'PNRRESA001'
IF((SELECT COUNT(*)FROM PNRDetails)=0)
BEGIN
RETURN #RR
END
ELSE
-- if any value is there then take the last value and add 1 to it and update to the table
BEGIN
DECLARE #pnr VARCHAR(20),#S1 VARCHAR(20),#S2 INT
DECLARE PNRCursor CURSOR Static
FOR SELECT PNRNo FROM PNRDetails
OPEN PNRCursor
FETCH LAST FROM PNRNo INTO #pnr
SET #S1=SUBSTRING(#pnr,1,7)
SET #S2=RIGHT(#PNR,3)
SET #S2=#S2+1;
SET #pnr=#S1+#S2;
END
RETURN #pnr
END
--Here am inserting only customerNo as 5 and the PNR should be generated by my function
INSERT INTO PNRDetails VALUES(5)
--it shows 1 row updated :)
SELECT * FROM PNRDetails
-- but when i run select command it shows
--Maximum stored procedure, function, trigger, or view nesting level exceeded (limit 32). :(
U can run this.And pls do help if u find anything that could help me. any help will be appreciated...
Waiting for your kind response...
You could try to use a computed column and an identity column instead.
create table PNRDetails
(
ID int identity,
PNRNo as 'PNRRES'+right(1000+ID, 3),
customerNo int
)
I would suggest just using an IDENTITY instead as your id, let SQL Server handle the assignment of each id number with all it's built-in guards for concurrency, and leave the formatting up to the UI....or, create a computed column that defines the formatted version of the ID if you really do need it in the DB.
The risk you run with your intended approach is:
poor performance
concurrency issues - if loats of ids are being generate around the same time
If you are happy to change the table structure. Following will do the job.
CREATE TABLE [dbo].[PNRDetails](
[autoId] [int] IDENTITY(1,1) NOT NULL,
[prnNo] AS ('PNRRES'+right('000'+CONVERT([varchar](3),[dbo].[GetRowCount]([autoId]),(0)),(3))),
[customerNo] [int] NOT NULL,
CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED
(
[autoId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
EDIT: to address identity issue for your requirement pls create following function and pass the [autoId] in as above (edited) in the computed column.
CREATE FUNCTION dbo.GetRowCount
(
#autoId INT
)
RETURNS INT
AS
BEGIN
DECLARE #RESULTS AS INT
SELECT #RESULTS = COUNT(autoId) FROM PNRDetails WHERE PNRDetails.autoId<#autoId
RETURN #RESULTS + 1
END
GO
--INSERT
INSERT INTO PNRDetails (customerNo) VALUES(5)
1) You can use an identity column in your database (INTEGER)
PROS: easy/No gaps in between generated ids
CONS: You have to select the inserted id & return via procedure/query
if you were to show it to end user
2) Define a database sequence
PROS: easy to implement/Can be stored/shown to user before the form is
even saved
CONS: Gaps in between if the certain id is once generated & not used
3). Select max(id) from column + 1
PROS: Useful where only single user inserts in a table
CONS: disastrous if you were in an environment where multiple users
were inserting in the same tablle (mismatched max ids)
4) Use a database trigger to autoincrement the column
PROS:automated
CONS: hard to debug (you have to make sure it don't breaks for some
reason otherwise insert fails)
Change the way your trigger works. Something like this
CREATE FUNCTION dbo.fn_FuncIncPNR(#ID int)
RETURNS varchar(20)
BEGIN
Declare #Retval varchar(20),
#No varchar(4)
Select #No = convert(varchar(4), #ID)
while Len(#No) < 4
Select #No = '0' + #No
Select #Retval = 'PNRRESA' + #No
RETURN #Retval
END
You will notice there is a parameter field
Change your table create to this
CREATE TABLE PNRDetails(PNRNo AS (dbo.fn_ShowPNRNo(wID)), wID int IDENTITY(1,1) NOT NULL, customerNo INT)
That should solve your problem

counting rows of sqlite INSERT SELECT

I have two sqlite tables, where one table has a foreign key of the other.
CREATE TABLE a (id INTEGER PRIMARY KEY NOT NULL, value TEXT UNIQUE NOT NULL);
CREATE TABLE b (id INTEGER PRIMARY KEY NOT NULL, a INTEGER REFERENCES a (id) NOT NULL, value TEXT NOT NULL);
I am doing an INSERT with a SELECT into b.
INSERT INTO b (a, value) SELECT ?value, a.id FROM a WHERE a.value == ?a;
How do I know weather a row was inserted into b or not? Doing a SELECT for the just inserted values and checking weather they exist, seems rather inefficient.
I hope the changes() function can help you.
The changes() function returns the number of database rows that were
changed or inserted or deleted by the most recently completed INSERT,
DELETE, or UPDATE statement, exclusive of statements in lower-level
triggers. The changes() SQL function is a wrapper around the
sqlite3_changes() C/C++ function and hence follows the same rules for
counting changes.
So changes() returns 1 if a row was inserted and 0 otherwise.

Numbering comments in ASP.NET and SQL Server

I've just thought about best way to store comments in database with appropriate numbers according to the article.
The idea is to store comments with composite primary key (commentId, articleId) where commentId is generated according to the given articleId. The system of generating should has same principle as IDENTITY generated columns in SQL Server, because if someone delete the comment, the number will be never used again. I guess there is not any functionality in Microsoft SQL Server to do that with composite PK, so I am asking about some replacement for this solution.
First thought was to use transaction to get MAX(commentId) + 1, but I am looking for something more abstract (maybe INSTEAD OF trigger), something that could be used for example in LINQ with no knowledge of the background, just insert to the appropriate table all required values (so no commentId) and save it.
I would use an autogenerated identity column for the commentId and have it be the primary key alone. I'd create an index on the articleId for look ups. I would also have createdDate column that is autopopulated with the current date on insertion -- mark it as db generated and readonly in LINQ so it doesn't require or try to insert/update the value. To get a numbering -- if showing them by date isn't enough -- I'd order by createdDate inversed and assign a numeric value in the select using Row_Number() or a numbering on the client side.
I would use an identity column as the key for the comments, why do you need a numbering for the comments stored in the database?
Thank you for responses, I wanted something with numbered comments because of referencing in the text of comments. I did not want to make reaction by names, sometimes one person reacts more times, so with this system, I will know to which one the person is replying.
So today I made up this INSTEAD OF INSERT trigger:
CREATE TRIGGER InsertComments ON Comments
INSTEAD OF INSERT
AS
DECLARE #Inserted TABLE
(
ArticleId INT NOT NULL,
UserId INT NOT NULL,
CommentDate DATETIME NOT NULL,
Content NVARCHAR(1000) NOT NULL,
RowNumber INT NOT NULL
)
INSERT INTO #Inserted
SELECT ArticleId, UserId, CommentDate, Content, ROW_NUMBER() OVER (ORDER BY CommentDate) AS RowNumber
FROM INSERTED
DECLARE #NumberOfRows INT = (SELECT COUNT(*) FROM #Inserted)
DECLARE #i INT = 1
WHILE (#i <= #NumberOfRows)
BEGIN
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
DECLARE #CommentId INT = (SELECT ISNULL(MAX(CommentId), 0)
FROM Comments WHERE ArticleId = (SELECT ArticleId
FROM #Inserted WHERE RowNumber = #i)) + 1
INSERT INTO Comments(CommentId, ArticleId, UserId, CommentDate, Content)
SELECT #CommentId, ArticleId, UserId, CommentDate, Content
FROM #Inserted WHERE RowNumber = #i
COMMIT
SET #i = #i + 1
END
I know this is not the perfect solution, but it works exactly how I needed. If any of you has some comments, I'll be happy to read them.

Resources