SQLite - Help to Make Complicated Query More Efficient - sqlite

The data is Financial data, structured in buckets, where one bucket (Rollup) can contain other buckets of data. As Example structure:
Rollup1 | Dept1
Rollup1 | Rollup2 | Dept2
Rollup1 | Rollup2 | Dept3
Rollup1 | Rollup3 | Dept4
Rollup1 | Rollup3 | Rollup4 | Dept5
Rollup1 | Rollup3 | Rollup4 | Dept6
There are 8 Columns of this, with Rollups and Depts scattered throughout (but the leaves are always single Depts). Approx 10k rows.
The goal of the Query Result is to show a single column with ALL Rollups, with variable logic to present certain Rollups normally, and modifying all other Rollups.
For example, if my variable contained "Dept4", my result would be:
Rollup1
Rollup3
NA - Rollup2
NA - Rollup4
In the real scenario, there are 3 variables which determine the display of the Rollup column.
Here is what I have, which functions as it should, however, the performance is VERY bad. 1 Query takes up to 5 seconds, which I would like to improve.
SELECT DISTINCT CASE
WHEN "2" NOT IN
(
SELECT "2"
FROM "Finance New"
WHERE (#VAR3 = 'All' OR #VAR3 IN ("2","3","4","5","6","7","8","9"))
AND (#VAR4 = 'All' OR "10" = #VAR4)
AND (#VAR5 = 'All' OR "11" = #VAR5)
)
THEN
'Z N/A - ' || "2"
ELSE
"2"
END AS COL2
FROM "Finance New"
WHERE "5" <> 'All Applicable' AND "1" <> '9999'
AND "2" LIKE '9%'
UNION
SELECT DISTINCT CASE
WHEN "3" NOT IN
(
SELECT "3"
FROM "Finance New"
WHERE (#VAR3 = 'All' OR #VAR3 IN ("2","3","4","5","6","7","8","9"))
AND (#VAR4 = 'All' OR "10" = #VAR4)
AND (#VAR5 = 'All' OR "11" = #VAR5)
)
THEN
'Z N/A - ' || "3"
ELSE
"3"
END AS COL2
FROM "Finance New"
WHERE "5" <> 'All Applicable' AND "1" <> '9999'
AND "3" LIKE '9%'
UNION
Etc, for each of the columns in the Rollup/Dept Tree report.
The inner select in each union query appends to the text based on the variable criteria. Sorting is done automatically. The last line before UNION (AND "3" LIKE "9%") is to actually grab the Rollup. Rollups all start with 9.
Input parameters are labeled like #VARx.
I'm wondering if there is a more efficient way of performing this, assuming I cannot create a temp table and cannot change the structure of the data.
Thank you!

All these ORs prevent the use of indexes.
If at all possible, remove those #VAR = 'All' comparisons (or the other one) and create the SQL string dynamically, depending on the actual VAR3/4/5 values.
The LIKE prevents the use of indexes (because LIKE (or GLOB) would require TEXT affinity on the indexed column).
Replace this with normal comparisons, i.e., replace "col" LIKE '9%' with "col" >= '9' AND "col" < ':'.
The UNION already removes duplicates; drop the DISTINCTs.
Without indexes, all these queries do full table scans.
Create the following (covering) indexes:
CREATE INDEX i_10_11_all on "Finance New"("10","11", "2","3","4","5","6","7","8","9");
CREATE INDEX i_11_10_all on "Finance New"("11","10", "2","3","4","5","6","7","8","9");
CREATE INDEX i_2_1_5 on "Finance New"("2", "1","5");
CREATE INDEX i_3_1_5 on "Finance New"("3", "1","5");
-- and so on for 4..9

Related

SQLITE: pull items in the same order as IN array

I have the following query:
SELECT * from tWords where WordAton IN ("bbb", "aaa2", "ccc", "aaa1")
the query returns first the results for "aaa1" then for "aaa2" then for "bbb" and then for "ccc". Is there a way to return the results in the order of the input array, which means first the results for "bbb" then for "aaa2"... etc.
Thank you in advance.
You can apply conditional ordering like this:
SELECT *
from tWords
where WordAton IN ('bbb', 'aaa2', 'ccc', 'aaa1')
order by case WordAton
when 'bbb' then 1
when 'aaa2' then 2
when 'ccc' then 3
when 'aaa1' then 4
end
In SQL (not just SQLite), the only way to always return rows in a given order is with a SQL ORDER BY ... clause. So the short answer is, "No", there's no simple way to return rows in the order given by the contents of an IN (...) clause.
You could use a common table expression (CTE) to define a sort order, but that's usually not worth the trouble. This isn't the same thing as ordering by the contents of a IN (...) clause, but it looks the same. (You're ordering by the sort order specified in the CTE.)
with ordered_words as (
select 1 as sort_order, 'bbb' as WordAton union
select 2, 'aaa2' union
select 3, 'ccc' union
select 4, 'aaa1'
)
select t.WordAton
from tWords t
join ordered_words o on t.WordAton = o.WordAton
where t.WordAton in ('bbb', 'aaa2', 'ccc', 'aaa1')
order by o.sort_order;

SQLITE - Sorting

I have this DB schema with 2 tables one for Athletes and one for Results.
I'm trying to get the last time elapse (or greater) of each athletes using this query:
Select Query
Select Athletes.BibNumber, Athletes.ChipNumber, Athletes.FirstName, Athletes.LastName, Athletes.Sex, Athletes.Category, count(Results.ElapsedTime) as Lapcount, Results.ElapsedTime
From Results, Athletes
Where Results.ChipNumber = Athletes.ChipNumber and Athletes.Category = 'A (Elite)' and Athletes.Sex = 'M' and Results.Active = 1
Group by Athletes.ChipNumber
Order by (Athletes.Sex = 'M') DESC, Athletes.Sex, Athletes.Category, Lapcount DESC, Results.ElapsedTime ASC;
This works ok if the times are added incrementally, but if I edit the time and add or change a time and the record ID is larger then the time the sort order is not applied.
Running the above query the result is:
"1" "2018001" "User" "2" "M" "A (Elite)" "5" "00:00:00.000"
"2" "2018002" "User" "1" "M" "A (Elite)" "5" "01:18:09.923"
But I would like to have:
"1" "2018001" "User" "2" "M" "A (Elite)" "5" "01:11:51.384"
"2" "2018002" "User" "1" "M" "A (Elite)" "5" "01:18:09.923"
DB Schema
CREATE TABLE IF NOT EXISTS `Results` (
`ID` INTEGER PRIMARY KEY AUTOINCREMENT,
`ChipNumber` TEXT,
`ReaderTime` TEXT,
`Antenna` TEXT,
`ElapsedTime` TEXT,
`Active` INTEGER DEFAULT 0
);
INSERT INTO `Results` (ID,ChipNumber,ReaderTime,Antenna,ElapsedTime,Active) VALUES
(72354,'2018002','2018/07/29 12:01:39.000','Gun','00:00:00.000',1),
(72383,'2018001','2018/07/29 12:19:07.975','S3','00:17:28.974',1),
(72386,'2018002','2018/07/29 12:19:51.877','S3','00:18:12.876',1),
(72411,'2018001','2018/07/29 12:36:49.677','S3','00:35:10.676',1),
(72415,'2018002','2018/07/29 12:39:29.232','S3','00:37:50.231',1),
(72433,'2018001','2018/07/29 12:55:08.811','S3','00:53:29.810',1),
(72439,'2018002','2018/07/29 12:59:37.760','M3','00:57:58.759',1),
(72452,'2018001','2018/07/29 13:13:30.385','S3','01:11:51.384',1),
(72456,'2018002','2018/07/29 13:19:48.923','Manual','01:18:09.923',1),
(72465,'2018001','2018/07/29 12:01:39.000','Gun','00:00:00.000',1);
CREATE TABLE IF NOT EXISTS `Athletes` (
`ID` INTEGER PRIMARY KEY AUTOINCREMENT,
`FirstName` TEXT,
`LastName` TEXT,
`Sex` TEXT DEFAULT 'M',
`Category` TEXT DEFAULT NULL,
`BibNumber` INTEGER DEFAULT 0,
`ChipNumber` TEXT DEFAULT 0,
`Active` BOOLEAN DEFAULT 0
);
INSERT INTO `Athletes` (ID,FirstName,LastName,Sex,Category,BibNumber,ChipNumber,Active) VALUES
(3,'User','1','M','A (Elite)',2,'2018002',1),
(29,'User','2','M','A (Elite)',1,'2018001',1);
I believe that your issue is due to the following (see highlighted) :-
If the SELECT statement is an aggregate query without a GROUP BY
clause, then each aggregate expression in the result-set is evaluated
once across the entire dataset. Each non-aggregate expression in the
result-set is evaluated once for an arbitrarily selected row of the
dataset. The same arbitrarily selected row is used for each
non-aggregate expression. Or, if the dataset contains zero rows, then
each non-aggregate expression is evaluated against a row consisting
entirely of NULL values.
SQL As Understood By SQLite - SELECT - 3. Generation of the set of result rows.
As such to ensure that you get the maximum value for the elapsed time you should use an aggregate function, thus max in your case.
Therefore, I believe the following will work for you :-
SELECT Athletes.BibNumber, Athletes.ChipNumber, Athletes.FirstName, Athletes.LastName, Athletes.Sex, Athletes.Category,
count(Results.ElapsedTime) AS Lapcount,
max(Results.ElapsedTime) AS ElapsedTime
FROM Results JOIN Athletes ON Results.ChipNumber = Athletes.ChipNumber
GROUP BY Athletes.ChipNumber
ORDER BY (Athletes.Sex = 'M') DESC, Athletes.Sex, Athletes.Category, Lapcount DESC, Results.ElapsedTime ASC;

table join order is changed between w/ and w/o keys

I found the query plan is changed between w/o and w/ keys.
CREATE TABLE `data` (
`name` TEXT,
`value` NUMERIC,
PRIMARY KEY(`name`)
) WITHOUT ROWID;
CREATE TABLE `ranges` (
`begin` TEXT,
`end` TEXT,
);
explain query plan select distinct t1.name as name from data t1, ranges t2 where t1.name between t2.begin and t2.end order by name;
"0" "0" "1" "SCAN TABLE ranges AS t2"
"0" "1" "0" "SEARCH TABLE data AS t1 USING PRIMARY KEY (name>? AND name<?)"
"0" "0" "0" "USE TEMP B-TREE FOR DISTINCT"
If I defined begin and end as keys,
CREATE TABLE `ranges` (
`begin` TEXT,
`end` TEXT,
PRIMARY KEY(`begin`,`end`)
);
the query plan is changed to the following.
"0" "0" "0" "SCAN TABLE data AS t1"
"0" "1" "1" "SEARCH TABLE ranges AS t2 USING COVERING INDEX sqlite_autoindex_ranges_1 (begin<?)"
The first query plan is better because in my case data table is much larger than ranges.
I read https://sqlite.org/optoverview.html. It says join order is defined by sqlite's default choice w/o analysis results. Does adding those keys change the default choice? Is there any other trick to let SQLite use the first query plan w/o providing stat data?
Also, is the default choice non-changed? Will it be changed later? I use 3.22.
I also notice that if I do not use order by and distinct, it always uses the first plan
explain query plan select t1.name as name from data t1, ranges t2 where t1.name between t2.begin and t2.end;
"0" "0" "1" "SCAN TABLE ranges AS t2"
"0" "1" "0" "SEARCH TABLE data AS t1 USING PRIMARY KEY (name>? AND name<?)"

How to select last 5 results from each user_id in an sqlite database?

I have a database with "num" table like this
user_id | number | unix_time
-----------------------------
123 2 xxxxxxxx
123 40 xxxxxxxx
123 24 xxxxxxxx
333 23 xxxxxxxx
333 67 xxxxxxxx
854 90 xxxxxxxx
I'd like to select the last 5 numbers inserted by each user_id, but I can't figure out how to do it.
I tried:
SELECT b.n, a.user_id
FROM num a
JOIN num b on a.user_id = b.user_id
WHERE (
SELECT COUNT(*)
FROM num b2
WHERE b2.n <= b.n
AND b2.user_id = b.user_id
) <= 5
I am adapting the answer from (sql query - how to apply limit within group by).
I use "2" instead of "5" to make the effect visible within your sample data.
Note that I used actual dates instead of your "xxxxxxxx", assuming that most likely you mean "most recent 5" when you write "last 5" and that only works for actual times.
select * from toy a
where a.ROWID IN
( SELECT b.ROWID FROM toy b
WHERE b.user_id = a.user_id
ORDER by unix_time DESC
LIMIT 2
) ;
How is it done:
make on-the-fly tables (i.e. the part within ())
one for each user_id, WHERE b.user_id = a.user_id
order each on-the-fly table separatly (that is the first trick),
by doing the ordering inside the ()
order chronologically backwards ORDER by unix_time DESC
limit to 5 (in the example 2) entries LIMIT 2
limit each on-the-fly table separatly (that is the second trick),
by doing the limiting inside the ()
select everything from the actual table, select * from toy,
but only select from the actual table those lines which occur in the total of all on-the-fly tables,
where a.ROWID IN (
introduce the distinguishing alias "a" for the total view of the table,
toy a
introduce the distinguishing alias "b" for the single-user_id view of the table,
toy b
By the way, here is the dump of what I used for testing
(it is a convenient way of making most of a MCVE):
BEGIN TRANSACTION;
CREATE TABLE toy (user_id int, number int, unix_time date);
INSERT INTO toy VALUES(123,2,'1970-01-01 05:33:20');
INSERT INTO toy VALUES(123,40,'1970-01-01 06:56:40');
INSERT INTO toy VALUES(123,24,'1970-01-01 08:20:00');
INSERT INTO toy VALUES(333,23,'1970-01-01 11:06:40');
INSERT INTO toy VALUES(333,67,'1970-01-01 12:30:00');
INSERT INTO toy VALUES(854,90,'1970-01-01 13:53:20');
COMMIT;
If you want to select last 5 records from the SQlite database then use query
SELECT * FROM table_name ORDER BY user_id DESC LIMIT 5;
Using this query you can select last n transactions...Hope I helped you

Combine two counts in SQLite

I have the following table ("Table") format
A B C
"801331" "5755270" "0"
"1761861" "10556391" "1"
"1761861" "10557381" "33"
"1761861" "11069131" "33"
"801331" "24348751" "0"
"801331" "77219852" "0"
"1761861" "557880972" "0"
And I would like to count and present two different quantities in one table grouped by column A.
The first is:
SELECT A, COUNT(*) FROM Table GROUP BY A
The second one has one condition:
SELECT A, COUNT(*) FROM Table WHERE C != 0 GROUP BY A
I want to have the following result
A 1st 2nd
"1761861" "4" "3"
"801331" "3" "0"
I tried a few answers from questions such as thisOne yet I could not make it happen as the result is one row.
I get it is pretty easy, yet i cannot make it work.
Is there a (simple) way to do it?
SELECT A, sum(1), sum(case when C <> 0 then 1 else 0 end) FROM Table GROUP BY A;
SELECT A, count(*), count(case when C <> 0 then A else null end) FROM Table GROUP BY A;

Resources