table join order is changed between w/ and w/o keys - sqlite

I found the query plan is changed between w/o and w/ keys.
CREATE TABLE `data` (
`name` TEXT,
`value` NUMERIC,
PRIMARY KEY(`name`)
) WITHOUT ROWID;
CREATE TABLE `ranges` (
`begin` TEXT,
`end` TEXT,
);
explain query plan select distinct t1.name as name from data t1, ranges t2 where t1.name between t2.begin and t2.end order by name;
"0" "0" "1" "SCAN TABLE ranges AS t2"
"0" "1" "0" "SEARCH TABLE data AS t1 USING PRIMARY KEY (name>? AND name<?)"
"0" "0" "0" "USE TEMP B-TREE FOR DISTINCT"
If I defined begin and end as keys,
CREATE TABLE `ranges` (
`begin` TEXT,
`end` TEXT,
PRIMARY KEY(`begin`,`end`)
);
the query plan is changed to the following.
"0" "0" "0" "SCAN TABLE data AS t1"
"0" "1" "1" "SEARCH TABLE ranges AS t2 USING COVERING INDEX sqlite_autoindex_ranges_1 (begin<?)"
The first query plan is better because in my case data table is much larger than ranges.
I read https://sqlite.org/optoverview.html. It says join order is defined by sqlite's default choice w/o analysis results. Does adding those keys change the default choice? Is there any other trick to let SQLite use the first query plan w/o providing stat data?
Also, is the default choice non-changed? Will it be changed later? I use 3.22.
I also notice that if I do not use order by and distinct, it always uses the first plan
explain query plan select t1.name as name from data t1, ranges t2 where t1.name between t2.begin and t2.end;
"0" "0" "1" "SCAN TABLE ranges AS t2"
"0" "1" "0" "SEARCH TABLE data AS t1 USING PRIMARY KEY (name>? AND name<?)"

Related

How to convert Bigquery repeated record into a column?

event_params is a repeated record. Its key values can be
firebase_event_origin, engagement_time_msec, firebase_screen, ...
Each key has a number of optional values according to the data type:
string_value, int_value, ...
I want to convert the key into a column and that the value will populate it.
For example: the key firebase_screen will be converted into a column firebase_screen with a value of webview screen. Same for all the other repeated records in the table.
I'm not sure if the UNNEST is the right solution here since it breaks it down into records instead of columns.
The screenshots of schema and the table I used for this example:
You need to unnest first, then group the data again.
Please replace the bracket in FROM ( ... ) with your table.
SELECT
date,
ANY_VALUE(CASE WHEN t.key="firebase_screen" THEN t.string_value ELSE NULL END) AS firebase_screen,
ANY_VALUE(CASE WHEN t.key="ga_session_number" THEN t.int_value ELSE NULL END) AS ga_session_number,
FROM (
SELECT
1 date,
[STRUCT("firebase_screen" AS key,
"webs" AS string_value,
NULL AS int_value),
STRUCT("ga_session_number" AS key,
NULL,
6 AS int_value) ] AS event_params ) AS tbl,
UNNEST(tbl.event_params) AS t
GROUP BY 1

SQLITE - Sorting

I have this DB schema with 2 tables one for Athletes and one for Results.
I'm trying to get the last time elapse (or greater) of each athletes using this query:
Select Query
Select Athletes.BibNumber, Athletes.ChipNumber, Athletes.FirstName, Athletes.LastName, Athletes.Sex, Athletes.Category, count(Results.ElapsedTime) as Lapcount, Results.ElapsedTime
From Results, Athletes
Where Results.ChipNumber = Athletes.ChipNumber and Athletes.Category = 'A (Elite)' and Athletes.Sex = 'M' and Results.Active = 1
Group by Athletes.ChipNumber
Order by (Athletes.Sex = 'M') DESC, Athletes.Sex, Athletes.Category, Lapcount DESC, Results.ElapsedTime ASC;
This works ok if the times are added incrementally, but if I edit the time and add or change a time and the record ID is larger then the time the sort order is not applied.
Running the above query the result is:
"1" "2018001" "User" "2" "M" "A (Elite)" "5" "00:00:00.000"
"2" "2018002" "User" "1" "M" "A (Elite)" "5" "01:18:09.923"
But I would like to have:
"1" "2018001" "User" "2" "M" "A (Elite)" "5" "01:11:51.384"
"2" "2018002" "User" "1" "M" "A (Elite)" "5" "01:18:09.923"
DB Schema
CREATE TABLE IF NOT EXISTS `Results` (
`ID` INTEGER PRIMARY KEY AUTOINCREMENT,
`ChipNumber` TEXT,
`ReaderTime` TEXT,
`Antenna` TEXT,
`ElapsedTime` TEXT,
`Active` INTEGER DEFAULT 0
);
INSERT INTO `Results` (ID,ChipNumber,ReaderTime,Antenna,ElapsedTime,Active) VALUES
(72354,'2018002','2018/07/29 12:01:39.000','Gun','00:00:00.000',1),
(72383,'2018001','2018/07/29 12:19:07.975','S3','00:17:28.974',1),
(72386,'2018002','2018/07/29 12:19:51.877','S3','00:18:12.876',1),
(72411,'2018001','2018/07/29 12:36:49.677','S3','00:35:10.676',1),
(72415,'2018002','2018/07/29 12:39:29.232','S3','00:37:50.231',1),
(72433,'2018001','2018/07/29 12:55:08.811','S3','00:53:29.810',1),
(72439,'2018002','2018/07/29 12:59:37.760','M3','00:57:58.759',1),
(72452,'2018001','2018/07/29 13:13:30.385','S3','01:11:51.384',1),
(72456,'2018002','2018/07/29 13:19:48.923','Manual','01:18:09.923',1),
(72465,'2018001','2018/07/29 12:01:39.000','Gun','00:00:00.000',1);
CREATE TABLE IF NOT EXISTS `Athletes` (
`ID` INTEGER PRIMARY KEY AUTOINCREMENT,
`FirstName` TEXT,
`LastName` TEXT,
`Sex` TEXT DEFAULT 'M',
`Category` TEXT DEFAULT NULL,
`BibNumber` INTEGER DEFAULT 0,
`ChipNumber` TEXT DEFAULT 0,
`Active` BOOLEAN DEFAULT 0
);
INSERT INTO `Athletes` (ID,FirstName,LastName,Sex,Category,BibNumber,ChipNumber,Active) VALUES
(3,'User','1','M','A (Elite)',2,'2018002',1),
(29,'User','2','M','A (Elite)',1,'2018001',1);
I believe that your issue is due to the following (see highlighted) :-
If the SELECT statement is an aggregate query without a GROUP BY
clause, then each aggregate expression in the result-set is evaluated
once across the entire dataset. Each non-aggregate expression in the
result-set is evaluated once for an arbitrarily selected row of the
dataset. The same arbitrarily selected row is used for each
non-aggregate expression. Or, if the dataset contains zero rows, then
each non-aggregate expression is evaluated against a row consisting
entirely of NULL values.
SQL As Understood By SQLite - SELECT - 3. Generation of the set of result rows.
As such to ensure that you get the maximum value for the elapsed time you should use an aggregate function, thus max in your case.
Therefore, I believe the following will work for you :-
SELECT Athletes.BibNumber, Athletes.ChipNumber, Athletes.FirstName, Athletes.LastName, Athletes.Sex, Athletes.Category,
count(Results.ElapsedTime) AS Lapcount,
max(Results.ElapsedTime) AS ElapsedTime
FROM Results JOIN Athletes ON Results.ChipNumber = Athletes.ChipNumber
GROUP BY Athletes.ChipNumber
ORDER BY (Athletes.Sex = 'M') DESC, Athletes.Sex, Athletes.Category, Lapcount DESC, Results.ElapsedTime ASC;

What is execute list subquery

In SQLite, I have
create tbl (id int, name text primary key);
create index tblIdIdx on tbl(id);
create tblAttributes (id int, name text, value numeric, primary key(id, name));
When I do
explain query plan select name, value from tblAttributes where sweepId = (select max(id) from tbl) and (name = 'n1' or name = 'n2' or name = 'n3');
I got the following results
selectid order from detail
"0" "0" "0" "SEARCH TABLE tblAttributes USING PRIMARY KEY (id=? AND name=?)"
"0" "0" "0" "EXECUTE SCALAR SUBQUERY 1"
"1" "0" "0" "SEARCH TABLE tbl USING COVERING INDEX tblIdIdx"
"0" "0" "0" EXECUTE LIST SUBQUERY 2"
https://www.sqlite.org/eqp.html explains that "EXECUTE SCALAR" is a cached query. What does EXECUTE LIST mean? Is it also cached?

Combine two counts in SQLite

I have the following table ("Table") format
A B C
"801331" "5755270" "0"
"1761861" "10556391" "1"
"1761861" "10557381" "33"
"1761861" "11069131" "33"
"801331" "24348751" "0"
"801331" "77219852" "0"
"1761861" "557880972" "0"
And I would like to count and present two different quantities in one table grouped by column A.
The first is:
SELECT A, COUNT(*) FROM Table GROUP BY A
The second one has one condition:
SELECT A, COUNT(*) FROM Table WHERE C != 0 GROUP BY A
I want to have the following result
A 1st 2nd
"1761861" "4" "3"
"801331" "3" "0"
I tried a few answers from questions such as thisOne yet I could not make it happen as the result is one row.
I get it is pretty easy, yet i cannot make it work.
Is there a (simple) way to do it?
SELECT A, sum(1), sum(case when C <> 0 then 1 else 0 end) FROM Table GROUP BY A;
SELECT A, count(*), count(case when C <> 0 then A else null end) FROM Table GROUP BY A;

SQLite - Help to Make Complicated Query More Efficient

The data is Financial data, structured in buckets, where one bucket (Rollup) can contain other buckets of data. As Example structure:
Rollup1 | Dept1
Rollup1 | Rollup2 | Dept2
Rollup1 | Rollup2 | Dept3
Rollup1 | Rollup3 | Dept4
Rollup1 | Rollup3 | Rollup4 | Dept5
Rollup1 | Rollup3 | Rollup4 | Dept6
There are 8 Columns of this, with Rollups and Depts scattered throughout (but the leaves are always single Depts). Approx 10k rows.
The goal of the Query Result is to show a single column with ALL Rollups, with variable logic to present certain Rollups normally, and modifying all other Rollups.
For example, if my variable contained "Dept4", my result would be:
Rollup1
Rollup3
NA - Rollup2
NA - Rollup4
In the real scenario, there are 3 variables which determine the display of the Rollup column.
Here is what I have, which functions as it should, however, the performance is VERY bad. 1 Query takes up to 5 seconds, which I would like to improve.
SELECT DISTINCT CASE
WHEN "2" NOT IN
(
SELECT "2"
FROM "Finance New"
WHERE (#VAR3 = 'All' OR #VAR3 IN ("2","3","4","5","6","7","8","9"))
AND (#VAR4 = 'All' OR "10" = #VAR4)
AND (#VAR5 = 'All' OR "11" = #VAR5)
)
THEN
'Z N/A - ' || "2"
ELSE
"2"
END AS COL2
FROM "Finance New"
WHERE "5" <> 'All Applicable' AND "1" <> '9999'
AND "2" LIKE '9%'
UNION
SELECT DISTINCT CASE
WHEN "3" NOT IN
(
SELECT "3"
FROM "Finance New"
WHERE (#VAR3 = 'All' OR #VAR3 IN ("2","3","4","5","6","7","8","9"))
AND (#VAR4 = 'All' OR "10" = #VAR4)
AND (#VAR5 = 'All' OR "11" = #VAR5)
)
THEN
'Z N/A - ' || "3"
ELSE
"3"
END AS COL2
FROM "Finance New"
WHERE "5" <> 'All Applicable' AND "1" <> '9999'
AND "3" LIKE '9%'
UNION
Etc, for each of the columns in the Rollup/Dept Tree report.
The inner select in each union query appends to the text based on the variable criteria. Sorting is done automatically. The last line before UNION (AND "3" LIKE "9%") is to actually grab the Rollup. Rollups all start with 9.
Input parameters are labeled like #VARx.
I'm wondering if there is a more efficient way of performing this, assuming I cannot create a temp table and cannot change the structure of the data.
Thank you!
All these ORs prevent the use of indexes.
If at all possible, remove those #VAR = 'All' comparisons (or the other one) and create the SQL string dynamically, depending on the actual VAR3/4/5 values.
The LIKE prevents the use of indexes (because LIKE (or GLOB) would require TEXT affinity on the indexed column).
Replace this with normal comparisons, i.e., replace "col" LIKE '9%' with "col" >= '9' AND "col" < ':'.
The UNION already removes duplicates; drop the DISTINCTs.
Without indexes, all these queries do full table scans.
Create the following (covering) indexes:
CREATE INDEX i_10_11_all on "Finance New"("10","11", "2","3","4","5","6","7","8","9");
CREATE INDEX i_11_10_all on "Finance New"("11","10", "2","3","4","5","6","7","8","9");
CREATE INDEX i_2_1_5 on "Finance New"("2", "1","5");
CREATE INDEX i_3_1_5 on "Finance New"("3", "1","5");
-- and so on for 4..9

Resources