How can I select multiple count() rows in SQLite? - sqlite

I'm working on a database to keep track of packages I need for a personal project. I am also treating this as an exercise to teach myself database design and SQL. The database I am using has a schema like the following:
CREATE TABLE packages
(
ID INTEGER PRIMARY KEY,
Name TEXT UNIQUE ON CONFLICT REPLACE NOT NULL ON CONFLICT IGNORE
);
CREATE TABLE dependencies
(
dependentPackage INTEGER REFERENCES pages(ID),
requiredPackage INTEGER REFERENCES pages(ID)
);
where the package referenced by dependencies.dependentPackage depends on the package referenced by dependencies.requiredPackage
I want a query with a column NumPackagesRequired, which returns a table that looks something like this:
packageName | NumDependencies
package1 | 6
package5 | 8
package9 | 1
I cannot achieve this by trying:
SELECT p.name AS packageName, count (d.requiredPackage) AS numDependencies
FROM packages p
JOIN dependencies d ON d.dependentPackage=p.ID;
because it returns only one row, containing the first package's name and the count of all the requirements.
I tried nesting a SELECT statement as a parameter to the count() function, but I still only got a single row of results. I have searched the sqlite documentation with no degree of luck.
How can I get a table like the one expected above?

When you are using GROUP BY, aggregate functions are computed over each group:
SELECT p.name AS packageName,
count (d.requiredPackage) AS numDependencies
FROM packages p
JOIN dependencies d ON d.dependentPackage=p.ID
GROUP BY p.name;
Alternatively, move the couting into a correlated subquery:
SELECT name AS packageName,
(SELECT count(*)
FROM dependencies
WHERE dependentPackage = packages.ID
) AS numDependencies
FROM packages;

Related

SQLite: Create File Structure Table from String Path. New query or modify existing?

I have a SQLite table:
FileDataID | Path
0 /FileAtRoot.txt
1 /video/gopro/father/mov001.mp4
2 /pictures/family/father/Oldman.jpg
3 /documents/legal/father/estate/will.doc
Using an elegant solution designed by forpas, a new table consisting of only the directory structure is created:
Directory | Directory_Parent | Value
0 null root
1 0 documents
2 1 legal
3 2 father
...
Reference: SQLite: Create Directory Structure Table from A List Of Paths
Now that a table of the directory structure exists, I need to link the original files to their parent by using a foreign key Directory_Parent in a new table:
FileDataID | Directory_Parent | Value
0 0 FileAtRoot.txt
1 19 mov001.mp4
2 9 Oldman.jpg
3 4 will.doc
How can I create this table from the original data using SQLite?
Can forpas's solution be modified so it creates both tables at once?
Or should this 2nd "file_struct" table be created in a 2nd SQLite query?
A 1 megabyte example database can be found here:
A bounty will be awarded for helping with this final question, thank you.
For this to work, you must have already the table dir_struct (from your previous question) so that the file names can be inserted to the table with the id of the directory they belong to.
First I create the new table files:
CREATE TABLE files(
FileDataID INTEGER REFERENCES listfile(FileDataID),
Directory_Parent INTEGER REFERENCES dir_struct(Directory),
Value
);
You must also create a unique index for FileDataID in listfile, because it is not defined as the PRIMARY KEY or UNIQUE, so columns in other tables (like the column FileDataID of files) can reference it.
CREATE UNIQUE INDEX idx_listfile_FileDataID ON listfile(FileDataID);
A recursive CTE is used to query dir_struct and build all the possible paths and it is joined to listfile to match the file names and their paths:
WITH cte AS (
SELECT Directory, Directory_Parent, Value, '' full_path
FROM dir_struct
WHERE Directory = 0
UNION ALL
SELECT d.Directory, d.Directory_Parent, d.Value, full_path || d.Value || '/'
FROM dir_struct d INNER JOIN cte c
ON c.Directory = d.Directory_Parent
)
INSERT INTO files(FileDataID, Directory_Parent, Value)
SELECT f.FileDataID, c.Directory, SUBSTR(f.Path, LENGTH(c.full_path) + 1)
FROM listfile f INNER JOIN cte c
ON f.Path LIKE c.full_path || '%' AND INSTR(SUBSTR(f.Path, LENGTH(c.full_path) + 1), '/') = 0
See the demo, where the code for the insertions in dir_struct has also been modified because now the table listfile contains files at the root, which did not exist in the sample data of your previous question.
So the code in the demo must be executed as a whole.
I used your 1MB sample data and the queries ran very fast.
But, for 1M rows (from the link you first posted), which I also tested (and found duplicates which you must delete before doing anything else), the creation of the table files took about 1.5 hour.
As I mentioned in my answer of your previous question, if this is a one time thing then use it. If you will need it frequently, then you should consider something else.

sqlite - copy subset of tables and columns into new db-file

I have a database A.db, which contains tables t1, t2 and t3.
Now I want to create a new database B.db, which contains t1 and some chosen columns col1 and col4 from t2.
With .import I get hundreds of errors and it seems to work only for full tables.
.output sounds like I just save the output as it would be printed.
Basically, I need an insert into foo select ... across different files. How can I do this?
First you must attach A.db to your current database and give it an alias like adb.
Then write the insert statement just like you would if all the tables existed in the same database, qualifying the column names with the database alias.
It's a good practice to include in the insert into... statement inside parentheses all the column names of the table foo for which you will set values from the other 2 tables, but also be sure that the order of the columns is the same with the order of the columns in the select list:
attach database 'pathtoAdatabase/A.db' as adb;
insert into foo (column1, column2, .......)
select adb.t1.column1, adb.t1.column2, ...., adb.t2.col1, adb.t2.col4
from adb.t1 inner join adb.t2
on <join condition>
Replace <join condition> with the conditions on whichyou will join the 2 tables to makes the rows that you will insert into foo, something like:
adb.t1.id = adb.t2.id

I need the equivalent of this Count with Case for Firebird 3 database

I need the equivalent of this Count with Case for a Firebird 3 database. I get an error when I try it:
SQL error code = -104.
Invalid usage of boolean expression.
I was just recently introduced to the Case command and I can't seem to rework it myself. I managed to get it to work with SQLite just fine.
The intent is to do an AND operation, the Where can't do an AND because the keywords are in rows.
SELECT Count((CASE WHEN keywords.keyword LIKE '%purchased%'
THEN 1 END) AND
(CASE WHEN keywords.keyword LIKE '%item%'
THEN 1 END)) AS TRows
FROM products
LEFT OUTER JOIN keywords_products ON
products.product_rec_id = keywords_products.product_rec_id
LEFT OUTER JOIN keywords ON
keywords_products.keyword_rec_id = keywords.keyword_rec_id
WHERE (keywords.keyword LIKE '%purchased%' OR
keywords.keyword LIKE '%item%')
I have three SQLite tables, a products table, a keywords_products table, and a keywords table.
CREATE TABLE products (
product_rec_id INTEGER PRIMARY KEY NOT NULL,
name VARCHAR (100) NOT NULL
);
CREATE TABLE keywords_products (
keyword_rec_id INTEGER NOT NULL,
product_rec_id INTEGER NOT NULL
);
CREATE TABLE keywords (
keyword_rec_id INTEGER PRIMARY KEY NOT NULL,
keyword VARCHAR (50) NOT NULL UNIQUE
);
The keywords_products table holds the the record id of a product and a record id of a keyword. Each product can be assigned multiple keywords in the keywords table.
The keyword table looks like this:
keyword_rec_id keyword
-------------- -----------
60 melee
43 scifi
87 water
The keywords_products table looks like this (one keyword can be assigned to many products):
keyword_rec_id product_rec_id
-------------- --------------
43 1
60 1
43 2
87 3
The products table looks like this:
product_rec_id name
-------------- --------------
1 Scifi Melee Weapons
2 Scifi Ray Weapon
3 Lake House
I'm assuming you want to count how many rows there are where both conditions are true.
The error occurs because you can't use AND between integer values. The values must be true booleans.
So, change your code to
Count((CASE WHEN keywords.keyword LIKE '%purchased%'
THEN TRUE END) AND
(CASE WHEN keywords.keyword LIKE '%item%'
THEN TRUE END))
However that is far too complex. You can simplify your expression to
count(nullif(
keywords.keyword LIKE '%purchased%' and keywords.keyword LIKE '%item%',
false))
The use of NULLIF is needed because COUNT will count all non-NULL values (as required by the SQL standard), and false is non-NULL as well. So to achieve the (assumed) desired effect, we transform false to NULL using NULLIF.
You have to use ONE single CASE expression with multiple WHEN branches.
Making Boolean functions of distinct CASE expressions just makes no sense - the CASE is not Boolean function itself.
You can see rules and an example at CASE.
case
when Age >= 18 then 'Yes'
when Age < 18 then 'No'
end;
Remake you two CASE clauses to a single CASE clause following this pattern.
However, you only use CASE when you can not move filters and conditions into standard part of SQL select. Normal approach would be to minimize data that SQL engine has to fetch, using pre-filtering. The CASE uses post-filtering, it makes SQL engine to fetch all the data, regardless if it needs it or not, and then discard the unneeded fetched data. That is redundant work slowing down the process.
In your case you already extracted the condition into WHERE clause, that is good.
SELECT
...
WHERE (keywords.keyword LIKE '%purchased%')
OR (keywords.keyword LIKE '%item%')
Since you pre-filter your data stream to always contain "item" or "purchase" then the CASE clause of yours would always return 1 on all rows selected under this WHERE pre-filtering. Hence - just remove the redundant CASE clause and put "1" instead.
SELECT Count(1)
FROM products
LEFT JOIN keywords_products ON products.product_rec_id = keywords_products.product_rec_id
LEFT JOIN keywords ON keywords_products.keyword_rec_id = keywords.keyword_rec_id
WHERE (keywords.keyword LIKE '%purchased%')
OR (keywords.keyword LIKE '%item%')
Now, given that WHERE clause is processed logically after JOINing, this your query de facto transformed LEFT JOINs into FULL JOINs ( your WHERE clause just discards rows with NULL "keyword" column values ) but aghain in unreliable and inefficient method. Since you do not want to have "keyword is NULL" kind of rows anyway - just convert your left joins to normal joins.

Recursive SQLite CTE with JSON1 json_each

I have a SQLite table where one column contains a JSON array containing 0 or more values. Something like this:
id|values
0 |[1,2,3]
1 |[]
2 |[2,3,4]
3 |[2]
What I want to do is "unfold" this into a list of all distinct values contained within the arrays of that column.
To start, I am using the JSON1 extension's json_each function to extract a table of values from a row:
SELECT
value
FROM
json_each(
(
SELECT
values
FROM
my_table
WHERE
id == 2
)
)
Where I can vary the id (2, above) to select any row in the table.
Now, I am trying to wrap this in a recursive CTE so that I can apply it to each row across the entire table and union the results. As a first step I replicated (roughly) the results from above as follows:
WITH RECURSIVE result AS (
SELECT null
UNION ALL
SELECT
value
FROM
json_each(
(
SELECT
values
FROM
my_table
WHERE
id == 2
)
)
)
SELECT * FROM result;
As the next step I had originally planned to make id a variable and increment it (in a similar manner to the first example in the documentation, but haven't been able to get that to work.
I have gone through the other examples in the documentation, but they are somewhat more complex and I haven't been able to distill those down to see how they might apply to this problem.
Can someone provide a simple example of how to solve this (or a similar problem) with a recursive CTE?
Of course, my goal is to solve the problem with or without CTEs so Im also happy to hear if there is a better way...
You do not need a recursive CTE for this.
To call json_each for multiple source rows, use a join:
SELECT t1.id, t2.value
FROM my_table AS t1
JOIN json_each((SELECT "values" FROM my_table WHERE id = t1.id)) AS t2;

SUM totals by FOR ALL ENTRIES itab keys

I want to execute a SELECT query on a database table that has 6 key fields, let's assume they are keyA, keyB, ..., keyF.
As input parameters to my ABAP function module I do receive an internal table with exactly that structure of the key fields, each entry in that internal table therefore corresponds to one tuple in the database table.
Thus I simply need to select all tuples from the database table that correspond to the entries in my internal table.
Furthermore, I want to aggregate an amount column in that database table in exactly the same query.
In pseudo SQL the query would look as follows:
SELECT SUM(amount) FROM table WHERE (keyA, keyB, keyC, keyD, keyE, keyF) IN {internal table}.
However, this representation is not possible in ABAP OpenSQL.
Only one column (such as keyA) is allowed to state, not a composite key. Furthermore I can only use 'selection tables' (those with SIGN, OPTIOn, LOW, HIGH) after they keyword IN.
Using FOR ALL ENTRIES seems feasible, however in this case I cannot use SUM since aggregation is not allowed in the same query.
Any suggestions?
For selecting records for each entry of an internal table, normally the for all entries idiom in ABAP Open SQL is your friend. In your case, you have the additional requirement to aggregate a sum. Unfortunately, the result set of a SELECT statement that works with for all entries is not allowed to use aggregate functions. In my eyes, the best way in this case is to compute the sum from the result set in the ABAP layer. The following example works in my system (note in passing: using the new ABAP language features that came with 7.40, you could considerably shorten the whole code).
report zz_ztmp_test.
start-of-selection.
perform test.
* Database table ZTMP_TEST :
* ID - key field - type CHAR10
* VALUE - no key field - type INT4
* Content: 'A' 10, 'B' 20, 'C' 30, 'D' 40, 'E' 50
types: ty_entries type standard table of ztmp_test.
* ---
form test.
data: lv_sum type i,
lt_result type ty_entries,
lt_keys type ty_entries.
perform fill_keys changing lt_keys.
if lt_keys is not initial.
select * into table lt_result
from ztmp_test
for all entries in lt_keys
where id = lt_keys-id.
endif.
perform get_sum using lt_result
changing lv_sum.
write: / lv_sum.
endform.
form fill_keys changing ct_keys type ty_entries.
append :
'A' to ct_keys,
'C' to ct_keys,
'E' to ct_keys.
endform.
form get_sum using it_entries type ty_entries
changing value(ev_sum) type i.
field-symbols: <ls_test> type ztmp_test.
clear ev_sum.
loop at it_entries assigning <ls_test>.
add <ls_test>-value to ev_sum.
endloop.
endform.
I would use FOR ALL ENTRIES to fetch all the related rows, then LOOP round the resulting table and add up the relevant field into a total. If you have ABAP 740 or later, you can use REDUCE operator to avoid having to loop round the table manually:
DATA(total) = REDUCE i( INIT sum = 0
FOR wa IN itab NEXT sum = sum + wa-field ).
One possible approach is simultaneous summarizing inside SELECT loop using statement SELECT...ENDSELECT statement.
Sample with calculating all order lines/quantities for the plant:
TYPES: BEGIN OF ls_collect,
werks TYPE t001w-werks,
menge TYPE ekpo-menge,
END OF ls_collect.
DATA: lt_collect TYPE TABLE OF ls_collect.
SELECT werks UP TO 100 ROWS
FROM t001w
INTO TABLE #DATA(lt_werks).
SELECT werks, menge
FROM ekpo
INTO #DATA(order)
FOR ALL ENTRIES IN #lt_werks
WHERE werks = #lt_werks-werks.
COLLECT order INTO lt_collect.
ENDSELECT.
The sample has no business sense and placed here just for educational purpose.
Another more robust and modern approach is CTE (Common Table Expressions) available since ABAP 751 version. This technique is specially intended among others for total/subtotal tasks:
WITH
+plants AS (
SELECT werks UP TO 100 ROWS
FROM t011w ),
+orders_by_plant AS (
SELECT SUM( menge )
FROM ekpo AS e
INNER JOIN +plants AS m
ON e~werks = m~werks
GROUP BY werks )
SELECT werks, menge
FROM +orders_by_plant
INTO TABLE #DATA(lt_sums)
ORDER BY werks.
cl_demo_output=>display( lt_sums ).
The first table expression +material is your internal table, the second +orders_by_mat quantities totals selected by the above materials and the last query is the final output query.

Resources