Recursive SQLite CTE with JSON1 json_each - sqlite

I have a SQLite table where one column contains a JSON array containing 0 or more values. Something like this:
id|values
0 |[1,2,3]
1 |[]
2 |[2,3,4]
3 |[2]
What I want to do is "unfold" this into a list of all distinct values contained within the arrays of that column.
To start, I am using the JSON1 extension's json_each function to extract a table of values from a row:
SELECT
value
FROM
json_each(
(
SELECT
values
FROM
my_table
WHERE
id == 2
)
)
Where I can vary the id (2, above) to select any row in the table.
Now, I am trying to wrap this in a recursive CTE so that I can apply it to each row across the entire table and union the results. As a first step I replicated (roughly) the results from above as follows:
WITH RECURSIVE result AS (
SELECT null
UNION ALL
SELECT
value
FROM
json_each(
(
SELECT
values
FROM
my_table
WHERE
id == 2
)
)
)
SELECT * FROM result;
As the next step I had originally planned to make id a variable and increment it (in a similar manner to the first example in the documentation, but haven't been able to get that to work.
I have gone through the other examples in the documentation, but they are somewhat more complex and I haven't been able to distill those down to see how they might apply to this problem.
Can someone provide a simple example of how to solve this (or a similar problem) with a recursive CTE?
Of course, my goal is to solve the problem with or without CTEs so Im also happy to hear if there is a better way...

You do not need a recursive CTE for this.
To call json_each for multiple source rows, use a join:
SELECT t1.id, t2.value
FROM my_table AS t1
JOIN json_each((SELECT "values" FROM my_table WHERE id = t1.id)) AS t2;

Related

I need the equivalent of this Count with Case for Firebird 3 database

I need the equivalent of this Count with Case for a Firebird 3 database. I get an error when I try it:
SQL error code = -104.
Invalid usage of boolean expression.
I was just recently introduced to the Case command and I can't seem to rework it myself. I managed to get it to work with SQLite just fine.
The intent is to do an AND operation, the Where can't do an AND because the keywords are in rows.
SELECT Count((CASE WHEN keywords.keyword LIKE '%purchased%'
THEN 1 END) AND
(CASE WHEN keywords.keyword LIKE '%item%'
THEN 1 END)) AS TRows
FROM products
LEFT OUTER JOIN keywords_products ON
products.product_rec_id = keywords_products.product_rec_id
LEFT OUTER JOIN keywords ON
keywords_products.keyword_rec_id = keywords.keyword_rec_id
WHERE (keywords.keyword LIKE '%purchased%' OR
keywords.keyword LIKE '%item%')
I have three SQLite tables, a products table, a keywords_products table, and a keywords table.
CREATE TABLE products (
product_rec_id INTEGER PRIMARY KEY NOT NULL,
name VARCHAR (100) NOT NULL
);
CREATE TABLE keywords_products (
keyword_rec_id INTEGER NOT NULL,
product_rec_id INTEGER NOT NULL
);
CREATE TABLE keywords (
keyword_rec_id INTEGER PRIMARY KEY NOT NULL,
keyword VARCHAR (50) NOT NULL UNIQUE
);
The keywords_products table holds the the record id of a product and a record id of a keyword. Each product can be assigned multiple keywords in the keywords table.
The keyword table looks like this:
keyword_rec_id keyword
-------------- -----------
60 melee
43 scifi
87 water
The keywords_products table looks like this (one keyword can be assigned to many products):
keyword_rec_id product_rec_id
-------------- --------------
43 1
60 1
43 2
87 3
The products table looks like this:
product_rec_id name
-------------- --------------
1 Scifi Melee Weapons
2 Scifi Ray Weapon
3 Lake House
I'm assuming you want to count how many rows there are where both conditions are true.
The error occurs because you can't use AND between integer values. The values must be true booleans.
So, change your code to
Count((CASE WHEN keywords.keyword LIKE '%purchased%'
THEN TRUE END) AND
(CASE WHEN keywords.keyword LIKE '%item%'
THEN TRUE END))
However that is far too complex. You can simplify your expression to
count(nullif(
keywords.keyword LIKE '%purchased%' and keywords.keyword LIKE '%item%',
false))
The use of NULLIF is needed because COUNT will count all non-NULL values (as required by the SQL standard), and false is non-NULL as well. So to achieve the (assumed) desired effect, we transform false to NULL using NULLIF.
You have to use ONE single CASE expression with multiple WHEN branches.
Making Boolean functions of distinct CASE expressions just makes no sense - the CASE is not Boolean function itself.
You can see rules and an example at CASE.
case
when Age >= 18 then 'Yes'
when Age < 18 then 'No'
end;
Remake you two CASE clauses to a single CASE clause following this pattern.
However, you only use CASE when you can not move filters and conditions into standard part of SQL select. Normal approach would be to minimize data that SQL engine has to fetch, using pre-filtering. The CASE uses post-filtering, it makes SQL engine to fetch all the data, regardless if it needs it or not, and then discard the unneeded fetched data. That is redundant work slowing down the process.
In your case you already extracted the condition into WHERE clause, that is good.
SELECT
...
WHERE (keywords.keyword LIKE '%purchased%')
OR (keywords.keyword LIKE '%item%')
Since you pre-filter your data stream to always contain "item" or "purchase" then the CASE clause of yours would always return 1 on all rows selected under this WHERE pre-filtering. Hence - just remove the redundant CASE clause and put "1" instead.
SELECT Count(1)
FROM products
LEFT JOIN keywords_products ON products.product_rec_id = keywords_products.product_rec_id
LEFT JOIN keywords ON keywords_products.keyword_rec_id = keywords.keyword_rec_id
WHERE (keywords.keyword LIKE '%purchased%')
OR (keywords.keyword LIKE '%item%')
Now, given that WHERE clause is processed logically after JOINing, this your query de facto transformed LEFT JOINs into FULL JOINs ( your WHERE clause just discards rows with NULL "keyword" column values ) but aghain in unreliable and inefficient method. Since you do not want to have "keyword is NULL" kind of rows anyway - just convert your left joins to normal joins.

How can I select multiple count() rows in SQLite?

I'm working on a database to keep track of packages I need for a personal project. I am also treating this as an exercise to teach myself database design and SQL. The database I am using has a schema like the following:
CREATE TABLE packages
(
ID INTEGER PRIMARY KEY,
Name TEXT UNIQUE ON CONFLICT REPLACE NOT NULL ON CONFLICT IGNORE
);
CREATE TABLE dependencies
(
dependentPackage INTEGER REFERENCES pages(ID),
requiredPackage INTEGER REFERENCES pages(ID)
);
where the package referenced by dependencies.dependentPackage depends on the package referenced by dependencies.requiredPackage
I want a query with a column NumPackagesRequired, which returns a table that looks something like this:
packageName | NumDependencies
package1 | 6
package5 | 8
package9 | 1
I cannot achieve this by trying:
SELECT p.name AS packageName, count (d.requiredPackage) AS numDependencies
FROM packages p
JOIN dependencies d ON d.dependentPackage=p.ID;
because it returns only one row, containing the first package's name and the count of all the requirements.
I tried nesting a SELECT statement as a parameter to the count() function, but I still only got a single row of results. I have searched the sqlite documentation with no degree of luck.
How can I get a table like the one expected above?
When you are using GROUP BY, aggregate functions are computed over each group:
SELECT p.name AS packageName,
count (d.requiredPackage) AS numDependencies
FROM packages p
JOIN dependencies d ON d.dependentPackage=p.ID
GROUP BY p.name;
Alternatively, move the couting into a correlated subquery:
SELECT name AS packageName,
(SELECT count(*)
FROM dependencies
WHERE dependentPackage = packages.ID
) AS numDependencies
FROM packages;

SUM totals by FOR ALL ENTRIES itab keys

I want to execute a SELECT query on a database table that has 6 key fields, let's assume they are keyA, keyB, ..., keyF.
As input parameters to my ABAP function module I do receive an internal table with exactly that structure of the key fields, each entry in that internal table therefore corresponds to one tuple in the database table.
Thus I simply need to select all tuples from the database table that correspond to the entries in my internal table.
Furthermore, I want to aggregate an amount column in that database table in exactly the same query.
In pseudo SQL the query would look as follows:
SELECT SUM(amount) FROM table WHERE (keyA, keyB, keyC, keyD, keyE, keyF) IN {internal table}.
However, this representation is not possible in ABAP OpenSQL.
Only one column (such as keyA) is allowed to state, not a composite key. Furthermore I can only use 'selection tables' (those with SIGN, OPTIOn, LOW, HIGH) after they keyword IN.
Using FOR ALL ENTRIES seems feasible, however in this case I cannot use SUM since aggregation is not allowed in the same query.
Any suggestions?
For selecting records for each entry of an internal table, normally the for all entries idiom in ABAP Open SQL is your friend. In your case, you have the additional requirement to aggregate a sum. Unfortunately, the result set of a SELECT statement that works with for all entries is not allowed to use aggregate functions. In my eyes, the best way in this case is to compute the sum from the result set in the ABAP layer. The following example works in my system (note in passing: using the new ABAP language features that came with 7.40, you could considerably shorten the whole code).
report zz_ztmp_test.
start-of-selection.
perform test.
* Database table ZTMP_TEST :
* ID - key field - type CHAR10
* VALUE - no key field - type INT4
* Content: 'A' 10, 'B' 20, 'C' 30, 'D' 40, 'E' 50
types: ty_entries type standard table of ztmp_test.
* ---
form test.
data: lv_sum type i,
lt_result type ty_entries,
lt_keys type ty_entries.
perform fill_keys changing lt_keys.
if lt_keys is not initial.
select * into table lt_result
from ztmp_test
for all entries in lt_keys
where id = lt_keys-id.
endif.
perform get_sum using lt_result
changing lv_sum.
write: / lv_sum.
endform.
form fill_keys changing ct_keys type ty_entries.
append :
'A' to ct_keys,
'C' to ct_keys,
'E' to ct_keys.
endform.
form get_sum using it_entries type ty_entries
changing value(ev_sum) type i.
field-symbols: <ls_test> type ztmp_test.
clear ev_sum.
loop at it_entries assigning <ls_test>.
add <ls_test>-value to ev_sum.
endloop.
endform.
I would use FOR ALL ENTRIES to fetch all the related rows, then LOOP round the resulting table and add up the relevant field into a total. If you have ABAP 740 or later, you can use REDUCE operator to avoid having to loop round the table manually:
DATA(total) = REDUCE i( INIT sum = 0
FOR wa IN itab NEXT sum = sum + wa-field ).
One possible approach is simultaneous summarizing inside SELECT loop using statement SELECT...ENDSELECT statement.
Sample with calculating all order lines/quantities for the plant:
TYPES: BEGIN OF ls_collect,
werks TYPE t001w-werks,
menge TYPE ekpo-menge,
END OF ls_collect.
DATA: lt_collect TYPE TABLE OF ls_collect.
SELECT werks UP TO 100 ROWS
FROM t001w
INTO TABLE #DATA(lt_werks).
SELECT werks, menge
FROM ekpo
INTO #DATA(order)
FOR ALL ENTRIES IN #lt_werks
WHERE werks = #lt_werks-werks.
COLLECT order INTO lt_collect.
ENDSELECT.
The sample has no business sense and placed here just for educational purpose.
Another more robust and modern approach is CTE (Common Table Expressions) available since ABAP 751 version. This technique is specially intended among others for total/subtotal tasks:
WITH
+plants AS (
SELECT werks UP TO 100 ROWS
FROM t011w ),
+orders_by_plant AS (
SELECT SUM( menge )
FROM ekpo AS e
INNER JOIN +plants AS m
ON e~werks = m~werks
GROUP BY werks )
SELECT werks, menge
FROM +orders_by_plant
INTO TABLE #DATA(lt_sums)
ORDER BY werks.
cl_demo_output=>display( lt_sums ).
The first table expression +material is your internal table, the second +orders_by_mat quantities totals selected by the above materials and the last query is the final output query.

Teradata - duplication column error

I want to make a volatile table using teradata.
In the select statement I am using multiple columns from different tables.
However, some of the columns in the different tables have same names.
Therefore, I am getting a 'duplication column error'.
The question is - is there any workaround to bypass this error?
Is it possible to add for example table name to column name?
This is how my code looks:
CREATE MULTISET VOLATILE TABLE test
AS (
SEL *
FROM Table_A Left JOIN Table_B
...
)
WITH DATA
ON COMMIT PRESERVE ROWS
Instead of doing a select * , select individual column names and put aliases next to it. This will bypass the error.
A select all statement only works if you're working off one table. If you're retrieving all data from multiple tables, you've to specify that in your select statement.
CREATE MULTISET VOLATILE TABLE test AS
(
SELECT Table_A.*
, Table_B.*
FROM Table_A
LEFT JOIN Table_B ON ...
...
)
WITH DATA PRIMARY INDEX(«PI»)
ON COMMIT PRESERVE ROWS

Query a manual list of data items

I would like to run a query involving joining a table to a manually generated list but am stuck trying to generate the manual list. There is an example of what I am attempting to do below:
SELECT
*
FROM
('29/12/2014', '30/12/2014', '30/12/2014') dates
;
Ideally I would want my output to look like:
29/12/2014
30/12/2014
31/12/2014
What's your Teradata release?
In TD14 there's STRTOK_SPLIT_TO_TABLE:
SELECT *
FROM TABLE (STRTOK_SPLIT_TO_TABLE(1 -- any dummy value
,'29/12/2014,30/12/2014,30/12/2014' -- any delimited string
,',' -- delimiter
)
RETURNS (outkey INTEGER
,tokennum INTEGER
,token VARCHAR(20) CHARACTER SET UNICODE) -- modify to match the actual size
) AS d
You can easily put this in a Derived Table and then join to it.
inkey (here the dummy value 1) is a numeric or string column, usually a key. Can be used for joining back to the original row.
outkey is the same as inkey.
tokennum is the ordinal position of the token in the input string.
token is the extracted substring.
Try this:
select '29/12/2014'
union
select '30/12/2014'
union
...
It should work in Teradata as well as in MySql.

Resources