MonetDB recursive CTE (common table expressions)

MonetDB recursive CTE (common table expressions) - recursion

It seems MonetDB does not support recursive CTE. This is a useful feature that I used to get BOM from ERP systems. For a greater flexibility I used Firebird recursive stored procedures to enhance the output with extra calculations. A good example of SQLServer recursive CTE can be found here https://www.essentialsql.com/recursive-ctes-explained/
Question is: Is it any way I can achieve similar results in MonetDB?

There is currently no support for recursive CTEs in MonetDB[Lite]. The solution you have proposed yourself seems like the way to go.

It is clear that once I have access to procedures, variables and while-loop, something can be done. The following code provides me the desired result using temporary tables. I would appreciate if anybody can provide me an alternative to this solution that provides the same results without using the temporary tables overhead.
CREATE TEMPORARY TABLE BOM (parent_id string, comp_id string, qty double) ON COMMIT PRESERVE ROWS;
INSERT INTO BOM VALUES('a','b',5), ('a','c',2), ('b','d',4), ('b','c',7), ('c','e',3);
select * from BOM;
+-----------+---------+--------------------------+
| parent_id | comp_id | qty |
+===========+=========+==========================+
| a | b | 5 |
| a | c | 2 |
| b | d | 4 |
| b | c | 7 |
| c | e | 3 |
+-----------+---------+--------------------------+
CREATE TEMPORARY TABLE EXPLODED_BOM (parent_id string, comp_id string, path string, qty double, level integer) ON COMMIT PRESERVE ROWS;
CREATE OR REPLACE PROCEDURE UPDATE_BOM()
BEGIN
DECLARE prev_count int;
DECLARE crt_count int;
DECLARE crt_level int;
delete from EXPLODED_BOM; --make sure is empty
insert into EXPLODED_BOM select parent_id, comp_id, parent_id||'-'||comp_id, qty, 0 from BOM; --insert first level
SET prev_count = 0;
SET crt_count = (select count(*) from EXPLODED_BOM);
SET crt_level = 0;
-- (crt_level < 100) avoids possible infinite loop, if BOM is malformed
WHILE (crt_level < 100) and (crt_count > prev_count) DO
SET prev_count = crt_count;
insert into EXPLODED_BOM select e.parent_id, a.comp_id, e.path||'-'||a.comp_id, a.qty*e.qty, crt_level+1
from BOM a, EXPLODED_BOM e
where a.parent_id = e.comp_id and e.level=crt_level;
-- is it any chance to get the amount of "affected rows" by insert, update or delete statements, this way I can avoid checking the new count?
SET crt_count = (select count(*) from EXPLODED_BOM);
SET crt_level = crt_level +1;
END WHILE;
END;
call UPDATE_BOM();
select * from EXPLODED_BOM;
+-----------+---------+---------+--------------------------+-------+
| parent_id | comp_id | path | qty | level |
+===========+=========+=========+==========================+=======+
| a | b | a-b | 5 | 0 |
| a | c | a-c | 2 | 0 |
| b | d | b-d | 4 | 0 |
| b | c | b-c | 7 | 0 |
| c | e | c-e | 3 | 0 |
| a | d | a-b-d | 20 | 1 |
| a | c | a-b-c | 35 | 1 |
| a | e | a-c-e | 6 | 1 |
| b | e | b-c-e | 21 | 1 |
| a | e | a-b-c-e | 105 | 2 |
+-----------+---------+---------+--------------------------+-------+

Related

Updating multiple rows in SQLite with relevant data from the same table

I have a database that I don't control the source of directly and results in errant '0' entries which mess up generated graphs with these drops to zero. I am able to manipulate the data after the fact and update that database.
It is acceptable that the last known good value can be used instead and so I am trying to make a general query that will remove all the zeros and populate it with the last known value.
Luckily, every entry includes the ID of the last entry and so it is a matter of simply looking back and grabbing it.
I have got very close to a final answer, but instead of updating with the last good value, it just uses the first value over and over again.
dummy data
CREATE TABLE tbl(id INT,r INT,oid INT);
INSERT INTO tbl VALUES(1,10,0);
INSERT INTO tbl VALUES(2,20,1);
INSERT INTO tbl VALUES(3,0,2);
INSERT INTO tbl VALUES(4,40,3);
INSERT INTO tbl VALUES(5,50,4);
INSERT INTO tbl VALUES(6,0,5);
INSERT INTO tbl VALUES(7,70,6);
INSERT INTO tbl VALUES(8,80,7);
SELECT * FROM tbl;
OUTPUT:
| id| r |oid|
|---|----|---|
| 1 | 10 | 0 |
| 2 | 20 | 1 |
| 3 | 0 | 2 | ** NEEDS FIXING
| 4 | 40 | 3 |
| 5 | 50 | 4 |
| 6 | 0 | 5 | ** NEEDS UPDATE
| 7 | 70 | 6 |
| 8 | 80 | 7 |
I have worked several queries to get results around what I am after:
All zero entries:
SELECT * FROM tbl WHERE r = 0;
OUTPUT:
| id | r | oid |
|----|----|-----|
| 3 | 0 | 2 |
| 6 | 0 | 5 |
Output only the those rows with the preceding good row
SELECT * FROM tbl WHERE A in (
SELECT id FROM tbl WHERE r = 0
UNION
SELECT oid FROM tbl WHERE r = 0
)
OUTPUT:
| id| r |oid|
|---|----|---|
| 2 | 20 | 1 |
| 3 | 0 | 2 |
| 5 | 50 | 4 |
| 6 | 0 | 5 |
Almost works
This is as close as I have got, it does change all the zero's, but it changes them all to the value of the first lookup
UPDATE tbl
SET r = (SELECT r
FROM tbl
WHERE id in (SELECT oid
FROM tbl
WHERE r = 0)
) WHERE r = 0 ;
OUTPUT:
| id| r |oid|
|---|----|---|
| 1 | 10 | 0 |
| 2 | 20 | 1 |
| 3 | 20 | 2 | ** GOOD
| 4 | 40 | 3 |
| 5 | 50 | 4 |
| 6 | 20 | 5 | ** BAD, should be 50
| 7 | 70 | 6 |
| 8 | 80 | 7 |
If it helps, I created this fiddle here that I've been playing with:
http://sqlfiddle.com/#!5/8afff/1

For this sample data all you have to do is use the correct correlated subquery that returns the value of r from the row with id equal to the current oid in the WHERE clause:
UPDATE tbl AS t
SET r = (SELECT tt.r FROM tbl tt WHERE tt.id = t.oid)
WHERE t.r = 0;
See the demo.

Get all table values if match in 2 other tables exists

I have a table "channel".
channelId
a
b
c
d
a table "video"
videoId | channelId
1 | a
2 | b
3 | c
4 | e
a table "comment"
commentID | videoID | videoID_channelID
xx | 1 | a
yy | 2 | b
zz | 5 | e
tt | 6 | f
Keys are:
channel.channelId = video.channelId = comment.videoID_channelID
video.videoId = comment.videoID
I need:
all channels with at least 1 video and 1 comment
all videos with at least 1 channel and 1 comment
all comments with a video and a channel
So I want to do 3 SQL statements, one for each table that references the other 2.
I tried it with a double inner-join (https://www.sqlitetutorial.net/sqlite-inner-join/) but it seems to return all combinations that fit rather than:
channelId
a
b
videoId | channelId
1 | a
2 | b
commentID | videoID | videoID_channelID
xx | 1 | a
yy | 2 | b
My code so far to get all channels with at least 1 video and 1 comment:
SELECT
channel.channelId
FROM
channel
INNER JOIN video ON video.channelId = channel.channelId
INNER JOIN comment ON comment.videoID_channelID = video.channelId

You can get all the results that you want with the same query that joins all 3 tables, but for each case select different columns:
SELECT c.channelId
FROM channel c
INNER JOIN video v ON v.channelId = c.channelId
INNER JOIN comment cm ON cm.videoID_channelID = v.channelId;
SELECT v.videoID, c.channelId
FROM channel c
INNER JOIN video v ON v.channelId = c.channelId
INNER JOIN comment cm ON cm.videoID_channelID = v.channelId;
SELECT cm.commentID, v.videoID, c.channelId
FROM channel c
INNER JOIN video v ON v.channelId = c.channelId
INNER JOIN comment cm ON cm.videoID_channelID = v.channelId;
You may have to add DISTINCT after each SELECT if you get duplicates in your actual data.
See the demo.
Results:
| channelId |
| --------- |
| a |
| b |
| videoID | channelId |
| ------- | --------- |
| 1 | a |
| 2 | b |
| commentID | videoID | channelId |
| --------- | ------- | --------- |
| xx | 1 | a |
| yy | 2 | b |

SQLite take N rows per each group

I have an SQLite table similar to the following:
| A | B |
_________
| e | 5 |
| f | 7 |
| a | 5 |
| n | 7 |
| g | 5 |
| d | 7 |
| i | 5 |
| j | 5 |
| e | 7 |
| v | 7 |
How can I retrieve three random rows with value 5 in column B and three random rows with value 7? I don't know values in B, neither values5 ad 7. I want 3 random rows for each different value in B. Result may be not grouped by column B values. It could be something like:
| A | B |
_________
| e | 5 |
| g | 5 |
| e | 7 |
| v | 7 |
| j | 5 |
| f | 7 |

The following almost does what you want:
select t.*
from t
where t.rowid in (select t2.rowid
from t t2
where t2.b = t.b
order by random()
limit 3
);
Alas, the subquery will be run for every row, so this is only approximate because the random number generator changes values on each execution.
One solution is to use a temporary table to store a random number for each row, which can then be used for sorting. Unfortunately, a CTE doesn't seem to do the trick, because these are re-evaluated on each reference.
After some thought, I think a temporary table might be the only solution:
drop table if exists tempt;
create temporary table tempt as
select t.*, random() as rand
from t;
select t.*
from tempt t
where t.rowid in (select t2.rowid
from tempt t2
where t2.b = t.b
order by rand
limit 3
);

You can use the hidden RowID column to get three rows per B value as follows:
SELECT A, B FROM T T1
WHERE RowID IN (SELECT RowID FROM T T2 WHERE B = T1.B LIMIT 3);
Note that you're likely (but not 100% guaranteed) to get the same three rows each time. If you want to get random rows at the expense of some performance, you can do:
SELECT A, B FROM T T1
WHERE RowID IN (SELECT RowID FROM T T2 WHERE B = T1.B ORDER BY random() LIMIT 3);

SQLite - Update a column based on values from two other tables' columns

I am trying to update Data1's ID to Record2's ID when:
Record1's and Record2's Name are the same, and
Weight is greater in Record2.
Record1
| ID | Weight | Name |
|----|--------|------|
| 1 | 10 | a |
| 2 | 10 | b |
| 3 | 10 | c |
Record2
| ID | Weight | Name |
|----|--------|------|
| 4 | 20 | a |
| 5 | 20 | b |
| 6 | 20 | c |
Data1
| ID | Weight |
|----|--------|
| 4 | 40 |
| 5 | 40 |
I have tried the following SQLite query:
update data1
set id =
(select record2.id
from record2,record1
where record1.name=record2.name
and record1.weight<record2.weight)
where id in
(select record1.id
from record1, record2
where record1.name=record2.name
and record1.weight<record2.weight)
Using the above query Data1's id is updated to 4 for all records.
NOTE: Record1's ID is the foreign key for Data1.

For the given data set the following seems to serve the cause:
update data1
set id =
(select record2.id
from record2,record1
where
data1.id = record1.id
and record1.name=record2.name
and record1.weight<record2.weight)
where id in
(select record1.id
from record1, record2
where
record1.id in (select id from data1)
and record1.name=record2.name
and record1.weight<record2.weight)
;
See it in action: SQL Fiddle.
Please comment if and as this requires adjustment / further detail.

Oracle 11g r2: strange behavior on index

I have the query:
SELECT count(*)
FROM
(
SELECT
TBELENCO.DATA_PROC, TBELENCO.POD, TBELENCO.DESCRIZIONE, TBELENCO.ERROR, TBELENCO.STATO,
TBELENCO.SEZIONE, TBELENCO.NOME_FILE, TBELENCO.ID_CARICAMENTO, TBELENCO.ESITO_OPERAZIONE,
TBELENCO.DES_TIPO_MISURA,
--TBELENCO.RAGIONE_SOCIALE,
--ROW_NUMBER() OVER (ORDER BY TBELENCO.DATA_PROC DESC) R
ROWNUM R
FROM(
SELECT
LOG.DATA_PROC, LOG.POD, LOG.DESCRIZIONE, LOG.ERROR, LOG.STATO,
LOG.SEZIONE, LOG.NOME_FILE, LOG.ID_CARICAMENTO, LOG.ESITO_OPERAZIONE, TM.DES_TIPO_MISURA
--,C.RAGIONE_SOCIALE
--ROW_NUMBER() OVER (ORDER BY LOG.DATA_PROC DESC) R
FROM
MS042_LOADING_LOGS LOG JOIN MS116_MEASURE_TYPES TM ON
TM.ID_TIPO_MISURA=LOG.SEZIONE
-- LEFT JOIN(
-- SELECT CUST.RAGIONE_SOCIALE,STR.POD,RSC.DATA_DA, RSC.DATA_A
-- FROM
-- MS038_METERS STR JOIN MS036_REL_SITES_CUSTOMERS RSC ON
-- STR.ID_SITO=RSC.ID_SITO
-- JOIN MS030_CUSTOMERS CUST ON
-- CUST.ID_CLIENTE=RSC.ID_CLIENTE
-- ) C ON
-- C.POD=LOG.POD
--AND LOG.DATA_PROC BETWEEN C.DATA_DA AND C.DATA_A
WHERE
1=1
--AND LOG.DATA_PROC>=TRUNC(SYSDATE)
AND LOG.DATA_PROC>=TRUNC(SYSDATE)-3
--TO_DATE('01/11/2014', 'DD/MM/YYYY')
) TBELENCO
)
WHERE
R BETWEEN 1 AND 200;
If I execute the query with AND LOG.DATA_PROC>=TRUNC(SYSDATE)-3, Oracle uses the index on the data_proc field of the MS042_LOADING_LOGS (LOG) table, if I use, instead, AND LOG.DATA_PROC>=TRUNC(SYSDATE)-4 or -5, or -6, etc, it uses a table access full. Why this behavior?
I also execute a :
ALTER INDEX MS042_DATA_PROC_IDX REBUILD;
but with no changes.
Thank,
Igor
--***********************************************************
SELECT count(*)
FROM
(
SELECT
TBELENCO.DATA_PROC, TBELENCO.POD, TBELENCO.DESCRIZIONE, TBELENCO.ERROR, TBELENCO.STATO,
TBELENCO.SEZIONE, TBELENCO.NOME_FILE, TBELENCO.ID_CARICAMENTO, TBELENCO.ESITO_OPERAZIONE,
TBELENCO.DES_TIPO_MISURA,
ROWNUM R
FROM(
SELECT
LOG.DATA_PROC, LOG.POD, LOG.DESCRIZIONE, LOG.ERROR, LOG.STATO,
LOG.SEZIONE, LOG.NOME_FILE, LOG.ID_CARICAMENTO, LOG.ESITO_OPERAZIONE, TM.DES_TIPO_MISURA
FROM
MS042_LOADING_LOGS LOG JOIN MS116_MEASURE_TYPES TM ON
TM.ID_TIPO_MISURA=LOG.SEZIONE
WHERE
1=1
AND LOG.DATA_PROC>=TRUNC(SYSDATE)-1
) TBELENCO
)
WHERE
R BETWEEN 1 AND 200;
Plan hash value: 2191058229
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 30866 (2)| 00:06:11 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
|* 2 | VIEW | | 94236 | 1196K| 30866 (2)| 00:06:11 |
| 3 | COUNT | | | | | |
|* 4 | HASH JOIN | | 94236 | 1104K| 30866 (2)| 00:06:11 |
| 5 | INDEX FULL SCAN | P087_TIPI_MISURE_PK | 15 | 30 | 1 (0)| 00:00:01 |
| 6 | TABLE ACCESS BY INDEX ROWID| MS042_LOADING_LOGS | 94236 | 920K| 30864 (2)| 00:06:11 |
|* 7 | INDEX RANGE SCAN | MS042_DATA_PROC_IDX | 94236 | | 25742 (2)| 00:05:09 |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("R"<=200 AND "R">=1)
4 - access("TM"."ID_TIPO_MISURA"="LOG"."SEZIONE")
7 - access(SYS_OP_DESCEND("DATA_PROC")<=SYS_OP_DESCEND(TRUNC(SYSDATE#!)-1))
filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("DATA_PROC"))>=TRUNC(SYSDATE#!)-1)
Plan hash value: 69930686
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 95921 (1)| 00:19:12 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
|* 2 | VIEW | | 1467K| 18M| 95921 (1)| 00:19:12 |
| 3 | COUNT | | | | | |
|* 4 | HASH JOIN | | 1467K| 16M| 95921 (1)| 00:19:12 |
| 5 | INDEX FULL SCAN | P087_TIPI_MISURE_PK | 15 | 30 | 1 (0)| 00:00:01 |
|* 6 | TABLE ACCESS FULL| MS042_LOADING_LOGS | 1467K| 13M| 95912 (1)| 00:19:11 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("R"<=200 AND "R">=1)
4 - access("TM"."ID_TIPO_MISURA"="LOG"."SEZIONE")
6 - filter("LOG"."DATA_PROC">=TRUNC(SYSDATE#!)-4)

The larger the fraction of rows that will be returned, the more efficient a table scan is and the less efficient it is to use an index. Apparently, Oracle expects that inflection point to come when the query returns more than 3 days of data. If that is inaccurate, I would expect that the statistics on your table or indexes are inaccurate.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

MonetDB recursive CTE (common table expressions) - recursion

There is currently no support for recursive CTEs in MonetDB[Lite]. The solution you have proposed yourself seems like the way to go.

Related

Updating multiple rows in SQLite with relevant data from the same table

Get all table values if match in 2 other tables exists

SQLite take N rows per each group

SQLite - Update a column based on values from two other tables' columns

Oracle 11g r2: strange behavior on index

Categories

Resources