BigQuery - Duplicate rows x time using recursive method - recursion

I tried duplicating all rows in a table using the recursive method.
I have this base table
base
+------+
| rn |
+------+
| 1 |
| 2 |
| 3 |
+------+
Then, I want to replicate each of rows 3 times so the expected output would be like (9 rows)
+------+-------------+
| rn | iteration |
+------+-------------+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 1 | 2 |
| 2 | 2 |
| 3 | 2 |
| 1 | 3 |
| 2 | 3 |
| 3 | 3 |
+------+-------------+
I want to use recursive method
WITH recursive test AS (
SELECT
*,
1 AS iteration
FROM `base`
UNION ALL
SELECT
a.*,
b.iteration + 1 AS iteration
FROM `base` a
JOIN test b ON b.iteration < 3
)
SELECT *
FROM test
ORDER BY 2,1
But the result is not what I expected, it would produce 39 rows instead of 9 rows, it seems it's because of the JOIN that using cross join. Is there any way to fix the query to produce the expected value?
Thanks in advance!

Instead of RECURSIVE CTE, you can consider below
WITH base AS (
SELECT * FROM UNNEST([1, 2, 3]) rn
)
SELECT *
FROM base, UNNEST(GENERATE_ARRAY(1, 3)) iteration;
Query results
And for recursive approach, you need to add another join condition a.rn = b.rn.
WITH RECURSIVE base AS (
SELECT * FROM UNNEST([1, 2, 3]) rn
),
test AS (
SELECT *, 1 AS iteration FROM base
UNION ALL
SELECT b.*, a.iteration + 1
FROM test a
JOIN `base` b ON a.rn = b.rn AND a.iteration < 3
)
SELECT * FROM test ORDER BY 2,1;

Related

Sqlite / populate new column that ranks the existing rows

I've a SQLite database table with the following columns:
| day | place | visitors |
-------------------------------------
| 2021-05-01 | AAA | 20 |
| 2021-05-01 | BBB | 10 |
| 2021-05-01 | CCC | 3 |
| 2021-05-02 | AAA | 5 |
| 2021-05-02 | BBB | 7 |
| 2021-05-02 | CCC | 2 |
Now I would like to introduce a column 'rank' which indicates the rank according to the visitors each day. Expected table would look like:
| day | place | visitors | Rank |
------------------------------------------
| 2021-05-01 | AAA | 20 | 1 |
| 2021-05-01 | BBB | 10 | 2 |
| 2021-05-01 | CCC | 3 | 3 |
| 2021-05-02 | AAA | 5 | 2 |
| 2021-05-02 | BBB | 7 | 1 |
| 2021-05-02 | CCC | 2 | 3 |
Populating the data for the new column Rank can be done with a program like (Pseudocode).
for each i_day in all_days:
SELECT
ROW_NUMBER () OVER (ORDER BY `visitors` DESC) Day_Rank, place
FROM mytable
WHERE `day` = 'i_day'
for each i_place in all_places:
UPDATE mytable
SET rank= Day_Rank
WHERE `Day`='i_day'
AND place = 'i_place'
Since this line by line update is quite inefficient, I'm searching how to optimize this with a SQL sub query in combination with the UPDATE.
(does not work so far...)
for each i_day in all_days:
UPDATE mytable
SET rank= (
SELECT
ROW_NUMBER () OVER (ORDER BY `visitors` DESC) Day_Rank
FROM mytable
WHERE `day` = 'i_day'
)
Typically, this can be done with a subquery that counts the number of rows with visitors greater than the value of visitors of the current row:
UPDATE mytable
SET Day_Rank = (
SELECT COUNT(*) + 1
FROM mytable m
WHERE m.day = mytable.day AND m.visitors > mytable.visitors
);
Note that the result is actually what RANK() would return, if there are ties in the values of visitors.
See the demo.
Or, you could calculate the rankings with ROW_NUMBER() in a CTE and use it in a subquery:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY day ORDER BY visitors DESC) rn
FROM mytable
)
UPDATE mytable
SET Day_Rank = (SELECT rn FROM cte c WHERE (c.day, c.place) = (mytable.day, mytable.place));
See the demo.
Or, if your versipn of SQLite is 3.33.0+ you can use the join-like UPDATE...FROM... syntax:
UPDATE mytable AS m
SET Day_Rank = t.rn
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY day ORDER BY visitors DESC) rn
FROM mytable
) t
WHERE (t.day, t.place) = (m.day, m.place);

Updating multiple rows in SQLite with relevant data from the same table

I have a database that I don't control the source of directly and results in errant '0' entries which mess up generated graphs with these drops to zero. I am able to manipulate the data after the fact and update that database.
It is acceptable that the last known good value can be used instead and so I am trying to make a general query that will remove all the zeros and populate it with the last known value.
Luckily, every entry includes the ID of the last entry and so it is a matter of simply looking back and grabbing it.
I have got very close to a final answer, but instead of updating with the last good value, it just uses the first value over and over again.
dummy data
CREATE TABLE tbl(id INT,r INT,oid INT);
INSERT INTO tbl VALUES(1,10,0);
INSERT INTO tbl VALUES(2,20,1);
INSERT INTO tbl VALUES(3,0,2);
INSERT INTO tbl VALUES(4,40,3);
INSERT INTO tbl VALUES(5,50,4);
INSERT INTO tbl VALUES(6,0,5);
INSERT INTO tbl VALUES(7,70,6);
INSERT INTO tbl VALUES(8,80,7);
SELECT * FROM tbl;
OUTPUT:
| id| r |oid|
|---|----|---|
| 1 | 10 | 0 |
| 2 | 20 | 1 |
| 3 | 0 | 2 | ** NEEDS FIXING
| 4 | 40 | 3 |
| 5 | 50 | 4 |
| 6 | 0 | 5 | ** NEEDS UPDATE
| 7 | 70 | 6 |
| 8 | 80 | 7 |
I have worked several queries to get results around what I am after:
All zero entries:
SELECT * FROM tbl WHERE r = 0;
OUTPUT:
| id | r | oid |
|----|----|-----|
| 3 | 0 | 2 |
| 6 | 0 | 5 |
Output only the those rows with the preceding good row
SELECT * FROM tbl WHERE A in (
SELECT id FROM tbl WHERE r = 0
UNION
SELECT oid FROM tbl WHERE r = 0
)
OUTPUT:
| id| r |oid|
|---|----|---|
| 2 | 20 | 1 |
| 3 | 0 | 2 |
| 5 | 50 | 4 |
| 6 | 0 | 5 |
Almost works
This is as close as I have got, it does change all the zero's, but it changes them all to the value of the first lookup
UPDATE tbl
SET r = (SELECT r
FROM tbl
WHERE id in (SELECT oid
FROM tbl
WHERE r = 0)
) WHERE r = 0 ;
OUTPUT:
| id| r |oid|
|---|----|---|
| 1 | 10 | 0 |
| 2 | 20 | 1 |
| 3 | 20 | 2 | ** GOOD
| 4 | 40 | 3 |
| 5 | 50 | 4 |
| 6 | 20 | 5 | ** BAD, should be 50
| 7 | 70 | 6 |
| 8 | 80 | 7 |
If it helps, I created this fiddle here that I've been playing with:
http://sqlfiddle.com/#!5/8afff/1
For this sample data all you have to do is use the correct correlated subquery that returns the value of r from the row with id equal to the current oid in the WHERE clause:
UPDATE tbl AS t
SET r = (SELECT tt.r FROM tbl tt WHERE tt.id = t.oid)
WHERE t.r = 0;
See the demo.

MonetDB recursive CTE (common table expressions)

It seems MonetDB does not support recursive CTE. This is a useful feature that I used to get BOM from ERP systems. For a greater flexibility I used Firebird recursive stored procedures to enhance the output with extra calculations. A good example of SQLServer recursive CTE can be found here https://www.essentialsql.com/recursive-ctes-explained/
Question is: Is it any way I can achieve similar results in MonetDB?
There is currently no support for recursive CTEs in MonetDB[Lite]. The solution you have proposed yourself seems like the way to go.
It is clear that once I have access to procedures, variables and while-loop, something can be done. The following code provides me the desired result using temporary tables. I would appreciate if anybody can provide me an alternative to this solution that provides the same results without using the temporary tables overhead.
CREATE TEMPORARY TABLE BOM (parent_id string, comp_id string, qty double) ON COMMIT PRESERVE ROWS;
INSERT INTO BOM VALUES('a','b',5), ('a','c',2), ('b','d',4), ('b','c',7), ('c','e',3);
select * from BOM;
+-----------+---------+--------------------------+
| parent_id | comp_id | qty |
+===========+=========+==========================+
| a | b | 5 |
| a | c | 2 |
| b | d | 4 |
| b | c | 7 |
| c | e | 3 |
+-----------+---------+--------------------------+
CREATE TEMPORARY TABLE EXPLODED_BOM (parent_id string, comp_id string, path string, qty double, level integer) ON COMMIT PRESERVE ROWS;
CREATE OR REPLACE PROCEDURE UPDATE_BOM()
BEGIN
DECLARE prev_count int;
DECLARE crt_count int;
DECLARE crt_level int;
delete from EXPLODED_BOM; --make sure is empty
insert into EXPLODED_BOM select parent_id, comp_id, parent_id||'-'||comp_id, qty, 0 from BOM; --insert first level
SET prev_count = 0;
SET crt_count = (select count(*) from EXPLODED_BOM);
SET crt_level = 0;
-- (crt_level < 100) avoids possible infinite loop, if BOM is malformed
WHILE (crt_level < 100) and (crt_count > prev_count) DO
SET prev_count = crt_count;
insert into EXPLODED_BOM select e.parent_id, a.comp_id, e.path||'-'||a.comp_id, a.qty*e.qty, crt_level+1
from BOM a, EXPLODED_BOM e
where a.parent_id = e.comp_id and e.level=crt_level;
-- is it any chance to get the amount of "affected rows" by insert, update or delete statements, this way I can avoid checking the new count?
SET crt_count = (select count(*) from EXPLODED_BOM);
SET crt_level = crt_level +1;
END WHILE;
END;
call UPDATE_BOM();
select * from EXPLODED_BOM;
+-----------+---------+---------+--------------------------+-------+
| parent_id | comp_id | path | qty | level |
+===========+=========+=========+==========================+=======+
| a | b | a-b | 5 | 0 |
| a | c | a-c | 2 | 0 |
| b | d | b-d | 4 | 0 |
| b | c | b-c | 7 | 0 |
| c | e | c-e | 3 | 0 |
| a | d | a-b-d | 20 | 1 |
| a | c | a-b-c | 35 | 1 |
| a | e | a-c-e | 6 | 1 |
| b | e | b-c-e | 21 | 1 |
| a | e | a-b-c-e | 105 | 2 |
+-----------+---------+---------+--------------------------+-------+

SQLite take N rows per each group

I have an SQLite table similar to the following:
| A | B |
_________
| e | 5 |
| f | 7 |
| a | 5 |
| n | 7 |
| g | 5 |
| d | 7 |
| i | 5 |
| j | 5 |
| e | 7 |
| v | 7 |
How can I retrieve three random rows with value 5 in column B and three random rows with value 7? I don't know values in B, neither values5 ad 7. I want 3 random rows for each different value in B. Result may be not grouped by column B values. It could be something like:
| A | B |
_________
| e | 5 |
| g | 5 |
| e | 7 |
| v | 7 |
| j | 5 |
| f | 7 |
The following almost does what you want:
select t.*
from t
where t.rowid in (select t2.rowid
from t t2
where t2.b = t.b
order by random()
limit 3
);
Alas, the subquery will be run for every row, so this is only approximate because the random number generator changes values on each execution.
One solution is to use a temporary table to store a random number for each row, which can then be used for sorting. Unfortunately, a CTE doesn't seem to do the trick, because these are re-evaluated on each reference.
After some thought, I think a temporary table might be the only solution:
drop table if exists tempt;
create temporary table tempt as
select t.*, random() as rand
from t;
select t.*
from tempt t
where t.rowid in (select t2.rowid
from tempt t2
where t2.b = t.b
order by rand
limit 3
);
You can use the hidden RowID column to get three rows per B value as follows:
SELECT A, B FROM T T1
WHERE RowID IN (SELECT RowID FROM T T2 WHERE B = T1.B LIMIT 3);
Note that you're likely (but not 100% guaranteed) to get the same three rows each time. If you want to get random rows at the expense of some performance, you can do:
SELECT A, B FROM T T1
WHERE RowID IN (SELECT RowID FROM T T2 WHERE B = T1.B ORDER BY random() LIMIT 3);

How to get self-differences in sqlite table

does anyone know if there is a way to get this result in SQLite.
Given table with single column x like this:
x |
--
1
4
5
2
I need to add column dx, which is simply a difference x_i - x_{i-1} (except for the first one) like this:
x | dx |
-- --
1 | 0
4 | 3
5 | 1
2 | -3
Thanks a lot!
Update: given there is id column:
id | x |
-- --
1 | 1
2 | 4
3 | 5
4 | 2
Is it possible to obtain:
id | x | dx |
-- -- --
1 | 1 | 0
2 | 4 | 3
3 | 5 | 1
4 | 2 | -3
SQL tables have no implicit order associated with them. You must supply an ORDER BY clause to impose an order on the results.
What column would you order by to define the predecessor row for the subtraction? (Hint: there is none.)
With the addition of an id column per the revised question
sqlite> select id, x, (select t1.x - t2.x from t as t2 where id = t1.id - 1) from t as t1;
1|1|
2|4|3
3|5|1
4|2|-3
Or
sqlite> select id, x, coalesce((select t1.x - t2.x from t as t2 where id = t1.id - 1),0) from t as t1;
1|1|0
2|4|3
3|5|1
4|2|-3

Resources