SQLite take N rows per each group

SQLite take N rows per each group - sqlite

I have an SQLite table similar to the following:
| A | B |
_________
| e | 5 |
| f | 7 |
| a | 5 |
| n | 7 |
| g | 5 |
| d | 7 |
| i | 5 |
| j | 5 |
| e | 7 |
| v | 7 |
How can I retrieve three random rows with value 5 in column B and three random rows with value 7? I don't know values in B, neither values5 ad 7. I want 3 random rows for each different value in B. Result may be not grouped by column B values. It could be something like:
| A | B |
_________
| e | 5 |
| g | 5 |
| e | 7 |
| v | 7 |
| j | 5 |
| f | 7 |

The following almost does what you want:
select t.*
from t
where t.rowid in (select t2.rowid
from t t2
where t2.b = t.b
order by random()
limit 3
);
Alas, the subquery will be run for every row, so this is only approximate because the random number generator changes values on each execution.
One solution is to use a temporary table to store a random number for each row, which can then be used for sorting. Unfortunately, a CTE doesn't seem to do the trick, because these are re-evaluated on each reference.
After some thought, I think a temporary table might be the only solution:
drop table if exists tempt;
create temporary table tempt as
select t.*, random() as rand
from t;
select t.*
from tempt t
where t.rowid in (select t2.rowid
from tempt t2
where t2.b = t.b
order by rand
limit 3
);

You can use the hidden RowID column to get three rows per B value as follows:
SELECT A, B FROM T T1
WHERE RowID IN (SELECT RowID FROM T T2 WHERE B = T1.B LIMIT 3);
Note that you're likely (but not 100% guaranteed) to get the same three rows each time. If you want to get random rows at the expense of some performance, you can do:
SELECT A, B FROM T T1
WHERE RowID IN (SELECT RowID FROM T T2 WHERE B = T1.B ORDER BY random() LIMIT 3);

Related

BigQuery - Duplicate rows x time using recursive method

I tried duplicating all rows in a table using the recursive method.
I have this base table
base
+------+
| rn |
+------+
| 1 |
| 2 |
| 3 |
+------+
Then, I want to replicate each of rows 3 times so the expected output would be like (9 rows)
+------+-------------+
| rn | iteration |
+------+-------------+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 1 | 2 |
| 2 | 2 |
| 3 | 2 |
| 1 | 3 |
| 2 | 3 |
| 3 | 3 |
+------+-------------+
I want to use recursive method
WITH recursive test AS (
SELECT
*,
1 AS iteration
FROM `base`
UNION ALL
SELECT
a.*,
b.iteration + 1 AS iteration
FROM `base` a
JOIN test b ON b.iteration < 3
)
SELECT *
FROM test
ORDER BY 2,1
But the result is not what I expected, it would produce 39 rows instead of 9 rows, it seems it's because of the JOIN that using cross join. Is there any way to fix the query to produce the expected value?
Thanks in advance!

Instead of RECURSIVE CTE, you can consider below
WITH base AS (
SELECT * FROM UNNEST([1, 2, 3]) rn
)
SELECT *
FROM base, UNNEST(GENERATE_ARRAY(1, 3)) iteration;
Query results
And for recursive approach, you need to add another join condition a.rn = b.rn.
WITH RECURSIVE base AS (
SELECT * FROM UNNEST([1, 2, 3]) rn
),
test AS (
SELECT *, 1 AS iteration FROM base
UNION ALL
SELECT b.*, a.iteration + 1
FROM test a
JOIN `base` b ON a.rn = b.rn AND a.iteration < 3
)
SELECT * FROM test ORDER BY 2,1;

sql count of rows from two tables where rows from table 1 don't exist in table 2

I have two tables in a SQLite db like so
| t1 |
|===============|
| t1_id | other |
|-------|-------|
| 1 | sdfds |
| 2 | asdaa |
| 3 | lkjeq |
| t2 |
|=======================|
| t2_id | t1_id | other |
|-------|-------|-------|
| 1 | 1 | dggeh |
| 2 | 1 | iohio |
| 3 | 3 | ytucc |
| 4 | 3 | .noih |
| 5 | 3 | /oioi |
There are approx. 300K+ rows in both tables. I want the counts of t1 that have related rows in t2 as well as those that don't have. That is,
"count of t1 with related t2" : 2
"count of t1 with no related t2": 1
Of course, I can get "count of t1 with related t2" from
SELECT Count(*) FROM t1 JOIN t2 ON t1.t1_id = t2.t1_id;
and the "count of t1 with no related t2" by subtracting the above from the total count of t1. But how can I get it from a SQL query efficiently? I tried the following
EXPLAIN QUERY PLAN SELECT t1_id FROM t1 AS t
WHERE NOT EXISTS (SELECT * FROM t2 WHERE t.t1_id = t2.t1_id);
and I see that the index on t1_id in table t2 is not used. When I try the query, it takes really long, like 100s of seconds.

A couple of possibilities:
SELECT count(*) FROM
(SELECT t1_id FROM t1
EXCEPT
SELECT t1_id FROM t2);
or
SELECT count(*)
FROM t1
WHERE t1_id NOT IN (SELECT t1_id FROM t2);
or (A slight variation on your original):
SELECT count(*)
FROM t1
WHERE NOT EXISTS (SELECT 1 FROM t2 WHERE t1.t1_id = t2.t1_id);
All three of those should use an index on t2(t1_idx). You'll have to do benchmarking on real data to see which is faster.

MonetDB recursive CTE (common table expressions)

It seems MonetDB does not support recursive CTE. This is a useful feature that I used to get BOM from ERP systems. For a greater flexibility I used Firebird recursive stored procedures to enhance the output with extra calculations. A good example of SQLServer recursive CTE can be found here https://www.essentialsql.com/recursive-ctes-explained/
Question is: Is it any way I can achieve similar results in MonetDB?

There is currently no support for recursive CTEs in MonetDB[Lite]. The solution you have proposed yourself seems like the way to go.

It is clear that once I have access to procedures, variables and while-loop, something can be done. The following code provides me the desired result using temporary tables. I would appreciate if anybody can provide me an alternative to this solution that provides the same results without using the temporary tables overhead.
CREATE TEMPORARY TABLE BOM (parent_id string, comp_id string, qty double) ON COMMIT PRESERVE ROWS;
INSERT INTO BOM VALUES('a','b',5), ('a','c',2), ('b','d',4), ('b','c',7), ('c','e',3);
select * from BOM;
+-----------+---------+--------------------------+
| parent_id | comp_id | qty |
+===========+=========+==========================+
| a | b | 5 |
| a | c | 2 |
| b | d | 4 |
| b | c | 7 |
| c | e | 3 |
+-----------+---------+--------------------------+
CREATE TEMPORARY TABLE EXPLODED_BOM (parent_id string, comp_id string, path string, qty double, level integer) ON COMMIT PRESERVE ROWS;
CREATE OR REPLACE PROCEDURE UPDATE_BOM()
BEGIN
DECLARE prev_count int;
DECLARE crt_count int;
DECLARE crt_level int;
delete from EXPLODED_BOM; --make sure is empty
insert into EXPLODED_BOM select parent_id, comp_id, parent_id||'-'||comp_id, qty, 0 from BOM; --insert first level
SET prev_count = 0;
SET crt_count = (select count(*) from EXPLODED_BOM);
SET crt_level = 0;
-- (crt_level < 100) avoids possible infinite loop, if BOM is malformed
WHILE (crt_level < 100) and (crt_count > prev_count) DO
SET prev_count = crt_count;
insert into EXPLODED_BOM select e.parent_id, a.comp_id, e.path||'-'||a.comp_id, a.qty*e.qty, crt_level+1
from BOM a, EXPLODED_BOM e
where a.parent_id = e.comp_id and e.level=crt_level;
-- is it any chance to get the amount of "affected rows" by insert, update or delete statements, this way I can avoid checking the new count?
SET crt_count = (select count(*) from EXPLODED_BOM);
SET crt_level = crt_level +1;
END WHILE;
END;
call UPDATE_BOM();
select * from EXPLODED_BOM;
+-----------+---------+---------+--------------------------+-------+
| parent_id | comp_id | path | qty | level |
+===========+=========+=========+==========================+=======+
| a | b | a-b | 5 | 0 |
| a | c | a-c | 2 | 0 |
| b | d | b-d | 4 | 0 |
| b | c | b-c | 7 | 0 |
| c | e | c-e | 3 | 0 |
| a | d | a-b-d | 20 | 1 |
| a | c | a-b-c | 35 | 1 |
| a | e | a-c-e | 6 | 1 |
| b | e | b-c-e | 21 | 1 |
| a | e | a-b-c-e | 105 | 2 |
+-----------+---------+---------+--------------------------+-------+

update table in sorted order

I am maintaining table structure like below.
sortid | id | name
1 | 1 | aa
3 | 2 | cc
4 | 3 | cc
2 | 4 | bb
5 | 5 | dd
Where sortid is maintained according to ascending order of name.
Now I want to update name 'dd' to 'aa', such way that sort id is also updated to its correct value.
Update table set name="bb" where name like "dd";
After updating my table should become like below.
sortid | id | name
1 | 1 | aa
4 | 2 | cc
5 | 3 | cc
3 | 4 | bb
2 | 5 | aa

That sortid is the number of rows that would be sorted before this row.
So you can compute it by counting rows:
UPDATE MyTable
SET sortid = (SELECT COUNT(*)
FROM MyTable AS T2
WHERE T2.name < MyTable.name) +
(SELECT COUNT(*)
FROM MyTable AS T2
WHERE T2.name = MyTable.name
AND T2.id <= MyTable.id);
(The second subquery resolves duplicate sortid values that would result from duplicate names.)

SQLITE get row continous value

I have a simple table. One of the example rows looks like this:
id | name |
1 | a |
2 | a |
3 | a |
4 | b |
6 | b |
7 | a |
8 | a |
I want to get last continuous id.
so if i start at '1', the result should be '4'
in this example, the result should be '7'
3 |a |
4 |b |
5 |a |
6 |a |
7 |a |
10|a |
Only i now is that select all after my input number, and find continuous programmatically.
how can i do..?

If I understand the question correctly, this should work for you:
select id from tableName t1 where not exists(select id from tableName t2 where t2.id=t1.id+1) and (id-(select count(*) from tableName t3 where t3.id<t1.id))=(select min(id) from tableName);
If you want to start from 10, it should be:
select id from tableName t1 where not exists(select id from tableName t2 where t2.id=t1.id+1) and (id-(select count(*) from tableName t3 where t3.id<t1.id and t3.id>=10))=10;

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

SQLite take N rows per each group - sqlite

Related

BigQuery - Duplicate rows x time using recursive method

sql count of rows from two tables where rows from table 1 don't exist in table 2

MonetDB recursive CTE (common table expressions)

update table in sorted order

SQLITE get row continous value

Categories

Resources