Trouble with Sqlite subquery

Trouble with Sqlite subquery - sqlite

My CustomTags table may have a series of "temporary" records where Tag_ID is 0, and Tag_Number will have some five digit value.
Periodically, I want to clean up my Sqlite table to remove these temporary values.
For example, I might have:
Tag_ID Tag_Number
0 12345
0 67890
0 45678
1 12345
2 67890
In this case, I want to remove the first two records because they are duplicated with actual Tag_ID 1 and 2. But I don't want to remove the third record yet because it hasn't been duplicated yet.
I have tried a number of different types of subqueries, but I just can't get it working. This is the last thing I tried, but my database client complains of an unknown syntax error. (I have tried with and without AS as an alias)
DELETE FROM CustomTags t1
WHERE t1.Tag_ID = 0
AND (SELECT COUNT(*) FROM CustomTags t2 WHERE t1.Tag_Number = t2.Tag_Number) > 1
Can anyone offer some insight? Thank you

There are many options, but the simplest are probably to use EXISTS;
DELETE FROM CustomTags
WHERE Tag_ID = 0
AND EXISTS(
SELECT 1 FROM CustomTags c
WHERE c.Tag_ID <> 0 AND c.Tag_Number = CustomTags.Tag_Number
)
An SQLfiddle to test with.
...or NOT IN...
DELETE FROM CustomTags
WHERE Tag_ID = 0
AND Tag_Number IN (
SELECT Tag_Number FROM CustomTags WHERE Tag_ID <> 0
)
Another SQLfiddle.

With your dataset like so:
sqlite> select * from test;
tag_id tag_number
---------- ----------
1 12345
1 67890
0 12345
0 67890
0 45678
You can run:
delete from test
where rowid not in (
select a.rowid
from test a
inner join (select tag_number, max(tag_id) as mt from test group by tag_number) b
on a.tag_number = b.tag_number
and a.tag_id = b.mt
);
Result:
sqlite> select * from test;
tag_id tag_number
---------- ----------
1 12345
1 67890
Please do test this out with a few more test cases than you have to be entirely sure that's what you want. I'd recommend creating a copy of your database before you run this on a large dataset.

Related

SQLITE3 - Deleting rows that have multiple columns of the same value

I have an SQLite3 database from which I want to remove rows that have two fields of the same value.
It seems that I am able to select such values with this query:
SELECT * FROM mydb GROUP BY user_id, num HAVING COUNT(*) > 1
However I am not able to delete them.
DELETE FROM mydb WHERE user_id IN (SELECT * FROM mydb GROUP BY user_id, num HAVING COUNT(*) > 1)
returns a syntax error.
This is what I expect:
Example:
id user_id num
1 1 1
2 1 1
3 2 1
4 1 2
5 2 2
In this example id 1 and 2 have both columns (user_id and num) of the same value so they should be removed. Preferably, but not necessarily I would like to have a solution that would leave only one such row (doesn't matter which one).
Result:
id user_id num
2 1 1
3 2 1
4 1 2
5 2 2
Note: id is a primary key. user_id is a foreign key. num is an INTEGER.

You were having a syntax error because your IN operator has a single value on the left (user_id) but a table of non-single-value rows in the right side (SELECT *). Compare like with like; WHERE user_id IN (SELECT user_id ...) to avoid it.
Anyway, here's a query to delete all-but-newest:
DELETE FROM mydb
WHERE id NOT IN (
SELECT MAX(id) FROM mydb
GROUP BY user_id, num
);
The subquery will return the highest id for every unique (user_id, num) combination. Then we just delete all the other rows. I.e. in your example, the subquery would return 2, 3, 4, 5 as "correct", which would result in deletion of row 1.

over - partition by in SQLite

I have a table TABLE in SQLite database with columns DATE, GROUP. I want to select the first 10 entries in each group. After researching similar topics here on stackoverflow, I came up with the following query, but it runs very slowly. Any ideas how to make it faster?
select * from TABLE as A
where (select count(*) from TABLE as B
where B.DATE < A.DATE and A.GROUP == B.GROUP) < 10
This is the result of EXPLAIN QUERY PLAN (TABLE = clients_bets):

Here are a few suggestions :
Use a covering index (an index containing all the data needed in the subquery, in this case the group and date)
create index some_index on some_table(some_group, some_date)
Additionally, rewrite the subquery to make is less dependent on outer query :
select * from some_table as A
where rowid in (
select B.rowid
from some_table as B
where A.some_group == B.some_group
order by B.some_date limit 10 )
The query plan change from :
0 0 0 SCAN TABLE some_table AS A
0 0 0 EXECUTE CORRELATED LIST SUBQUERY 1
1 0 0 SEARCH TABLE some_table AS B USING COVERING INDEX idx_1 (some_group=?)
to
0 0 0 SCAN TABLE some_table AS A
0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1
1 0 0 SEARCH TABLE some_table AS B USING COVERING INDEX idx_1 (some_group=? AND some_date<?)
While it is very similar, the query seems quite faster. I'm not sure why.

Teradat update statement

The current update statement is
UPDATE Table1 t1
SET column1 = 1
WHERE not EXISTS
( SELECT 1 FROM Table2 t2
WHERE t2.column2= t1.column2
AND t2.column1 = 0
)
AND t1.column2 > 0
AND t1.column1 = 0
The above update statement is fine if I have the value of 0 in column1 in the table Table t2. But I have a special scenario that my table Table t2 is having values as 0 and 1 for column1. In this case there should not be no update. In a single statement i have to handle both the situation.
Scenario 1: Update only if Column1 in the Table t2 having the value of 0
Scenario 2: No update if I Column1 in the Table t2 having the value of both 0 and 1
Can you please help me in this.

Update in Teradata

I have below data in my table:
eff_dt end_dt type_cd status
1-Jan-14 5-Jan-14 AAA 0
5-Jan-14 7-Jan-14 null 1
7-Jan-14 10-Jan-14 null 1
10-Jan-14 15-Jan-14 BBB 0
15-Jan-14 21-Jan-14 null 1
21-Jan-14 25-Jan-14 null 1
25-Jan-14 30-Jan-14 CCC 0
I want to update data using self join .
After update table should look like:
eff_dt end_dt type_cd status
1-Jan-14 5-Jan-14 AAA 0
5-Jan-14 7-Jan-14 AAA 1
7-Jan-14 10-Jan-14 AAA 1
10-Jan-14 15-Jan-14 BBB 0
15-Jan-14 21-Jan-14 BBB 1
21-Jan-14 25-Jan-14 BBB 1
25-Jan-14 30-Jan-14 CCC 0
Please help me with the update query in teradata ?

eff_dt end_dt type_cd status
1-Jan-14 5-Jan-14 AAA 0
5-Jan-14 7-Jan-14 null 1
7-Jan-14 10-Jan-14 null 1
10-Jan-14 15-Jan-14 BBB 0
15-Jan-14 21-Jan-14 null 1
21-Jan-14 25-Jan-14 null 1
25-Jan-14 30-Jan-14 CCC 0
given the above data, we can self join using status and dates.
end_dt of first row is basically the eff_dt of the 2nd row.
UPDATE A
FROM DB.TABLEA AS A, DB.TABLEA AS B
SET type_cd = B.type_cd
WHERE A.eff_dt = B.end_dt
and A.status = 1;
Do the same update again for to update the 3rd row status.
If the no. of rows is variable, then you will have to modify the query.

Something like the following should do the trick for you:
SELECT
t1.eff_dt, t1.end_dt, t2.type_cd, t1.status
FROM
yourtable t1
LEFT OUTER JOIN (SELECT * FROM yourtable WHERE status = 0) t2 ON t1.end_dt >= t2.end_dt
QUALIFY ROW_NUMBER() OVER (PARTITION BY t1.end_dt ORDER BY t2.end_dt DESC) = 1
This is joining your table to a version of your table where only status=0 records are present, since those are the ones with your non-null type_cd. It joins on date looking for any record that has a type_cd and the end_dt is less than the current records end_dt.
The QUALIFY windowing function at the end looks for records that have a type_cd with the highest end_dt. The downside here is the larger your table gets the more records you generate in the join, so your intermediate result will grow substantially. Your results will be correct, but you'll be using up more and more spool space.
If you find the windowing function difficult to understand in this query, try running the query without it and SELECT *. You can work through the QUALIFY logic a little easier then.

create table sample_1
(
eff_dt date,
end_dt date,
type_cd varchar(4)
,status int
);
insert into sample_1(date '2014-01-01',date '2014-01-05','aaa',0);
insert into sample_1(date '2014-01-05',date '2014-01-07',null,1);
insert into sample_1(date '2014-01-07',date '2014-01-10',null,1);
insert into sample_1(date '2014-01-10',date '2014-01-15','bbb',0);
insert into sample_1(date '2014-01-15',date '2014-01-21',null,1);
insert into sample_1(date '2014-01-21',date '2014-01-25',null,1);
insert into sample_1(date '2014-01-25',date '2014-01-30','ccc',0);
upd tgt
from sample_1 tgt
, (
sel tgt.eff_dt,tgt.end_dt,lkp.type_cd,tgt.status
from sample_1 tgt,
(
sel tgt.*,max(eff_dt) over (order by eff_dt asc rows between 1 following and 1 following ) eff_dt1
from sample_1 tgt
where status=0 --type_cd is not null
) lkp
where tgt.eff_dt between lkp.eff_dt and coalesce (eff_dt1,date '9999-12-31')
and coalesce ( tgt.type_cd,lkp.type_cd) =lkp.type_cd
) lkp
set type_cd=lkp.type_cd
where tgt.eff_dt=lkp.eff_dt

Filling those NULLs is a simple task for LAST_VALUE:
UPDATE tgt
FROM mytable tgt
,(
SEL eff_dt,
Last_Value(type_cd IGNORE NULLS)
Over (ORDER BY eff_dt) AS last_cd
FROM mytable
QUALIFY type_cd IS NULL
) AS src
SET type_cd = src.last_cd
WHERE tgt.eff_dt= src.eff_dt
Assuming this was just an example and you must do this for a group of rows you better use MERGE, will never be slower, but might be faster:
MERGE INTO mytable AS tgt
USING
(
SEL eff_dt,
Last_Value(type_cd IGNORE NULLS)
Over (ORDER BY eff_dt) AS last_cd
FROM mytable
QUALIFY type_cd IS NULL
) AS src
-- ON must include at least all (P)PI columns of the target table
ON tgt.eff_dt = src.eff_dt
WHEN MATCHED THEN
UPDATE SET type_cd = src.last_cd

Parsing data rows in plsql

This is quite clumsy.
Initial info: There's a clumsy select query eventually returning 0 or 1 depending on several conditions. Usually it get's to select only one row (other data is cut of by where, etc.). The problem occurs when there's more rows to parse. So the data actually looks like follows:
Status
0
1
instead of
Status
1
Problem: Only one rowed data is needed in return, i.e. if there's a 1 present in any row, it should 1 returned, otherwise 0.
Condition: It should be done only in a query (no variables, ifs etc.).
Thanks in advance.

If you are sure that 1 and 0 are the only valuesbeing returned, Can't you use a max over this query to see any 1s..?
select max(id) result
from (
select 1 id from dual
union all
select 0 id from dual
)
RESULT
----------
1
1 select max(id)
2 from (
3 select 0 id from dual
4 union all
5 select 0 id from dual
6 union all
7 select 0 id from dual
8* )
SQL> /
MAX(ID)
----------
0

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Trouble with Sqlite subquery - sqlite

Related

SQLITE3 - Deleting rows that have multiple columns of the same value

over - partition by in SQLite

Teradat update statement

Update in Teradata

Parsing data rows in plsql

Categories

Resources