I have an SQLite3 database from which I want to remove rows that have two fields of the same value.
It seems that I am able to select such values with this query:
SELECT * FROM mydb GROUP BY user_id, num HAVING COUNT(*) > 1
However I am not able to delete them.
DELETE FROM mydb WHERE user_id IN (SELECT * FROM mydb GROUP BY user_id, num HAVING COUNT(*) > 1)
returns a syntax error.
This is what I expect:
Example:
id user_id num
1 1 1
2 1 1
3 2 1
4 1 2
5 2 2
In this example id 1 and 2 have both columns (user_id and num) of the same value so they should be removed. Preferably, but not necessarily I would like to have a solution that would leave only one such row (doesn't matter which one).
Result:
id user_id num
2 1 1
3 2 1
4 1 2
5 2 2
Note: id is a primary key. user_id is a foreign key. num is an INTEGER.
You were having a syntax error because your IN operator has a single value on the left (user_id) but a table of non-single-value rows in the right side (SELECT *). Compare like with like; WHERE user_id IN (SELECT user_id ...) to avoid it.
Anyway, here's a query to delete all-but-newest:
DELETE FROM mydb
WHERE id NOT IN (
SELECT MAX(id) FROM mydb
GROUP BY user_id, num
);
The subquery will return the highest id for every unique (user_id, num) combination. Then we just delete all the other rows. I.e. in your example, the subquery would return 2, 3, 4, 5 as "correct", which would result in deletion of row 1.
I have a table TABLE in SQLite database with columns DATE, GROUP. I want to select the first 10 entries in each group. After researching similar topics here on stackoverflow, I came up with the following query, but it runs very slowly. Any ideas how to make it faster?
select * from TABLE as A
where (select count(*) from TABLE as B
where B.DATE < A.DATE and A.GROUP == B.GROUP) < 10
This is the result of EXPLAIN QUERY PLAN (TABLE = clients_bets):
Here are a few suggestions :
Use a covering index (an index containing all the data needed in the subquery, in this case the group and date)
create index some_index on some_table(some_group, some_date)
Additionally, rewrite the subquery to make is less dependent on outer query :
select * from some_table as A
where rowid in (
select B.rowid
from some_table as B
where A.some_group == B.some_group
order by B.some_date limit 10 )
The query plan change from :
0 0 0 SCAN TABLE some_table AS A
0 0 0 EXECUTE CORRELATED LIST SUBQUERY 1
1 0 0 SEARCH TABLE some_table AS B USING COVERING INDEX idx_1 (some_group=?)
to
0 0 0 SCAN TABLE some_table AS A
0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1
1 0 0 SEARCH TABLE some_table AS B USING COVERING INDEX idx_1 (some_group=? AND some_date<?)
While it is very similar, the query seems quite faster. I'm not sure why.
The current update statement is
UPDATE Table1 t1
SET column1 = 1
WHERE not EXISTS
( SELECT 1 FROM Table2 t2
WHERE t2.column2= t1.column2
AND t2.column1 = 0
)
AND t1.column2 > 0
AND t1.column1 = 0
The above update statement is fine if I have the value of 0 in column1 in the table Table t2. But I have a special scenario that my table Table t2 is having values as 0 and 1 for column1. In this case there should not be no update. In a single statement i have to handle both the situation.
Scenario 1: Update only if Column1 in the Table t2 having the value of 0
Scenario 2: No update if I Column1 in the Table t2 having the value of both 0 and 1
Can you please help me in this.
I have below data in my table:
eff_dt end_dt type_cd status
1-Jan-14 5-Jan-14 AAA 0
5-Jan-14 7-Jan-14 null 1
7-Jan-14 10-Jan-14 null 1
10-Jan-14 15-Jan-14 BBB 0
15-Jan-14 21-Jan-14 null 1
21-Jan-14 25-Jan-14 null 1
25-Jan-14 30-Jan-14 CCC 0
I want to update data using self join .
After update table should look like:
eff_dt end_dt type_cd status
1-Jan-14 5-Jan-14 AAA 0
5-Jan-14 7-Jan-14 AAA 1
7-Jan-14 10-Jan-14 AAA 1
10-Jan-14 15-Jan-14 BBB 0
15-Jan-14 21-Jan-14 BBB 1
21-Jan-14 25-Jan-14 BBB 1
25-Jan-14 30-Jan-14 CCC 0
Please help me with the update query in teradata ?
eff_dt end_dt type_cd status
1-Jan-14 5-Jan-14 AAA 0
5-Jan-14 7-Jan-14 null 1
7-Jan-14 10-Jan-14 null 1
10-Jan-14 15-Jan-14 BBB 0
15-Jan-14 21-Jan-14 null 1
21-Jan-14 25-Jan-14 null 1
25-Jan-14 30-Jan-14 CCC 0
given the above data, we can self join using status and dates.
end_dt of first row is basically the eff_dt of the 2nd row.
UPDATE A
FROM DB.TABLEA AS A, DB.TABLEA AS B
SET type_cd = B.type_cd
WHERE A.eff_dt = B.end_dt
and A.status = 1;
Do the same update again for to update the 3rd row status.
If the no. of rows is variable, then you will have to modify the query.
Something like the following should do the trick for you:
SELECT
t1.eff_dt, t1.end_dt, t2.type_cd, t1.status
FROM
yourtable t1
LEFT OUTER JOIN (SELECT * FROM yourtable WHERE status = 0) t2 ON t1.end_dt >= t2.end_dt
QUALIFY ROW_NUMBER() OVER (PARTITION BY t1.end_dt ORDER BY t2.end_dt DESC) = 1
This is joining your table to a version of your table where only status=0 records are present, since those are the ones with your non-null type_cd. It joins on date looking for any record that has a type_cd and the end_dt is less than the current records end_dt.
The QUALIFY windowing function at the end looks for records that have a type_cd with the highest end_dt. The downside here is the larger your table gets the more records you generate in the join, so your intermediate result will grow substantially. Your results will be correct, but you'll be using up more and more spool space.
If you find the windowing function difficult to understand in this query, try running the query without it and SELECT *. You can work through the QUALIFY logic a little easier then.
create table sample_1
(
eff_dt date,
end_dt date,
type_cd varchar(4)
,status int
);
insert into sample_1(date '2014-01-01',date '2014-01-05','aaa',0);
insert into sample_1(date '2014-01-05',date '2014-01-07',null,1);
insert into sample_1(date '2014-01-07',date '2014-01-10',null,1);
insert into sample_1(date '2014-01-10',date '2014-01-15','bbb',0);
insert into sample_1(date '2014-01-15',date '2014-01-21',null,1);
insert into sample_1(date '2014-01-21',date '2014-01-25',null,1);
insert into sample_1(date '2014-01-25',date '2014-01-30','ccc',0);
upd tgt
from sample_1 tgt
, (
sel tgt.eff_dt,tgt.end_dt,lkp.type_cd,tgt.status
from sample_1 tgt,
(
sel tgt.*,max(eff_dt) over (order by eff_dt asc rows between 1 following and 1 following ) eff_dt1
from sample_1 tgt
where status=0 --type_cd is not null
) lkp
where tgt.eff_dt between lkp.eff_dt and coalesce (eff_dt1,date '9999-12-31')
and coalesce ( tgt.type_cd,lkp.type_cd) =lkp.type_cd
) lkp
set type_cd=lkp.type_cd
where tgt.eff_dt=lkp.eff_dt
Filling those NULLs is a simple task for LAST_VALUE:
UPDATE tgt
FROM mytable tgt
,(
SEL eff_dt,
Last_Value(type_cd IGNORE NULLS)
Over (ORDER BY eff_dt) AS last_cd
FROM mytable
QUALIFY type_cd IS NULL
) AS src
SET type_cd = src.last_cd
WHERE tgt.eff_dt= src.eff_dt
Assuming this was just an example and you must do this for a group of rows you better use MERGE, will never be slower, but might be faster:
MERGE INTO mytable AS tgt
USING
(
SEL eff_dt,
Last_Value(type_cd IGNORE NULLS)
Over (ORDER BY eff_dt) AS last_cd
FROM mytable
QUALIFY type_cd IS NULL
) AS src
-- ON must include at least all (P)PI columns of the target table
ON tgt.eff_dt = src.eff_dt
WHEN MATCHED THEN
UPDATE SET type_cd = src.last_cd
This is quite clumsy.
Initial info: There's a clumsy select query eventually returning 0 or 1 depending on several conditions. Usually it get's to select only one row (other data is cut of by where, etc.). The problem occurs when there's more rows to parse. So the data actually looks like follows:
Status
0
1
instead of
Status
1
Problem: Only one rowed data is needed in return, i.e. if there's a 1 present in any row, it should 1 returned, otherwise 0.
Condition: It should be done only in a query (no variables, ifs etc.).
Thanks in advance.
If you are sure that 1 and 0 are the only valuesbeing returned, Can't you use a max over this query to see any 1s..?
select max(id) result
from (
select 1 id from dual
union all
select 0 id from dual
)
RESULT
----------
1
1 select max(id)
2 from (
3 select 0 id from dual
4 union all
5 select 0 id from dual
6 union all
7 select 0 id from dual
8* )
SQL> /
MAX(ID)
----------
0