Delete duplicate rows in sqlite database - sqlite

I want to delete duplicate rows in a sqlite database. I found this code online to do it:
DELETE FROM table1
WHERE ROWID NOT IN (
SELECT min(rowid)
FROM table1
GROUP BY col1, col2
)
But the issue is that my database contains 50+ columns and I don't want to write all the column names in group by. Is there a workaround to group by all the columns?
I am new to this so I may be missing something simple.

In a GROUP BY clause you must enumerate all the columns.
For your case, I think it's better to create another table table2 with all the distinct rows of table1, then delete all rows of table1 and finally insert all rows from table2 to table1:
CREATE TABLE table2 AS SELECT DISTINCT * FROM table1;
DELETE FROM table1;
INSERT INTO table1 SELECT * FROM table2;
DROP TABLE table2;
See a simplified demo.

Related

update (overwrite) an entire record (all columns) in sqlite

When I am copying rows from one table to another, there is a convenient syntax:
INSERT INTO table1 SELECT * FROM table2 WHERE <table 2 rows have some property>
But what if I want to overwrite entire existing rows in table1 with entire rows from table2? So I want something like:
UPDATE table1 SET * FROM table2 WHERE <table 1 and 2 rows match on some key field>
But from what I can tell, the only way to do this is to enumerate the columns being set one by one (set table1.columnA = table2.columnA, table1.columnB = table2.columnB, and so on). Is there some way to say "do it for all the columns" when using UPDATE like there is when using INSERT? If not, why not?
(I guess I could delete all rows from table1 with the given property, and then use the INSERT INTO table1 SELECT * syntax to bring in the replacement rows from table2. But that seems like it leaves a bunch of unwanted deleted rows in the database needing to be vacuumed at some point, as opposed to a clean UPDATE where there are no deleted records? Or maybe I'm not understanding the efficiency of a bunch of deletes followed by a bunch of inserts?)
There is no such syntax for exactly what you have in mind, and I think you will need to SET each column separately. Also, SQLite does not support a direct update join syntax, but we may use correlated subqueries:
UPDATE table1
SET table1.columnA = (SELECT columnA FROM table2 WHERE table1.col = table2.col),
SET table1.columnB = (SELECT columnB FROM table2 WHERE table1.col = table2.col),
SET table1.columnC = (SELECT columnC FROM table2 WHERE table1.col = table2.col);

Can i save the output of a joined table into another table?

I created a database and having two tables like x and y. I cross joined them both and got an output.Now my question is can i save that output to a new table in the same database or different?
First you can try running the query SELECT * FROM table1 CROSS JOIN table2. Observe what columns are returned from the query, then create a new table new_table with columns of the appropriate type. Then you can try using INSERT INTO ... SELECT:
INSERT INTO new_table
SELECT * FROM table1 CROSS JOIN table2;
The CREATE TABLE statement directly accepts a query:
CREATE TABLE new_table AS
SELECT * FROM table1 CROSS JOIN table2;
(This will not keep the column types.)
The output can be saved in a different database if that database has been ATTACHed.

Return a column once time

I execute a query in my DB:
SELECT table1.*, tabl2.* FROM table1 JOIN table2 USING(id);
In these 2 tables i have a common column "id". What I have to ask, in order to get the column 'id' once time in the results and not twice?
I thought one solution is to write down in the query which columns I want. But If I want to avoid this (as there are many) ?
Will it work for you to name specific columns you need from both tables? something like:
SELECT table1.id, table2.other_column1, table2.other_column2 FROM table1 JOIN table2 USING(id);
You are selecting all fields from both tables by using (*)

Can we delete duplicate records from a table in teradata without using intermediate table

Can we delete duplicate records from a multiset table in teradata without using intermediate table.
Suppose we have 2 rows with values
1, 2, 3
and 1, 2, 3
in my multiset table then after delete i should have
only one row i.e. 1, 2, 3.
You can't unless the ROWID usage has been enabled on your system (and probablity is quite low). You can easily test it by trying to explain a SELECT ROWID FROM table;
Otherwise there are two possible ways.
Low number of duplicates:
create a new table as result of SELECT all columns FROM table GROUP BY all columns HAVING COUNT(*) > 1;
DELETE FROM tab WHERE EXISTS (SELECT * FROM newtab WHERE...)
INSERT INTO tab SELECT * FROM newtab
High number of duplicates:
copy to a new table using SELECT DISTINCT * or copy to a SET TABLE to get rid of the duplicates and then re-INSERT back
Use the same approach, but create a volatile table in the middle.
CREATE VOLATILE MULTISET TABLE TEMPDB.TEMP_DUP_ID (
Row_ID DECIMAL(31,0)
) PRIMARY INDEX (Row_ID)
ON COMMIT PRESERVE ROWS;
INSERT INTO TEMPDB.TEMP_DUP_ID
SELECT ROW_ID
FROM DB.TABLE T
QUALIFY ROW_NUMBER() OVER (PARTITION BY DUP ORDER BY DUP DESC) > 1
Then use the table to delete.
Ideally you will have unique key per row, otherwise, you will need to manipulate the data a bit more to generate one (with row_number() for instance... This is just a recommendation).
---Without creating intermediate table
delete FROM ORGINAL_TABLE WHERE (COL1, 2) in (select COL1, count() from ORGINAL_TABLE
GROUP BY 1
HAVING COUNT() >1 )
and DUPLICATE_BASED_COL >1; -------Delete one row(keep it)
If you have duplicates and want to delete one row, then we need to use the last line in the sql, if we want to delete both rows than, ignore the condition.
create table without dup
CREATE TABLE new AS (SELECT DISTINCT * FROM old) WITH DATA;
verify
select * from new;
drop the original one
drop table old;
rename the new table as original
RENAME TABLE new to old;
verify
select * from old;
SEL * FROM THE_TABLE_Containing_duplications
QUALIFY (ROW_number() over(partition by duplicated_column order by duplicated_column)=1) --keep only one occurence (the first one)

SQLite Delete from New Table data that exists in old table

I get daily update files of progress on a project that have no date field. I have added a date column to the first Table and imported the 1st and 2nd tables into SQLIte. Now I want to delete from the 2nd table, data that already exists in the 1st table so that I can add a new date to the 2nd table and then insert the 2nd table into the 1st table. The problem I am having is with the delete from 2nd table part. Column4 data Intersects in both Tables.
I tried the following SQL statement, but it deletes all the rows from Table2:
DELETE from Table2 WHERE exists (SELECT Col4 from Table1 INTERSECT SELECT Col4 from Table2);
Any help is appreciated.
Exists checks the existence of rows in your subquery. Your problem will be solved by using "not in" statement
DELETE from Table2 WHERE Col4 not in (SELECT Col4 from Table1 INTERSECT SELECT Col4 from Table2);

Resources