Deleting duplicate records based on two columns concatenation values

Deleting duplicate records based on two columns concatenation values - plsql

I have a table employee has 30000 records. I need to delete duplicate records based on two columns concatenation. For example name and job, like
martin clerk
martin clerk
Below is my code:
declare
type typ_emp is table of emp%rowtype;
v_emp typ_emp;
cursor cur_emp
is
select *
from emp a
where rowid >
(select min (rowid)
from emp b
where concat (concat (b.ename, '-'), b.job) =
concat (concat (a.ename, '-'), a.job)
)
;
begin
open cur_emp;
loop
fetch cur_emp bulk collect into v_emp;
exit when v_emp.count = 0;
if v_emp.count > 0
then
for i in v_emp.first .. v_emp.last
loop
insert into backup_emp (ename, job)
values (v_emp (i).ename, v_emp (i).job)
;
end loop;
end if;
end loop;
close cur_emp;
delete
from emp s
where s.rowid >
any (select t.rowid
from emp t
where concat (concat (t.ename, '-'), t.job) =
concat (concat (s.ename, '-'), s.job));
commit;
exception
when others then
Raise;
end;
It is taking a long time to delete the records. Can anyone help me in tuning a query for this or suggest me what is the better approach.
Thanks in advance.

Creating function based index might improve your performance
CREATE INDEX concatindex ON emp (ename||'-'||job);
Delete statement would look like this
delete emp a where a.rowid > (select min(rowid) from emp b where b.ename||'-'||b.job=a.ename||'-'||a.job)
unless you need to insert deleted rows into backup table which is not clear from your question. If so I would rather bulkcollect rows into collection. Leave a comment if you need to have detailed this option.

I hope this helps.
SELECT ROWID, ename || '-' || job AS concatenation,
decode(rank() over(PARTITION BY ename || '-' || job ORDER BY ROWID), 1, 'keep', 'delete') AS to_do
FROM emp
ORDER BY ename || '-' || job, ROWID;

Here is my code changes:
cursor cur_emp
is
select *
from
(select b.*
,row_number()over(partition by concat (concat (b.ename, '-'), b.job) order by ename)cnt
from emp b
) where cnt>1;

Related

PLSQL STORED PROCEDURE does not give result

select count(*)
INTO countExceed
from uid_emp_master k
where k.unique_id in (select k.reviewer_uid
from uid_rm_hierarchy k
where k.unique_id in ('||p_ID_list||'))
and k.band IN( 'A','B','C','D');
if (countExceed > 0) then
quer :='UPDATE UID_RM_HIERARCHY I
SET I.REVIEWER_UID in (SELECT L.REVIEWER_UID
FROM UID_RM_HIERARCHY L
WHERE L.UNIQUE_ID in ('||p_ID_list||') )
WHERE I.REVIEWER_UID in('||p_ID_list||')
and isdeleted=0';
EXECUTE IMMEDIATE quer ;
END IF;
the above stored procedure does not show any result the variable countExceed declared as a number please help me to correct the query.

The issue is in
where k.unique_id in ('||p_ID_list||'))
Here you are saying to look for records
where unique_id = '||p_ID_list||'
exactly as its typed, but what you need is to handle that variable as a list of values.
Say you have a table like this
create table tabTest(id) as (
select 'id1' from dual union all
select 'id2' from dual union all
select 'id3' from dual union all
select 'id4' from dual
)
and your input string is 'id1,id3,1d8';
I see two ways to do what you need; one is with dynamic SQL, for example:
declare
vResult number;
vList varchar2(199) := 'id1,id3,1d8';
vSQL varchar2(100);
begin
vSQL :=
'select count(*)
from tabTest
where id in (''' || replace (vList, ',', ''', ''') || ''')';
--
execute immediate vSQL into vResult;
--
dbms_output.put_line('Result: ' || vResult);
end;
Another way could be by splitting the string into a list of values and then simply using the resulting list in the IN.
For that, there are many answers about how to split a comma separated list string in Oracle.

Pl/sql dbms output

I'm very new to pl/sql and I cannot make this query run.
I want it to find differences between two tables and then output ID of those transactions.
Any help would be appreciated!
SET SERVEROUTPUT ON
DECLARE
diff_id varchar2(50);
diff_id2 varchar2(50);
BEGIN
FOR dcount IN
SELECT
O.transid ,
ABB.transid
into diff_id, diff_id2
FROM
(SELECT *
FROM O.transactions
AND abdate >= trunc(sysdate -3)
) O
FULL OUTER JOIN
(SELECT *
FROM ABB.transactions
AND abdate >= trunc(sysdate -3)
) ABB
ON O.transid = ABB.transid
LOOP
DBMS_OUTPUT.put_line (employee_rec.diff_id);
DBMS_OUTPUT.put_line (employee_rec.diff_id2);
END LOOP;
END;

my desired output would be id of transactions which are not in both
tables. Ie 375 and 480
Ah, yes, 375 and 480. What about 832?
Anyway: you don't need PL/SQL to do that. Would SET operators do any good? For example, if you want to fetch ID s from the first table that aren't contained in the second one, you'd use
select id from first_table
minus
select id from second_table;
Both ways?
select 'in 1st, not in 2nd' what, id
from (select id from first_table
minus
select id from second_table)
union all
select 'in 2nd, not in 1st', id
from (select id from second_table
minus
select id from first_table);
Apply additional conditions, if necessary (ABDATE column, for example).

how to retrieve data from multiple colums from bulk collect

DECLARE
TYPE two_cols IS RECORD
(
family_id family_members.family_id %TYPE,
city family_members.city%TYPE
);
TYPE family_members_t IS TABLE OF two_cols;
l_family_members family_members_t;
BEGIN
SELECT family_id,city
BULK COLLECT INTO l_family_members
FROM (SELECT x.family_id, x.City, x.Member_count,row_number()
OVER (PARTITION BY x.family_id ORDER BY x.Member_count DESC) rn
FROM (SELECT family_id, City, COUNT(*) Member_count
FROM FAMILY_MEMBERS
GROUP BY family_id, City) x) y
WHERE y.rn = 1;
for rec in 1..l_family_members.count
loop
dbms_output.put_line('majority mem of family id'
|| l_family_members.family_id(rec)
|| 'stay in '||l_family_members.city(rec));
end loop;
END;
Error:
ORA-06550: line 23, column 69: PLS-00302: component 'FAMILY_ID' must
be declared ORA-06550: line 23, column 1: PL/SQL: Statement ignored
06550. 00000 - "line %s, column %s:\n%s"
*Cause: Usually a PL/SQL compilation error.
*Action:
I am confused at the output line.. I am not getting how to retrieve data from bulk collect as there are two columns in it..how to distinguish them and retrieve them?

you are trying to select 2 columns into 1 record which doesn't work.
depending on your database version, you may be able to select records which then get bulk collected into a table as follows
DECLARE
TYPE two_cols IS RECORD
(
family_id family_members.family_id %TYPE,
city family_members.city%TYPE
);
TYPE family_members_t IS TABLE OF two_cols;
l_family_members family_members_t;
BEGIN
SELECT two_cols(family_id,city )
BULK COLLECT INTO l_family_members
FROM (SELECT x.family_id, x.City, x.Member_count,row_number()
OVER (PARTITION BY x.family_id ORDER BY x.Member_count DESC) rn
FROM (SELECT family_id, City, COUNT(*) Member_count
FROM FAMILY_MEMBERS
GROUP BY family_id, City) x) y
WHERE y.rn = 1;
for rec in 1..l_family_members.count
loop
dbms_output.put_line('majority mem of family id'
|| l_family_members(rec).family_id
|| 'stay in '||l_family_members(rec).city);
end loop;
END;
NB: I also fixed the reference in the output loop to put the (rec) after the table and before column

PL/SQL: How to delete records in a specific manner, for example if records of specific type X exist, delete all but one record

I'm trying to create a PL/SQL procedure where by I delete records that are grouped and selected by cursor but I only want one record remaining. I want to delete first by Xcomment, if there are multiple entries with id_number, activity_code, start_dt, activity_participation_code exist, then delete all but ONE entry with blank/null xcomment. If there are multiple entries with blank xcomment, then delete all but one with blank table_nmb. If multiple entries with blank table_nmb then delete highest sequence until only one is left. Essentially, I only want one record per all these fields. I'm having trouble thinking of how to do this so any help would be appreciated.
Here is my code so far:
Create Or Replace Function Y_Cleanup_Cursor
Return Sys_Refcursor
As
My_Cursor Sys_Refcursor;
Begin
Open My_Cursor For
Select Q.Id_Number, Q.Activity_Code, Q.Start_Dt, Q.Activity_Participation_Code, Q.Rec_Count, A.Xcomment, A.Table_Nmb, A.Xsequence
From (Select Id_Number, Activity_Code, Start_Dt, Activity_Participation_Code, Count(0) As Rec_Count
From Activity A
Group By Id_Number, Activity_Code, Start_Dt, Activity_Participation_Code
Having Count(0) > 1) Q,
Activity A
Where
Q.Id_Number = A.Id_Number And
Q.Activity_Code = A.Activity_Code And
Q.Start_Dt = A.Start_Dt And
Q.Activity_Participation_Code = A.Activity_Participation_Code;
Return My_Cursor;
End Y_Cleanup_Cursor;
Create Or Replace Procedure Help_Me_Please(Code In Varchar2)
Is
-- Declare Variables
-- I Stands For Internal Variable
L_Cursor Sys_Refcursor;
I_Id_Number Varchar2(10 Byte);
I_Xsequence Number (6);
I_Activity_Code Varchar2(05 Byte);
I_Start_Dt Varchar2(08 Byte);
I_Activity_Participation_Code Varchar2(02 Byte);
I_Table_Nmb Varchar2(15 Byte);
I_Xcomment Varchar2(255 Byte);
I_Rec_Count Number (6);
L_Counter Integer;
Begin
L_Cursor := Y_Cleanup_Cursor;
Loop
Fetch L_Cursor Into
I_Id_Number, I_Activity_Code, I_Start_Dt, I_Activity_Participation_Code, I_Rec_Count, I_Xcomment, I_Table_Nmb, I_Xsequence;
Select Count (Id_Number)
Into L_Counter
From Activity Where
Id_Number = I_Id_Number
And Activity_Code = I_Activity_Code
And Start_Dt = I_Start_Dt
And Activity_Participation_Code = I_Activity_Participation_Code
And Trim(Xcomment) Is Null;
If L_Counter <> I_Rec_Count Then
Begin
Delete From Activity
Where
Id_Number = I_Id_Number
And Activity_Code = I_Activity_Code
And Start_Dt = I_Start_Dt
And Activity_Participation_Code = I_Activity_Participation_Code
And Trim(Xcomment) Is Null;
end;
End If;
Exit When L_Cursor%Notfound;
End Loop;
Close L_Cursor;
End Help_Me_Please;

From what I gather you want to delete all rows except 1 where there are repeating columns
first make sure to backup your table:
create table [backup_table] as select * from [table];
Try This:
DELETE FROM backup_table
WHERE rowid not in
(SELECT MIN(rowid)
FROM backup_table
GROUP BY [col1], [col2]);
Col1 and col2, etc are the columns that should be identical

Pl/SQL - oracle 9i - Manual Pivoting

We have a table which has three columns in it:
Customer_name, Age_range, Number_of_people.
1 1-5 10
1 5-10 15
We need to return all the number of people in different age ranges as rows of a single query. If we search for customer #1, the query should just return one row:
Header- Age Range (1-5) Age Range (5-10)
10 15
We needed to get all the results in a single row; When I query for customer 1, the result should be only number of people in a single row group by age_range.
What would be the best way to approach this?

You need to manually perform a pivot:
SELECT SUM(CASE WHEN age_range = '5-10'
THEN number_of_people
ELSE NULL END) AS nop5,
SUM(CASE WHEN age_range = '10-15'
THEN number_of_people
ELSE NULL END) AS nop10
FROM customers
WHERE customer_name = 1;

There are easy solutions with 10g and 11g using LISTGAGG, COLLECT, or other capabilities added after 9i but I believe that the following will work in 9i.
Source (http://www.williamrobertson.net/documents/one-row.html)
You will just need to replace deptno with customer_name and ename with Number_of_people
SELECT deptno,
LTRIM(SYS_CONNECT_BY_PATH(ename,','))
FROM ( SELECT deptno,
ename,
ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY ename) -1 AS seq
FROM emp )
WHERE connect_by_isleaf = 1
CONNECT BY seq = PRIOR seq +1 AND deptno = PRIOR deptno
START WITH seq = 1;
DEPTNO CONCATENATED
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 ADAMS,FORD,JONES,SCOTT,SMITH
30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
3 rows selected.

This will create a stored FUNCTION which means you can access it at any time.
CREATE OR REPLACE FUNCTION number_of_people(p_customer_name VARCHAR2)
RETURN VARCHAR2
IS
v_number_of_people NUMBER;
v_result VARCHAR2(500);
CURSOR c1
IS
SELECT Number_of_people FROM the_table WHERE Customer_name = p_customer_name;
BEGIN
OPEN c1;
LOOP
FETCH c1 INTO v_number_of_people;
EXIT WHEN c1%NOTFOUND;
v_result := v_result || v_number_of_people || ' ' || CHR(13);
END;
END;
To run it, use:
SELECT number_of_people(1) INTO dual;
Hope this helps, and please let me know if there are any errors, I didn't testrun the function myself.

Just do
select Number_of_people
from table
where Customer_name = 1
Are we missing some detail?

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Deleting duplicate records based on two columns concatenation values - plsql

I hope this helps. SELECT ROWID, ename || '-' || job AS concatenation, decode(rank() over(PARTITION BY ename || '-' || job ORDER BY ROWID), 1, 'keep', 'delete') AS to_do FROM emp ORDER BY ename || '-' || job, ROWID;

Here is my code changes: cursor cur_emp is select * from (select b.* ,row_number()over(partition by concat (concat (b.ename, '-'), b.job) order by ename)cnt from emp b ) where cnt>1;

Related

PLSQL STORED PROCEDURE does not give result

Pl/sql dbms output

how to retrieve data from multiple colums from bulk collect

PL/SQL: How to delete records in a specific manner, for example if records of specific type X exist, delete all but one record

Pl/SQL - oracle 9i - Manual Pivoting

Categories

Resources