I have a oracle query
select id from (
select ID, ROW_NUMBER() over (partition by LATEST_RECEIPT order by ID) rownumber
from Table
where LATEST_RECEIPT in
(
select LATEST_RECEIPT from Table
group by LATEST_RECEIPT
having COUNT(1) > 1
)
) t
where rownumber <> 1;
The data type of LATEST_RECEIPT was earlier varchar2(4000) and this query worked fine. Since the length of the column needs to be extended i modified it to CLOB, after which this fails. Could anyone help me fix this issue or provide a work around?
You can change your inner query to look for other rows with the same last_receipt value but a different ID (assuming ID is unique); if another row exists then that is equivalent to your count returning greater than one. But you can't simply test two CLOB values for equality, you need to use dbms_lob.compare:
select ID
from your_table t1
where exists (
select null from your_table t2
where dbms_lob.compare(t2.LATEST_RECEIPT, t1.LATEST_RECEIPT) = 0
and t2.ID != t1.ID
-- or if ID isn't unique: and t2.ROWID != t1.ROWID
);
Applying the row number filter is tricker, as you also can't use a CLOB in the analytic partition by clause. As André Schild suggested, you can use a hash; here passing the integer value 3, which is the equivalent of dbms_crypto.hash_sh1 (though in theory that could change in a future release!):
select id from (
select ID, ROW_NUMBER() over (partition by dbms_crypto.hash(LATEST_RECEIPT, 3)
order by ID) rownumber
from your_table t1
where exists (
select null from your_table t2
where dbms_lob.compare(t2.LATEST_RECEIPT, t1.LATEST_RECEIPT) = 0
and t2.ID != t1.ID
-- or if ID isn't unique: and t2.ROWID != t1.ROWID
)
)
where rownumber > 1;
It is of course possible to get a hash collision, and if that happened - you had two latest_receipt values which both appeared more than once and both hashed to the same value - then you could get too many rows back. That seems pretty unlikely, but it's something to consider.
So rather than ordering you can only look for rows which have the same lastest_receipt and a lower ID:
select ID
from your_table t1
where exists (
select null from your_table t2
where dbms_lob.compare(t2.LATEST_RECEIPT, t1.LATEST_RECEIPT) = 0
and t2.ID < t1.ID
);
Again that assumes ID is unique. If it isn't then you could still use rowid instead, but you would have less control over which rows were found - the lowest rowid isn't necessarily the lowest ID. Presumably you're using this to dine rows to delete. If you actually don't mind which row you keep and which you delete then you could still do:
and t2.ROWID < t1.ROWID
But since you are currently ordering that probably isn't acceptable, and hashing might be preferable, despite the small risk.
Related
My database structure contains columns: id, name, value, dealer. I want to retrieve row with lowest value for each dealer. I've been trying to mess up with MIN() and GROUP BY, still - no solution.
Solution1:
SELECT t1.* FROM your_table t1
JOIN (
SELECT MIN(value) AS min_value, dealer
FROM your_table
GROUP BY dealer
) AS t2 ON t1.dealer = t2.dealer AND t1.value = t2.min_value
Solution2 (recommended, much faster than solution1):
SELECT t1.* FROM your_table t1
LEFT JOIN your_table t2
ON t1.dealer = t2.dealer AND t1.value > t2.value
WHERE t2.value IS NULL
This problem is very famous, so there is a special page for this in Mysql's manual.
Check this: Rows Holding the Group-wise Maximum/Minimum of a Certain Column
select id,name,MIN(value) as pkvalue,dealer from TABLENAME
group by id,name,dealer;
here you group all rows by id,name,dealer and then you will get min value as pkvalue.
SELECT MIN(value),dealer FROM table_name GROUP BY dealer;
First you need to resolve the lowest value for each dealer, and then retrieve rows having that value for a particular dealer. I would do this that way:
SELECT a.*
FROM your_table AS a
JOIN (SELECT dealer,
Min(value) AS m
FROM your_table
GROUP BY dealer) AS b
ON ( a.dealer= b.dealer
AND a.value = b.m )
Try following:
SELECT dealer, MIN(value) as "Lowest value"
FROM value
GROUP BY dealer;
select id, name, value, dealer from yourtable where dealer
in(select min(dealer) from yourtable group by name, value)
These answers seem to miss the edge case of having multiple minimum values for a dealer and only wanting to return one row.
If you want to only want one value for each dealer you can use row_number partition - group - the table by dealer then order the data by value and id. we have to make the assumption that you will want the row with the smallest id.
SELECT ord_tbl.id,
ord_tbl.name,
ord_tbl.value,
ord_tbl.dealer
FROM (SELECT your_table.*,
ROW_NUMBER() over (PARTITION BY dealer ORDER BY value ASC, ID ASC)
FROM your_table
) AS ord_tbl
WHERE ord_tbl.ROW_NUMBER = 1;
Be careful though that value, id and dealer are indexed. If not this will do a full table scan and can get pretty slow...
I have a SQL query in teradata that returns a results set of ~160m rows in (I guess) a reasonable time: dependent on how good a day the server is having it runs between 10-60 minutes.
I recently got access to space to save it as a table, however using my initial query and the "insert into " command I get error 2646-no more spool.
query structure is
insert into <test_DB.tablename>
with smaller_dataset as
(
select
*
from
(
select
items
,case items
from
<Database.table>
QUALIFY ROW_NUMBER() OVER (PARTITION BY A,B ORDER BY C desc , LAST_UPDATE_DTM DESC) = 1
where 1=1
and other things
) T --irrelevant alias for subquery
QUALIFY ROW_NUMBER() OVER (PARTITION BY A, B ORDER BY C desc) = 1)
, employee_table as
(
select
items
,max(J1.field1) J1_field1
,max(J2.field1) J2_field1
,max(J3.field1) J3_field1
,max(J4.field1) J4_field1
from smaller_dataset S
self joins J1,J2,J3,J4
group by
non-aggregate items
)
select
items
case items
from employee_table
;
How can I break up the return into smaller chunks to prevent this error?
I have two tables whose relevant columns look more or less like the following:
table1.id:
zIXuJeY-qH
zJrcULGLXK
zXyIgnlylb
zZ7MYDGDAV
zbCFI4wKpe
ziGeauO-0O
zmzWbaGSb7
table2.id
zIXuJeY-qH
zIXuJeY-qH
zIXuJeY-qH
zIXuJeY-qH
zJrcULGLXK
zJrcULGLXK
in which table1.id is a primary key and table2.id is not a primary key. I'm trying to grab all the IDs from table 1 that are not in table 2, but am having trouble. If, for instance, I select a count of all the values that are in table two, I get:
>SELECT COUNT(id) FROM table1 WHERE id IN (SELECT id FROM table2);
>2
But if I try to find the IDs not in table2, I get 0:
>SELECT COUNT(id) FROM table1 WHERE id NOT IN (SELECT id FROM table2);
>
There are 503 unique IDs in table1 and only 163 in table2. What am I doing wrong?
I suppose there can be nulls in table2.id?
NOT IN ( ..., null, ...)
does never result in true unfortunately (because - so the argumentation - null is the unknown value, and we cannot guarantee that the unknown value is not the value you are looking for obviously). A bad trap.
Try
SELECT COUNT(id) FROM table1 WHERE id NOT IN (SELECT id FROM table2 where id is not null);
or an EXISTS clause instead.
I am trying to delete all rows in a simple table that have a duplicate value except for the duplicate with the highest id.
Table:
CREATE TABLE IF NOT EXISTS [Expression] (
[ID] INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
[Value] VARCHAR(2048) NOT NULL
)
Attempted Queries:
DELETE Expression
WHERE EXISTS (
SELECT 1
FROM Expression Exp2
WHERE Expression.Value=Exp2.Value
AND Expression.Id < Exp2.Id)
fails with
"SQL logic error or missing database near "Expression":syntax error"
DELETE Exp1
FROM Expression Exp1
INNER JOIN Expression Exp2
ON Exp1.Value=Exp2.Value AND Exp1.Id < Exp2.Id
fails with
"SQL logic error or missing database near "Exp1":syntax error"
What syntax do I need to use?
Don't forget the FROM.
DELETE command
DELETE FROM Expression
WHERE EXISTS (
SELECT 1
FROM Expression Exp2
WHERE Expression.Value=Exp2.Value
AND Expression.Id < Exp2.Id
);
You can do this with many ways:
1) USING CTE:
WITH CTE AS (
SELECT ID, VALUE, ROW_NUMBER() OVER(ORDER BY Value) as RowNum
FROM Expression
)
DELETE
FROM CTE
WHERE ROWNum >1
2) USING Temp tables: Same concept
DELETE t1 FROM contacts t1 INNER JOIN contacts t2 WHERE t1.id < t2.id AND t1.email = t2.email;
I have this query
Select distinct p_id, p_date,p_city
from p_master
where p_a_id in(1,2,5,8,2,1,10,02)
and my IN clause contains 200 values. How do I get to know which ones weren't returned by the query. Each value in the IN clause may have a record in some cases they don't. I want to know all the records that weren't found for any selected p_a_id type.
Please help
This will do the trick but I'm sure there's an easier way to find this out :-)
with test1 as
(select '1,2,5,8,2,1,10,02' str from dual)
select * from (
select trim(x.column_value.extract('e/text()')) cols
from test1 t, table (xmlsequence(xmltype('<e><e>' || replace(t.str,',','</e><e>')|| '</e></e>').extract('e/e'))) x) cols
left outer join
(Select count(*), p_a_id from p_master where p_a_id in (1,2,5,8,2,1,10,02) group by p_a_id) p
on p.p_a_id = cols.cols
where p_a_id is null
;