Is it possible to use WHERE clause in same query as PARTITION BY? - teradata

I need to write SQL that keeps only the minimum 5 records per each identifiable record in a table. For this, I use partition by and delete all records where the value returned is greater than 5. When I attempt to use the WHERE clause in the same query as the partition by statement, I get the error "Ordered Analytical Functions not allowed in WHERE Clause". So, in order to get it to work, I have to use three subqueries. My SQL looks ilke this:
delete mydb.mytable where (field1,field2) in
(
select field1,field2 from
(
select field1,field2,
Rank() over
(
partition BY field1
order by field1,field2
) n
from mydb.mytable
) x
where n > 5
)
The innermost subquery just returns the raw data. Since I can't use WHERE there, I wrapped it with a subquery, the purpose of which is to 1) use WHERE to get records greater than 5 in rank and 2) select only field1 and field2. The reason why I select only those two fields is so that I can use the IN statement for deleting those records in the outermost query.
It works, but it appears a bit cumbersome. I'd like to consolidate the inner two subqueries into a single subquery. Is this possible?

Sounds like you need to use the QUALIFY clause which is the HAVING clause for Window Aggregate functions. Below is my take on what you are trying to accomplish.
Please do not run this SQL directly against your production data without first testing it.
/* Physical Delete */
DELETE TGT
FROM MyDB.MyTable TGT
INNER JOIN
(SELECT Field1
, Field2
FROM MyDB.MyTable
QUALIFY ROW_NUMBER() (PARTITION BY Field1, ORDER BY Field1,2)
> 5
) SRC
ON TGT.Field1 = SRC.Field1
AND TGT.Field2 = SRC.Fileld2
/* Logical Delete */
UPDATE TGT
FROM MyDB.MyTable TGT
,
(SELECT Field1
, Field2
FROM MyDB.MyTable
QUALIFY ROW_NUMBER() (PARTITION BY Field1, ORDER BY Field1,2)
> 5
) SRC
SET Deleted = 'Y'
/* RecordExpireDate = Date - 1 */
WHERE TGT.Field1 = SRC.Field1
AND TGT.Field2 = SRC.Fileld2

Related

Get LIMIT value from subquery result

I would like to use the LIMIT option in my query, but the number of expected rows is stored in another table. This is what I have, but it doesn't work:
select * from table1 limit (select limitvalue from table2 where id = 1)
When I only run the subquery, the result is 6, as expected.
I prefer working with a WITH statement if possible, but that didn't work eiter.
Thank you in advance!
You could use a prepared statement to get the limit of queries from the other table because the limit clause does not allow non constant variables as parameter:
PREPARE firstQuery FROM "SELECT * FROM table1 LIMIT ?";
SET #limit = (select limitvalue from table2 where id = 1);
EXECUTE firstQuery USING #limit;
The source of the sql query from another post
You can make use of MariaDB's ROW_NUMBER function in a CTE to count the rows to be output, comparing that against the limitvalue. For example:
WITH rownums AS (
SELECT *,
ROW_NUMBER() OVER () AS rn
FROM table1
)
SELECT *
FROM rownums
WHERE rn <= (SELECT limitvalue FROM table2 WHERE id = 1)
Note Using LIMIT without ORDER BY is not guaranteed to give you the same results every time. You should include an ORDER BY clause in the OVER part of the ROW_NUMBER window function. With the sample data in my demo, you might use something like:
ROW_NUMBER() OVER (ORDER BY mark DESC)
Demo on dbfiddle

Subquery with alias in SQLite

I have a query that I run in PostgreSQL like this:
select
c_count, count(*) as custdist
from (
select
c_custkey,
count(o_orderkey)
from
customer left outer join orders on
c_custkey = o_custkey
and o_comment not like '%special%requests%'
group by
c_custkey
)as c_orders (c_custkey, c_count)
group by
c_count
order by
custdist desc,
c_count desc;
And I wanted to run it on SQLite, but I got this error: Error: near" (": syntax error. Maybe he doesn't recognize this as c_orders (c_custkey, c_count).
Is there any way to rewrite this query to execute in SQLite?
SQLite does not allow redefining/renaming columns for an nested, aliased query. You can do that with a WITH clause (i.e. Common Table Expression; CTE). Or you can add aliases to the nested query columns directly using AS keyword.
Interesting that this is exactly how the outer query columns are named. Just use the same pattern for the nested query. I don't use PostgreSQL, but why not add aliases directly on each column and complicate it by using different syntax for each part of the query?
select
c_count, count(*) as custdist
from (
select
c_custkey,
count(o_orderkey) AS c_count
from
customer left outer join orders on
c_custkey = o_custkey
and o_comment not like '%special%requests%'
group by
c_custkey
) AS c_orders
group by
c_count
order by
custdist desc,
c_count desc;

string_agg function with IN operator not working in PostgreSQL Query

here is my query
select *
from table
where id in ( select string_agg(CAST(id as varchar), '","') FROM table)
string_agg() is completely useless and unnecessary for that:
select *
from table_one
where id in (select id FROM other_table)
I assume you are doing that for two different tables, otherwise that would be a very expensive way of writing: select * from table where id is not null

Update row with value from next row sqlite

I have the following columns in a SQLite DB.
id,ts,origin,product,bid,ask,nextts
1,2016-10-18 20:20:54.733,SourceA,Dow,1.09812,1.0982,
2,2016-10-18 20:20:55.093,SourceB,Oil,7010.5,7011.5,
3,2016-10-18 20:20:55.149,SourceA,Dow,18159.0,18161.0,
How can I populate the 'next timestamp' column (nextts) with the next timestamp for the same product (ts), from the same source? I've been trying the following, but I can't seem to put a subquery in an UPDATE statement.
UPDATE TEST a SET nextts = (select ts
from TEST b
where b.id> a.id and a.origin = b.origin and a.product = b.product
order by id asc limit 1);
If I call this, I can display it, but I haven't found a way of updating the value yet.
select a.*,
(select ts
from TEST b
where b.id> a.id and a.origin = b.origin and a.product = b.product
order by id asc limit 1) as nextts
from TEST a
order by origin, a.id;
The problem is that you're using table alias for table in UPDATE statement, which is not allowed. You can skip alias from there and use unaliased (but table-name prefixed) reference to its columns (while keeping aliased references for the SELECT), like this:
UPDATE TEST
SET nextts = (
SELECT b.ts
FROM TEST b
WHERE b.id > TEST.id AND
TEST.origin = b.origin AND
TEST.product = b.product
ORDER BY b.id ASC
LIMIT 1
);
Prefixing unaliased column references with the table name is necessary for SQLite to identify that you're referencing to unaliased table. Otherwise the id column whould be understood as the id from the closest[*] possible data source, in which case it's the aliased table (as b alias), while we're interested in the unaliased table, therefore we need to explicitly tell SQLite that.
[*] Closest data source is the one listed in the same query, or parent query, or parent's parent query, etc. SQLite is looking for the first data source (going from inner part to the outside) in the query hierarchy that defines this column.

Insert into Table with the first column being a Sequence

I am trying to use an Insert, Sequence and Select * to work together.
INSERT INTO BRK_INDV
Select * from (Select brk_seq.NEXTVAL as INDV_SEQ, a.*
FROM (select to_date(to_char(REQUEST_DATETIME,'DD-MM-YYYY'),'DD-MM-YYYY') BUSINESS_DAY, to_char(REQUEST_DATETIME,'hh24') src_hour,
CASE tran_type
WHEN 'V' THEN 'Visa'
WHEN 'M' THEN 'MasterCard'
ELSE tran_type
end text,
tran_type, count(*) as count
from DLY_STATS
where 1=1
AND to_date(to_char(REQUEST_DATETIME,'DD-MM-YYYY'),'DD-MM-YYYY') = '09-FEB-2015'
group by to_date(to_char(REQUEST_DATETIME,'DD-MM-YYYY'),'DD-MM-YYYY'),to_char(REQUEST_DATETIME,'hh24'),tran_type order by src_hour)a);
This gives me the following error:
ERROR at line 2:
ORA-02287: sequence number not allowed here
I tried to remove the order by and still the same error.
However, if I only run
Select brk_seq.NEXTVAL as INDV_SEQ, a.*
FROM (select to_date(to_char(REQUEST_DATETIME,'DD-MM-YYYY'),'DD-MM-YYYY') BUSINESS_DAY, to_char(REQUEST_DATETIME,'hh24') src_hour,
CASE tran_type
WHEN 'V' THEN 'Visa'
WHEN 'M' THEN 'MasterCard'
ELSE tran_type
end text,
tran_type, count(*) as count
from DLY_STATS
where 1=1
AND to_date(to_char(REQUEST_DATETIME,'DD-MM-YYYY'),'DD-MM-YYYY') = '09-FEB-2015'
group by to_date(to_char(REQUEST_DATETIME,'DD-MM-YYYY'),'DD-MM-YYYY'),to_char(REQUEST_DATETIME,'hh24'),tran_type order by src_hour)a;
It shows me proper entries. Then, why is select * not working for that?
Kindly help.
I see what you're trying to do. You want to insert rows into the BRK_INDV table in a particular order. The sequence number, which I assume will be the primary key of BRK_INDV, will be generated sequentially in the sorted order of the input rows.
You are working with a relational database. One of the first characteristics we all learn about a relational database is that the order of the rows in a table is insignificant. That's just a fancy word for fugitaboutit.
You cannot assume that a select * from table will return the rows in the same order they were written. It might. It might for quite a long time. Then something -- the number of rows, the grouping of some column values, the phase of the moon -- something will change and you will get them out in a seemingly totally random order.
If you want order, it must be imposed in the query, not the insert.
Here's the statement you should be executing:
INSERT INTO BRK_INDV
With
Grouped( Business_Day, Src_Hour, Text, Tran_Type, Count )As(
Select Trunc( Request_Datetime ) Business_Day,
To_Char( Request_Datetime, 'hh24') Src_Hour,
Case Tran_Type
When 'V' Then 'Visa'
When 'M' Then 'MasterCard'
Else Tran_Type
end Text,
Tran_Type, count(*) as count
from DLY_STATS
Where 1=1 --> Generated as dynamic SQL?
And Request_Datetime >= Date '2015-02-09'
And Request_Datetime < Date '2015-02-10'
Group By Trunc( Request_Datetime ), To_Char( Request_Datetime, 'hh24'), Tran_Type
)
Select brk_seq.Nextval Indv_Seq, G.*
from Grouped G;
Notice there is no order by. If you want to see the generated rows in a particular order:
select * from Brk_Indv order by src_hour;
Since there could be hundreds or thousands of transactions in any particular hour, you probably order by something other than hour anyway.
In Oracle, the trunc function is the best way to get a date with the time portion stripped away. However, you don't want to use it in the where clause (or, aamof, any other function such as to_date or to_char)as that would make the clause non-sargable and result in a complete table scan.
The problem is that you can't use a sequence in a subquery. For example, this gives the same ORA-02287 error you are getting:
create table T (x number);
create sequence s;
insert into T (select * from (select s.nextval from dual));
What you can do, though, is create a function that returns nextval from the sequence, and use that in a subquery:
create function f return number as
begin
return s.nextval;
end;
/
insert into T (select * from (select f() from dual));

Resources