Teradata ROWS UNBOUNDED PRECEDING - teradata

I have below data set with me. I want to ensure that for any particular ID start_date, end_date and the period in between these two dates are not overlapping with dates of any other ID.
ID Start_Date End_Date
101 01-01-2001 31-01-2001
102 01-02-2001 28-02-2001
103 26-02-2001 31-03-2016
104 15-03-2001 30-04-2001
105 01-05-2002 31-05-2002
106 05-12-2002 31-12-2002
107 15-12-2002 05-01-2003
To do this I created below query:
select id,start_date,end_date,
case
when
end_date < max(end_date) over(order by start_date rows unbounded preceding)
then 'overlapping'
when
start_date < max(end_date) over(order by start_date rows unbounded preceding)
then 'overlapping'
else 'non-overlapping'
end as FLAG from table
I am getting below output having all flag as 'overlapping' which is not correct. I think 'rows unbounded preceding' is taking current row also in calculation:
Can you please let me know where I am wrong:
ID Start_Date End_Date Flag
101 01-01-2001 31-01-2001 Overlapping
102 01-02-2001 28-02-2001 Overlapping
103 26-02-2001 31-03-2016 Overlapping
104 15-03-2001 30-04-2001 Overlapping
105 01-05-2002 31-05-2002 Overlapping
106 05-12-2002 31-12-2002 Overlapping
107 15-12-2002 05-01-2003 Overlapping

There are a couple of ways to go about this. Since date ranges overlapping can get tricky, I would use Teradata's Period logic and a self join:
SELECT
*
FROM
table t1
INNER JOIN table t2 ON
period(t1.start_date, next(t1.end_date)) P_INTERSECT period(t2.start_date, next(t2.end_date)) IS NOT NULL
That will convert your start and end dates to a PERIOD data type and then look for records that have intersecting periods. The results will be the two records, combined into a single record, where the overlap takes place.

You can also try it the old-fashioned way, like:
CREATE TABLE db.t
(id INT,
start_date DATE,
end_date DATE);
INSERT INTO db.t VALUES (101,'2001-01-01','2001-01-31');
INSERT INTO db.t VALUES (102,'2001-02-01','2001-02-28');
INSERT INTO db.t VALUES (103,'2001-02-26','2001-03-31');
INSERT INTO db.t VALUES (104,'2001-03-15','2001-04-30');
INSERT INTO db.t VALUES (105,'2002-05-01','2002-05-31');
INSERT INTO db.t VALUES (106,'2002-12-05','2002-12-31');
INSERT INTO db.t VALUES (107,'2002-12-01','2003-01-05');
SELECT
t.id,
t.start_date,
t.end_date,
MAX(CASE WHEN o.id IS NULL THEN 'non-overlapping'
ELSE 'overlaps with' || o.id END) AS flag
FROM
db.t t LEFT OUTER JOIN
db.t o ON
t.start_date < o.end_date AND
t.end_date >= o.start_date AND
t.id <> o.id
GROUP BY 1,2,3
returns (sorted manually)
id start_date end_date flag
101 01/01/2001 01/31/2001 non-overlapping
102 02/01/2001 02/28/2001 overlaps with 103
103 02/26/2001 03/31/2001 overlaps with 104
104 03/15/2001 04/30/2001 overlaps with 103
105 05/01/2002 05/31/2002 non-overlapping
106 12/05/2002 12/31/2002 overlaps with 107
107 12/01/2002 01/05/2003 overlaps with 106

Related

Split a row into multiple rows - Teradata

Below is an example of my table
Names Start_Date Orders Items
AAA 2020-01-01 300 100
BAA 2020-02-01 896 448
My requirement would be as below
Names Start_Date Orders
AAA 2020-01-01 100
AAA 2020-01-01 100
AAA 2020-01-01 100
BBB 2020-02-01 448
BBB 2020-02-01 448
The rows should be split based on the (Orders/Items) value
This is a nice task for Teradata's SQL extension to create time series (based on #Andrew's test data):
SELECT *
FROM vt_foo
EXPAND ON PERIOD(start_date, start_date + Cast(Ceiling(Cast(orders AS FLOAT)/items) AS INT)) AS pd
For an exact split of orders into items:
SELECT dt.*,
CASE WHEN items * (end_date - start_date) > orders
THEN orders MOD items
ELSE items
end
FROM
(
SELECT t.*, End(pd) AS end_date
FROM vt_foo AS t
EXPAND ON PERIOD(start_date, start_date + Cast(Ceiling(Cast(orders AS FLOAT)/items) AS INT)) AS pd
) AS dt
This calls for a recursive CTE. Here's how I'd approach it, with a lovely volatile table for some sample data.
create volatile table vt_foo
(names varchar(100), start_date date, orders int, items int)
on commit preserve rows;
insert into vt_foo values ('AAA','2020-01-01',300,100);
insert into vt_foo values ('BAA','2020-02-01',896,448);
insert into vt_foo values ('CCC','2020-03-01',525,100); -
with recursive cte (names, start_date,items, num, counter) as (
select
names,
start_date,
items,
round(orders /( items * 1.0) ) as num ,
1 as counter
from vt_foo
UNION ALL
select
a.names,
a.start_date,
a.items,
b.num,
b.counter + 1
from vt_foo a
inner join cte b
on a.names = b.names
and a.start_date =b.start_date
where b.counter + 1 <= b.num
)
select * from cte
order by names,start_date
This bit: b.counter + 1 <= b.num is the key to limiting the output to the proper # of rows per product/date.
I think this should be ok, but test it with small volumes of data.

Retain values only for certain section of data in teradata

Below is the link of my previous quetsion.
Retain values till there is a change in value in Teradata
It worked as suggested by one of the community members #Dnoeth. Can this retention be done only for certain section of data?
I.e, Retain data only for data where Dep is A or B . When Dep is C just use same value as input and no need to retain till certain value.
Data:
Cust_id Balance st_ts Dep
123 1000 27MAY2018 A
123 350 31MAY2018 A
256 2000 29MAY2018 B
345 1000 28APR2018 C
345 1200 26MAY2018 C
Output reqd:
Cust_id Balance st_ts Dep
123 1000 27MAY2018 A
123 1000 28MAY2018 A
123 1000 29MAY2018 A
123 1000 30MAY2018 A
123 350 31MAY2018 A
256 2000 29MAY2018 B
256 2000 30MAY2018 B
256 2000 31MAY2018 B
345 1000 28APR2018 C
345 1200 26MAY2018 C
Query used:
Wth cte
{
SELECT customer_id, bal, st_ts,
-- return the next row's date
Coalesce(Min(st_ts)
Over (PARTITION BY customer_id
ORDER BY st_ts
ROWS BETWEEN 1 Following AND 1 Following)
,Date '2018-06-01') AS next_Txn_dt
FROM BAL_DET;
}
SELECT customer_id, bal
,Last(pd) -- last day of the period
FROM cTE
-- make a period of the current and next row's date
-- and return one row per day
EXPAND ON PERIOD(ST_TS, next_Txn_dt) AS pd;
Thanks
Sandy
You can add a CASE to check for Dep = 'C':
WITH cte AS
( SELECT customer_id, bal, st_ts, dep,
-- return the next row's date
CASE
WHEN dep = 'C'
THEN st_ts +1 -- simply increase date
ELSE
Coalesce(Min(st_ts)
Over (PARTITION BY customer_id
ORDER BY st_ts
ROWS BETWEEN 1 Following AND 1 Following)
,DATE '2018-06-01')
END AS next_Txn_dt
FROM BAL_DET
)
SELECT customer_id, bal
,Last(pd) -- last day of the period
,dep
FROM cTE
-- make a period of the current and next row's date
-- and return one row per day
EXPAND ON PERIOD(ST_TS, next_Txn_dt) AS pd

Oracle Timestamp based calculation on every day using Two datetime column

I have a table that contains two time stamp t1(event open date) and t2(event close date) and a primary key eventid.
If event is open then t2 will be null whenever even gets closed the same row will be get updated with event closure date t2.
For example I want to check how many issues are open on every day bases on opened date (t1) from 01-apr-2016 to 10-apr-2016.
I have to calculate how many events are open for every day based on a selected date range.
Lets say if eventid 1 has got opened on 1st-APR and got closed on 10th-APR and I am calculating the number of opened issues for every day on 11th-APR then it should give me number of open event 1 from 1st-APR to 10th-APR.
Table Structure:-
================================================
EVENTID T1 T2
================================================
1 01-apr-2016 10-apr-2016
2 02-apr-2016 08-apr-2016
3 05-apr-2016 09-apr-2016
Expected Output:-
==============================================================================
DATE TOTAL_OPEN_EVENTS
==============================================================================
01-apr-2016 1
02-apr-2016 2(1 issue open on 1st(not closed on 2nd) and 1 on 2nd)
03-apr-2016 2
04-apr-2016 2
05-apr-2016 3
06-apr-2016 3
07-apr-2016 3
08-apr-2016 2(1 issue got closed on 8th(which was opened on 2nd))
09-apr-2016 2
10-apr-2016 0
How to do this kind of calculation in Oracle database ?
In order to generate the end report, you need a row for each date in your desired range. You could either use a calendar table, if available, or I find using a query on DUAL using CONNECT BY LEVEL < some_number works well to generate rows on the fly. (In this case "some_number" will be the number of days you want to report on.)
From there, you just need to join the individual dates to the date ranges in your event table:
-- create table "events" table
create table event_date_ranges
as
select 1 as event_id, TO_DATE('2016-APR-01', 'YYYY-MM-DD') as start_date, TO_DATE('2016-APR-10', 'YYYY-MON-DD') as end_date from dual
union all
select 2 as event_id, TO_DATE('2016-APR-02', 'YYYY-MM-DD') as start_date, TO_DATE('2016-APR-08', 'YYYY-MON-DD') as end_date from dual
union all
select 3 as event_id, TO_DATE('2016-APR-05', 'YYYY-MM-DD') as start_date, TO_DATE('2016-APR-09', 'YYYY-MON-DD') as end_date from dual
;
with
date_range_qry as
(-- one way to set the start and end dates for your report
select TO_DATE('2016-APR-01', 'YYYY-MM-DD') as report_start_date
, TO_DATE('2016-APR-10', 'YYYY-MM-DD') as report_end_date
from dual
)
, dates_qry
as
(
-- generate a row for all dates between 2016-APR-01 and 2016-APR-10
select report_start_date + ROWNUM - 1 as report_date
from dual
cross join
date_range_qry drq
connect by level <= (drq.report_end_date - drq.report_start_date + 1)
)
select dq.report_date, count(edr.event_id) as total_open_events
from dates_qry dq
left outer join
event_date_ranges edr
on dq.report_date >= edr.start_date
and dq.report_date < edr.end_date
group by dq.report_date
order by dq.report_date
Output:
REPORT_DATE TOTAL_OPEN_EVENTS
2016-APR-01 1
2016-APR-02 2
2016-APR-03 2
2016-APR-04 2
2016-APR-05 3
2016-APR-06 3
2016-APR-07 3
2016-APR-08 2
2016-APR-09 1
2016-APR-10 0
You can try this:
create table events_log
as
select 1 as event_id, TO_DATE('01-04-2016', 'DD/MM/YYYY') as T1, TO_DATE('10-04-2016', 'DD/MM/YYYY') as T2 from dual
union all
select 2 as event_id, TO_DATE('02-04-2016', 'DD/MM/YYYY') as T1, TO_DATE('08-04-2016', 'DD/MM/YYYY') as T2 from dual
union all
select 3 as event_id, TO_DATE('05-04-2016', 'DD/MM/YYYY') as T1, TO_DATE('09-04-2016', 'DD/MM/YYYY') as T2 from dual
;
--------------
select v.REPORT_DATE, count(t.EVENT_ID) as open_event
from events_log t,
(select to_date('01/04/2016', 'DD/MM/YYYY') + ROWNUM - 1 as report_date
from dual
connect by level <= (to_date('11/04/2016', 'DD/MM/YYYY') -
to_date('01/04/2016', 'DD/MM/YYYY') + 1)) v
where t.T1(+) <= v.report_date
and t.T2(+) >= v.report_date
group by v.report_date
order by v.report_date;
Output will be:
report_date open_event
01/04/2016 1
02/04/2016 2
03/04/2016 2
04/04/2016 2
05/04/2016 3
06/04/2016 3
07/04/2016 3
08/04/2016 3
09/04/2016 2
10/04/2016 1
11/04/2016 0

SQLite how to select row based on a column max

I have the following table
UserID, Cost, date
1. 23. 2015-04-02
2. 17. 2015-03-14
1. 63. 2015-09-23
2. 49. 2013-03-17
2. 12. 2013-04-23
1. 96. 2016-01-01
What I want is a list of USERID & date with the largest cost
So
Userid 1 cost 96 date 2016-01-01
Userid 2 cost 49 date 2013-03-17
I have tried
select date, userid, max(cost) from table group by userid
But I'm confused with will the date always be from the correct row
Thanks
In SQLite 3.7.11 or later, values from other columns are guaranteed to come from a row that matches the max().
Consider a generalized approach for most RDMS versions. Below uses a derived table subquery:
SELECT t2.UserID, t2.MaxOfCost, t1.Date
FROM Table t1
INNER JOIN
(SELECT UserID, Max(Cost) As MaxOfCost,
FROM Table) t2
ON t1.UserID = t2.UserID
AND t1.Cost = t2.MaxOfCost

column is amguously defined in column

my tables
DESC emp
Name Null Type
------------------------------ -------- --------------------------
EMP_NO NUMBER
EMP_NAME VARCHAR2(10)
ADDRESS VARCHAR2(15)
PH_NO NUMBER(10)
DEPT_NO NUMBER
result:
1 ram ctr 8892939927 100
2 mohan ptr 7569936347 101
3 mallu ppt 9553438342 102
4 scoot dmc 9874563210 103
5 jim plr 9236548875 104
6 ravi tpt 8562398756 105
7 manju hyd 7562398742 106
8 manoj hyd 869523654 107
9 sarath ctr 9632158769 108
10 hemanth mpk 9632147852 109
desc salary
Name Null Type
------------------------------ -------- --------------------------
EMP_NO NUMBER
SALARY NUMBER(10)
PERIOD VARCHAR2(10)
START_DATE DATE
END_DATE DATE
result:
1 12580 15months 12-DEC-07 10-DEC-10
2 15500 19months 10-JAN-07 10-DEC-11
3 7777 18months 11-JUL-07 21-APR-11
4 9999 11months 07-JUL-07 31-JAN-11
5 8500 9months 12-MAR-07 27-MAR-11
6 10000 20months 17-SEP-07 01-AUG-11
7 25000 7months 17-NOV-07 26-JUL-11
8 100000 6months 05-MAY-07 21-JUN-11
9 35000 16months 28-FEB-08 21-JUN-11
10 5000 16months 02-DEC-08 19-AUG-11
joinning query :
select emp_no,
emp_name,
dpt_no,
salary
from emp
join salary on emp.dpt_no=salary.dpt_no
but am getting error is"column is amguously defined". How to resolve this problem?
You need to fully qualify the columns in the select list (the way it's done in the JOIN condition). Otherwise Oracle wouldn't know from which table the column dept_no should be taken.
select emp.emp_no,
emp.emp_name,
emp.dpt_no,
salary.salary
from emp
join salary on emp.dpt_no=salary.dpt_no;
It's good coding style to always qualify the columns - at least in a query involving more than one table - even if they are not ambigous.
If you don't want to type the full table name, you can use a (maningful) alias:
select emp.emp_no,
emp.emp_name,
emp.dpt_no,
sal.salary,
sal.period
from emp
join salary sal
on emp.dpt_no = sal.dpt_no;
If the columnname is the same in tables (salary and emp) and youre joining the tables, you have to specify form wich table you want to selecte the column (salary or from emp)
in youre case the solution is to use salary.dpt_no instead of dpt_no
select emp_no,emp_name, salary.dpt_no,salary from emp join salary on emp.dpt_no=salary.dpt_no

Resources