Count ID based on start date - count

I have a source table that looks like this
I start counting ID of the Pd based on the first date then go to the 2nd date and check if it is Pd the add the ID, the go the 3rd date and check if Pd from the previous date are change or not if the change the count them to new group. Please see the desired output. Could you please help?
Thank you

In a single pass solution you will need to track each ids prior inv. When this tracking is in place you will
decrement an invs count based on ids prior inv
increment an invs count based on ids current inv
in the tracker replace the ids prior inv with the current inv
The number of ids is dynamic and not known apriori, and ids prior inv value lookup is keyed on id. The best DATA Step feature for dynamic lookup is HASH
Also, because the counts output is a pivot based on inv values, you will need to either
have a series of if/then or select/when statements to increment/decrement the invs counts
output data as date inv count and Proc TRANSPOSE
Data
data have;
format id 4. date yymmdd10. inv $2.;
input id date yymmdd10. event $ e_seq inv ; datalines;
100 2018-01-01 In 1 Pd
101 2018-01-01 In 1 Pd
102 2018-02-04 In 1 Pd
100 2018-02-07 N 2 NG
101 2018-02-14 P 2 G
101 2018-02-18 A 3 Pd
100 2018-03-15 A 3 Pd
102 2018-05-01 P 2 G
103 2018-06-03 In 1 Pd
run;
Sample code
Nested DOW loops are used to test for end of input data and ensure one row output for each date (the group)
data want(keep=date G NG Pd);
if 0 then set have; * prep pdv for hash;
* ids is the 'tracker';
declare hash ids();
ids.defineKey('id');
ids.defineData('id', 'lastinv');
ids.defineDone();
lastinv = inv; * prep lastinv in pdv;
do until (end);
do until (last.date);
set have end=end;
where inv in ('Pd' 'G' 'NG');
by date;
if ids.find() = 0 then do; * decrement count based on ids prior inv;
select (lastinv);
when ('G') G + -1;
when ('NG') NG + -1;
when ('Pd') Pd + -1;
otherwise ;
end;
end;
* update ids prior inv;
lastinv = inv;
ids.replace();
* increment count based on ids prior inv;
select (lastinv);
when ('G') G + 1;
when ('NG') NG + 1;
when ('Pd') Pd + 1;
otherwise ;
end;
end;
OUTPUT; * <------------ output one row of counts per date;
end;
run;

Related

SQLITE get next row after ORDERBY

I need to get the next row from an ORDERBY query
I have 2 columns, ID(Primary key), Age(float) in a table T and I need something like the following
SELECT ID FROM T WHERE !> (inputted ID) + 1 rowID/Next row <! ORDERBY Age (then primary key, but I suspect if the Age values are the same SQLite would default to order by primary key anyway) LIMIT 1
Essentially it would select the next row after the inputted ID in the ordered table, its the next row / rowID + 1 I am not sure how to get.
As suggested here is a data set as an example
https://dbfiddle.uk?rdbms=sqlite_3.27&fiddle=19685ac20cc42041a59d318a01a2010f
ID Age
1 12.2
2 36.8
3 22.5
4 41
5 16.7
I am attempting to get the the following row from the ordered (by age) list given a specific ID
ID Age
1 12.2
5 16.7
3 22.5
2 36.8
4 41
Something similar to
SELECT ID FROM OrderedInfo WHERE ID = 5 ORDER BY Age ASC LIMIT 1 OFFSET 1;
My expected result would be '3' from the example data above
I have expanded the data set to include duplicate entries as I didn't implicitly state it could have such data - as such forpas answer works for the first example with no duplicate entries - thanks for your help
https://dbfiddle.uk?rdbms=sqlite_3.27&fiddle=f13d7f5a44ba414784547d9bbdf4997e
Use a subquery for the ID that you want in the WHERE clause:
SELECT *
FROM OrderedInfo
WHERE Age > (SELECT Age FROM OrderedInfo WHERE ID = 5)
ORDER BY Age LIMIT 1;
See the demo.
If there are duplicate values in the column Age use a CTE that returns the row that you want and join it to the table so that you expand the conditions:
WITH cte AS (SELECT ID, Age FROM OrderedInfo WHERE ID = 5)
SELECT o.*
FROM OrderedInfo o INNER JOIN cte c
ON o.Age > c.Age OR (o.Age = c.Age AND o.ID > c.ID)
ORDER BY o.Age, o.ID LIMIT 1;
See the demo.

Split a row into multiple rows - Teradata

Below is an example of my table
Names Start_Date Orders Items
AAA 2020-01-01 300 100
BAA 2020-02-01 896 448
My requirement would be as below
Names Start_Date Orders
AAA 2020-01-01 100
AAA 2020-01-01 100
AAA 2020-01-01 100
BBB 2020-02-01 448
BBB 2020-02-01 448
The rows should be split based on the (Orders/Items) value
This is a nice task for Teradata's SQL extension to create time series (based on #Andrew's test data):
SELECT *
FROM vt_foo
EXPAND ON PERIOD(start_date, start_date + Cast(Ceiling(Cast(orders AS FLOAT)/items) AS INT)) AS pd
For an exact split of orders into items:
SELECT dt.*,
CASE WHEN items * (end_date - start_date) > orders
THEN orders MOD items
ELSE items
end
FROM
(
SELECT t.*, End(pd) AS end_date
FROM vt_foo AS t
EXPAND ON PERIOD(start_date, start_date + Cast(Ceiling(Cast(orders AS FLOAT)/items) AS INT)) AS pd
) AS dt
This calls for a recursive CTE. Here's how I'd approach it, with a lovely volatile table for some sample data.
create volatile table vt_foo
(names varchar(100), start_date date, orders int, items int)
on commit preserve rows;
insert into vt_foo values ('AAA','2020-01-01',300,100);
insert into vt_foo values ('BAA','2020-02-01',896,448);
insert into vt_foo values ('CCC','2020-03-01',525,100); -
with recursive cte (names, start_date,items, num, counter) as (
select
names,
start_date,
items,
round(orders /( items * 1.0) ) as num ,
1 as counter
from vt_foo
UNION ALL
select
a.names,
a.start_date,
a.items,
b.num,
b.counter + 1
from vt_foo a
inner join cte b
on a.names = b.names
and a.start_date =b.start_date
where b.counter + 1 <= b.num
)
select * from cte
order by names,start_date
This bit: b.counter + 1 <= b.num is the key to limiting the output to the proper # of rows per product/date.
I think this should be ok, but test it with small volumes of data.

Retain values only for certain section of data in teradata

Below is the link of my previous quetsion.
Retain values till there is a change in value in Teradata
It worked as suggested by one of the community members #Dnoeth. Can this retention be done only for certain section of data?
I.e, Retain data only for data where Dep is A or B . When Dep is C just use same value as input and no need to retain till certain value.
Data:
Cust_id Balance st_ts Dep
123 1000 27MAY2018 A
123 350 31MAY2018 A
256 2000 29MAY2018 B
345 1000 28APR2018 C
345 1200 26MAY2018 C
Output reqd:
Cust_id Balance st_ts Dep
123 1000 27MAY2018 A
123 1000 28MAY2018 A
123 1000 29MAY2018 A
123 1000 30MAY2018 A
123 350 31MAY2018 A
256 2000 29MAY2018 B
256 2000 30MAY2018 B
256 2000 31MAY2018 B
345 1000 28APR2018 C
345 1200 26MAY2018 C
Query used:
Wth cte
{
SELECT customer_id, bal, st_ts,
-- return the next row's date
Coalesce(Min(st_ts)
Over (PARTITION BY customer_id
ORDER BY st_ts
ROWS BETWEEN 1 Following AND 1 Following)
,Date '2018-06-01') AS next_Txn_dt
FROM BAL_DET;
}
SELECT customer_id, bal
,Last(pd) -- last day of the period
FROM cTE
-- make a period of the current and next row's date
-- and return one row per day
EXPAND ON PERIOD(ST_TS, next_Txn_dt) AS pd;
Thanks
Sandy
You can add a CASE to check for Dep = 'C':
WITH cte AS
( SELECT customer_id, bal, st_ts, dep,
-- return the next row's date
CASE
WHEN dep = 'C'
THEN st_ts +1 -- simply increase date
ELSE
Coalesce(Min(st_ts)
Over (PARTITION BY customer_id
ORDER BY st_ts
ROWS BETWEEN 1 Following AND 1 Following)
,DATE '2018-06-01')
END AS next_Txn_dt
FROM BAL_DET
)
SELECT customer_id, bal
,Last(pd) -- last day of the period
,dep
FROM cTE
-- make a period of the current and next row's date
-- and return one row per day
EXPAND ON PERIOD(ST_TS, next_Txn_dt) AS pd

Sum all field with the other field < itself in sqlite

Sorry because I dont think good title for my problem.
I have table a(f1 integer, date Long), date increase, and the data
f1 date
1 1
2 2
3 3
...
I need to sum f1 by date, with record 1{1,1} the sum f1 is 1,with record 2 the sum f1 is 1+2, record 3 the sum f1 is 1+2+3...
How can I do that?
This requires a correlated subquery:
SELECT date,
(SELECT SUM(f1)
FROM a AS a2
WHERE a2.date <= a.date
) AS f1_sum
FROM a
ORDER BY date;
But it's inefficient. Consider just scanning the table, sorted by the date, and summing f1 as you're reading it.

SQLite count(*) in while clause

I have a calendar table in which there are all the dates in the future and a workday field:
fld_date / fld_workday
2014-01-01 / 1
2014-01-02 / 1
2014-01-03 / 0
...
I want select a date which are n workday far from another date. I tried two ways, but i failed:
The 5th workday from 2014-11-07:
1.
SELECT n1.fld_date FROM calendar as n1 WHERE n1.fld_workday=1 AND
(select count(*) FROM calendar as n2 WHERE n2.fld_date>='2014-11-07' AND n2.fld_workday=1)=5
It gave back 0 row.
2.
SELECT fld_date FROM calendar WHERE fld_date>='2014-11-07' AND fld_workday=1 LIMIT 1 OFFSET 5
It's ok, but i would like to change the 5 days constant to a field, and it's cannot (it would be inside a bigger select statement):
SELECT fld_date FROM calendar WHERE fld_date>='2014-11-07' AND fld_workday=1 LIMIT 1 OFFSET fld_another_field
Any suggestion?
In the first query, the subquery does not refer to the row in n1.
You need a correlated subquery:
SELECT fld_Date
FROM Calendar AS n1
WHERE fld_WorkDay = 1
AND (SELECT COUNT(*)
FROM Calendar AS n2
WHERE fld_Date BETWEEN '2014-11-07' AND n1.fld_Date
AND fld_WorkDay = 1
) = 5
LIMIT 1
The subquery is extremly inefficient if there is no index on the fld_Date column.
You can avoid executing the subquery for every row in n1 by adding another condition with an estimate of the result date (assuming that there are between about four to five work days per week, and using a few extra days to be sure):
...
WHERE fldDate BETWEEN date('2014-11-07', (5 * 4/7 - 10) || ' days')
AND date('2014-11-07', (5 * 5/7 + 10) || ' days')
AND fldWorkDay = 1
AND (SELECT ...

Resources