I need the the below output on Teradata :
DATE_HOME WORKING_DAY
01/01/2018 0
02/01/2018 1
03/01/2018 1
04/01/2018 1
05/01/2018 1
06/01/2018 0
07/01/2018 0
08/01/2018 1
09/01/2018 1
Output required
DATE_HOME WORKING_DAY Updated_DATE
01/01/2018 0 02/01/2018
02/01/2018 1 02/01/2018
03/01/2018 1 03/01/2018
04/01/2018 1 04/01/2018
05/01/2018 1 05/01/2018
06/01/2018 0 08/01/2018
07/01/2018 0 08/01/2018
08/01/2018 1 08/01/2018
09/01/2018 1 09/01/2018
That's a simple task for first_value:
first_value(case when WORKING_DAY = 1 then DATE_HOME end ignore nulls)
over (order by DATE_HOME
rows between current date and unbounded following)
Change the non-business dates into NULL and then search for the first non-NULL value.
Edit:
In fact there's no need for first_value as you sort by the same column, a simple min works, too:
min(case when WORKING_DAY = 1 then DATE_HOME end)
over (order by DATE_HOME
rows between current date and unbounded following)
Good lord, this is ugly, but it seems to work. I don't have access to a TD system, so it's more verbose than it could be:
SELECT
date_home,
working_day,
CAST(
CASE
-- If current date is a non-work day, add appropriate number of days to non-work day to get next work-day
WHEN working_day = 0 THEN date_home + INTERVAL '1' DAY * (ROW_NUMBER() OVER(PARTITION BY update_date ORDER BY date_home DESC))
ELSE date_home
END
AS DATE) AS Update_date
FROM (
SELECT
date_home,
working_day,
CASE
-- If current and previous days were non-work days, group row by PrevWorkDay value
WHEN MIN(working_day) OVER(ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) = 0 AND working_day = 0 THEN MIN(PrevWorkDay) OVER(ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING)
ELSE PrevWorkDay
END AS Update_Date
FROM (
SELECT
date_home,
working_day,
CASE
-- Track "baseline" previous date_home value for new group of "non-work day" rows
WHEN COALESCE(MIN(working_day) OVER(ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING), 1) = 1 AND working_day = 0 THEN COALESCE(MIN(Date_Home) OVER(ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING), date_home)
ELSE NULL
END AS PrevWorkDay
FROM holiday_calendar
) src
) src
ORDER BY date_home
This assumes your source data is stored in a table called "holiday_calendar".
The COALESCE's are used to handle the first row in the result set, which can't compute values for the previous row, since there is no previous row.
Give it a try and let me know.
Related
I have a table with columns as id,date,name
id date name
1 2019-08-01 00:00:00 abc
1 2019-08-01 00:00:00 def
2 2019-08-01 00:00:00 pqr
1 2019-08-31 00:00:00 def
I want to get the count of id for given month.
The expected result for count of id for month 8 must be 3
SELECT strftime('%Y/%m/%d', date) as vdate,count(DISTINCT vdate,id) AS totalcount FROM cardtable WHERE date BETWEEN date('" + $rootScope.mydate + "', 'start of month') AND date('" + $rootScope.mydate + "','start of month','+1 month','-1 day') group by vdate
Basically i want to count if id and date both are distinct.for example if there are 2 entries on date 2019-08-01 with same id than it should give count as 1,if there 3 entries on date 2019-08-01 in which 2 entries are with id 1 and 3rd entry with 2 than it should count 2 and when there are 2 entries with id 1 and on different date lets say 1 entry on 2019-08-01 with id 1 and other on 2019-08-31 with id 1 than count id for month 8 must 2.How can i modify the above query.
Use a subquery which returns the distinct values that you want to count:
SELECT COUNT(*) AS totalcount
FROM (
SELECT DISTINCT strftime('%Y/%m/%d', date), id
FROM cardtable
WHERE date(date) BETWEEN
date('" + $rootScope.mydate + "', 'start of month')
AND
date('" + $rootScope.mydate + "','start of month','+1 month','-1 day')
)
See the demo.
Results:
| totalcount |
| ---------- |
| 3 |
enter image description here
How to generate dates between tow date column based on each row
A row generator technique should be used, such as:
SQL> alter session set nls_date_format = 'dd.mm.yyyy';
Session altered.
SQL> with test (sno, start_date, end_date) as
2 (select 1, date '2018-01-01', date '2018-01-05' from dual union
3 select 2, date '2018-01-03', date '2018-01-05' from dual
4 )
5 select sno, start_date + column_value - 1 datum
6 from test,
7 table(cast(multiset(select level from dual
8 connect by level <= end_date - start_date + 1)
9 as sys.odcinumberlist))
10 order by sno, datum;
SNO DATUM
---------- ----------
1 01.01.2018
1 02.01.2018
1 03.01.2018
1 04.01.2018
1 05.01.2018
2 03.01.2018
2 04.01.2018
2 05.01.2018
8 rows selected.
SQL>
I have a table(pay_period) as following
pay_period
period_id list_id start_date end_date price
1 100 2017-01-01 2017-08-31 100
2 100 2017-09-01 2017-12-31 110
3 101 2017-01-01 2017-08-31 75
Now I have list_id, checkin_date, checkout_date
list_id 100
checkin_date 2017-08-25
checkout_date 2017-09-10
I need to calculate the price of a list for the period from checkin date to checkout date.
therefore the calculation is supposed to be
7 * 100 + 10 * 110
I am thinking to do it with a for loop, if there is any other better way to do it, can you please suggest?
You have to see if the checkin_date and checkout_date are into the same period_id.
1.1 If yes, you multiply the price with the nunmber of days.
1.2 If no, you have count the days between checkin_day untill the end of your period 1 and multiply with the corresponding price, then do the same with checkout_day and beginning of next period.
Note: i guess it might happen to have more than 2 prices per list_id. for example:
period_id list_id start_date end_date price
1 100 2017-01-01 2017-04-30 100
2 100 2017-05-01 2017-09-30 110
3 100 2017-10-01 2017-12-31 120
4 101 2017-01-01 2017-08-31 75
and the calculation period to be:
list_id 100
checkin_date 2017-03-01
checkout_date 2017-11-10
In this case, yes, the solution would be to have a CURSOR where to keep the prices for list_id and periods; loop through it and compare the checkin_date and checkout_date with each record.
Best,
Mikcutu.
You can do the following for a much cleaner code. Although it is purely sql, I am using a function to make it code better to understand.
Create a generic function which gets you the number of overlapping days in 2 different date range.
CREATE OR REPLACE FUNCTION fn_count_range
( p_start_date1 IN DATE,
p_end_date1 IN DATE,
p_start_date2 IN DATE,
p_end_date2 IN DATE ) RETURN NUMBER AS
v_days NUMBER;
BEGIN
IF p_end_date1 < p_start_date1 OR p_end_date2 < p_start_date2 THEN
RETURN 0;
END IF;
SELECT COUNT(*) INTO v_days
FROM (
(SELECT p_start_date1 + LEVEL - 1
FROM dual CONNECT BY LEVEL <= p_end_date1 - p_start_date1 + 1 ) INTERSECT
(SELECT p_start_date2 + LEVEL - 1
FROM dual CONNECT BY LEVEL <= p_end_date2 - p_start_date2 + 1 ) );
RETURN v_days;
END;
/
Now, your query to calculate the total price is simplified.
WITH lists ( list_id,
checkin_date,
checkout_date) AS
( SELECT 100,
TO_DATE('2017-08-25','YYYY-MM-DD'),
TO_DATE('2017-09-10','YYYY-MM-DD')
FROM dual) --Not required if you have a lists table.
SELECT l.list_id,
SUM(fn_count_range(start_date,end_date,checkin_date,checkout_date) * price) total_price
FROM pay_period p
JOIN lists l ON p.list_id = l.list_id
GROUP BY l.list_id;
I have a data table of three columns id, dtstart, dtend. For example:
id start end
1 01/01/2015 31/01/2015
1 02/02/2015 28/02/2015
1 01/07/2016 31/07/2016
1 01/08/2016 31/08/2016
2 01/03/2015 31/03/2015
2 01/04/2015 30/04/2015
2 01/02/2016 28/02/2016
2 01/03/2016 31/03/2016
...
I need to create another data table grouped by id with the same columns but the new start date is the minimum date in the original start date and the new end date is the maximum date in the original dtend.
When there is a break of more then one day between an end date and the next start date then it should be grouped separately.
For example for the above the new table would be:
id start end
1 01/01/2015 28/02/2015
1 01/07/2016 31/08/2016
2 01/03/2015 30/04/2016
2 01/02/2016 31/03/2016
...
Do I need a for loop or is there a more efficient way (data table grouping for example)? The table is over 20 million rows with 100k+ unique ids.
Cheers
Andrew
This can be done using dplyr
dt.new <- dt %>%
arrange(id, start, end) %>%
mutate(gr = cumsum(lag(id, default = min(id)) != id |
as.numeric(difftime(start, lag(end, default = first(start)), units = 'days')) > 1)) %>%
group_by(id, gr) %>%
summarise(start = first(start),
end = last(end))
The result is:
Source: local data frame [6 x 4]
Groups: id [?]
id gr start end
<int> <int> <dttm> <dttm>
1 1 0 2015-01-01 2015-01-31
2 1 1 2015-02-02 2015-02-28
3 1 2 2016-07-01 2016-08-31
4 2 3 2015-03-01 2015-04-30
5 2 4 2016-02-01 2016-02-28
6 2 5 2016-03-01 2016-03-31
This works and doesn't match your output because you requested a one day margin (if you want two day margins then switch from >1 to >2), and 2016 was a leap year, which is in R's internal calendar. So the margin between 2/28/2016 and 3/1/2016 is 2 days.
Thanks again #akash87
For example row 6 below is within a month so it should still return one row for id 1 from 1/02/2006 to 30/09/2006 but it breaks into two, the first from 01/02/2006 to 12/06/2006 and then from 01/07/2006 to 30/09/2016
id dtstart dtend
1 01/02/2006 28/02/2006
1 01/03/2006 31/03/2006
1 01/04/2006 30/04/2006
1 01/05/2006 31/05/2006
1 01/06/2006 30/06/2006
1 10/06/2006 12/06/2006
1 01/07/2006 31/07/2006
1 01/08/2006 31/08/2006
1 01/09/2006 30/09/2006
2 01/04/2006 30/04/2006
2 01/05/2006 31/05/2006
2 01/09/2006 30/09/2006
2 01/10/2006 31/10/2006
So instead of returning
id start end
1 01/02/2006 30/09/2006
2 01/04/2006 31/05/2006
2 01/09/2006 31/10/2006
We have
id start end
1 01/02/2006 12/06/2006
1 01/07/2006 30/09/2006
2 01/04/2006 31/05/2006
2 01/09/2006 31/10/2006
Andrew
I have the below table: I also have a calendar table if needed.
ID Start_dt End_dt
1 1/9/2016 3/10/2016
Expected Output:
ID Start_dt End_dt Month ActiveCustomerPerMonth
1 1/9/16 3/10/2016 201601 1
1 1/9/16 3/10/2016 201602 1
1 1/9/16 3/10/2016 201603 0 (Not Active end of Month)
I need this as I'm working on a current query that will utilize a case statement to count if the customer was active for that month. If the member was active on the last day of the month, the member would be considred active for that month. But I need to be able to count for al months for that customer.
CASE
WHEN LAST_DAY(x.END_DT) = x.END_DT
THEN '1'
WHEN TO_CHAR(X.END_DT,'MM/DD/YYYY') != '01/01/3000'
OR X.DISCHARGE_REASON IS NOT NULL
THEN '0'
WHEN X.FIRST_ASSGN_DT IS NULL
THEN '0'
ELSE '1'
END ActiveMemberForMonth
I'm new to Oracle and was reading about connect by but did not understand the process and not sure if this would be the proper place to use.
Something like this.
with
test_data ( id, start_dt, end_dt ) as (
select 1, to_date('1/9/2016' , 'mm/dd/yyyy'), to_date('3/10/2016', 'mm/dd/yyyy')
from dual union all
select 2, to_date('1/23/2016', 'mm/dd/yyyy'), to_date('5/31/2016', 'mm/dd/yyyy')
from dual
)
-- end of test data; solution (SQL query) begins below this line
select id, start_dt, end_dt,
to_char(add_months(trunc(start_dt, 'mm'), level - 1), 'yyyymm') as mth,
case when end_dt < last_day(end_dt)
and level = 1 + months_between(trunc(end_dt, 'mm'), trunc(start_dt, 'mm'))
then 0 else 1 end as active_at_month_end
from test_data
connect by level <= 1 + months_between(trunc(end_dt, 'mm'), trunc(start_dt, 'mm'))
and prior id = id
and prior sys_guid() is not null
order by id, mth -- optional
;
ID START_DT END_DT MTH ACTIVE_AT_MONTH_END
--- ---------- ---------- ------ -------------------
1 2016-01-09 2016-03-10 201601 1
1 2016-01-09 2016-03-10 201602 1
1 2016-01-09 2016-03-10 201603 0
2 2016-01-23 2016-05-31 201601 1
2 2016-01-23 2016-05-31 201602 1
2 2016-01-23 2016-05-31 201603 1
2 2016-01-23 2016-05-31 201604 1
2 2016-01-23 2016-05-31 201605 1
8 rows selected.