I am currently trying to improve on a query that is being used to build a view. The query is in PL/SQL, in an Oracle database. There are 3 different types of reports (100,200, and 300) that are generated at each building. We track the consecutive years that each report is generated, and based on the combination of (1) the type(s) of report(s) generated for a given year and (2) the consecutive years each report has been generated, we arrive at a Result type for that building.
Here is a description of the criteria for the Result types:
Result 600 - If all 3 report types have been generated in the current year, where:
Level 1: all reports were generated in 1 consecutive year (this is the first year)
Level 2: at least 1 report type has been generated for 2 consecutive years (none have 3 consecutive years)
Level 3: at least one report type has been generated for 3 consecutive years
Result 100 - Only report type 100 has been generated in the current year, where:
Level 1 - 1 consecutive year
Level 2 - 2 consecutive years
Level 3 - 3 consecutive years
Result 200 - Only report type 200 has been generated in the current year, where:
Level 1 - 1 consecutive year
Level 2 - 2 consecutive years
Level 3 - 3 consecutive years
Result 300 - Only report type 300 has been generated in the current year, where:
Level 1 - 1 consecutive year
Level 2 - 2 consecutive years
Level 3 - 3 consecutive years
Result 400 - Only reports 100 and 200 have been generated, where:
Level 1: both reports were generated in 1 consecutive year (this is the first year)
Level 2: at least 1 report type has been generated for 2 consecutive years (neither have 3 consecutive years)
Level 3: at least one report type has been generated for 3 consecutive years
Result 500 - Only reports 100 and 300 have been generated, where:
Level 1: both reports were generated in 1 consecutive year (this is the first year)
Level 2: at least 1 report type has been generated for 2 consecutive years (neither have 3 consecutive years)
Level 3: at least one report type has been generated for 3 consecutive years
Result 700 - Only reports 200 and 300 have been generated, where:
Level 1: both reports were generated in 1 consecutive year (this is the first year)
Level 2: at least 1 report type has been generated for 2 consecutive years (neither have 3 consecutive years)
Level 3: at least one report type has been generated for 3 consecutive years
Still with me? Sweet. So here is the current code that is used to generate this view, which is simply a display of the result:
CREATE OR REPLACE FORCE VIEW REPORTS.REPORT_RESULT_VIEW
(
BUILDING,
BUILDING_NAME,
GROUP,
YEAR,
TYPE,
SUBTYPE,
CONSEC,
RESULT
)
AS
WITH cte1
AS (SELECT 1 ID_100,
1 ID_200,
1 ID_300,
'600 Level 1' RESULT
FROM DUAL
UNION ALL
SELECT 2 ID_100,
2 ID_200,
2 ID_300,
'600 Level 2' RESULT
FROM DUAL
UNION ALL
SELECT 3 ID_100,
3 ID_200,
3 ID_300,
'600 Level 3' RESULT
FROM DUAL
UNION ALL
SELECT 1 ID_100,
1 ID_200,
2 ID_300,
'600 Level 2' RESULT
FROM DUAL),
(note - there are 63 total combinations that are listed in the actual code... I only entered the first few to give you an idea of how it is set up)
cte2
AS ( SELECT MAX (ID_100) ID_100_CONSEC,
MAX (ID_200) ID_200_CONSEC,
MAX (ID_300) ID_300_CONSEC,
YEAR,
BUILDING
FROM (SELECT CONSEC ID_100,
NULL ID_200,
NULL ID_300,
YEAR,
TYPE || SUBTYPE TYPE,
BUILDING
FROM REPORT_MASTER_VIEW
WHERE TYPE || SUBTYPE = '100'
UNION
SELECT NULL ID_100,
CONSEC ID_200,
NULL ID_300,
YEAR,
TYPE || SUBTYPE TYPE,
BUILDING
FROM REPORT_MASTER_VIEW
WHERE TYPE || SUBTYPE = '200'
UNION
SELECT NULL ID_100,
NULL ID_200,
CONSEC ID_300,
YEAR,
TYPE || SUBTYPE TYPE,
BUILDING
FROM REPORT_MASTER_VIEW
WHERE TYPE || SUBTYPE = '300')
GROUP BY YEAR, BUILDING),
cte3
AS (SELECT c2.*, c1.RESULT
FROM cte2 c2
JOIN
cte1 c1
ON NVL (c2.ID_100_CONSEC, 0) = c1.ID_100
AND NVL (c2.ID_200_CONSEC, 0) = c1.ID_200
AND NVL (c2.ID_300_CONSEC, 0) = c1.ID_300)
SELECT t1."BUILDING",
t1."BUILDING_NAME",
t1."GROUP",
t1."YEAR",
t1."TYPE",
t1."SUBTYPE",
t1."CONSEC",
t2.RESULT
FROM REPORT_MASTER_VIEW t1
JOIN
cte3 t2
ON t1.BUILDING = t2.BUILDING AND t1.YEAR = t2.YEAR
WHERE T1.TYPE IN ('100', '200' '300')
ORDER BY t1.BUILDING;
Now, because for every report combination, it has to run through all the possible combinations, this view takes about 24 seconds to build. In the app that it is referenced in, it takes nearly a minute to load the page. For this reason, I am trying to figure out ways to make the query more efficient. At first, I was thinking of using nested CASE statements, but I wasn't really sure how that would work. Any suggestions on how to better approach this is greatly appreciated.
WITH cte2 AS
(
SELECT MAX(CASE WHEN TYPE || SUBTYPE = '100' THEN CONSEC END) AS ID_100_CONSEC
, MAX(CASE WHEN TYPE || SUBTYPE = '200' THEN CONSEC END) AS ID_200_CONSEC
, MAX(CASE WHEN TYPE || SUBTYPE = '300' THEN CONSEC END) AS ID_300_CONSEC
, YEAR
, BUILDING
FROM REPORT_MASTER_VIEW
WHERE TYPE || SUBTYPE IN ('100', '200', '300')
GROUP BY
YEAR
, BUILDING
)
,cte3 AS
(
SELECT ID_100_CONSEC
, ID_200_CONSEC
, ID_300_CONSEC
, YEAR
, BUILDING
, CASE
WHEN c2.ID_100_CONSEC = 1 THEN '600 Level 1'
WHEN c2.ID_100_CONSEC = 2 THEN '600 Level 2'
WHEN c2.ID_100_CONSEC = 3 THEN '600 Level 3'
WHEN c2.ID_200_CONSEC = 1 THEN '600 Level 1'
WHEN c2.ID_200_CONSEC = 2 THEN '600 Level 2'
WHEN c2.ID_200_CONSEC = 3 THEN '600 Level 3'
WHEN c2.ID_300_CONSEC = 1 THEN '600 Level 1'
WHEN c2.ID_300_CONSEC = 2 THEN '600 Level 2'
WHEN c2.ID_300_CONSEC = 3 THEN '600 Level 3'
END AS RESULT
FROM cte2
WHERE c2.ID_100_CONSEC IN (1, 2, 3)
OR c2.ID_200_CONSEC IN (1, 2, 3)
OR c2.ID_300_CONSEC IN (1, 2, 3)
)
SELECT t1."BUILDING",
t1."BUILDING_NAME",
t1."GROUP",
t1."YEAR",
t1."TYPE",
t1."SUBTYPE",
t1."CONSEC",
t2.RESULT
FROM REPORT_MASTER_VIEW t1
WHERE T1.TYPE IN ('100', '200', '300')
AND EXISTS
(
SELECT 1
FROM cte3 t2
WHERE t1.BUILDING = t2.BUILDING
AND t1.YEAR = t2.YEAR
)
ORDER BY
t1.BUILDING;
Should be pretty close. I think you will need to polish it a little.
Related
I have a big table which is 100k rows in size and the PRIMARY KEY is of the datatype NUMBER. The way data is populated in this column is using a random number generator.
So my question is, can there be a possibility to have a SQL query that can help me with getting partition the table evenly with the range of values. Eg: If my column value is like this:
1
2
3
4
5
6
7
8
9
10
And I would like this to be broken into three partitions, then I would expect an output like this:
Range 1 1-3
Range 2 4-7
Range 3 8-10
It sounds like you want the WIDTH_BUCKET() function. Find out more.
This query will give you the start and end range for a table of 1250 rows split into 20 buckets based on id:
with bkt as (
select id
, width_bucket(id, 1, 1251, 20) as id_bucket
from t23
)
select id_bucket
, min(id) as bkt_start
, max(id) as bkt_end
, count(*)
from bkt
group by id_bucket
order by 1
;
The two middle parameters specify min and max values; the last parameter specifies the number of buckets. The output is the rows between the minimum and maximum bows split as evenly as possible into the specified number of buckets. Be careful with the min and max parameters; I've found poorly chosen bounds can have an odd effect on the split.
This solution works without width_bucket function. While it is more verbose and certainly less efficient it will split the data as evenly as possible, even if some ID values are missing.
CREATE TABLE t AS
SELECT rownum AS id
FROM dual
CONNECT BY level <= 10;
WITH
data AS (
SELECT id, rownum as row_num
FROM t
),
total AS (
SELECT count(*) AS total_rows
FROM data
),
parts AS (
SELECT rownum as part_no, total.total_rows, total.total_rows / 3 as part_rows
FROM dual, total
CONNECT BY level <= 3
),
bounds AS (
SELECT parts.part_no,
parts.total_rows,
parts.part_rows,
COALESCE(LAG(data.row_num) OVER (ORDER BY parts.part_no) + 1, 1) AS start_row_num,
data.row_num AS end_row_num
FROM data
JOIN parts
ON data.row_num = ROUND(parts.part_no * parts.part_rows, 0)
)
SELECT bounds.part_no, d1.ID AS start_id, d2.ID AS end_id
FROM bounds
JOIN data d1
ON d1.row_num = bounds.start_row_num
JOIN data d2
ON d2.row_num = bounds.end_row_num
ORDER BY bounds.part_no;
PART_NO START_ID END_ID
---------- ---------- ----------
1 1 3
2 4 7
3 8 10
I have a table that contains two time stamp t1(event open date) and t2(event close date) and a primary key eventid.
If event is open then t2 will be null whenever even gets closed the same row will be get updated with event closure date t2.
For example I want to check how many issues are open on every day bases on opened date (t1) from 01-apr-2016 to 10-apr-2016.
I have to calculate how many events are open for every day based on a selected date range.
Lets say if eventid 1 has got opened on 1st-APR and got closed on 10th-APR and I am calculating the number of opened issues for every day on 11th-APR then it should give me number of open event 1 from 1st-APR to 10th-APR.
Table Structure:-
================================================
EVENTID T1 T2
================================================
1 01-apr-2016 10-apr-2016
2 02-apr-2016 08-apr-2016
3 05-apr-2016 09-apr-2016
Expected Output:-
==============================================================================
DATE TOTAL_OPEN_EVENTS
==============================================================================
01-apr-2016 1
02-apr-2016 2(1 issue open on 1st(not closed on 2nd) and 1 on 2nd)
03-apr-2016 2
04-apr-2016 2
05-apr-2016 3
06-apr-2016 3
07-apr-2016 3
08-apr-2016 2(1 issue got closed on 8th(which was opened on 2nd))
09-apr-2016 2
10-apr-2016 0
How to do this kind of calculation in Oracle database ?
In order to generate the end report, you need a row for each date in your desired range. You could either use a calendar table, if available, or I find using a query on DUAL using CONNECT BY LEVEL < some_number works well to generate rows on the fly. (In this case "some_number" will be the number of days you want to report on.)
From there, you just need to join the individual dates to the date ranges in your event table:
-- create table "events" table
create table event_date_ranges
as
select 1 as event_id, TO_DATE('2016-APR-01', 'YYYY-MM-DD') as start_date, TO_DATE('2016-APR-10', 'YYYY-MON-DD') as end_date from dual
union all
select 2 as event_id, TO_DATE('2016-APR-02', 'YYYY-MM-DD') as start_date, TO_DATE('2016-APR-08', 'YYYY-MON-DD') as end_date from dual
union all
select 3 as event_id, TO_DATE('2016-APR-05', 'YYYY-MM-DD') as start_date, TO_DATE('2016-APR-09', 'YYYY-MON-DD') as end_date from dual
;
with
date_range_qry as
(-- one way to set the start and end dates for your report
select TO_DATE('2016-APR-01', 'YYYY-MM-DD') as report_start_date
, TO_DATE('2016-APR-10', 'YYYY-MM-DD') as report_end_date
from dual
)
, dates_qry
as
(
-- generate a row for all dates between 2016-APR-01 and 2016-APR-10
select report_start_date + ROWNUM - 1 as report_date
from dual
cross join
date_range_qry drq
connect by level <= (drq.report_end_date - drq.report_start_date + 1)
)
select dq.report_date, count(edr.event_id) as total_open_events
from dates_qry dq
left outer join
event_date_ranges edr
on dq.report_date >= edr.start_date
and dq.report_date < edr.end_date
group by dq.report_date
order by dq.report_date
Output:
REPORT_DATE TOTAL_OPEN_EVENTS
2016-APR-01 1
2016-APR-02 2
2016-APR-03 2
2016-APR-04 2
2016-APR-05 3
2016-APR-06 3
2016-APR-07 3
2016-APR-08 2
2016-APR-09 1
2016-APR-10 0
You can try this:
create table events_log
as
select 1 as event_id, TO_DATE('01-04-2016', 'DD/MM/YYYY') as T1, TO_DATE('10-04-2016', 'DD/MM/YYYY') as T2 from dual
union all
select 2 as event_id, TO_DATE('02-04-2016', 'DD/MM/YYYY') as T1, TO_DATE('08-04-2016', 'DD/MM/YYYY') as T2 from dual
union all
select 3 as event_id, TO_DATE('05-04-2016', 'DD/MM/YYYY') as T1, TO_DATE('09-04-2016', 'DD/MM/YYYY') as T2 from dual
;
--------------
select v.REPORT_DATE, count(t.EVENT_ID) as open_event
from events_log t,
(select to_date('01/04/2016', 'DD/MM/YYYY') + ROWNUM - 1 as report_date
from dual
connect by level <= (to_date('11/04/2016', 'DD/MM/YYYY') -
to_date('01/04/2016', 'DD/MM/YYYY') + 1)) v
where t.T1(+) <= v.report_date
and t.T2(+) >= v.report_date
group by v.report_date
order by v.report_date;
Output will be:
report_date open_event
01/04/2016 1
02/04/2016 2
03/04/2016 2
04/04/2016 2
05/04/2016 3
06/04/2016 3
07/04/2016 3
08/04/2016 3
09/04/2016 2
10/04/2016 1
11/04/2016 0
I'm having trouble trying to wrap my head around how to write this query to calculate the occupancy level of a hotel and then list the results by date. Consider the following type of data from a table called reservations:
Arrival Departure Guest Confirmation
08/01/2015 08/05/2015 John 13234
08/01/2015 08/03/2015 Bob 34244
08/02/2015 08/03/2015 Steve 32423
08/02/2015 08/02/2015 Mark 32411
08/02/2015 08/04/2014 Jenny 24422
Output Data would ideally look like:
Date Occupancy
08/01/2015 2
08/02/2015 4
08/03/2015 2
08/04/2015 1
08/02/2015 0
And the query should be able to utilize a date range as a variable. I'm having trouble getting the obviously hardest piece of how to both get the count per night and spitting it out by date.
You can generate a list of dates first. In Oracle you can do this by using connect by. This will make a recursive query. For instance, to get the next 30 days, you can select today and keep connecting until you've got the desired number of days. level indicates the level of recursion.
select trunc(sysdate) + level - 1 as THEDATE
from dual
connect by level <= 30;
On that list, you can query the number of reservations for each day in that period:
select THEDATE,
(select count(*)
from reservations r
where r.Arrival >= THEDATE and
r.Departure < THEDATE) as RESERVATIONCOUNT
from
( select trunc(sysdate) + level - 1 as THEDATE,
from dual
connect by level <= 30)
Instead of getting a fixed number of dates, you can also get another value there, for instance, to get at least 30 days in the future, but further if there are reservations for later..:
select THEDATE,
(select count(*)
from reservations r
where r.Arrival >= THEDATE and
r.Departure < THEDATE) as RESERVATIONCOUNT
from
( select trunc(sysdate) + level - 1 as THEDATE,
from dual
connect by
level <= greatest(30, (select trunc(max(DEPARTURE) - sysdate)
from reservations)))
I have gone through a similar post in Stack overflow...
but my query is :
If my table generates a flag in run time execution,then how can I increase Grp_number(generate run time) every time my flag changes.
my Oracle query:
Select emp_id,
Case when MOD(rownum/3)=1 and rownum>1 then 'Y' else 'N' as flag
from Transaction_table
Desired o/p Data format:
emp_id Flag GRP_number
1 N 1
2 N 1
3 N 1
4 Y 2
5 N 2
6 N 2
7 Y 3
You cannot reference a column in another column in the same select list. You need to use sub query to avoid INVALID IDENTIFIER error.
Do it like -
WITH DATA AS(
SELECT emp_id,
CASE
WHEN MOD(rownum/3)=1
AND rownum >1
THEN 'Y'
ELSE 'N' AS flag
FROM Transaction_table
)
SELECT emp_id, flag, SUM(gap) over (PARTITION BY person
ORDER BY DAY) grp
FROM(
SELECT emp_id, flag,
CASE WHEN flag = lag(flag) over (PARTITION BY person
ORDER BY DAY)
THEN 0
ELSE 1
END gap
FROM DATA)
I have a calendar table in which there are all the dates in the future and a workday field:
fld_date / fld_workday
2014-01-01 / 1
2014-01-02 / 1
2014-01-03 / 0
...
I want select a date which are n workday far from another date. I tried two ways, but i failed:
The 5th workday from 2014-11-07:
1.
SELECT n1.fld_date FROM calendar as n1 WHERE n1.fld_workday=1 AND
(select count(*) FROM calendar as n2 WHERE n2.fld_date>='2014-11-07' AND n2.fld_workday=1)=5
It gave back 0 row.
2.
SELECT fld_date FROM calendar WHERE fld_date>='2014-11-07' AND fld_workday=1 LIMIT 1 OFFSET 5
It's ok, but i would like to change the 5 days constant to a field, and it's cannot (it would be inside a bigger select statement):
SELECT fld_date FROM calendar WHERE fld_date>='2014-11-07' AND fld_workday=1 LIMIT 1 OFFSET fld_another_field
Any suggestion?
In the first query, the subquery does not refer to the row in n1.
You need a correlated subquery:
SELECT fld_Date
FROM Calendar AS n1
WHERE fld_WorkDay = 1
AND (SELECT COUNT(*)
FROM Calendar AS n2
WHERE fld_Date BETWEEN '2014-11-07' AND n1.fld_Date
AND fld_WorkDay = 1
) = 5
LIMIT 1
The subquery is extremly inefficient if there is no index on the fld_Date column.
You can avoid executing the subquery for every row in n1 by adding another condition with an estimate of the result date (assuming that there are between about four to five work days per week, and using a few extra days to be sure):
...
WHERE fldDate BETWEEN date('2014-11-07', (5 * 4/7 - 10) || ' days')
AND date('2014-11-07', (5 * 5/7 + 10) || ' days')
AND fldWorkDay = 1
AND (SELECT ...