Slow SQL Server Query due to calculations? - asp.net

I am responsible for an old time recording system which was written in ASP.net Web Forms using ADO.Net 2.0 for persistence.
Basically the system allows users to add details about a piece of work they are doing, the amount of hours they have been assigned to complete the work as well as the amount of hours they have spent on the work to date.
The system also has a reporting facility with the reports based on SQL queries. Recently I have noticed that many reports being run from the system have become very slow to execute. The database has around 11 tables, and it doesn’t store too much data. 27,000 records is the most records any one table holds, with the majority of tables well below even 1,500 records.
I don’t think the issue is therefore related to large volumes of data, I think it is more to do with poorly constructed sql queries and possibly even the same applying to the database design.
For example, there are queries similar to this
#start_date datetime,
#end_date datetime,
#org_id int
select distinct t1.timesheet_id,
t1.proposal_job_ref,
t1.work_date AS [Work Date],
consultant.consultant_fname + ' ' + consultant.consultant_lname AS [Person],
proposal.proposal_title AS [Work Title],
t1.timesheet_time AS [Hours],
--GET TOTAL DAYS ASSIGNED TO PROPOSAL
(select sum(proposal_time_assigned.days_assigned)-- * 8.0)
from proposal_time_assigned
where proposal_time_assigned.proposal_ref_code = t1.proposal_job_ref )
as [Total Days Assigned],
--GET TOTAL DAYS SPENT ON THE PROPOSAL SINCE 1ST APRIL 2013
(select isnull(sum(t2.timesheet_time / 8.0), '0')
from timesheet_entries t2
where t2.proposal_job_ref = t1.proposal_job_ref
and t2.work_date <= t1.work_date
and t2.work_date >= '01/04/2013' )
as [Days Spent Since 1st April 2013],
--GET TOTAL DAYS REMAINING ON THE PROPOSAL
(select sum(proposal_time_assigned.days_assigned)
from proposal_time_assigned
where proposal_time_assigned.proposal_ref_code = t1.proposal_job_ref )
-
(select sum(t2.timesheet_time / 8.0)
from timesheet_entries t2
where t2.proposal_job_ref = t1.proposal_job_ref
and t2.work_date <= t1.work_date
) as [Total Days Remaining]
from timesheet_entries t1,
consultant,
proposal,
proposal_time_assigned
where (proposal_time_assigned.consultant_id = consultant.consultant_id)
and (t1.proposal_job_ref = proposal.proposal_ref_code)
and (proposal_time_assigned.proposal_ref_code = t1.proposal_job_ref)
and (t1.code_id = #org_id) and (t1.work_date >= #start_date) and (t1.work_date <= #end_date)
and (t1.proposal_job_ref <> '0')
order by 2, 3
Which are expected to return data for reports. I am not even sure if anyone can follow what is happening in the query above, but basically there are quite a few calculations happening, i.e., dividing, multiplying, substraction. I am guessing this is what is slowing down the sql queries.
I suppose my question is, can anyone even make enough sense of the query above to even suggest how to speed it up.
Also, should calculations like the ones mentioned above ever been carried out in an sql query? Or should the this be done within code?
Any help would be really appreciated with this one.
Thanks.

based on the information given i had to do an educated guess about certain table relationships. if you post the table structures, indexes etc... we can complete remaining columns in to this query.
As of right now this query calculates "Days Assigned", "Days Spent" and "Days Remaining"
for the KEY "timesheet_id and proposal_job_ref"
what we have to see is how "work_date", "timesheet_time", "[Person]", "proposal_title" is associate with that.
are these calculation by person and Proposal_title as well ?
you can use sqlfiddle to provide us the sample data and output so we can work off the meaning full data instead doing guesses.
SELECT
q1.timesheet_id
,q1.proposal_job_ref
,q1.[Total Days Assigned]
,q2.[Days Spent Since 1st April 2013]
,(
q1.[Total Days Assigned]
-
q2.[Days Spent Since 1st April 2013]
) AS [Total Days Remaining]
FROM
(
select
t1.timesheet_id
,t1.proposal_job_ref
,sum(t4.days_assigned) as [Total Days Assigned]
from tbl1.timesheet_entries t1
JOIN tbl1.proposal t2
ON t1.proposal_job_ref=t2.proposal_ref_code
JOIN tbl1.proposal_time_assigned t4
ON t4.proposal_ref_code = t1.proposal_job_ref
JOIN tbl1.consultant t3
ON t3.consultant_id=t4.consultant_id
WHERE t1.code_id = #org_id
AND t1.work_date BETWEEN #start_date AND #end_date
AND t1.proposal_job_ref <> '0'
GROUP BY t1.timesheet_id,t1.proposal_job_ref
)q1
JOIN
(
select
tbl1.timesheet_id,tbl1.proposal_job_ref
,isnull(sum(tbl1.timesheet_time / 8.0), '0') AS [Days Spent Since 1st April 2013]
from tbl1.timesheet_entries tbl1
JOIN tbl1.timesheet_entries tbl2
ON tbl1.proposal_job_ref=tbl2.proposal_job_ref
AND tbl2.work_date <= tbl1.work_date
AND tbl2.work_date >= '01/04/2013'
WHERE tbl1.code_id = #org_id
AND tbl1.work_date BETWEEN #start_date AND #end_date
AND tbl1.proposal_job_ref <> '0'
GROUP BY tbl1.timesheet_id,tbl1.proposal_job_ref
)q2
ON q1.timesheet_id=q2.timesheet_id
AND q1.proposal_job_ref=q2.proposal_job_ref

The Problem what i see in your query is :
1> Alias name is not provided for the Tables.
2> Subqueries are used (which are execution cost consuming) instead of WITH clause.
if i would write your query it will look like this :
select distinct t1.timesheet_id,
t1.proposal_job_ref,
t1.work_date AS [Work Date],
c1.consultant_fname + ' ' + c1.consultant_lname AS [Person],
p1.proposal_title AS [Work Title],
t1.timesheet_time AS [Hours],
--GET TOTAL DAYS ASSIGNED TO PROPOSAL
(select sum(pta2.days_assigned)-- * 8.0)
from proposal_time_assigned pta2
where pta2.proposal_ref_code = t1.proposal_job_ref )
as [Total Days Assigned],
--GET TOTAL DAYS SPENT ON THE PROPOSAL SINCE 1ST APRIL 2013
(select isnull(sum(t2.timesheet_time / 8.0), 0)
from timesheet_entries t2
where t2.proposal_job_ref = t1.proposal_job_ref
and t2.work_date <= t1.work_date
and t2.work_date >= '01/04/2013' )
as [Days Spent Since 1st April 2013],
--GET TOTAL DAYS REMAINING ON THE PROPOSAL
(select sum(pta2.days_assigned)
from proposal_time_assigned pta2
where pta2.proposal_ref_code = t1.proposal_job_ref )
-
(select sum(t2.timesheet_time / 8.0)
from timesheet_entries t2
where t2.proposal_job_ref = t1.proposal_job_ref
and t2.work_date <= t1.work_date
) as [Total Days Remaining]
from timesheet_entries t1,
consultant c1,
proposal p1,
proposal_time_assigned pta1
where (pta1.consultant_id = c1.consultant_id)
and (t1.proposal_job_ref = p1.proposal_ref_code)
and (pta1.proposal_ref_code = t1.proposal_job_ref)
and (t1.code_id = #org_id) and (t1.work_date >= #start_date) and (t1.work_date <= #end_date)
and (t1.proposal_job_ref <> '0')
order by 2, 3
Check above query for any indexing option & number of records to be processed from each table.

Check your databases for indexes on the following tables (if those columns are not indexed, then start by indexing each).
proposal_time_assigned.proposal_ref_code
proposal_time_assigned.consultant_id
timesheet_entries.code_id
timesheet_entries.proposal_job_ref
timesheet_entries.work_date
consultant.consultant_id
proposal.proposal_ref_code
Without all of these indexes, nothing will improve this query.
The only thing in your query that would affect performance is the way you are filtering the [work_date]. Your current syntax causes a table scan:
--bad
and t2.work_date <= t1.work_date
and t2.work_date >= '01/04/2013'
This syntax uses an index (if it exists) and would be much faster:
--better
and t2.work_date between t1.work_date and '01/04/2013'

Related

Complex calculation of the hours amount

I have a Pythonic system that stores student absences data in a SQLite database. Each row includes the start and end time of the absence, represented by the number of seconds since Jan 01 1970. I was asked to add a feature which limits the number of hours of absence per week.
It sounds easy to pull out the amount of hours, using a statement like this:
SELECT (sum(ending-starting)/3600)
FROM requests
WHERE student_id = {x}
AND starting BETWEEN {y} AND ({y}+604800)
AND approved = 1
The problem is that the limit must only be the hours defined as "mandatory presence." For example, if a user has defined the hours 8:00 to 17:00 as a "mandatory presence," an absence that begins on Sunday at 14:00 and ends on Monday at the same time, will be calculated in the code above 24 hours, while in practice it is only 9 hours.
"Mandatory presence" is defined in the database as two numerical parameters: "morning" and "evening" (always a round hour). Is there a way to make the calculation above taking into account these two numbers?
If it can not be done in sql, I would love to hear how to select the data in sql and then perform the calculation in python.
I believe the following may do what you wish :-
SELECT
(
sum((
(ending - starting)
-(
CASE WHEN starting < strftime('%s',date(starting,'unixepoch')||' 08:00')
THEN strftime('%s',date(starting,'unixepoch')||' 08:00') - starting
ELSE 0
END
+
CASE WHEN ending > strftime('%s',date(starting,'unixepoch')||' 17:00')
THEN ending - strftime('%s',date(starting,'unixepoch')||' 17:00')
ELSE 0
END
)
) /3600)
) AS ha, *
FROM requests
WHERE student_id = {x}
AND starting BETWEEN {y} AND ({y}+604800)
AND approved = 1
;
MikeT's answer is not entirely working, but it certainly helped me reach the desired result. Here's the perfect statement:
SELECT
(
sum((
(ending - starting)
-(
CASE WHEN starting < strftime('%s',date(starting,'unixepoch')||printf(' %02d:00', morning))
THEN strftime('%s',date(starting,'unixepoch')||printf(' %02d:00', morning)) - starting
ELSE 0
END
+
CASE WHEN ending > strftime('%s',date(ending,'unixepoch')||printf(' %02d:00', evening))
THEN ending - strftime('%s',date(ending,'unixepoch')||printf(' %02d:00', evening))
ELSE 0
END
)
) /3600.0
-(
(24-evening+morning)
*
(round(julianday(ending, 'unixepoch'))-round(julianday(starting, 'unixepoch')))
)
)) AS ha
FROM requests
INNER JOIN students ON requests.student_id = students.ID
INNER JOIN institutes ON students.inst_id = institutes.ID
WHERE student_id = {x}
AND starting BETWEEN {y} AND ({y}+604800)
AND approved = 1;
Thank you very much for your help!

Get a count of rows that meet condition

SQLITE3
Task: get a data set that contains the following data - SEE NOTES BESIDE COLUMNS
SELECT DISTINCT DateTime(Rounded, 'unixepoch') AS RoundedDate, -- Rounded DateTime to the floor hour
Count() AS Count, -- Count of items that registered within the above time
CAST (avg(Speed) AS INT) AS AverageSpeed, -- Average table.Speed column data within the defined datetime
Count() AS SpeederCount -- ?? WTF? [pseudo constraints: if Speed > Speedlimit then +1]
FROM RawSpeedLane AS sl
INNER JOIN
SpeedLaneSearchData AS slsd ON slsd.ParentId = sl.Id
INNER JOIN
Projects AS p ON p.ProjectId = sl.ProjectId
WHERE sl.ProjectId = 72
GROUP BY RoundedDate;
The SQL above is currently gives me all the data I need, EXECPT for the last column.
This last column is supposed to be the count of records where that pass specific criteria. The only way I have found to successfully do this is to build a sub query... Cool? okay, but the problem is the sub query takes 4 minutes to run because well... I suck at SQL :P No matter how many different ways I've tried to write it, it still takes forever.
Here is the long, but working version.
SELECT DISTINCT RoundedDate,
Count() AS Count,
CAST (avg(Speed) AS INT) AS AverageSpeed,
(
SELECT count()
FROM RawSpeedLane AS slr
WHERE slr.ProjectId = 72 AND
datetime( ( (strftime('%s', Start) - (strftime('%M', Start) * 60 + strftime('%S', Start) ) ) ), 'unixepoch') = sl.RoundedDate AND
Speed > p.SpeedLimit
)
AS SpeederCount
FROM SpeedLaneReportDataView AS sl
INNER JOIN
Projects AS p ON p.ProjectId = sl.ProjectId
WHERE sl.ProjectId = 72
GROUP BY RoundedDate;
I currently just tried this for the last column
(select Count() where sl.Speed > p.SpeedLimit)
but as expected, i got 1s and 0s im not really sure on what to do here. Any hints or help that lead me in the right direction is very much appreciated.
I don't think SQLite has an IIF but CASE works.
This is a response to Backs answer, but I can't comment yet.
SELECT DISTINCT DateTime(Rounded, 'unixepoch') AS RoundedDate, -- Rounded DateTime to the floor hour
Count() AS Count, -- Count of items that registered within the above time
CAST (avg(Speed) AS INT) AS AverageSpeed, -- Average table.Speed column data within the defined datetime
SUM(CASE WHEN Speed > SpeedLimit THEN 1 ELSE 0 END) AS SpeederCount
FROM RawSpeedLane AS sl
With SUM and IIF:
SELECT DISTINCT DateTime(Rounded, 'unixepoch') AS RoundedDate, -- Rounded DateTime to the floor hour
Count() AS Count, -- Count of items that registered within the above time
CAST (avg(Speed) AS INT) AS AverageSpeed, -- Average table.Speed column data within the defined datetime
SUM(IIF(Speed > SpeedLimit, 1, 0)) AS SpeederCount
FROM RawSpeedLane AS sl

SQLite Efficient Running Total

I have a table of transactions in SQLite
number date Category Amount runningBalance
I want the running balance column to have a running sum of the amount column after the table is sorted by Date first and number second.
I can do this with a select when reading. But this table has the potential to get very large and I don't want to recalculate every time. I want to make a trigger where all the transactions following (by date then number) the inserted/edited transaction have their runningBalance value updated.
This will mean that the calculations are reduced... as more recent transactions are likely to be edited more often, and older ones rarely. It also will spread the computation over writes so that reads are near instant.
Can anyone provide assistance on how to set up such a trigger?
so far this is what I have but it does not give desired results. And recalculates all every time. Not just the ones following the change.
CREATE TRIGGER RunningTotal AFTER UPDATE ON Transactions FOR EACH ROW
BEGIN
UPDATE Transactions
SET RunningBalance = (
SELECT (
SELECT sum(Amount)
FROM TopInfo t2
WHERE t2.Date <= t1.Date
)
FROM Transactions t1
);
END;
Thanks!
I've managed to find a way that works. Not sure how efficient it is though. Love to hear if anyone knows a more efficient way to update the Balance column.
CREATE TRIGGER Balance AFTER UPDATE OF Amount ON Transactions FOR EACH ROW
BEGIN
UPDATE Transactions
SET Balance = (
SELECT Balance
FROM (
SELECT TransactionID,
(
SELECT sum(t2.Amount)
FROM Transactions t2
WHERE t2.Date <= t1.Date
ORDER BY Date
)
AS Balance
FROM Transactions t1
WHERE TransactionID = Transactions.TransactionID
ORDER BY Date
)
)
WHERE Transactions.Date >= NEW.Date;
END;
UPDATE:
CREATE TRIGGER Balance AFTER UPDATE OF Amount ON Transactions FOR EACH ROW
BEGIN
UPDATE Transactions
SET Balance = (
SELECT Balance
FROM (
SELECT TransactionID,
(
SELECT sum(t2.Amount)
FROM Transactions t2
WHERE CASE WHEN t2.Date = t1.Date THEN t2.TransactionID <= t1.TransactionID ELSE t2.Date <= t1.Date END
ORDER BY Date,
TransactionID
)
AS Balance
FROM Transactions t1
WHERE TransactionID = Transactions.TransactionID
ORDER BY Date,
TransactionID
)
)
WHERE Transactions.Date >= NEW.Date;
END;
I've Done Some more with running total and have come up with 2 ways. The second is much slower than the first. Any ideas why???
method 1
SELECT TransactionID,Date, Account, Amount,
(SELECT sum(t2.Amount)
FROM Transactions t2
WHERE
CASE WHEN t2.Date = t1.Date
THEN t2.TransactionID <= t1.TransactionID
AND t2.Account == t1.Account
ELSE t2.Date <= t1.Date
AND t2.Account == t1.Account
END
ORDER BY Date, TransactionID)
AS Balance
FROM Transactions t1
ORDER BY Date, TransactionID
Method2
SELECT n.TransactionID, n.Date, n.Account, n.Amount,
SUM(o.Amount) As running_total
FROM Transactions n LEFT JOIN Transactions o
ON (
CASE WHEN o.Date = n.Date
THEN n.TransactionID >= o.TransactionID
AND o.Account == n.Account
ELSE n.Date >= o.Date
AND o.Account == n.Account
END
)
GROUP BY n.Account, n.Date, n.TransactionID
ORDER BY n.Date, n.TransactionID;

Retrieve a table to tallied numbers, best way

I have query that runs as part of a function which produces a one row table full of counts, and averages, and comma separated lists like this:
select
(select
count(*)
from vw_disp_details
where round = 2013
and rating = 1) applicants,
(select
count(*)
from vw_disp_details
where round = 2013
and rating = 1
and applied != 'yes') s_applicants,
(select
LISTAGG(discipline, ',')
WITHIN GROUP (ORDER BY discipline)
from (select discipline,
count(*) discipline_number
from vw_disp_details
where round = 2013
and rating = 1
group by discipline)) disciplines,
(select
LISTAGG(discipline_count, ',')
WITHIN GROUP (ORDER BY discipline)
from (select discipline,
count(*) discipline_count
from vw_disp_details
where round = 2013
and rating = 1
group by discipline)) disciplines_count,
(select
round(avg(util.getawardstocols(application_id,'1','AWARD_NAME')), 2)
from vw_disp_details
where round = 2013
and rating = 1) average_award_score,
(select
round(avg(age))
from vw_disp_details
where round = 2013
and rating = 1) average_age
from dual;
Except that instead of 6 main sub-queries there are 23.
This returns something like this (if it were a CSV):
applicants | s_applicants | disciplines | disciplines_count | average_award_score | average_age
107 | 67 | "speed,accuracy,strength" | 3 | 97 | 23
Now I am programmatically swapping out the "rating = 1" part of the where clauses for other expressions. They all work rather quickly except for the "rating = 1" one which takes about 90 seconds to run and that is because the rating column in the vw_disp_details view is itself compiled by a sub-query:
(SELECT score
FROM read r,
eval_criteria_lookup ecl
WHERE r.criteria_id = ecl.criteria_id
AND r.application_id = a.lgo_application_id
AND criteria_description = 'Overall Score'
AND type = 'ABC'
) reader_rank
So when the function runs this extra query seems to slow everything down dramatically.
My question is, is there a better (more efficient) way to run a query like this that is basically just a series of counts and averages, and how can I refactor to optimize the speed so that the rating = 1 query doesn't take 90 seconds to run.
You could choose to MATERIALIZE the vw_disp_details VIEW. That would pre-calculate the value of the rating column. There are various options for how up-to-date a materialized view is kept, you would probably want to use the ON COMMIT clause so that vw_disp_details is always correct.
Have a look at the official documentation and see if that would work for you.
http://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_6002.htm
Do all most of your queries in only one. Instead of doing:
select
(select (count(*) from my_tab) as count_all,
(select avg(age) from my_tab) as avg_age,
(select avg(mypkg.get_award(application_id) from my_tab) as_avg-app_id
from dual;
Just do:
select count(*), avg(age),avg(mypkg.get_award(application_id)) from my_tab;
And then, maybe you can do some union all for the other results. But this step all by itself should help.
I was able to solve this issue by doing two things: creating a new view that displayed only the results I needed, which gave me marginal gains in speed, and in that view moving the where clause of the sub-query that caused the lag into the where clause of the view and tacking on the result of the sub-query as column in the view. This still returns the same results thanks to the fact that there are always going to be records in the table the sub-query accessed for each row of the view query.
SELECT
a.application_id,
util.getstatus (a.application_id) status,
(SELECT score
FROM applicant_read ar,
eval_criteria_lookup ecl
WHERE ar.criteria_id = ecl.criteria_id
AND ar.application_id = a.application_id
AND criteria_description = 'Overall Score' //THESE TWO FIELDS
AND type = 'ABC' //ARE CRITERIA_ID = 15
) score
as.test_total test_total
FROM application a,
applicant_scores as
WHERE a.application_id = as.application_id(+);
Became
SELECT
a.application_id,
util.getstatus (a.application_id) status,
ar.score,
as.test_total test_total
FROM application a,
applicant_scores as,
applicant_read ar
WHERE a.application_id = as.application_id(+)
AND ar.application_id = a.application_id(+)
AND ar.criteria_id = 15;

Working with date ranges (Classic ASP and SQL)

I have to implement a solution where two date ranges can overlap each other. within the overlapped dates, I have to count how many days overlap each other. Once I know the overlapped days I can calculate a total figure based on the price that's attached per day.
A scenario would be that
A customer is booking a hotel
Customer booking dates - 17/02/2011 to 26/02/2011
Normal price (All year) - 01/01/2011 - 31/12/2011 (price per day :$30.00)
Special Offer 1 dates - 01/01/2011 to 19/02/2011 (price per day :$20.00)
Special Offer 2 dates - 17/02/2011 to 24/02/2011 (price per day :$10.00)
In the above scenario, the proposed algorithm should work out the cheapest offer that the date ranges overlap and work out the price for the booking. If there is no special offer available it uses the normal price.
So for the first two days the system should get the price from "special offer 1" as it's the cheapest available price. Next 5 days should be "Special offer 2 price" and for the next 2 days it'll be normal price.
I'd be grateful to see both SQL(using MS-SQL Server) or Code base answers to get the diffrenet views.
I hope the question is clear and looking foward to see the answers.
Many thanks in advance
Using the standard trick of using an auxiliary calendar table, it is simply a case of joins and grouping to get the best price each day:
SELECT C.dt, MIN(price) AS best_price
FROM Prices P
INNER JOIN Calendar C
ON C.dt >= P.price_start_date
AND C.dt < P.price_end_date
INNER JOIN CustomerBooking B
ON C.dt >= B.booking_start_date
AND C.dt < B.booking_end_date
GROUP
BY C.dt;
The same query as above, including sample data using CTEs:
WITH Prices (price_start_date, price_end_date, narrative, price)
AS
(
SELECT CAST(start_date AS Date), CAST(end_date AS Date), narrative, price
FROM (
VALUES ('2011-01-01T00:00:00', '2011-12-31T00:00:00', 'Normal price', 30),
('2011-01-01T00:00:00', '2011-02-21T00:00:00', 'Special Offer 1', 20),
('2011-02-19T00:00:00', '2011-02-24T00:00:00', 'Special Offer 2', 10)
) AS T (start_date, end_date, narrative, price)
),
CustomerBooking (booking_start_date, booking_end_date)
AS
(
SELECT CAST(start_date AS Date), CAST(end_date AS Date)
FROM (
VALUES ('2011-02-17T00:00:00', '2011-02-26T00:00:00')
) AS T (start_date, end_date)
)
SELECT C.dt, MIN(price) AS best_price
FROM Prices P
INNER JOIN Calendar C
ON C.dt >= P.price_start_date
AND C.dt < P.price_end_date
INNER JOIN CustomerBooking B
ON C.dt >= B.booking_start_date
AND C.dt < B.booking_end_date
GROUP
BY C.dt;
Let's supose that for each day you should apply lowest price.
create function price ( #fromDate date, #toDate date) returns money
as
begin
declare #iterator_day date
declare #total money
set #total = 0
set #iterator_day = #fromDate
WHILE #iterator_day < = #toDate
begin
select #total = #total + min( price )
from offers
where #iterator_day between offers.fromFay and offers.toDay
set #iterator_day = DATEADD (day , 1 , #iterator_day )
end
return #total
end
then you can call function in your query:
select
b.fromDay, b.toDay, dbo.price( b.fromDay, b.toDay )
from
booking b
I've only used ASP.net 4.0, but I can offer some SQL will give you the price for a given date:
SELECT ISNULL(MIN(PricePerDay), 0) AS MinPricePerDay
FROM Offers
WHERE (StartDate <= '18/2/11') AND (EndDate >= '18/2/11')
From your application you could build the query to be something like this:
SELECT ISNULL(MIN(PricePerDay), 0) AS MinPricePerDay
FROM Offers
WHERE (StartDate <= '17/2/11') AND (EndDate >= '17/2/11');
SELECT ISNULL(MIN(PricePerDay), 0) AS MinPricePerDay
FROM Offers
WHERE (StartDate <= '18/2/11') AND (EndDate >= '18/2/11');
SELECT ISNULL(MIN(PricePerDay), 0) AS MinPricePerDay
FROM Offers
WHERE (StartDate <= '19/2/11') AND (EndDate >= '19/2/11');
This would return a dataset of tables containing a single value for the minimum price for that date (in the same order as your query)
Sounds like a good job for a Stored Procedure...
Your problem here is that you're got multiple overlapping time periods. You either need to constrain the problem slightly, or remodel the data slightly. (To get desirable performance.)
Option 1 - Constraints
A data set of 'normal' prices - that never overlap with each other
A data set of 'special' prices - that also never overlap with each other
Every bookable date has a 'normal' price
Every bookable date has a 'special' price (EVEN if it's NULL to mean 'no special price')
The last constraint is the strangest one. But it's needed to make the simple join work. When comparing date ranges, it's alot easier to form the query if the two sets of ranges are gapless and have no overlaps inside them.
This means that you should now be able to work it out with just a few joins...
SELECT
CASE WHEN [sp].started > [np].started THEN [sp].started ELSE [np].started END AS [started]
CASE WHEN [sp].expired < [np].expired THEN [sp].expired ELSE [np].expired END AS [expired]
CASE WHEN [sp].price < [np].price THEN [sp].price ELSE [np].price END AS [price]
FROM
normal_prices AS [np]
LEFT JOIN
special_prices AS [sp]
ON [sp].started < [np].expired
AND [sp].expired > [np].started
AND [sp].started >= (SELECT ISNULL(MAX(started),0) FROM special_prices WHERE started <= [np].started)
-- The third condition is an optimisation for large data-sets.
WHERE
[np].started < #expired
AND [np].expired > #started
-- Note: Inclusive StartDates, Exlusive EndDate
-- For example, "all of Jan" would be "2011-01-01" to "2011-02-01"
Option 2 - Re-Model
This one is often the fastest in my experience; you increase the amount of space being used, and gain a simpler faster query...
Table Of Prices, stored by DAY rather than period...
- calendar_date
- price_code
- price
SELECT
calendar_date,
MIN(price)
FROM
prices
WHERE
calendar_date >= #started
AND calendar_date < #expired
Or, if you needed the price_code as well...
WITH
ordered_prices AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY calendar_date ORDER BY price ASC, price_code) AS price_rank,
*
FROM
prices
)
SELECT
calendar_date,
price_code,
price
FROM
ordered_prices
WHERE
calendar_date >= #started
AND calendar_date < #expired

Resources