I have a table(pay_period) as following
pay_period
period_id list_id start_date end_date price
1 100 2017-01-01 2017-08-31 100
2 100 2017-09-01 2017-12-31 110
3 101 2017-01-01 2017-08-31 75
Now I have list_id, checkin_date, checkout_date
list_id 100
checkin_date 2017-08-25
checkout_date 2017-09-10
I need to calculate the price of a list for the period from checkin date to checkout date.
therefore the calculation is supposed to be
7 * 100 + 10 * 110
I am thinking to do it with a for loop, if there is any other better way to do it, can you please suggest?
You have to see if the checkin_date and checkout_date are into the same period_id.
1.1 If yes, you multiply the price with the nunmber of days.
1.2 If no, you have count the days between checkin_day untill the end of your period 1 and multiply with the corresponding price, then do the same with checkout_day and beginning of next period.
Note: i guess it might happen to have more than 2 prices per list_id. for example:
period_id list_id start_date end_date price
1 100 2017-01-01 2017-04-30 100
2 100 2017-05-01 2017-09-30 110
3 100 2017-10-01 2017-12-31 120
4 101 2017-01-01 2017-08-31 75
and the calculation period to be:
list_id 100
checkin_date 2017-03-01
checkout_date 2017-11-10
In this case, yes, the solution would be to have a CURSOR where to keep the prices for list_id and periods; loop through it and compare the checkin_date and checkout_date with each record.
Best,
Mikcutu.
You can do the following for a much cleaner code. Although it is purely sql, I am using a function to make it code better to understand.
Create a generic function which gets you the number of overlapping days in 2 different date range.
CREATE OR REPLACE FUNCTION fn_count_range
( p_start_date1 IN DATE,
p_end_date1 IN DATE,
p_start_date2 IN DATE,
p_end_date2 IN DATE ) RETURN NUMBER AS
v_days NUMBER;
BEGIN
IF p_end_date1 < p_start_date1 OR p_end_date2 < p_start_date2 THEN
RETURN 0;
END IF;
SELECT COUNT(*) INTO v_days
FROM (
(SELECT p_start_date1 + LEVEL - 1
FROM dual CONNECT BY LEVEL <= p_end_date1 - p_start_date1 + 1 ) INTERSECT
(SELECT p_start_date2 + LEVEL - 1
FROM dual CONNECT BY LEVEL <= p_end_date2 - p_start_date2 + 1 ) );
RETURN v_days;
END;
/
Now, your query to calculate the total price is simplified.
WITH lists ( list_id,
checkin_date,
checkout_date) AS
( SELECT 100,
TO_DATE('2017-08-25','YYYY-MM-DD'),
TO_DATE('2017-09-10','YYYY-MM-DD')
FROM dual) --Not required if you have a lists table.
SELECT l.list_id,
SUM(fn_count_range(start_date,end_date,checkin_date,checkout_date) * price) total_price
FROM pay_period p
JOIN lists l ON p.list_id = l.list_id
GROUP BY l.list_id;
Related
I'm trying to come up with an SQL solution for the following problem. I have a table in which stock transactions are saved. As transactions don't occur on a daily basis the time series in the table is incomplete.
Table TRANSACTION
Date
Depot
Ticker
Buy or Sell
Shares
2022-02-01
A
MSFT
BUY
100
2022-02-04
B
AAPL
BUY
20
2022-02-05
A
MSFT
SELL
80
2022-02-07
A
PG
BUY
50
2022-02-08
A
PG
BUY
30
2022-02-08
A
PG
BUY
20
2022-02-09
A
PG
BUY
50
2022-02-10
B
T
BUY
10
Now I would like to write a SQL statement that gives me the daily share count per depot an ticker for every day since 2022-02-01. The expected result would be this.
Date
Depot
Ticker
Share Count
2022-02-01
A
MSFT
100
2022-02-02
A
MSFT
100
2022-02-03
A
MSFT
100
2022-02-04
A
MSFT
100
2022-02-04
B
AAPL
20
2022-02-05
A
MSFT
20
2022-02-05
B
AAPL
20
2022-02-06
A
MSFT
20
2022-02-06
B
AAPL
20
2022-02-07
A
MSFT
20
2022-02-07
A
PG
50
2022-02-07
B
AAPL
20
2022-02-08
A
PG
50
2022-02-08
A
MSFT
20
2022-02-08
A
PG
100
2022-02-08
B
AAPL
50
2022-02-09
A
MSFT
20
2022-02-09
A
PG
150
2022-02-09
B
AAPL
50
2022-02-10
A
MSFT
20
2022-02-10
A
PG
150
2022-02-10
B
AAPL
50
2022-02-10
B
T
10
What I'v done so far:
A: Getting a list of all dates between 2022-02-01 and the current date (e.g. 2022-02-10).
B: Getting a running total over the shares column by depot and ticker ordered by date.
A Left join B
WITH RECURSIVE dates(date) AS (
VALUES('2022-02-01')
UNION ALL
SELECT date(date, '+1 day')
FROM dates
WHERE date < DATE()
)
SELECT dt.Date
, tr.Date
, tr.Ticker
, tr.Shares AS [Share Count]
FROM dates dt
LEFT JOIN (
SELECT Date
, Depot
, Ticker
, SUM(shares) OVER (PARTITION BY Depot, Ticker ORDER BY Date ROWS UNBOUNDED PRECEDING) AS Shares
FROM TRANSACTION
ORDER BY Depot, Date, Ticker
) ON (
tr.Date <= dt.Date
)
ORDER BY dt.Date, Ticker
That doesn't seem to do the trick as the running total does not properly aggregate the date rows. The result looks more like a kind of cross join between the synthetic date table and the transaction table.
Where am I wrong? Has anyone a solution for my problem?
Thanks a lot!
You are almost there, just expanded on your idea.
I have hard-coded end date in below example, but that can be changed as per requirement.
The dates CTE is just building dates within the needed range.
In the next sub-query (t1), there are two more columns added - column md for max date and column s_shares to get a running sum of shares based on if share is bought (added) or subtracted if sold.
We then select from the two sets values between dates max date (column md) and original date.
WITH RECURSIVE dates(date1) AS (
select date('2022-02-01')
UNION ALL
SELECT date(date1, '+1 day')
FROM dates
WHERE date1 < date('2022-02-10')
) select d.date1,depot,ticker, s_shares
from dates d,
(select date1,
case
coalesce(lead(date1) over (partition by depot, ticker order by date1),0)
when 0 then date('2022-02-10', '+1 day')
else lead(date1) over (partition by depot, ticker order by date1)
end md,
depot, ticker, buyorsell,
sum(case when buyorsell = 'SELL' then -(Shares) else Shares end)
over (partition by depot, ticker order by Date1 ROWS UNBOUNDED PRECEDING) s_shares
from tic) t1
where d.date1< t1.md
and d.date1>= t1.date1
order by depot, ticker, d.date1;
Refer to the fiddle here.
I have a SQLite database, I want to create a query that will group records if the DateTime is within 60 minutes - the hard part is the DateTime is cumulative so if we have 3 records with DateTimes 2019-12-14 15:40:00, 2019-12-14 15:56:00 and 2019-12-14 16:55:00 it would all fall in one group. Please see the hands and desired output of the query to help you understand the requirement.
Database Table "Hands"
ID DateTime Result
1 2019-12-14 15:40:00 -100
2 2019-12-14 15:56:00 1000
3 2019-12-14 16:55:00 -2000
4 2012-01-12 12:00:00 400
5 2016-10-01 21:00:00 900
6 2016-10-01 20:55:00 1000
Desired output of query
StartTime Count Result
2019-12-14 15:40:00 3 -1100
2012-01-12 12:00:00 1 400
2016-10-01 20:55:00 2 1900
You can use some window functions to indicate at which record a new group should start (because of a datetime difference with the previous that is 60 minutes or larger), and then to turn that information into a unique group number. Finally you can group by that group number and perform the aggregation functions on it:
with base as (
select DateTime, Result,
coalesce(cast((
julianday(DateTime) - julianday(
lag(DateTime) over (order by DateTime)
)
) * 24 >= 1 as integer), 1) as firstInGroup
from Hands
), step as (
select DateTime, Result,
sum(firstInGroup) over (
order by DateTime rows
between unbounded preceding and current row) as grp
from base
)
select min(DateTime) DateTime,
count(*) Count,
sum(Result) Result
from step
group by grp;
DB-fiddle
I am writing a query on this table to get the sum of size for all the directories, group by directory where date is yesterday. I am getting no output from the below query.
test.id test.path test.size test.date
1 this/is/the/path1/fil.txt 232.24 2019-06-01
2 this/is/the/path2/test.txt 324.0 2016-06-01
3 this/is/the/path3/index.txt 12.3 2017-05-01
4 this/is/the/path4/test2.txt 134.0 2019-03-23
5 this/is/the/path1/files.json 2.23 2018-07-23
6 this/is/the/path1/code.java 1.34 2014-03-23
7 this/is/the/path2/data.csv 23.42 2016-06-23
8 this/is/the/path3/test.html 1.33 2018-09-23
9 this/is/the/path4/prog.js 6.356 2019-06-23
4 this/is/the/path4/test2.txt 134.0 2019-04-23
SELECT regexp_replace(path,'[^/]+$',''), sum(cast(size as decimal))
from test WHERE date > date_sub(current_date, 1) GROUP BY path,size;
You must not group by size, only by regexp_replace(path,'[^/]+$','').
Also, since you want only yesterday's rows why do you use WHERE date > '2019%?
You can get yesterday's date with date_sub(current_date, 1):
select
regexp_replace(path,'[^/]+$',''),
sum(cast(size as decimal))
from test
where date = date_sub(current_date, 1)
group by regexp_replace(path,'[^/]+$','');
You probably want WHERE date >= '2019-01-01'. Using % in matching strings, for example your 2019%, only works with LIKE, not inequality matching.
The example you gave looks like you want all rows in calendar year 2019.
For yesterday, you want
WHERE date >= DATE_SUB(current_date, -1)
AND date < current_date
This works even if your date column contains timestamps.
enter image description here
How to generate dates between tow date column based on each row
A row generator technique should be used, such as:
SQL> alter session set nls_date_format = 'dd.mm.yyyy';
Session altered.
SQL> with test (sno, start_date, end_date) as
2 (select 1, date '2018-01-01', date '2018-01-05' from dual union
3 select 2, date '2018-01-03', date '2018-01-05' from dual
4 )
5 select sno, start_date + column_value - 1 datum
6 from test,
7 table(cast(multiset(select level from dual
8 connect by level <= end_date - start_date + 1)
9 as sys.odcinumberlist))
10 order by sno, datum;
SNO DATUM
---------- ----------
1 01.01.2018
1 02.01.2018
1 03.01.2018
1 04.01.2018
1 05.01.2018
2 03.01.2018
2 04.01.2018
2 05.01.2018
8 rows selected.
SQL>
I wrote a query in SQLite that cumulates VALUE and it is good if I order by ID only (ID is auto increment int). But, if I want to order by DATE and then by ID, the query shows me wrong results. Any ideas?
Query:
select t1.ID,
t1.DATE,
t1.VALUE,
sum(t2.VALUE) as TOTAL
from test t1
inner join test t2 on t1.DATE >= t2.DATE and t1.id >= t2.id
group by t1.ID, t1.DATE
order by t1.DATE desc, t1.ID desc
Results:
ID DATE VALUE TOTAL
9 2015-01-16 55 281 -- should be 226 + 55 = 281
6 2015-01-15 10 26 -- should be 216 + 10 = 226
5 2015-01-15 5 16 -- should be 211 + 5 = 216
8 2015-01-14 100 211
7 2015-01-14 100 111
4 2015-01-13 5 11
3 2015-01-12 3 6
2 2015-01-11 2 3
1 2015-01-10 1 1
At least on the small data, changing the join to something like this:
t1.DATE||substr('00000'||t1.id,-5) >= t2.DATE||substr('00000'||t2.id,-5)
works fine... So we are building up a "sort key" based on the date and the id (padded with zeros to 5 digits) and using that in the join. It might be necessary to build an index on that to deal with performance though