I have a table with data like below:
Log Table:
User Id
Login Date
1
2022-01-03
1
2022-01-04
1
2022-01-10
1
2022-01-11
1
2022-01-12
1
2022-01-23
1
2022-01-25
1
2022-01-26
1
2022-01-27
1
2022-01-28
What I'm trying to do is to create a query that return rows of the latest logins by consecutive dates with var_date as parameter.
If var_date is 2022-01-29, then the result is:
User Id
Login Date
1
2022-01-25
1
2022-01-26
1
2022-01-27
1
2022-01-28
If var_date is 2022-01-30, then no result is returned, since 2022-01-29 is not in the table.
If var_date is 2022-01-24, then the query will return row with 2022-01-23 as login date.
How am I to do this in SQLite?
Thank you.
This question is a variant of gaps and islands, with the islands being clusters of records per user with continuous dates. Here is one approach using analytic functions:
WITH cte AS (
SELECT *, CASE WHEN julianday(LoginDate) -
julianday(LAG(LoginDate) OVER (PARTITION BY UserID
ORDER BY LoginDate))
> 1 THEN 1 ELSE 0 END AS counter
FROM yourTable
),
cte2 AS (
SELECT *, SUM(counter) OVER (PARTITION BY UserID ORDER BY LoginDate) AS grp
FROM cte
)
SELECT UserID, LoginDate
FROM cte2 t1
WHERE LoginDate < '2022-01-29' AND
grp = (SELECT t2.grp FROM cte2 t2
WHERE t2.UserID = t1.UserID AND t2.LoginDate = '2022-01-28');
Demo
The two CTEs generate a pseudo date group for each cluster per user. The final query returns all records less than the target date for which the group value is the same as the immediately preceding date. Hence, for dates having no immediate record for a given user, the query will return empty set.
Use a recursive CTE:
WITH cte(UserId, LoginDate) AS (
SELECT :var_user_id, :var_date
UNION ALL
SELECT UserId, date(c.LoginDate, '-1 day')
FROM cte c
WHERE EXISTS (SELECT 1 FROM tablename t WHERE t.UserId = c.UserId AND t.LoginDate = date(c.LoginDate, '-1 day'))
)
SELECT *
FROM cte
WHERE LoginDate < (SELECT MAX(LoginDate) FROM cte);
Change :var_user_id and :var_date to the values that you want for the user's id and the date.
See the demo.
Related
I've got a table in an SQLite3 database containing account balances, but it currently only contains balances for a few specific dates:
Balance Date
Amount
2021-12-15
400
2021-12-18
500
2021-12-22
200
I need to fill in the gaps between these dates with the previous recorded balance, so e.g. 2021-12-16 and 2021-12-17 should have a balance of 400 and 2021-12-19, 2021-12-20 and 2021-12-21 should have a balance of 500.
Is there a way to fill these gaps using SQL? I think I need some logic like
INSERT INTO BALANCES (BalanceDate,BalanceAmount)
VALUES(previous record + 1 day, previous record's amount)
but I don't know how I can point SQL to the previous record.
Thanks
You can use a recursive cte to produce the missing dates:
WITH cte AS (
SELECT date(b1.BalanceDate, '+1 day') BalanceDate, b1.Amount
FROM BALANCES b1
WHERE NOT EXISTS (SELECT 1 FROM BALANCES b2 WHERE b2.BalanceDate = date(b1.BalanceDate, '+1 day'))
AND date(b1.BalanceDate, '+1 day') < (SELECT MAX(BalanceDate) FROM BALANCES)
UNION ALL
SELECT date(c.BalanceDate, '+1 day'), c.Amount
FROM cte c
WHERE NOT EXISTS (SELECT 1 FROM BALANCES b WHERE b.BalanceDate = date(c.BalanceDate, '+1 day'))
AND date(c.BalanceDate, '+1 day') < (SELECT MAX(BalanceDate) FROM BALANCES)
)
INSERT INTO BALANCES(BalanceDate, Amount)
SELECT BalanceDate, Amount FROM cte;
See the demo.
Below is an example of my table
Names Start_Date Orders Items
AAA 2020-01-01 300 100
BAA 2020-02-01 896 448
My requirement would be as below
Names Start_Date Orders
AAA 2020-01-01 100
AAA 2020-01-01 100
AAA 2020-01-01 100
BBB 2020-02-01 448
BBB 2020-02-01 448
The rows should be split based on the (Orders/Items) value
This is a nice task for Teradata's SQL extension to create time series (based on #Andrew's test data):
SELECT *
FROM vt_foo
EXPAND ON PERIOD(start_date, start_date + Cast(Ceiling(Cast(orders AS FLOAT)/items) AS INT)) AS pd
For an exact split of orders into items:
SELECT dt.*,
CASE WHEN items * (end_date - start_date) > orders
THEN orders MOD items
ELSE items
end
FROM
(
SELECT t.*, End(pd) AS end_date
FROM vt_foo AS t
EXPAND ON PERIOD(start_date, start_date + Cast(Ceiling(Cast(orders AS FLOAT)/items) AS INT)) AS pd
) AS dt
This calls for a recursive CTE. Here's how I'd approach it, with a lovely volatile table for some sample data.
create volatile table vt_foo
(names varchar(100), start_date date, orders int, items int)
on commit preserve rows;
insert into vt_foo values ('AAA','2020-01-01',300,100);
insert into vt_foo values ('BAA','2020-02-01',896,448);
insert into vt_foo values ('CCC','2020-03-01',525,100); -
with recursive cte (names, start_date,items, num, counter) as (
select
names,
start_date,
items,
round(orders /( items * 1.0) ) as num ,
1 as counter
from vt_foo
UNION ALL
select
a.names,
a.start_date,
a.items,
b.num,
b.counter + 1
from vt_foo a
inner join cte b
on a.names = b.names
and a.start_date =b.start_date
where b.counter + 1 <= b.num
)
select * from cte
order by names,start_date
This bit: b.counter + 1 <= b.num is the key to limiting the output to the proper # of rows per product/date.
I think this should be ok, but test it with small volumes of data.
I have created a dataset "Orders" to test sqlite with structure
CREATE TABLE Orders (
OrderID INTEGER PRIMARY KEY AUTOINCREMENT
OrderDate TIMESTAMP DEFAULT (CURRENT_TIMESTAMP)
CustomerID VARCHAR(20)
OrderValue DECIMAL (8, 3) NOT NULL
);
I filled the table with sample data
ID Date Customer Value($)
6 11-09-2019 Eva 6946.3
7 11-10-2019 John 850.6
8 11-11-2019 Helen 9855.0
9 11-12-2019 Maria 765.2
11 11-13-2019 Gui 1879.5 --< I removed ID 10 purposely
12 11-14-2019 Eric 600.0
13 11-15-2019 Paul 12890.1
How could I identify in same row both records 11 and 9, given the parameter :date, to represent the last sale of orderdate = :date and the immediately forward, or in case I changed record 9 to same date of 11, I get 8 (the last sale of last day)?
pseudo-code
select last 2 order where orderdate <= :date inner join (? a relation to put both in same row)
Step one is to replace your 'MM-DD-YYYY' date strings with ones that can be sorted - 'YYYY-MM-DD', for example (Then you can use the date and time functions on them as well if needed). Since your orderdate column has a default value of CURRENT_TIMESTAMP, but you're just showing the date and that not in the same format that uses, I assume you're inserting your dates manually instead of letting them be automatically generated on insert? The column names of your sample data table don't match up with the ones in your table definition either... that's confusing.
Anyways, since you said you want the values in the same row, the lead() window function comes into play (Requires Sqlite 3.25 or newer). Something like:
WITH cte AS
(SELECT orderid, orderdate, customerid, ordervalue
, lead(orderid, 1) OVER bydate AS next_id
, lead(orderdate, 1) OVER bydate AS next_date
, lead(customerid, 1) OVER bydate AS next_customer
, lead(ordervalue, 1) OVER bydate AS next_value
FROM orders
WINDOW bydate AS (ORDER BY orderdate))
SELECT * FROM cte WHERE orderdate = :date;
gives for a :date of '2019-11-12':
orderid orderdate customerid ordervalue next_id next_date next_customer next_value
---------- ---------- ---------- ---------- ---------- ---------- ------------- ----------
9 2019-11-12 Maria 765.2 11 2019-11-13 Gui 1879.5
I have a table that contains two time stamp t1(event open date) and t2(event close date) and a primary key eventid.
If event is open then t2 will be null whenever even gets closed the same row will be get updated with event closure date t2.
For example I want to check how many issues are open on every day bases on opened date (t1) from 01-apr-2016 to 10-apr-2016.
I have to calculate how many events are open for every day based on a selected date range.
Lets say if eventid 1 has got opened on 1st-APR and got closed on 10th-APR and I am calculating the number of opened issues for every day on 11th-APR then it should give me number of open event 1 from 1st-APR to 10th-APR.
Table Structure:-
================================================
EVENTID T1 T2
================================================
1 01-apr-2016 10-apr-2016
2 02-apr-2016 08-apr-2016
3 05-apr-2016 09-apr-2016
Expected Output:-
==============================================================================
DATE TOTAL_OPEN_EVENTS
==============================================================================
01-apr-2016 1
02-apr-2016 2(1 issue open on 1st(not closed on 2nd) and 1 on 2nd)
03-apr-2016 2
04-apr-2016 2
05-apr-2016 3
06-apr-2016 3
07-apr-2016 3
08-apr-2016 2(1 issue got closed on 8th(which was opened on 2nd))
09-apr-2016 2
10-apr-2016 0
How to do this kind of calculation in Oracle database ?
In order to generate the end report, you need a row for each date in your desired range. You could either use a calendar table, if available, or I find using a query on DUAL using CONNECT BY LEVEL < some_number works well to generate rows on the fly. (In this case "some_number" will be the number of days you want to report on.)
From there, you just need to join the individual dates to the date ranges in your event table:
-- create table "events" table
create table event_date_ranges
as
select 1 as event_id, TO_DATE('2016-APR-01', 'YYYY-MM-DD') as start_date, TO_DATE('2016-APR-10', 'YYYY-MON-DD') as end_date from dual
union all
select 2 as event_id, TO_DATE('2016-APR-02', 'YYYY-MM-DD') as start_date, TO_DATE('2016-APR-08', 'YYYY-MON-DD') as end_date from dual
union all
select 3 as event_id, TO_DATE('2016-APR-05', 'YYYY-MM-DD') as start_date, TO_DATE('2016-APR-09', 'YYYY-MON-DD') as end_date from dual
;
with
date_range_qry as
(-- one way to set the start and end dates for your report
select TO_DATE('2016-APR-01', 'YYYY-MM-DD') as report_start_date
, TO_DATE('2016-APR-10', 'YYYY-MM-DD') as report_end_date
from dual
)
, dates_qry
as
(
-- generate a row for all dates between 2016-APR-01 and 2016-APR-10
select report_start_date + ROWNUM - 1 as report_date
from dual
cross join
date_range_qry drq
connect by level <= (drq.report_end_date - drq.report_start_date + 1)
)
select dq.report_date, count(edr.event_id) as total_open_events
from dates_qry dq
left outer join
event_date_ranges edr
on dq.report_date >= edr.start_date
and dq.report_date < edr.end_date
group by dq.report_date
order by dq.report_date
Output:
REPORT_DATE TOTAL_OPEN_EVENTS
2016-APR-01 1
2016-APR-02 2
2016-APR-03 2
2016-APR-04 2
2016-APR-05 3
2016-APR-06 3
2016-APR-07 3
2016-APR-08 2
2016-APR-09 1
2016-APR-10 0
You can try this:
create table events_log
as
select 1 as event_id, TO_DATE('01-04-2016', 'DD/MM/YYYY') as T1, TO_DATE('10-04-2016', 'DD/MM/YYYY') as T2 from dual
union all
select 2 as event_id, TO_DATE('02-04-2016', 'DD/MM/YYYY') as T1, TO_DATE('08-04-2016', 'DD/MM/YYYY') as T2 from dual
union all
select 3 as event_id, TO_DATE('05-04-2016', 'DD/MM/YYYY') as T1, TO_DATE('09-04-2016', 'DD/MM/YYYY') as T2 from dual
;
--------------
select v.REPORT_DATE, count(t.EVENT_ID) as open_event
from events_log t,
(select to_date('01/04/2016', 'DD/MM/YYYY') + ROWNUM - 1 as report_date
from dual
connect by level <= (to_date('11/04/2016', 'DD/MM/YYYY') -
to_date('01/04/2016', 'DD/MM/YYYY') + 1)) v
where t.T1(+) <= v.report_date
and t.T2(+) >= v.report_date
group by v.report_date
order by v.report_date;
Output will be:
report_date open_event
01/04/2016 1
02/04/2016 2
03/04/2016 2
04/04/2016 2
05/04/2016 3
06/04/2016 3
07/04/2016 3
08/04/2016 3
09/04/2016 2
10/04/2016 1
11/04/2016 0
I have the following tables in SQL Server:
user_id, value, date, action_id
----------------------------------
1 A 1/3/2012 null
1 K 1/4/2012 null
1 B 1/5/2012 null
2 X 1/3/2012 null
2 K 1/4/2012 1
3 K 1/3/2012 null
3 L 1/4/2012 2
3 K 1/5/2012 3
4 K 1/3/2012 null
action_id, state
----------------------------------
1 0
2 1
3 1
4 0
5 1
I need to return the most recent record for each user where the value is 'K', the action id is either null or its state is set to 1. Here's the result set I want:
user_id, value, date, action_id
----------------------------------
3 K 1/5/2012 3
4 K 1/3/2012 null
For user_id 1, the most recent value is B and its action id is null, so I consider this the most recent record, but it's value is not K.
For user_id 2, the most recent value is K, but action id 1 has state 0, so I fallback to X, but X is not K.
user_id 3 and 4 are straightforward.
I'm interested in Linq to SQL query in ASP.NET, but for now T-SQL is fine too.
The SQL query would be :
Select Top 1 T1.* from Table1 T1
LEFT JOIN Table2 T2
ON T1.action_id = T2.action_id
Where T1.Value = 'K' AND (T1.action_id is null or T2.state = 1)
Order by T1.date desc
LINQ Query :
var result = context.Table1.Where(T1=> T1.Value == "K"
&& (T1.action_id == null ||
context.Table2
.Where(T2=>T2.State == 1)
.Select(T2 => T2.action_id).Contains(T1.action_id)))
.OrderByDescending(T => T.date)
.FirstOrDefault();
Good Luck !!
This query will return desired result set:
SELECT
*
FROM
(
SELECT
user_id
,value
,date
,action_id
,ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY date DESC) RowNum
FROM
testtable
WHERE
value = 'K'
) testtable
WHERE
RowNum = 1
You can also try following approach if user_id and date combination is unique
Make sure to get the order of predicates in the join to be able to use indexes:
SELECT
testtable.*
FROM
(
SELECT
user_id
,MAX(date) LastDate
FROM
testtable
WHERE
value = 'K'
GROUP BY
user_id
) tblLastValue
INNER JOIN
testtable
ON
testtable.user_id = tblLastValue.user_id
AND
testtable.date = tblLastValue.LastDate
This would select the top entries for all users as described in your specification, as opposed to TOP 1 which just selects the most recent entry in the database. I'm assuming here that your tables are named users and actions:
WITH usersactions as
(SELECT
u.user_id,
u.value,
u.date,
u.action_id,
ROW NUMBER() OVER (PARTITION BY u.user_id ORDER BY u.date DESC, u.action_id DESC) as row
FROM users u
LEFT OUTER JOIN actions a ON u.action_id = a.action_id
WHERE
u.value = 'K' AND
(u.action_id IS NULL OR a.state = 1)
)
SELECT * FROM usersactions WHERE row = 1
Or if you don't want to use a CTE:
SELECT * FROM
(SELECT
u.user_id,
u.value,
u.date,
u.action_id,
ROW NUMBER() OVER (PARTITION BY u.user_id ORDER BY u.date DESC, u.action_id DESC) as row
FROM users u
LEFT OUTER JOIN actions a ON u.action_id = a.action_id
WHERE
u.value = 'K' AND
(u.action_id IS NULL OR a.state = 1)
) useractions
WHERE row = 1