SQL - Select top n grouped by multiple fields, ordered by count - sqlite

I am attempting to find the top n records when grouped by multiple attributes. I believe it is related to this problem, but I am having difficulty adapting the solution described to my situation.
To simplify, I have a table with columns (did is short for device_id):
id int
did int
dateVal dateTime
I am trying to find the top n device_id's for each day with the most rows.
For example (ignoring id and the time part of dateTime),
did dateVal
1 2017-01-01
1 2017-01-01
1 2017-01-01
2 2017-01-01
3 2017-01-01
3 2017-01-01
1 2017-01-02
1 2017-01-02
2 2017-01-02
2 2017-01-02
2 2017-01-02
3 2017-01-02
Finding the top 2 would yield...
1, 2017-01-01
3, 2017-01-01
2, 2017-01-02
1, 2017-01-02
My current naive approach is only giving me the top 2 across all dates.
--Using SQLite
select date(dateVal) || did
from data
group by date(dateVal), did
order by count(*) desc
limit 2
I'm using the concatenation operator so that I can later extract the rows.
I am using SQLite, but any general SQL explanation would be appreciated.

Similarly to this question, define a CTE that computes all device counts for your desired groups, then use it in a WHERE ... IN subquery, limited to the top 2 devices for that date:
WITH device_counts AS (
SELECT did, date(dateval) AS dateval, COUNT(*) AS device_count
FROM data
GROUP BY did, date(dateval)
)
SELECT did, date(dateval) FROM device_counts DC_outer
WHERE did IN (
SELECT did
FROM device_counts DC_inner
WHERE DC_inner.dateval = DC_outer.dateval
GROUP BY did, date(dateval)
ORDER BY DC_inner.device_count DESC LIMIT 2
)
ORDER BY date(dateval), did

I tested the query using sql server
select top 2 did, dateVal
from (select *, count(*) as c
from test
group by did,dateVal) as t
order by t.c desc

Related

MariaDB running total up to N and rows NOT included in its calculation

I have a table which amongst other columns has amt and created(timestamp).
I'm trying to calculate the running total of amt up to N
Get all the rows not included in the calculation leading to the sum up to N
I'm doing this in code but was wondering if there was a way to get these with SQL and ideally in one query.
Looking around and it's easy to find examples of calculating the running total like
https://stackoverflow.com/a/1290936/400048 but less so to find running total up N and then only actually return rows not involved in calculating N.
You can use the window version of the SUM aggregate function to get the running total for each row.
CREATE TABLE TEST (ID BIGINT PRIMARY KEY, AMT INT, CREATED TIMESTAMP);
INSERT INTO TEST VALUES
(1, 1, TIMESTAMP '2000-01-01 00:00:00'),
(2, 2, TIMESTAMP '2000-01-02 00:00:00'),
(3, 1, TIMESTAMP '2000-01-03 00:00:00'),
(4, 3, TIMESTAMP '2000-01-04 00:00:00'),
(5, 5, TIMESTAMP '2000-01-05 00:00:00'),
(6, 1, TIMESTAMP '2000-01-07 00:00:00');
SELECT ID, AMT, SUM(AMT) OVER (ORDER BY CREATED) RT, CREATED FROM TEST ORDER BY CREATED;
> ID AMT RT CREATED
> -- --- -- -------------------
> 1 1 1 2000-01-01 00:00:00
> 2 2 3 2000-01-02 00:00:00
> 3 1 4 2000-01-03 00:00:00
> 4 3 7 2000-01-04 00:00:00
> 5 5 12 2000-01-05 00:00:00
> 6 1 13 2000-01-07 00:00:00
Then you can use a non-standard QUALIFY clause in H2 or a subquery (in both MariaDB and H2) to filter out rows below the limit.
If N is a running total limit and by “rows not included in the calculation” you mean rows above the limit, the queries will look like these:
-- Simple non-standard query for H2
SELECT ID, AMT, SUM(AMT) OVER (ORDER BY CREATED) RT, CREATED FROM TEST
QUALIFY RT > 10 ORDER BY CREATED;
-- Equivalent standard query with subquery for MariaDB, H2, and many others
SELECT * FROM (
SELECT ID, AMT, SUM(AMT) OVER (ORDER BY CREATED) RT, CREATED FROM TEST
) T WHERE RT > 10 ORDER BY CREATED;
> ID AMT RT CREATED
> -- --- -- -------------------
> 5 5 12 2000-01-05 00:00:00
> 6 1 13 2000-01-07 00:00:00
RT - AMT in the first row here is a running total of all previous rows. You can select it separately, if you wish:
-- Non-standard query for H2
SELECT SUM(AMT) OVER (ORDER BY CREATED) RT FROM TEST
QUALIFY RT < 10 ORDER BY CREATED DESC FETCH FIRST ROW ONLY;
-- Non-standard query for MariaDB or H2
SELECT RT FROM (
SELECT ID, AMT, SUM(AMT) OVER (ORDER BY CREATED) RT, CREATED FROM TEST
) T WHERE RT < 10 ORDER BY CREATED DESC LIMIT 1;
-- Standard query for H2 and others (but not for MariaDB)
SELECT RT FROM (
SELECT ID, AMT, SUM(AMT) OVER (ORDER BY CREATED) RT, CREATED FROM TEST
) T WHERE RT < 10 ORDER BY CREATED DESC FETCH FIRST ROW ONLY;
> RT
> --
> 7
If you meant something else, the QUALIFY or WHERE criteria will be different.

SQLite: Number rows of query output

I am learning SQLite and I am currently posing the question whether there is a simple way of adding a sequential numbering to the output of a query. Underneath, I provide an example of what I am trying to achieve.
For instance, I have the following query:
SELECT
splTicker AS 'Ticker',
count(splTicker) AS '# of Splits'
FROM Splits
GROUP BY splTicker
ORDER BY count(splTicker) DESC, splTicker ASC;
The output of this query is as follows:
bash-3.2$ sqlite3 myShares < Queries/Split.sql
Ticker # of Splits
---------- -----------
AI.PA 7
ASML.AS 3
BN.PA 3
ALTR.LS 2
BOKA.AS 2
DG.PA 2
...
SON.LS 1
SU.PA 1
SW.PA 1
TEC.PA 1
UMI.BR 1
VIV.PA 1
VPK.AS 1
I am trying to add a sequential number to the rows to obtain the following output:
# Ticker # of Splits
-- ---------- -----------
1 AI.PA 7
2 ASML.AS 3
3 BN.PA 3
4 ALTR.LS 2
5 BOKA.AS 2
6 DG.PA 2
...
Currently, I use a workaround and add the row numbers post-query in Perl. I am posing the question whether I could do this directly in SQLite. The idea seems simple, but I have not found a solution yet. Any help would be appreciated.
Best regards,
GAM
Try this:
SELECT
(SELECT COUNT(*)
FROM Splits AS s2
WHERE s2.splTicker <= s1.splTicker) AS '#',
splTicker AS 'Ticker',
count(splTicker) AS '# of Splits'
FROM Splits s1
GROUP BY s1.splTicker
ORDER BY count(s1.splTicker) DESC, s1.splTicker ASC;

sqlite query comparing data in different rows

i have the following table where i have the date( not a primary key) and rating ('A' being the highest grade):
date rating
03-10-2010 C
03-09-2010 C
03-08-2010 B
03-07-2010 B
03-06-2010 B
03-05-2010 B
03-04-2010 A
I need to make a query where i compare the rating in order to return the result for each 'date'.
For example. considering the date 03-10-2010, i want to know when the last rating downgrade happened. if the downgrade was 1 day ago return '1' as result, if it was 2 days ago return '2' and if was older than 3 days return 0.
And i would do the same query for each date, getting an array with the results.
i'm stuck trying to do this and i have no more ideas how to do it. Anyone can help me please?
thanks.
You want the difference, in days, between the date of each record and the date of the record before the last downgrade.
When you have a specific record, the record before the last downgrade is the record that
has a higher rating than this record, and
has a lower date than this record, and
is the latest record of those.
In SQL, this can be done with a correlated subquery:
SELECT date,
rating,
(SELECT date
FROM MyTable AS downgrade
WHERE downgrade.date < MyTable.date
AND downgrade.rating < MyTable.rating
ORDER BY date DESC
LIMIT 1) AS downgrade_date
FROM MyTable
date rating downgrade_date
---------- ---------- ----------
2010-03-04 A
2010-03-05 B 2010-03-04
2010-03-06 B 2010-03-04
2010-03-07 B 2010-03-04
2010-03-08 B 2010-03-04
2010-03-09 C 2010-03-08
2010-03-10 C 2010-03-08
To compute the difference, convert the date into a numeric value.
You can then use this value for further computations:
SELECT date,
rating,
CASE
WHEN days <= 3 THEN days
ELSE 0
END AS whatever
FROM (SELECT date,
rating,
julianday(date) -
julianday((SELECT date
FROM MyTable AS downgrade
WHERE downgrade.date < MyTable.date
AND downgrade.rating < MyTable.rating
ORDER BY date DESC
LIMIT 1)) AS days
FROM MyTable)

Filtering in Oracle based on a group of values contained in a list of values

I have following two tables:
ID_PERSON NAME
-----------------
1 John
2 Joe
3 Peter
ID_PERSON ID_SPECIALIZATION
------------------------------
1 5
1 6
1 7
2 5
2 1
3 6
3 10
I need to filter data based on group of ids ID_SPECIALIZATION that will be provided. For example
I want to display only those persons who has specialization in 5 and 6 so it will return only first person. In ASP.NET Web form there will be two listboxes, left and right button, in first LB there will be all possible specializations and user will choose some of them to second LB as filtering options. I have no idea how to put this filtering condition in sql query. Thanks for help.
You could use the following:
SQL> SELECT p.id_person, p.NAME
2 FROM person p
3 JOIN person_spe s ON p.id_person = s.id_person
4 WHERE id_specialization IN (5, 6)
5 GROUP BY p.id_person, p.NAME
6 HAVING COUNT(*) = 2;
ID_PERSON NAME
---------- -----
1 John
One way to do it:
SELECT
ID_PERSON
, NAME
FROM
Person AS p
WHERE EXISTS
( SELECT *
FROM
PersonSpecialization AS ps
WHERE ps.ID_PERSON = p.ID_PERSON
AND ps.ID_SPECIALIZATION = 5
)
AND EXISTS
( SELECT *
FROM
PersonSpecialization AS ps
WHERE ps.ID_PERSON = p.ID_PERSON
AND ps.ID_SPECIALIZATION = 6
)
SELECT d1.id_person, d1.name FROM tbl_table1 d1
INNER JOIN tbl_table2 d1
ON d1.ID_PERSON=d2.ID_PERSON
WHERE ID_SPECILIZATION = ?
Theres the query but I'm not sure how asp.net works and passing in the value. It might be work looking up bind variables which allows you to use place holders in the sql which oracle then caches the query and just uses the values that you pass in at run tuime using EXECUTE IMMEDIATE.

Getting All the record of particular month - Building SQL Query

I need some help to build SQL Query. I have table having data like:
ID Date Name
1 1/1/2009 a
2 1/2/2009 b
3 1/3/2009 c
I need to get result something like...
1 1/1/2009 a
2 1/2/2009 b
3 1/3/2009 c
4 1/4/2009 Null
5 1/5/2009 Null
6 1/6/2009 Null
7 1/7/2009 Null
8 1/8/2009 Null
............................
............................
............................
30 1/30/2009 Null
31 1/31/2009 Null
I want query something like..
Select * from tbl **where month(Date)=1 AND year(Date)=2010**
Above is not completed query.
I need to get all the record of particular month, even if some date missing..
I guess there must be equi Join in the query, I am trying to build this query using Equi join
Thanks
BIG EDIT
Now understand the OPs question.
Use a common table expression and a left join to get this effect.
DECLARE #FirstDay DATETIME;
-- Set start time
SELECT #FirstDay = '2009-01-01';
WITH Days AS
(
SELECT #FirstDay as CalendarDay
UNION ALL
SELECT DATEADD(d, 1, CalendarDay) as CalendarDay
FROM Days
WHERE DATEADD(d, 1, CalendarDay) < DATEADD(m, 1, #FirstDay)
)
SELECT DATEPART(d,d.CalendarDay), **t.date should be (d.CalendarDay)**, t.Name FROM Days d
LEFT JOIN tbl t
ON
d.CalendarDay = t.Date
ORDER BY
d.CalendarDay;
Left this original answer at bottom
You need DATEPART, sir.
SELECT * FROM tbl WHERE DATEPART(m,Date) = 1
If you want to choose month and year, then you can use DATEPART twice or go for a range.
SELECT * FROM tbl WHERE DATEPART(m,Date) = 1 AND DATEPART(yyyy,Date) = 2009
Range :-
SELECT * FROM tbl WHERE Date >= '2009-01-01' AND Date < '2009-02-01'
See this link for more info on DATEPART.
http://msdn.microsoft.com/en-us/library/ms174420.aspx
You can use less or equal to.
Like so:
select * from tbl where date > '2009-01-01' and date < '2009-02-01'
However, it is unclear if you want month 1 from all years?
You can check more examples and functions on "Date and Time Functions" from MSDN
Create a temporary table containing all days of that certain month,
Do left outer join between that table and your data table on tempTable.month = #month.
now you have a big table with all days of the desired month and all the records matching the proper dates + empty records for those dates who have no data.
i hope that's what you want.

Resources