ROW_NUMBER, RANK, or DENSE_RANK to have the SAME order number if there are rows with same values - sqlite

PROBLEM
Form a scoreboard with position number, player name and best score. If two players have the same score, they share the position and the names are in alpabetical order. (See example.)
I have two tables
INSERT INTO players (id,name) VALUES (1, Uolevi),(2,Maija),(3,Liisa),(4,Kaaleppi),(5,Kotivalo);
INSERT INTO results (id, player_id, score) VALUES (1, 1, 100), (2, 2, 200), (3, 3, 200), (4, 4, 100), (5, 5, 50);
The expected result is:
Order
Name
Score
1
Liisa
200
1
Maija
200
3
Kaaleppi
100
3
Uolevi
100
5
Kotivalo
50
Please look carefully at the order number. Because there are 2 rows with order number 1, the next order number will be 3, instead of 2.

Here is a possible solution:
WITH
places AS (
SELECT row_number() OVER (ORDER BY score DESC, name) AS place, name, score
FROM players
JOIN results ON players.id = results.player_id
)
SELECT first_value(place) OVER (PARTITION BY score ORDER BY place) AS place, name, score
FROM places ORDER BY score DESC, name;
See the fiddle.

You need rank() to keep ranking if there is a tie. dense_rank() when you don't need gaps between ranks.
select rank() over (order by r.score desc) as order1,
dense_rank() over (order by r.score desc) as order2,
p.name,
r.score
from players p
join results r
on p.id = r.player_id
order by r.score desc, p.name;
Outcome (include both rank() and dense_rank() for your reference).
| order1 | order2 | name | score |
+--------+--------+-----------+-------+
| 1 | 1 | Liisa | 200 |
| 1 | 1 | Maija | 200 |
| 3 | 2 | Kaaleppi | 100 |
| 3 | 2 | Uolevi | 100 |
| 5 | 3 | Kotivalo | 50 |

Related

Make partitions based on difference in date in Postgres window function

I have data in the following format
id | first_name | last_name | birth_date
abc | Jared | Pollard | 1970-01-01
def | Jared | Pollard | 1972-02-02
ghi | Jared | Pollard | 1980-01-01
klm | Jared | Pollard | 2015-01-01
and I would like a query which groups data based on the following rule
If first_name, last_name are equal and birth_dates are within 5 years of each other, than records belong to same group
So the above data contains three groups group1=(abc, def), group2=(ghi) and group3=(klm)
Currently I have the following query which incorrectly creates only 2 groups, group1=(abc, def) and group2=(ghi, klm)
SELECT
g.id,
FIRST_VALUE(g.id) OVER (PARTITION BY lower(trim(g.last_name)), lower(trim(g.first_name)),
CASE WHEN g.birth_date between g.fv_birth_date - interval '5 year' AND g.fv_birth_date + interval '5 year' THEN 1 ELSE 0 END
ORDER BY g.last_used_dt DESC NULLS LAST) AS cluster_id
FROM (
SELECT id, last_used_dt, last_name, first_name, birth_date,
FIRST_VALUE(birth_date)
OVER (PARTITION BY
lower(trim(last_name)),
lower(trim(first_name))
ORDER BY last_used_dt DESC NULLS LAST) AS fv_birth_date
FROM guest
) g;
I understand this is because of the CASE statement within the PARTITION BY clause but am unable to come up with any other query

Sqlite: How to get the percentage of a total as a column

I have a table with sequences and their total occurrence and I'd like to create a column which shows the percentage of the total of all sequences.
This is the basic table:
sequence
e3a0b000
e92d4030
e59f3038
e1a0c00d
e1a0c00d
e1a0c00d
e1a0c00d
...
SELECT sequence, Count(*) AS total
FROM seq_one
GROUP BY sequence
ORDER BY total DESC
Which then gives me this:
sequence | total
e1a0c00d | 155
e3510000 | 2
e2512001 | 1
e3a0b000 | 1
e59f3038 | 1
e92d4010 | 1
e92d4030 | 1
Then applying this to the temporary table:
SELECT sequence, total, total*1.0/SUM(total) as relative
FROM (SELECT sequence, Count(*) AS total
FROM seq_one
GROUP BY sequence
ORDER BY total DESC)
Gives me this:
sequence | total | relative
e1a0c00d | 155 | 0.95679012345679
I would like to get the whole column instead of only the first row.

Add serial number for each id based on dates

I have a dataset like shown below (except the Ser_NO, this is the field i want to create).
+--------+------------+--------+
| CaseID | Order_Date | Ser_No |
+--------+------------+--------+
| 44 | 22-01-2018 | 1 |
+--------+------------+--------+
| 44 | 24-02-2018 | 3 |
+--------+------------+--------+
| 44 | 12-02-2018 | 2 |
+--------+------------+--------+
| 100 | 24-01-2018 | 1 |
+--------+------------+--------+
| 100 | 26-01-2018 | 2 |
+--------+------------+--------+
| 100 | 27-01-2018 | 3 |
+--------+------------+--------+
How can i achieve a serial number for each CaseId based on my dates. So the first date in a specific CaseID gets number 1, the second date in this CaseID gets number 2 and so on.
I'm working with T-SQL btw,
I've tried a few things:
CASE
WHEN COUNT(CaseID) > 1
THEN ORDER BY (Order_Date)
AND Ser_no +1
END
Thanks in advance.
First of all, although I don't understand what you did, it gives you what you wanted. The serial number is assigned by date order. The problem I can see is that the result shows you the rows in the wrong order (1, 3, 2 instead of 1, 2, 3).
To sort that order you can try this:
SELECT *, ROW_NUMBER() OVER (PARTITION BY caseid ORDER BY caseid, order_date) AS ser_no
FROM [Table]
Thanks for your reply,
Sorry for the misunderstanding, because the ser_no is not yet in my table. That is the field a want to calculate.
I finished it myself this morning, but it looks almost the same like your measure:
RANK() OVER(PARTITION BY CaseID ORDER BY CaseID, Order_Date ASC

sqlite3 recursive aggregation of data

This may be a kind of the Knapsack problem.
I need to traverse a data table, group it by a column, choosing ones with better time.
Then repeat the previous step until a limit given by column CAPACITY is not reached.
This is the demo scenario:
create table if not exists data( vid num, size num, epid num, sid num, capacity num, dt );
delete from data;
insert into data(vid,size,epid,sid,capacity,dt)
values
(0,20,1,1,50,1100), -- 2nd choice
(0,20,1,1,50,1000), -- 1st choice
(0,20,1,1,50,1200), -- last choice excluded because out of capacity
(1,20,2,2,50,1100), -- 2nd choice
(1,20,2,2,50,1000), -- 1st choice
(1,20,2,2,50,1200); -- last choice excluded because out of capacity
This is the non recursive solution:
with best0 as (
select a.rowid as tid,a.vid,a.sid,a.size,a.dt,a.capacity-a.size as remains,0 as level
from data a
group by a.sid
having min(a.dt)
),
best1 as (
select a.tid,a.vid,a.sid,a.size,a.dt,a.remains, a.level
from (
select
a.rowid as tid,a.sid,a.vid,a.size,a.capacity,a.dt,b.remains-a.size as remains,
b.level+1 as level
from data a
join best0 b on b.sid=a.sid -- and b.level=a.level-1
where not a.rowid in (select tid from best0)
and b.remains-a.size>0
) a group by a.sid having min(a.dt)
),
best2 as (
select a.tid,a.vid,a.sid,a.size,a.dt,a.remains, a.level
from (
select
a.rowid as tid,a.sid,a.vid,a.size,a.capacity,a.dt,b.remains-a.size as remains,
b.level+1 as level
from data a
join best1 b on b.sid=a.sid -- and b.level=a.level-1
where not a.rowid in (select tid from best0 union all select tid from best1)
and b.remains-a.size>0
) a group by a.sid having min(a.dt)
)
select * from best0
union all
select * from best1
union all
select * from best2
And this the result:
tid | vid | sid | size | Dtime | capacity | group_level
--- | --- | --- | ---- | ----- | -------- | -----------
2 | 0 | 1 | 20 | 1000 | 30 | 0
5 | 1 | 2 | 20 | 1000 | 30 | 0
1 | 0 | 1 | 20 | 1100 | 10 | 1
4 | 1 | 2 | 20 | 1100 | 10 | 1
This is the recursive version that give error: "recursive reference in a subquery: best"
with recursive best(tid,vid,sid,size,dt,remains,level)
as (
select a.rowid as tid,a.vid,a.sid,a.size,a.dt,a.capacity-a.size as remains,0 as level
from data a
group by a.sid
having min(a.dt)
union all
select a.tid,a.vid,a.sid,a.size,a.dt,a.remains, a.level
from (
select
a.rowid as tid,a.sid,a.vid,a.size,a.dt,b.remains-a.size as remains,
b.level+1 as level
from data a
join best b on b.sid=a.sid -- and b.level=a.level-1
where not a.rowid in (select tid from best) and b.remains-a.size>0
) a group by a.sid having min(a.dt)
)
select * from best
I tried differents solutions even using a loop counter but everyone give the same error.

Suggestion needed writing a complex query - sqlite

I have 4 columns in a table called musics - 'artist','genre', 'writer' , 'producer'.
I need to write a query such that, it returns a value 0 , if there are no repetition of values corresponding to the column name; if there is a repetition of values, it should return a value 1, corresponding to that column name.
Any help is much appreciated
SELECT (COUNT(artist) <> COUNT(DISTINCT artist)) artist,
(COUNT(genre) <> COUNT(DISTINCT genre)) genre,
(COUNT(writer) <> COUNT(DISTINCT writer)) writer,
(COUNT(producer) <> COUNT(DISTINCT producer)) producer
FROM musics
Another version
SELECT
( SELECT (COUNT(*) > 0)
FROM (SELECT 1 FROM musics GROUP BY artist HAVING COUNT(*) > 1) a
) artist,
( SELECT (COUNT(*) > 0)
FROM (SELECT 1 FROM musics GROUP BY genre HAVING COUNT(*) > 1) g
) genre,
( SELECT (COUNT(*) > 0)
FROM (SELECT 1 FROM musics GROUP BY writer HAVING COUNT(*) > 1) w
) writer,
( SELECT (COUNT(*) > 0)
FROM (SELECT 1 FROM musics GROUP BY producer HAVING COUNT(*) > 1) p
) producer
Sample data
| artist | genre | writer | producer |
------------------------------------------
| artist1 | genre1 | writer1 | producer1 |
| artist2 | genre2 | writer1 | producer2 |
| artist1 | genre3 | writer3 | producer3 |
Sample output:
| artist | genre | writer | producer |
--------------------------------------
| 1 | 0 | 1 | 0 |
SQLFiddle
For Artist
select convert(bit,(count(1)-1))
from table_name
group by artist -- <-- Replace artist with column name for which duplicate
write a select count statement using distinct with specified column and another select count without distinct and compare both of them based on your requirement
you can use 4 different query with union & each query must contain count(column name) + group by clause

Resources