I have been trying to use LEFT OUTER JOIN, GROUP BY and (failing) to use the CONCAT || function to get the maximum score for the best book of the decade but have had no luck finding out when two books get the same max score in the same decade.
I need the output below:
There are 2 tables:
Table 1: bookName
Schema: uniqueBookNameId, BookName, yearPublished (from 1901 to 2022)
Table 2: bookRating
Schema: uniqueBookNameId, bookRating
Use a CTE where you join the tables and rank the books with RANK() window function.
Then filter the results to get the top books of each decade:
WITH cte AS (
SELECT r.bookRating,
n.BookName,
n.yearPublished / 10 * 10 || 's' AS Decade,
RANK() OVER (PARTITION BY n.yearPublished / 10 * 10 ORDER BY r.bookRating DESC) AS rnk
FROM bookName n INNER JOIN bookRating r
ON r.uniqueBookNameId = n.uniqueBookNameId
)
SELECT DISTINCT bookRating, BookName, Decade
FROM cte
WHERE rnk = 1
ORDER BY Decade;
or:
WITH cte AS (
SELECT r.bookRating,
n.BookName,
n.yearPublished / 10 * 10 || 's' AS Decade,
RANK() OVER (PARTITION BY n.yearPublished / 10 * 10 ORDER BY r.bookRating DESC) AS rnk
FROM bookName n INNER JOIN bookRating r
ON r.uniqueBookNameId = n.uniqueBookNameId
)
SELECT bookRating, GROUP_CONCAT(DISTINCT BookName) AS BookName, Decade
FROM cte
WHERE rnk = 1
GROUP BY Decade
ORDER BY Decade;
If more than one ID for a match select id with max(date) .
if more than one id for above max dates select max(ID)
Go to TABLE1 to get high-dated IDS
Step 1 : identifying IDs which are having max dates
CREATE MULTISET VOLATILE TABLE TEST_VT
AS
(
SELECT
TABLE1.ID ,
TABLE1.KEY1,
TABLE1.STRT_DT,
TABLE1.TERM_DT,
Rank() OVER (PARTITION BY TABLE1.KEY1 ORDER BY TABLE1.STRT_DT DESC , TABLE1.ID DESC) RNK
FROM TABLE2
INNER JOIN TABLE1
ON TABLE2.KEY1=TABLE1.KEY1
WHERE TABLE1.TERM_DT='8888-12-31'
QUALIFY RNK=1
GROUP BY
1,2,3,4
)
WITH DATA PRIMARY INDEX(ID,KEY1) ON COMMIT PRESERVE ROWS;
Go to TABLE1 to get high-dated KEY1S associated with IDS from ABOVE STEP
Step 2 : If there is more than one max date take max ID out of all matched records in above step
SELECT
TABLE1.ID ,
TABLE1.KEY1,
TABLE1.STRT_DT,
TABLE1.TERM_DT,
RANK() OVER (PARTITION BY TABLE1.KEY1 ORDER BY TABLE1.STRT_DT DESC , TABLE1.ID DESC) AS RNK
FROM TABLE1
INNER JOIN TEST_VT
ON TEST_VT.ID=TABLE1.ID
INNER JOIN TABLE3
ON TABLE3.KEY1=TABLE1.KEY1
WHERE TABLE1.TERM_DT='8888-12-31'
QUALIFY RNK=1
GROUP BY
1,2,3,4
I have been stuck on this progressive question for more than 10 days now
Questions is :: Find all the actors that made more movies with Yash Chopra than any other director
Heres my attempt
SELECT pidsWhoDidMoviesWithYashChopra.pid,
pidsWhoDidMoviesWithYashChopra.moviesWithYashChopra,
pidsOfThoseWhoDidMoviesWithDirectors.moviesByAPID,
pidsWhoDidMoviesWithYashChopra.countOfMoviesWithYashChopraByAPID,
pidsOfThoseWhoDidMoviesWithDirectors.totalNumberOfMoviesByAPID
FROM
(
SELECT TRIM(M_Cast.PID) AS pid, moviesByYashChopra.mDirectorMID AS moviesWithYashChopra, COUNT(moviesByYashChopra.mDirectorMID) AS countOfMoviesWithYashChopraByAPID
FROM M_Cast
JOIN
(
SELECT TRIM(M_Director.MID) AS mDirectorMID
FROM
M_Director
WHERE TRIM(M_Director.PID) IN
(
SELECT TRIM(Person.PID) AS personPID
FROM
Person
WHERE Person.Name LIKE '%Yash Chopra%'
)
) AS moviesByYashChopra
ON TRIM(M_Cast.MID) == moviesByYashChopra.mDirectorMID
GROUP BY pid
) AS pidsWhoDidMoviesWithYashChopra
JOIN
(
SELECT TRIM(M_Cast.PID) AS pid, TRIM(M_Cast.MID) AS moviesByAPID, COUNT(TRIM(M_Cast.MID)) AS totalNumberOfMoviesByAPID
FROM M_Cast
GROUP BY pid
) AS pidsOfThoseWhoDidMoviesWithDirectors
ON pidsWhoDidMoviesWithYashChopra.pid == pidsOfThoseWhoDidMoviesWithDirectors.pid
GROUP BY pidsWhoDidMoviesWithYashChopra.pid
And here's the output it produces
And here's the schema
Now where I require help is :: Ability to go ahead from here :: As in how do i dissect this part of the question "than any other director." :: I think that's the tricky part
Any direction/hints will be helpful, Thanks...
You need to join person (for the actor's details) with m_cast, movie, m_director and person again (for the director's details) and group by actor and director to count the number of movies each actor made with each director.
Then use window functions first_value() to get the name of the director with whom the actor made the most movies and lag() to get the 2nd max number of movies (this is needed to filter out ties).
Enclose this query inside a CTE and then filter:
with cte as (
select pa.pid, pa.name, count(*) counter,
first_value(pd.name) over (partition by pa.pid, pa.name order by count(*) desc) max_dir_name,
lag(count(*)) over (partition by pa.pid, pa.name order by count(*) desc) prev_counter
from person pa
inner join m_cast c on c.pid = pa.pid
inner join movie m on m.mid = c.mid
inner join m_director d on d.mid = m.mid
inner join person pd on pd.pid = d.pid
group by pa.pid, pa.name, pd.name
)
select pid, name, counter
from cte
where max_dir_name = 'Yash Chopra' and coalesce(prev_counter, 0) < counter
I have two tables named "likes" and "comments" and I want to have a table which has counts of likes and comments for each specific user, I wrote following query in SQLite but result is not true for all users, count values for users in both tables are multiple of number of likes and number of comments.
SELECT
likes.liker_name, likes.liker_id, likes.profile_picture ,
COUNT(comments.commenter_name) AS comment_count, COUNT( likes.liker_id) AS like_count
FROM likes
LEFT JOIN comments
ON likes.liker_name = comments.commenter_name
GROUP BY
likes.liker_name
ORDER BY
COUNT( likes.liker_id) DESC
How can I get correct value of count for users that exist in both tables?
The problem is: Some users have comments but no likes, others have likes but no comments, some have both and some have none. Therefore I suggest using a union query and summing that one again
SELECT
u.name, u.id, u.profile_picture,
SUM(u.like_count) AS like_count, SUM(u.comment_count) AS comment_count
FROM (
SELECT
liker_name AS name, liker_id AS id, profile_picture,
COUNT(*) AS like_count, 0 AS comment_count
FROM
likes
GROUP BY
liker_name, liker_id, profile_picture
UNION ALL
SELECT
commenter_name AS name, commenter_id AS id, profile_picture,
0 AS like_count, COUNT(*) AS comment_count
FROM
comments
GROUP BY
commenter_name, commenter_id, profile_picture
) AS u
GROUP BY
u.name, u.id, u.profile_picture
If you have a separate user table you could also left join the likes count and the comments count subqueries to the user table
SELECT
u.name, u.id, u.profile_picture, l.cnt AS like_count, c.cnt AS comment_count
FROM
users u
LEFT JOIN
(SELECT liker_id, COUNT(*) AS cnt
FROM likes
GROUP BY liker_id
) AS l
ON u.user_id = l.liker_id
LEFT JOIN
(SELECT commenter_id, COUNT(*) AS cnt
FROM comments
GROUP BY commenter_id
) AS c
ON u.user_id = c.commenter_id
WHERE l.cnt > 0 OR c.cnt > 0
No matter how you make it, you must count the comments and the likes in separate subqueries. If you count after joining you are summing on a result where records might be duplicated (the ones on the left side) and you are getting the wrong count.
I'm looking to improve this query I wrote for a small web application in ASP.NET 4.0 using SQL-Server 2005. This application will allow the user to search by Product ID and have it return the following information:
Highest Purchase Price + Most Recent Date of purchase # this price
Lowest Purchase Price + Most Recent Date of purchase # this price
Most Recent Purchase Price + Date
Average Purchase Price (optional, i thought this might improve the usefulness of the app)
Here is the structure of the Products table (I'm only including relevant columns, this is a DB already in production and these are non-pk columns)
product_id (nvarchar(20))
price (decimal(19,2))
pDate (datetime)
Before I put down the query I have so far I just want to say that I can get this information easily through multiple queries, so if this is the best practice then disregard improving the query, but I was aiming to minimize the number of queries needed to get all needed information.
What I have so far: (Note: There are rows with price = 0 so I ignored those in the bottom select looking for the MIN price)
SELECT price, MAX(pDate)
FROM Products
WHERE product_id = #product_id AND
(price = (SELECT MAX(price)
FROM Products
WHERE product_id =#product_id) OR
price = (SELECT MIN(price)
FROM Products
WHERE product_id = #product_id AND price > 0))
GROUP BY price
Now this is returning 2 rows:
first = the lowest price + date
second row = high price + date
What I would like ideally is to have a query return 1 row with all the needed information stated above if possible, as it would simplify displaying the information in ASP for me. And like I said earlier, if multiple queries is the be approach then no need to re-write a complex query here.
Edit
Here is some sample data
Desired query results: (ignore the format as I typed this in excel)
Here is the query I will be using thanks to Ken Benson:
SELECT TOP 1 prod.product_id,
minp.price AS minprice, minp.pDate as minlastdate,
maxp.price AS maxprice, maxp.pDate as maxlastdate,
ag.price AS averageprice
FROM products AS prod
LEFT JOIN (SELECT lmd.product_id,max(lmd.pDate) as pDate,mn.price FROM products as lmd INNER JOIN
(SELECT product_id, min(price) AS price from products WHERE price > 0 group by product_id) as mn ON lmd.product_id=mn.product_id AND lmd.price=mn.price
group by lmd.product_id,mn.price ) AS minp ON minp.product_id=prod.product_id
LEFT JOIN (SELECT lxd.product_id,max(lxd.pDate) as pDate,mx.price FROM products as lxd INNER JOIN
(SELECT product_id, max(price) AS price from products group by product_id) as mx ON lxd.product_id=mx.product_id AND lxd.price=mx.price
group by lxd.product_id,mx.price ) AS maxp ON maxp.product_id=prod.product_id
LEFT JOIN (SELECT product_id,avg(price) as price FROM products WHERE price > 0 GROUP BY product_id) AS ag ON ag.product_id=prod.product_id
WHERE prod.product_id=#product_id
I think you can do a couple of joins back to the table ...
Select product_id, min.price, min.pDate, max.price, max.pDate
FROM products as p
LEFT JOIN (Select Min(price), pDate, product_id FROM products GROUP BY product_id)
as min on min.product_id=p.product_id
LEFT JOIN (Select max(price), pDate, product_id FROM products GROUP BY product_id)
as max on max.product_id=p.product_id
Where p.product_id = #product_id
This second bit of code should produce desired results....
SELECT prod.product_id,
minp.price AS minprice, minp.pDate as minlastdate,
maxp.price AS maxprice, maxp.pDate as maxlastdate,
ag.price AS averageprice
FROM products AS prod
LEFT JOIN (SELECT lmd.product_id,max(lmd.pDate) as pDate,mn.price FROM products as lmd INNER JOIN
(SELECT product_id, min(price) AS price from products group by product_id) as mn ON lmd.product_id=mn.product_id
group by lmd.product_id,mn.price ) AS minp ON minp.product_id=prod.product_id
LEFT JOIN (SELECT lxd.product_id,max(lxd.pDate) as pDate,mx.price FROM products as lxd INNER JOIN
(SELECT product_id, max(price) AS price from products group by product_id) as mx ON lxd.product_id=mx.product_id
group by lxd.product_id,mx.price ) AS maxp ON maxp.product_id=prod.product_id
LEFT JOIN (SELECT product_id,avg(price) as price FROM products GROUP BY product_id) AS ag ON ag.product_id=prod.product_id
WHERE prod.product_id=1
LIMIT 1
Yep - left out an 'and' condition:
SELECT TOP 1
prod.product_id,
minp.price AS minprice, minp.pDate as minlastdate,
maxp.price AS maxprice, maxp.pDate as maxlastdate,
ag.price AS averageprice
FROM products AS prod
LEFT JOIN (SELECT lmd.product_id,max(lmd.pDate) as pDate,mn.price FROM products as lmd INNER JOIN
(SELECT product_id, min(price) AS price from products group by product_id) as mn ON lmd.product_id=mn.product_id **AND lmd.price=mn.price**
group by lmd.product_id,mn.price ) AS minp ON minp.product_id=prod.product_id
LEFT JOIN (SELECT lxd.product_id,max(lxd.pDate) as pDate,mx.price FROM products as lxd INNER JOIN
(SELECT product_id, max(price) AS price from products group by product_id) as mx ON lxd.product_id=mx.product_id AND **lxd.price=mx.price**
group by lxd.product_id,mx.price ) AS maxp ON maxp.product_id=prod.product_id
LEFT JOIN (SELECT product_id,avg(price) as price FROM products GROUP BY product_id) AS ag ON ag.product_id=prod.product_id
WHERE prod.product_id=#product_id
I would do this with a combination of ranking functions and conditional aggregations:
select product_id,
max(case when seqnum_hi = 1 then price end) as highPrice,
max(case when seqnum_hi = 1 then pdate end) as highPrice_date
max(case when seqnum_low = 1 then price end) as lowPrice,
max(case when seqnum_low = 1 then pdate end) as lowPrice_date,
max(case when seqnum_rec = 1 then price end) as recentPrice,
max(case when seqnum_rec = 1 then pdate end) as recentPrice_date,
avg(price) as avg_price
from (select p.*,
row_number() over (partition by product_id order by price asc) as seqnum_low,
row_number() over (partition by product_id order by price desc) as seqnum_hi,
row_number() over (partition by product_id order by pdate desc) as seqnum_rec
from price
where product_id = #product_id
group by product_id
The seguence numbers identify the rows with the particular attributes you care about (high price, low price, most recent). The conditional max then just selects information from those rows.
The following should get what you want. It's pretty long, but is readable so should be easily modified by anyone who needs to:
;WITH CTE_MaxPrice AS
(
SELECT product_id, MAX(P.price) AS MaxPrice
FROM Products P
GROUP BY product_id
HAVING product_id = #product_id
),
CTE_MinPrice AS
(
SELECT product_id, MIN(P.price) AS MinPrice
FROM Products P
GROUP BY product_id
HAVING product_id = #product_id
),
CTE_MaxPriceDate AS
(
SELECT P.product_id, MAX(P.pDate) AS MaxDate
FROM Products P
INNER JOIN CTE_MaxPrice MaxP ON P.product_id = MaxP.product_id
AND P.price = MaxP.MaxPrice
GROUP BY P.product_id
),
CTE_MinPriceDate AS
(
SELECT P.product_id, MAX(P.pDate) AS MinDate
FROM Products P
INNER JOIN CTE_MinPrice MinP ON P.product_id = MinP.product_id
AND P.price = MinP.MinPrice
GROUP BY P.product_id
)
SELECT MaxP.MaxPrice, MaxPD.MaxDate,
MinP.MinPrice, MinPD.MinDate,
RP.price AS RecentPrice, MAX(RP.pDate) AS RecentDate,
AVG(AP.price) AS AveragePrice
FROM Products P
INNER JOIN CTE_MaxPrice MaxP ON P.product_id = MaxP.product_id
INNER JOIN CTE_MinPrice MinP ON P.product_id = MinP.product_id
AND MinP.MinPrice > 0
INNER JOIN CTE_MaxPriceDate MaxPD ON P.product_id = MaxPD.product_id
INNER JOIN CTE_MinPriceDate MinPD ON P.product_id = MinPD.product_id
INNER JOIN Products RP ON P.product_id = RP.product_id
INNER JOIN Products AP ON P.product_id = AP.product_id
GROUP BY MaxP.MaxPrice, MaxPD.MaxDate,
MinP.MinPrice, MinPD.MinDate, RP.price
HAVING P.product_id = #product_id
Well since there have been three attempts to answer, and none have worked quite how you want, I'll tell you how I would do it - and this assumes you can use a stored procedure and also assumes that the product table is not so huge that multiple seperate queries would be a problem:
CREATE PROCEDURE myproc AS
DECLARE #Price1 money
DECLARE #Date1 smalldatetime
DECLARE #Price2 money
DECLARE #Date2 smalldatetime
DECLARE #Price3 money
DECLARE #Date3 smalldatetime
DECLARE #Price4 money
SELECT #Price1 = MAX(Price) FROM Products
SELECT #Date1 = MAX(pDate) FROM Products WHERE Price=#Price1
SELECT #Price2 = Min(Price) FROM Products WHERE Price >0
SELECT #Date2 = MAX(pDate) FROM Products WHERE Price=#Price2
SELECT #Date3 = Max(pDate) FROM Products
SELECT #Price3 = MAX(Price) FROM Products WHERE pDate=#Date3 --max in case there are more than one purchases with the same date.
SELECT #Price4 = AVG(Price) FROM Products WHERE Price>0
SELECT #Price1 As MaxPrice,
#Date1 As MaxPriceDate,
#Price2 As LowPrice,
#Date2 As LowPriceDate,
#Price4 As AveragePrice,
#Price3 As RecentPrice,
#Price3 As RecentPriceDate
GO
Forgive any typographical errors, I didn't test this, but if you can use stored procedures, this will work.
So this is not much different than doing your multiple queries from the client, but should perform better putting them all into a single SP. You could also cut the number of queries down a bit by using some of the code from your other answers, but I have left it this way for clarity.