Homework Question :: Tried very hard :: Need Direction - sqlite

I have been stuck on this progressive question for more than 10 days now
Questions is :: Find all the actors that made more movies with Yash Chopra than any other director
Heres my attempt
SELECT pidsWhoDidMoviesWithYashChopra.pid,
pidsWhoDidMoviesWithYashChopra.moviesWithYashChopra,
pidsOfThoseWhoDidMoviesWithDirectors.moviesByAPID,
pidsWhoDidMoviesWithYashChopra.countOfMoviesWithYashChopraByAPID,
pidsOfThoseWhoDidMoviesWithDirectors.totalNumberOfMoviesByAPID
FROM
(
SELECT TRIM(M_Cast.PID) AS pid, moviesByYashChopra.mDirectorMID AS moviesWithYashChopra, COUNT(moviesByYashChopra.mDirectorMID) AS countOfMoviesWithYashChopraByAPID
FROM M_Cast
JOIN
(
SELECT TRIM(M_Director.MID) AS mDirectorMID
FROM
M_Director
WHERE TRIM(M_Director.PID) IN
(
SELECT TRIM(Person.PID) AS personPID
FROM
Person
WHERE Person.Name LIKE '%Yash Chopra%'
)
) AS moviesByYashChopra
ON TRIM(M_Cast.MID) == moviesByYashChopra.mDirectorMID
GROUP BY pid
) AS pidsWhoDidMoviesWithYashChopra
JOIN
(
SELECT TRIM(M_Cast.PID) AS pid, TRIM(M_Cast.MID) AS moviesByAPID, COUNT(TRIM(M_Cast.MID)) AS totalNumberOfMoviesByAPID
FROM M_Cast
GROUP BY pid
) AS pidsOfThoseWhoDidMoviesWithDirectors
ON pidsWhoDidMoviesWithYashChopra.pid == pidsOfThoseWhoDidMoviesWithDirectors.pid
GROUP BY pidsWhoDidMoviesWithYashChopra.pid
And here's the output it produces
And here's the schema
Now where I require help is :: Ability to go ahead from here :: As in how do i dissect this part of the question "than any other director." :: I think that's the tricky part
Any direction/hints will be helpful, Thanks...

You need to join person (for the actor's details) with m_cast, movie, m_director and person again (for the director's details) and group by actor and director to count the number of movies each actor made with each director.
Then use window functions first_value() to get the name of the director with whom the actor made the most movies and lag() to get the 2nd max number of movies (this is needed to filter out ties).
Enclose this query inside a CTE and then filter:
with cte as (
select pa.pid, pa.name, count(*) counter,
first_value(pd.name) over (partition by pa.pid, pa.name order by count(*) desc) max_dir_name,
lag(count(*)) over (partition by pa.pid, pa.name order by count(*) desc) prev_counter
from person pa
inner join m_cast c on c.pid = pa.pid
inner join movie m on m.mid = c.mid
inner join m_director d on d.mid = m.mid
inner join person pd on pd.pid = d.pid
group by pa.pid, pa.name, pd.name
)
select pid, name, counter
from cte
where max_dir_name = 'Yash Chopra' and coalesce(prev_counter, 0) < counter

Related

Error in using WITH clause and INTERSECT in SQLite

I have two SQL queries;
First one is:
with b as (select person_id from people where name='Ward Bond' and born=1903)
select title_id from b natural join crew;
Which is producing correct results and is OK.
Another one is:
with c as (select person_id from people where name='John Wayne' and born=1907)
select title_id from c natural join crew;
Which is also totally OK and producing correct results. As soon as I try to find the intersection of these two queries using the following query:
with b as (select person_id from people where name='Ward Bond' and born=1903) select title_id from b natural join crew
intersect
with c as (select person_id from people where name='John Wayne' and born=1907) select title_id from c natural join crew;
I get the error Error: near "with": syntax error
I'm using SQLite3. Can you please help me to find the problem? The thing I'm trying to get is straightforward; I want to have the intersection of these two temporary tables.
This is the correct syntax for SQLite:
select * from (
with b as (
select person_id
from people
where name='Ward Bond' and born=1903
)
select title_id from b natural join crew
)
intersect
select * from (
with c as (
select person_id
from people
where name='John Wayne' and born=1907
)
select title_id from c natural join crew
);
Another way to get the intersected rows:
with cte(name, born) as (values ('Ward Bond', 1903), ('John Wayne', 1907))
select c.title_id
from crew c natural join people p
where (p.name, p.born) in cte
group by c.title_id
having count(distinct p.person_id) = 2;
That's how I did it using view:
create view a as select person_id from people where name='Ward Bond' and born=1903;
create view b as select person_id from people where name='John Wayne' and born=1907;
with c as
(select title_id from a natural join crew
intersect
select title_id from b natural join crew)
select primary_title from c natural join titles;

How do I include all max values within a row?

I'm very new to learning SQL, I apologize if my question isn't completely accurate.
The question I'm trying to answer with this query is "What is the most popular music genre in each country?" I've had to use a subquery and it works, but I found that for a few countries in the table, more than one genre has the MAX value. I'm stuck with how to edit my query so that all genres with the max value show in the results. Here is my code, using DB Browser for SQLite:
SELECT BillingCountry AS Country , name AS Genre , MAX(genre_count) AS Purchases
FROM (
SELECT i.BillingCountry, g.name, COUNT(g.genreid) AS genre_count
FROM Invoice i
JOIN InvoiceLine il
ON il.InvoiceId = i.InvoiceId
JOIN TRACK t
ON il.trackid = t.TrackId
JOIN Genre g
ON t.genreid = g.GenreId
GROUP BY 1,2
) sub
GROUP BY 1
Here is an example of the result:
| Country | Genre |Purchase|
|---------|-------|--------|
|Agrentina| Punk | 9 |
|Australia| Rock | 22 |
BUT in running just the subquery to COUNT the purchases, Argentina has two Genres with 9 Purchases (the max number for that country). How do I adjust my query to include both and not just the first one in the row?
You can do it with RANK() window function:
SELECT BillingCountry, name, genre_count
FROM (
SELECT i.BillingCountry, g.name, COUNT(*) AS genre_count,
RANK() OVER (PARTITION BY i.BillingCountry ORDER BY COUNT(*) DESC) rnk
FROM Invoice i
INNER JOIN InvoiceLine il ON il.InvoiceId = i.InvoiceId
INNER JOIN TRACK t ON il.trackid = t.TrackId
INNER JOIN Genre g ON t.genreid = g.GenreId
GROUP BY i.BillingCountry, g.name
)
WHERE rnk = 1
This will return the ties in separate rows.
If you want 1 row for each country, you could also use GROUP_CONCAT():
SELECT BillingCountry, GROUP_CONCAT(name) AS name, MAX(genre_count) AS genre_count
FROM (
SELECT i.BillingCountry, g.name, COUNT(*) AS genre_count,
RANK() OVER (PARTITION BY i.BillingCountry ORDER BY COUNT(*) DESC) rnk
FROM Invoice i
INNER JOIN InvoiceLine il ON il.InvoiceId = i.InvoiceId
INNER JOIN TRACK t ON il.trackid = t.TrackId
INNER JOIN Genre g ON t.genreid = g.GenreId
GROUP BY i.BillingCountry, g.name
)
WHERE rnk = 1
GROUP BY BillingCountry

Why would oracle subquery with AND & OR return returning wrogn results set

I have two subqueries. as shown below. the first query works fine but the second query which is basically the first query that I modified to use AND & OR, doesn't work in the sense that it doesn't return ID as expected. any suggestions on what is happening here?
1. (SELECT * FROM (SELECT EMPID FROM EVENT_F
INNER JOIN WCINFORMATION_D
ON EVENT_F.JOB_INFO_ROW_WID= WCINFORMATION_D.ROW_WID
INNER JOIN WCANDIDATE_D ON WCCANDIDATE_D.ROW_WID = VENT_F.CANDIDATE_ROW_WID
WHERE STEP_NAME = 'Offer'
AND WCINFORMATION_D.JOB_FAMILY_NAME IN ('MDP','ELP','Emerging Leader Program','Other')
AND TITLE NOT IN ('Student Ambassador Program for Eligible Summer Interns','Student Ambassador')
AND PI_CANDIDATE_NUM = OUTERAPP.PI_CANDIDATE_NUM
--limit 1
ORDER BY CREATION_DT ASC
) T1 WHERE ROWNUM=1) AS A_ID,
2.(SELECT * FROM (SELECT EMPID FROM EVENT_F
INNER JOIN WCINFORMATION_D
ON EVENT_F.JOB_INFO_ROW_WID= WCINFORMATION_D.ROW_WID
INNER JOIN WCANDIDATE_D ON WCCANDIDATE_D.ROW_WID = VENT_F.CANDIDATE_ROW_WID
WHERE STEP_NAME = 'Offer'
AND WCINFORMATION_D.JOB_FAMILY_NAME IN ('MDP','ELP','Emerging Leader Program','Other') or WCINFORMATION_D.JOB_FAMILY_NAME NOT IN ('MDP','ELP','Emerging Leader Program','Other')
AND TITLE NOT IN ('Student Ambassador Program for Eligible Summer Interns','Student Ambassador')
AND PI_CANDIDATE_NUM = OUTERAPP.PI_CANDIDATE_NUM
--limit 1
ORDER BY CREATION_DT ASC
) T1 WHERE ROWNUM=1) AS A_ID,
If you're wanting to get the count of people in one set of job families, plus a count of people in another set, you need to use a conditional count, e.g. something along the lines of:
SELECT COUNT(CASE WHEN wid.job_family_name IN ('MDP', 'ELP', 'Emerging Leader Program', 'Other') THEN 1 END) job_family_grp1,
COUNT(CASE WHEN wid.job_family_name IS NULL OR wid.job_family_name NOT IN ('MDP', 'ELP', 'Emerging Leader Program', 'Other') THEN 1 END) job_family_grp2
FROM event_f ef
INNER JOIN wcinformation_d wid
ON ef.job_info_row_wid = wid.row_wid
INNER JOIN wcandidate_d wcd
ON wcd.row_wid = ef.candidate_row_wid
WHERE step_name = 'Offer' -- alias this column name
AND title NOT IN ('Student Ambassador Program for Eligible Summer Interns', 'Student Ambassador') -- alias this column name;
You will most likely need to amend this to work for your particular case (it'll have to go as a join into your main query, given there are two columns being selected) since you didn't provide enough information in your question to give us the wider context.

Aggregate sum from diff columns using conditions. PL/SQL

The task is to get know how much money player earned during the year.
So I want to write a select where by the known player_id I will aggregate his total win like.
FIRST_PLAYER_ID is an id of player who got the first place and FIRST_PRICE accordingly.
What I need:
player_id | total_prize_sum_over_all_tournaments_played_by_him
Code i tried so far:
SELECT PLAYER.SUNAME, TOURNAMENT.NAME FROM PLAYER
JOIN TOURNAMENT
ON PLAYER.ID = TOURNAMENT.FIRST_PLAYER_ID
OR PLAYER.ID = TOURNAMENT.SECOND_PLAYER_ID
OR PLAYER.ID = TOURNAMENT.THIRD_PLAYER_ID
ORDER BY PLAYER.BIRTH_DATE
Also, please help me to name this question correctly.
The following sql statement aggregates the prize money won by each player. The idea is to aggregate by each rank individually, summing up the results for each player afterwards.
SELECT pt.p_id
, SUM(pt.tl)
FROM (
SELECT p.p_id
, sum(t.first_prize) tl
FROM player p
JOIN tournament t on ( t.first_player_id = p.p_id )
GROUP BY p.p_id
UNION ALL
SELECT p.p_id
, sum(t.second_prize) tl
FROM player p
JOIN tournament t on ( t.second_player_id = p.p_id )
GROUP BY p.p_id
UNION ALL
SELECT p.p_id
, sum(t.third_prize) tl
FROM player p
JOIN tournament t on ( t.third_player_id = p.p_id )
GROUP BY p.p_id
) pt
GROUP BY pt.p_id
;
This sql fiddle is a full-fledged example (with restricted table definitons).

Join two tables in SQLite and Count

I have two tables named "likes" and "comments" and I want to have a table which has counts of likes and comments for each specific user, I wrote following query in SQLite but result is not true for all users, count values for users in both tables are multiple of number of likes and number of comments.
SELECT
likes.liker_name, likes.liker_id, likes.profile_picture ,
COUNT(comments.commenter_name) AS comment_count, COUNT( likes.liker_id) AS like_count
FROM likes
LEFT JOIN comments
ON likes.liker_name = comments.commenter_name
GROUP BY
likes.liker_name
ORDER BY
COUNT( likes.liker_id) DESC
How can I get correct value of count for users that exist in both tables?
The problem is: Some users have comments but no likes, others have likes but no comments, some have both and some have none. Therefore I suggest using a union query and summing that one again
SELECT
u.name, u.id, u.profile_picture,
SUM(u.like_count) AS like_count, SUM(u.comment_count) AS comment_count
FROM (
SELECT
liker_name AS name, liker_id AS id, profile_picture,
COUNT(*) AS like_count, 0 AS comment_count
FROM
likes
GROUP BY
liker_name, liker_id, profile_picture
UNION ALL
SELECT
commenter_name AS name, commenter_id AS id, profile_picture,
0 AS like_count, COUNT(*) AS comment_count
FROM
comments
GROUP BY
commenter_name, commenter_id, profile_picture
) AS u
GROUP BY
u.name, u.id, u.profile_picture
If you have a separate user table you could also left join the likes count and the comments count subqueries to the user table
SELECT
u.name, u.id, u.profile_picture, l.cnt AS like_count, c.cnt AS comment_count
FROM
users u
LEFT JOIN
(SELECT liker_id, COUNT(*) AS cnt
FROM likes
GROUP BY liker_id
) AS l
ON u.user_id = l.liker_id
LEFT JOIN
(SELECT commenter_id, COUNT(*) AS cnt
FROM comments
GROUP BY commenter_id
) AS c
ON u.user_id = c.commenter_id
WHERE l.cnt > 0 OR c.cnt > 0
No matter how you make it, you must count the comments and the likes in separate subqueries. If you count after joining you are summing on a result where records might be duplicated (the ones on the left side) and you are getting the wrong count.

Resources