Related
I'm very new to learning SQL, I apologize if my question isn't completely accurate.
The question I'm trying to answer with this query is "What is the most popular music genre in each country?" I've had to use a subquery and it works, but I found that for a few countries in the table, more than one genre has the MAX value. I'm stuck with how to edit my query so that all genres with the max value show in the results. Here is my code, using DB Browser for SQLite:
SELECT BillingCountry AS Country , name AS Genre , MAX(genre_count) AS Purchases
FROM (
SELECT i.BillingCountry, g.name, COUNT(g.genreid) AS genre_count
FROM Invoice i
JOIN InvoiceLine il
ON il.InvoiceId = i.InvoiceId
JOIN TRACK t
ON il.trackid = t.TrackId
JOIN Genre g
ON t.genreid = g.GenreId
GROUP BY 1,2
) sub
GROUP BY 1
Here is an example of the result:
| Country | Genre |Purchase|
|---------|-------|--------|
|Agrentina| Punk | 9 |
|Australia| Rock | 22 |
BUT in running just the subquery to COUNT the purchases, Argentina has two Genres with 9 Purchases (the max number for that country). How do I adjust my query to include both and not just the first one in the row?
You can do it with RANK() window function:
SELECT BillingCountry, name, genre_count
FROM (
SELECT i.BillingCountry, g.name, COUNT(*) AS genre_count,
RANK() OVER (PARTITION BY i.BillingCountry ORDER BY COUNT(*) DESC) rnk
FROM Invoice i
INNER JOIN InvoiceLine il ON il.InvoiceId = i.InvoiceId
INNER JOIN TRACK t ON il.trackid = t.TrackId
INNER JOIN Genre g ON t.genreid = g.GenreId
GROUP BY i.BillingCountry, g.name
)
WHERE rnk = 1
This will return the ties in separate rows.
If you want 1 row for each country, you could also use GROUP_CONCAT():
SELECT BillingCountry, GROUP_CONCAT(name) AS name, MAX(genre_count) AS genre_count
FROM (
SELECT i.BillingCountry, g.name, COUNT(*) AS genre_count,
RANK() OVER (PARTITION BY i.BillingCountry ORDER BY COUNT(*) DESC) rnk
FROM Invoice i
INNER JOIN InvoiceLine il ON il.InvoiceId = i.InvoiceId
INNER JOIN TRACK t ON il.trackid = t.TrackId
INNER JOIN Genre g ON t.genreid = g.GenreId
GROUP BY i.BillingCountry, g.name
)
WHERE rnk = 1
GROUP BY BillingCountry
In DB2 is there a way to basically say:
case when sku (select * from table1 where tb1field = 'SMOMD') then 'True' end
Okay so this is my query so far, I've been going at this for at least a month now so any help would be great.
select tb4.customer, tb4.sku, tb4.qty, tb4.retqty, tb4.stipqty, tb4.lastdate, tb4.firstdate, tb4.stipdate
from(
--Table 4
select tb3.Customer as Customer, tb3.sku as SKU, tb3.qty as Qty, tb3.retqty as RetQty, tb3.stipqty as STIPQty,
case when tb3.lastdate is null then '00/0000' else substr(tb3.lastdate,5,2)||'/'||substr(tb3.lastdate,1,4) end as LastDate,
case when tb3.firstdate is null then '00/0000' else substr(tb3.firstdate,5,2)||'/'||substr(tb3.firstdate,1,4) end as FirstDate,
case when tb3.stipdate is null then '00/0000' else substr(tb3.stipdate,5,2)||'/'||substr(tb3.stipdate,1,4) end as STIPDate
from(
--Table 3
select tb2.Customer as Customer, tb2.SKU as SKU, tb2.Qty as Qty, tb2.RetQty as RetQty, tb2.STIPQty as STIPQty,
max(case when tb2.TranID in ('010','100') then tb2.datenum end) as LastDate,
min(case when tb2.TranID in ('010','100') then tb2.datenum end) as FirstDate,
case when tb2.RC = '4M' then tb2.datenum end as STIPDate
from(
--Table 2
select tb1.Customer as Customer, tb1.SKU as SKU,
sum(case when tb1.TranID in ('010','100') then abs(tb1.OrdNet) else '0' end) as Qty,
sum(case when tb1.TranID = '500' and tb1.rc != '4M' then abs(tb1.OrdNet) else '0' end) as RetQty,
count(case when tb1.rc = '4M' then tb1.sku end) as STIPQty,
tb1.datenum as datenum, tb1.TranID as tranid, tb1.RC as rc
from(
--Table 1
select distinct stkund as Customer, sthptg||space(1)||stmodl||space(1)||stvari||space(1)||stfarb||space(1)||stgroe as SKU,
stvorg as TranID, stggru as RC, stprg09 as PG9, stprg08 as PG8, stperi as datenum, ormne1 as OrdNet
from st_usus.s_stati_pv
join caspdtau.cospf440 on stadrn = jadr40
where trim(stvert) in ('111S','122S')
and sthptg != 'V'
and aktv40 = 'A'
and stprg01 in ('01','04')
and stprg02 = '01'
and stvorg in ('500','010','100')
and stperi >= '20160100'
) as tb1
group by tb1.Customer, tb1.SKU, tb1.datenum, tb1.tranid, tb1.rc
) as tb2
group by tb2.customer, tb2.sku, tb2.qty, tb2.retqty, tb2.stipqty, tb2.tranid, tb2.rc, tb2.datenum
) as tb3
group by tb3.customer, tb3.sku, tb3.qty, tb3.retqty, tb3.stipqty, tb3.lastdate, tb3.firstdate, tb3.stipdate
) as tb4
order by tb4.Customer, tb4.sku
I'm not going to try to decipher exactly what you're trying to do...
Some general advice, rather than using Nested Table Expressions (NTE)
select <..> from (select <...>from mytable)
Consider Common Table Expressions (CTE)
with
table1 as (select <...> from st_usus.s_stati_pv join caspdtau.cospf440 on stadrn = jadr40)
, table2 as (select <...> from table1)
, table3 as (select <...> from table2)
, table4 as (select <....> from table3)
select <...> from table4;
Each CTE (ie. tableX) can refer to a prior CTE or a physical table/view as needed. The final select can refer to one or more CTE's along with one or more physical tables or views.
Nice thing about building with CTE's, is that you can check your results after each step..
with
table1 as (select <...> from st_usus.s_stati_pv join caspdtau.cospf440 on stadrn = jadr40)
select * from table1;
I have two subqueries. as shown below. the first query works fine but the second query which is basically the first query that I modified to use AND & OR, doesn't work in the sense that it doesn't return ID as expected. any suggestions on what is happening here?
1. (SELECT * FROM (SELECT EMPID FROM EVENT_F
INNER JOIN WCINFORMATION_D
ON EVENT_F.JOB_INFO_ROW_WID= WCINFORMATION_D.ROW_WID
INNER JOIN WCANDIDATE_D ON WCCANDIDATE_D.ROW_WID = VENT_F.CANDIDATE_ROW_WID
WHERE STEP_NAME = 'Offer'
AND WCINFORMATION_D.JOB_FAMILY_NAME IN ('MDP','ELP','Emerging Leader Program','Other')
AND TITLE NOT IN ('Student Ambassador Program for Eligible Summer Interns','Student Ambassador')
AND PI_CANDIDATE_NUM = OUTERAPP.PI_CANDIDATE_NUM
--limit 1
ORDER BY CREATION_DT ASC
) T1 WHERE ROWNUM=1) AS A_ID,
2.(SELECT * FROM (SELECT EMPID FROM EVENT_F
INNER JOIN WCINFORMATION_D
ON EVENT_F.JOB_INFO_ROW_WID= WCINFORMATION_D.ROW_WID
INNER JOIN WCANDIDATE_D ON WCCANDIDATE_D.ROW_WID = VENT_F.CANDIDATE_ROW_WID
WHERE STEP_NAME = 'Offer'
AND WCINFORMATION_D.JOB_FAMILY_NAME IN ('MDP','ELP','Emerging Leader Program','Other') or WCINFORMATION_D.JOB_FAMILY_NAME NOT IN ('MDP','ELP','Emerging Leader Program','Other')
AND TITLE NOT IN ('Student Ambassador Program for Eligible Summer Interns','Student Ambassador')
AND PI_CANDIDATE_NUM = OUTERAPP.PI_CANDIDATE_NUM
--limit 1
ORDER BY CREATION_DT ASC
) T1 WHERE ROWNUM=1) AS A_ID,
If you're wanting to get the count of people in one set of job families, plus a count of people in another set, you need to use a conditional count, e.g. something along the lines of:
SELECT COUNT(CASE WHEN wid.job_family_name IN ('MDP', 'ELP', 'Emerging Leader Program', 'Other') THEN 1 END) job_family_grp1,
COUNT(CASE WHEN wid.job_family_name IS NULL OR wid.job_family_name NOT IN ('MDP', 'ELP', 'Emerging Leader Program', 'Other') THEN 1 END) job_family_grp2
FROM event_f ef
INNER JOIN wcinformation_d wid
ON ef.job_info_row_wid = wid.row_wid
INNER JOIN wcandidate_d wcd
ON wcd.row_wid = ef.candidate_row_wid
WHERE step_name = 'Offer' -- alias this column name
AND title NOT IN ('Student Ambassador Program for Eligible Summer Interns', 'Student Ambassador') -- alias this column name;
You will most likely need to amend this to work for your particular case (it'll have to go as a join into your main query, given there are two columns being selected) since you didn't provide enough information in your question to give us the wider context.
I have a requirement below to choose latest status.
Table 1:
Table2:
Results Expected:
Below is the logic we need.
SELECT
Table1.ID,
,CASE WHEN (Table1.hub=Table2.hub) THEN Table2.Status ELSE NULL END AS Original_Status
,CASE WHEN (Table1.hub<>Table2.hub AND Table2.Status like 'Found%' ) THEN Table2.hub ELSE NULL END AS Derived_Hub
,CASE WHEN (Table1.hub<>Table2.hub AND Table2.Status like 'Found%' ) THEN Table2.Status ELSE NULL END AS Derived_Status
from
Table1
Join Table2
ON (Table1.ID=Table2.ID)
With this code, I am getting 3 rows. If I put max in the above case statements, I am getting Hub3 instead of Hub2.
Could you please let me know how I can merge everything into single row. Please note that I want to show latest found Status from Table 2 even if there are 2 hubs with found status.
Let me know if you need any further information.
I think your best bet is to join into Table 2 twice. The first time, much like you are doing here, but using a LEFT OUTER JOIN on both id and hub. Then the second time on a derived version of table2 where you only select the top 1 records when sorted by timestamp in descending order:
SELECT
t1.id,
t1.hub as "Original Hub"
t2.status as "Original Hub Status",
t3.hub as "Found Hub",
t3.status as "Found Hub Status"
FROM
Table1 t1
LEFT OUTER JOIN Table2 t2 ON
t1.id = t2.id AND
t1.hub = t2.hub
LEFT OUTER JOIN
(
--Select a hub with the same id, that doesn't share the same hub number
--Only choose the top record when sorted by timestamp in descending order
SELECT TOP 1 id, hub, status
FROM table2
WHERE t1.hub <> table2.hub
ORDER BY TimeStamp Desc
) t3 ON
t1.id = t3.id
That subquery that has alias t3 is using a correlated subquery by referencing the t1.hub value in it's WHERE statement.
In TD14.10 you can utilize LAST_VALUE to access data from the "last" row. The CASEs are based on your query:
SELECT
t2.ID,
CASE WHEN t1.Hub = t2.Hub THEN t2.Hub END AS Original_Hub,
CASE WHEN t1.Hub = t2.Hub THEN t2.Status END AS Original_Status,
-- get the last Hub
LAST_VALUE(CASE WHEN t1.Hub <> t2.Hub AND Table2.Status like 'Found%'
THEN t2.Hub
END)
OVER (PARTITION BY t1.ID
ORDER BY CASE WHEN t1.Hub = t2.Hub THEN 0 ELSE 1 END, t2.TS
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS Derived_Hub,
-- get the last Status
LAST_VALUE(CASE WHEN t1.Hub <> t2.Hub AND Table2.Status like 'Found%'
THEN t2.Status
END)
OVER (PARTITION BY t1.ID
ORDER BY CASE WHEN t1.Hub = t2.Hub THEN 0 ELSE 1 END, t2.TS
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS Derived_Status
FROM
Table1 AS t1
JOIN Table2 AS t2
ON (t1.ID=t2.ID)
QUALIFY
ROW_NUMBER () -- return the 1st row only
OVER (PARTITION BY t1.ID
ORDER BY CASE WHEN t1.Hub = t2.Hub THEN 0 ELSE 1 END, t2.TS) = 1
EXPLAIN should combine all OVER into a single STATS-step as they use the same PARTITION BY and ORDER BY.
I'm looking to improve this query I wrote for a small web application in ASP.NET 4.0 using SQL-Server 2005. This application will allow the user to search by Product ID and have it return the following information:
Highest Purchase Price + Most Recent Date of purchase # this price
Lowest Purchase Price + Most Recent Date of purchase # this price
Most Recent Purchase Price + Date
Average Purchase Price (optional, i thought this might improve the usefulness of the app)
Here is the structure of the Products table (I'm only including relevant columns, this is a DB already in production and these are non-pk columns)
product_id (nvarchar(20))
price (decimal(19,2))
pDate (datetime)
Before I put down the query I have so far I just want to say that I can get this information easily through multiple queries, so if this is the best practice then disregard improving the query, but I was aiming to minimize the number of queries needed to get all needed information.
What I have so far: (Note: There are rows with price = 0 so I ignored those in the bottom select looking for the MIN price)
SELECT price, MAX(pDate)
FROM Products
WHERE product_id = #product_id AND
(price = (SELECT MAX(price)
FROM Products
WHERE product_id =#product_id) OR
price = (SELECT MIN(price)
FROM Products
WHERE product_id = #product_id AND price > 0))
GROUP BY price
Now this is returning 2 rows:
first = the lowest price + date
second row = high price + date
What I would like ideally is to have a query return 1 row with all the needed information stated above if possible, as it would simplify displaying the information in ASP for me. And like I said earlier, if multiple queries is the be approach then no need to re-write a complex query here.
Edit
Here is some sample data
Desired query results: (ignore the format as I typed this in excel)
Here is the query I will be using thanks to Ken Benson:
SELECT TOP 1 prod.product_id,
minp.price AS minprice, minp.pDate as minlastdate,
maxp.price AS maxprice, maxp.pDate as maxlastdate,
ag.price AS averageprice
FROM products AS prod
LEFT JOIN (SELECT lmd.product_id,max(lmd.pDate) as pDate,mn.price FROM products as lmd INNER JOIN
(SELECT product_id, min(price) AS price from products WHERE price > 0 group by product_id) as mn ON lmd.product_id=mn.product_id AND lmd.price=mn.price
group by lmd.product_id,mn.price ) AS minp ON minp.product_id=prod.product_id
LEFT JOIN (SELECT lxd.product_id,max(lxd.pDate) as pDate,mx.price FROM products as lxd INNER JOIN
(SELECT product_id, max(price) AS price from products group by product_id) as mx ON lxd.product_id=mx.product_id AND lxd.price=mx.price
group by lxd.product_id,mx.price ) AS maxp ON maxp.product_id=prod.product_id
LEFT JOIN (SELECT product_id,avg(price) as price FROM products WHERE price > 0 GROUP BY product_id) AS ag ON ag.product_id=prod.product_id
WHERE prod.product_id=#product_id
I think you can do a couple of joins back to the table ...
Select product_id, min.price, min.pDate, max.price, max.pDate
FROM products as p
LEFT JOIN (Select Min(price), pDate, product_id FROM products GROUP BY product_id)
as min on min.product_id=p.product_id
LEFT JOIN (Select max(price), pDate, product_id FROM products GROUP BY product_id)
as max on max.product_id=p.product_id
Where p.product_id = #product_id
This second bit of code should produce desired results....
SELECT prod.product_id,
minp.price AS minprice, minp.pDate as minlastdate,
maxp.price AS maxprice, maxp.pDate as maxlastdate,
ag.price AS averageprice
FROM products AS prod
LEFT JOIN (SELECT lmd.product_id,max(lmd.pDate) as pDate,mn.price FROM products as lmd INNER JOIN
(SELECT product_id, min(price) AS price from products group by product_id) as mn ON lmd.product_id=mn.product_id
group by lmd.product_id,mn.price ) AS minp ON minp.product_id=prod.product_id
LEFT JOIN (SELECT lxd.product_id,max(lxd.pDate) as pDate,mx.price FROM products as lxd INNER JOIN
(SELECT product_id, max(price) AS price from products group by product_id) as mx ON lxd.product_id=mx.product_id
group by lxd.product_id,mx.price ) AS maxp ON maxp.product_id=prod.product_id
LEFT JOIN (SELECT product_id,avg(price) as price FROM products GROUP BY product_id) AS ag ON ag.product_id=prod.product_id
WHERE prod.product_id=1
LIMIT 1
Yep - left out an 'and' condition:
SELECT TOP 1
prod.product_id,
minp.price AS minprice, minp.pDate as minlastdate,
maxp.price AS maxprice, maxp.pDate as maxlastdate,
ag.price AS averageprice
FROM products AS prod
LEFT JOIN (SELECT lmd.product_id,max(lmd.pDate) as pDate,mn.price FROM products as lmd INNER JOIN
(SELECT product_id, min(price) AS price from products group by product_id) as mn ON lmd.product_id=mn.product_id **AND lmd.price=mn.price**
group by lmd.product_id,mn.price ) AS minp ON minp.product_id=prod.product_id
LEFT JOIN (SELECT lxd.product_id,max(lxd.pDate) as pDate,mx.price FROM products as lxd INNER JOIN
(SELECT product_id, max(price) AS price from products group by product_id) as mx ON lxd.product_id=mx.product_id AND **lxd.price=mx.price**
group by lxd.product_id,mx.price ) AS maxp ON maxp.product_id=prod.product_id
LEFT JOIN (SELECT product_id,avg(price) as price FROM products GROUP BY product_id) AS ag ON ag.product_id=prod.product_id
WHERE prod.product_id=#product_id
I would do this with a combination of ranking functions and conditional aggregations:
select product_id,
max(case when seqnum_hi = 1 then price end) as highPrice,
max(case when seqnum_hi = 1 then pdate end) as highPrice_date
max(case when seqnum_low = 1 then price end) as lowPrice,
max(case when seqnum_low = 1 then pdate end) as lowPrice_date,
max(case when seqnum_rec = 1 then price end) as recentPrice,
max(case when seqnum_rec = 1 then pdate end) as recentPrice_date,
avg(price) as avg_price
from (select p.*,
row_number() over (partition by product_id order by price asc) as seqnum_low,
row_number() over (partition by product_id order by price desc) as seqnum_hi,
row_number() over (partition by product_id order by pdate desc) as seqnum_rec
from price
where product_id = #product_id
group by product_id
The seguence numbers identify the rows with the particular attributes you care about (high price, low price, most recent). The conditional max then just selects information from those rows.
The following should get what you want. It's pretty long, but is readable so should be easily modified by anyone who needs to:
;WITH CTE_MaxPrice AS
(
SELECT product_id, MAX(P.price) AS MaxPrice
FROM Products P
GROUP BY product_id
HAVING product_id = #product_id
),
CTE_MinPrice AS
(
SELECT product_id, MIN(P.price) AS MinPrice
FROM Products P
GROUP BY product_id
HAVING product_id = #product_id
),
CTE_MaxPriceDate AS
(
SELECT P.product_id, MAX(P.pDate) AS MaxDate
FROM Products P
INNER JOIN CTE_MaxPrice MaxP ON P.product_id = MaxP.product_id
AND P.price = MaxP.MaxPrice
GROUP BY P.product_id
),
CTE_MinPriceDate AS
(
SELECT P.product_id, MAX(P.pDate) AS MinDate
FROM Products P
INNER JOIN CTE_MinPrice MinP ON P.product_id = MinP.product_id
AND P.price = MinP.MinPrice
GROUP BY P.product_id
)
SELECT MaxP.MaxPrice, MaxPD.MaxDate,
MinP.MinPrice, MinPD.MinDate,
RP.price AS RecentPrice, MAX(RP.pDate) AS RecentDate,
AVG(AP.price) AS AveragePrice
FROM Products P
INNER JOIN CTE_MaxPrice MaxP ON P.product_id = MaxP.product_id
INNER JOIN CTE_MinPrice MinP ON P.product_id = MinP.product_id
AND MinP.MinPrice > 0
INNER JOIN CTE_MaxPriceDate MaxPD ON P.product_id = MaxPD.product_id
INNER JOIN CTE_MinPriceDate MinPD ON P.product_id = MinPD.product_id
INNER JOIN Products RP ON P.product_id = RP.product_id
INNER JOIN Products AP ON P.product_id = AP.product_id
GROUP BY MaxP.MaxPrice, MaxPD.MaxDate,
MinP.MinPrice, MinPD.MinDate, RP.price
HAVING P.product_id = #product_id
Well since there have been three attempts to answer, and none have worked quite how you want, I'll tell you how I would do it - and this assumes you can use a stored procedure and also assumes that the product table is not so huge that multiple seperate queries would be a problem:
CREATE PROCEDURE myproc AS
DECLARE #Price1 money
DECLARE #Date1 smalldatetime
DECLARE #Price2 money
DECLARE #Date2 smalldatetime
DECLARE #Price3 money
DECLARE #Date3 smalldatetime
DECLARE #Price4 money
SELECT #Price1 = MAX(Price) FROM Products
SELECT #Date1 = MAX(pDate) FROM Products WHERE Price=#Price1
SELECT #Price2 = Min(Price) FROM Products WHERE Price >0
SELECT #Date2 = MAX(pDate) FROM Products WHERE Price=#Price2
SELECT #Date3 = Max(pDate) FROM Products
SELECT #Price3 = MAX(Price) FROM Products WHERE pDate=#Date3 --max in case there are more than one purchases with the same date.
SELECT #Price4 = AVG(Price) FROM Products WHERE Price>0
SELECT #Price1 As MaxPrice,
#Date1 As MaxPriceDate,
#Price2 As LowPrice,
#Date2 As LowPriceDate,
#Price4 As AveragePrice,
#Price3 As RecentPrice,
#Price3 As RecentPriceDate
GO
Forgive any typographical errors, I didn't test this, but if you can use stored procedures, this will work.
So this is not much different than doing your multiple queries from the client, but should perform better putting them all into a single SP. You could also cut the number of queries down a bit by using some of the code from your other answers, but I have left it this way for clarity.