What is an example query to retrieve the first, second and third largest number from a database table using SQL Server?
You can sort by your value descendingly and take the top 3.
SELECT TOP 3 YourVal FROM YourTable ORDER BY YourVal DESC
Or if you wanted each result separate,
first number :
SELECT TOP 1 YourVal FROM YourTable ORDER BY YourVal DESC
second number:
SELECT TOP 1 YourVal FROM YourTable
WHERE YourVal not in (SELECT TOP 1 YourVal FROM YourTable ORDER BY YourVal DESC)
ORDER BY YourVal DESC
third number:
SELECT TOP 1 YourVal FROM YourTable
WHERE YourVal not in (SELECT TOP 2 YourVal FROM YourTable ORDER BY YourVal DESC)
ORDER BY YourVal DESC
assuming YourVal is unique
EDIT : following on from OPs comment
to get the nth value, select the TOP 1 that isn't in the TOP (n-1), so fifth can be chosen by:
SELECT TOP 1 YourVal FROM YourTable
WHERE YourVal not in (SELECT TOP 4 YourVal FROM YourTable ORDER BY YourVal DESC)
ORDER BY YourVal DESC
The proposed SELECT TOP n ... ORDER BY key will work but you need to be aware of the fact that you might get unexpected results if the column you're sorting on is not unique. Find more information on the topic here.
Sudhakar,
It may be easier to use ROW_NUMBER() or DENSE_RANK() for some of these questions. For example, to find YourVal and other columns from the fifth row in order of YourVal DESC:
WITH TRanked AS (
SELECT *,
ROW_NUMBER() OVER (
ORDER BY YourVal DESC, yourPrimaryKey
) AS rk
)
SELECT YourVal, otherColumns
FROM TRanked
WHERE rk = 5;
If you want all rows with the fifth largest distinct YourVal value, just change ROW_NUMBER() to DENSE_RANK().
One really big advantage to these functions is the fact that you can immediately change a "the nth highest YourVal" query to a "the nth highest YourVal for each otherColumn" query just by adding PARTITION BY otherColumn to the OVER clause.
In certain DBMS packages the top command may not work. Then how to do this? Suppose we need to find the 3rd largest salary in employee table. So we select the distinct salary from the table in descending order:
select distinct salary from employee order by salary desc
Now among the salaries selected we need top 3 salaries, for that we write:
select salary from (select distinct salary from employee order by salary desc) where rownum<=3 order by salary
This gives top 3 salaries in ascending order. This makes the third largest salary to go to first position. Now we have the final task of printing the 3rd largest number.
select salary from (select salary from (select distinct salary from employee order by salary desc) where rownum<=3 order by salary) where rownum=1
This gives the third largest number. For any mistake in the query please let me know. Basically to get the nth largest number we can rewrite the above query as
select salary from (select salary from (select distinct salary from employee order by salary desc) where rownum<=**n** order by salary) where rownum=1
If you have a table called Orders and 3 columns Id, ProductId and Quantity then to retrieve the top 3 highest quantities your query would look like:
SELECT TOP 3 [Id], [ProductId], [Quantity] FROM [Orders] ORDER BY [Quantity] DESC
or if you just want the quantity column:
SELECT TOP 3 [Quantity] FROM [Orders] ORDER BY [Quantity] DESC
This works prefect!
select top 1 * from Employees where EmpId in
(
select top 3 EmpId from Employees order by EmpId
) order by EmpId desc;
If you would like to get 2nd, 3rd or 4th highest just change top3 to appropriate number.
Related
I have a table containing user_id, movie_id, rating. These are all INT, and ratings range from 1-5.
I want to get the median rating and group it by user_id, but I'm having some trouble doing this.
My code at the moment is:
SELECT AVG(rating)
FROM (SELECT rating
FROM movie_data
ORDER BY rating
LIMIT 2 - (SELECT COUNT(*) FROM movie_data) % 2
OFFSET (SELECT (COUNT(*) - 1) / 2
FROM movie_data));
However, this seems to return the median value of all the ratings. How can I group this by user_id, so I can see the median rating per user?
The following gives the required median:
DROP TABLE IF EXISTS movie_data2;
CREATE TEMPORARY TABLE movie_data2 AS
SELECT user_id, rating FROM movie_data order by user_id, rating;
SELECT a.user_id, a.rating FROM (
SELECT user_id, rowid, rating
FROM movie_data2) a JOIN (
SELECT user_id, cast(((min(rowid)+max(rowid))/2) as int) as midrow FROM movie_data2 b
GROUP BY user_id
) c ON a.rowid = c.midrow
;
The logic is straightforward but the code is not beautified. Given encouragement or comments I will improve it. In a nutshell, the trick is to use rowid of SQLite.
This is not easily possible because SQLite does not allow correlated subqueries to refer to outer values in the LIMIT/OFFSET clauses.
Add WHERE clauses for the user_id to all three subqueries, and execute them for each user ID.
SELECT user_id,AVG(rating)
FROM movie_data
GROUP BY user_id
ORDER BY rating
Would anyone please explain how the below query works, this query is to find the Nth highest salary.
Will this works like bubble sort, just like taking one column of outer tab at a time and compare with inner table..? requesting you to explain with example.
Select distinct(salary)
from emp e
where &n = (
Select count(distinct(salary))
from emp
where e.salary<= salary);
Let's take this example: http://sqlfiddle.com/#!4/b863c9/9
Table
create table salary (salary int);
insert into salary values (100);
insert into salary values (200);
insert into salary values (300);
insert into salary values (400);
insert into salary values (400);
insert into salary values (500);
insert into salary values (600);
insert into salary values (700);
insert into salary values (800);
insert into salary values (900);
2nd highest
Select distinct(salary)
from salary e
where 2 = (
Select count(distinct(salary))
from salary
where e.salary <= salary);
What is the sub query doing?
For each distinct salary in the outer query, the sub query will execute. select distinct(salary) from salary will list out all the distinct salary in no order.
For each of those salaries, the sub query will filter data higher than salary of the outer query's salary. Then it will count distinct salaries. Let's do a run through:
When outer query has salary of 100, sub query execute something like this: select count(distinct(salary)) from salary where salary >= 100 resulting in 9. That means, there are 9 distinct salaries >= 100.
When outer query has salary of 200, sub query executes select count(distinct(salary)) from salary where salary >= 200. The result is 8.
As the query goes over and over again to the next salaries, it will come to a salary of 800. Sub query select count(distinct(salary)) from salary where salary >= 800 results in 2. At this point, the where clause is satisfied for the outer query and the salary is printed.
As far as the algorithm, it depends on whether there's index or not and whether you use order or not.
Lets pretend that we have a supermarket.
We got a table called Sales where every record is one article), so if we scan 3 articles we will have 3 rows with following columns: ArticleId and Amount where amount Always is 1.
And then we have a table called Articles which have columns: ArticleId and AvailableAmount.
When the sale is done we need to Count records that are the same in Sales table and then update AvailableAmount with AvailableAmount subtracted with the sum of each article.
I'm thinking something like this but i dont know if im thinking right:
UPDATE Articles
SET
AvailableAmount = AvailableAmount - (
Select ArticleId,Count(*) From Sales Group by ArticleId HAVING Count(*) > 1
)
WHERE
ArticleId in(Select distinct ArticleId FROM Sales)
This query is almost correct, but
the subquery must return only one column,
HAVING Count(*) > 1 does not make sense, and
the subquery must return only one value, so you need a correlated subquery:
UPDATE Articles
SET AvailableAmount = AvailableAmount -
(SELECT COUNT(*)
FROM Sales
WHERE ArticleId = Articles.ArticleId)
WHERE ArticleId IN (SELECT ArticleId
FROM Sales)
I am very new to SQL and am using SQLite 3 to run basket analysis on sales data.
The relevant columns are the product ID, a unique transaction ID (which identifies the basket) and the product quantity. Where a customer has bought more than one product type, the unqiue transaction ID is repeated.
I am wanting to count the number of baskets where the customer has bought 1 item.
So far I have tried select count(distinct uniqID) from salesdata having sum(qty) = 1;
But this brought up an error saying a GROUP BY clause is required before HAVING.
I then tried select count(distinct uniqID) from salesdata group by uniqID having sum(qty) = 1
SQlite accepted this, but returned me a list of just 1s, which isn't right either!
I then tried select count(uniqID) from salesdata group by qty having sum(qty) = 1
SQlite also accepted this but returned nothing at all.
Any ideas would be hugely appreciated!
E
Try something like this to retrieve every user which has more or equal than one item in his basket
select uniqID, sum(qty) as total from salesdata group by uniqID having total >= 1
if you want to have only the users which have 1 item in their baskets replace >=1 with =1
like:
select uniqID, sum(qty) as total from salesdata group by uniqID having total = 1
If you want the numbers of users with 1 item in their baskets you get this like this:
SELECT COUNT(*) FROM (select uniqID, sum(qty) as total from salesdata group by uniqID having total
= 1)
Selecting the number of baskets that have only one item. This will also filter out baskets with a quantity higher than one on one single item. If you don't want that, remove the WHERE qty = 1 part.
SELECT
COUNT(uniqID) FROM
(SELECT
uniqID, SUM(qty) AS total
FROM
salesdata
WHERE
qty = 1
GROUP BY
uniqID
HAVING
total = 1)
I have the following query:
SELECT rowid FROM table1 ORDER BY RANDOM() LIMIT 1
And as well I have another table (table3). In that table I have columns table1_id and table2_id. table1_id is a link to a row in table1 and table2_id is a link to a row in another table.
I want in my query to get only those results that are defined in table3. Only those that have table1 rowid in their table1_id column. There may not be any columns at all referring to a certain table1 rowid so in this case I don't want to receive them.
How can I achieve this goal?
Update: I tried the following query, which doesn't work:
SELECT rowid FROM table1
WHERE rowid IN (SELECT table1_id FROM table3 WHERE table1_id = table1.rowid)
ORDER BY RANDOM() LIMIT 1
SELECT rowid FROM table1
WHERE rowid IN ( SELECT DISTINCT table1_id FROM table3 )
ORDER BY RANDOM() LIMIT 1;
This query means "choose at random a row from table1 which has an entry in table3".
Every row in table1 equal likelihood of being selected (DISTINCT) as long as it is referenced at least once in table3.
If you are trying to get more than one result, then you should remove the "ORDER BY RANDOM() LIMIT 1" clause.
Assuming you want to select more than just a rowid, you need to SELECT from a JOIN between the tables you're interested in. SQLite doesn't have the full set of standard JOIN functionality, so you'll need to re-work your query so it can use a LEFT OUTER JOIN.
SELECT table1.rowid, table1.other_field
FROM table3
LEFT OUTER JOIN table1 ON table3.table1_id = table1.rowid
ORDER BY RANDOM()
LIMIT 1;