Would anyone please explain how the below query works, this query is to find the Nth highest salary.
Will this works like bubble sort, just like taking one column of outer tab at a time and compare with inner table..? requesting you to explain with example.
Select distinct(salary)
from emp e
where &n = (
Select count(distinct(salary))
from emp
where e.salary<= salary);
Let's take this example: http://sqlfiddle.com/#!4/b863c9/9
Table
create table salary (salary int);
insert into salary values (100);
insert into salary values (200);
insert into salary values (300);
insert into salary values (400);
insert into salary values (400);
insert into salary values (500);
insert into salary values (600);
insert into salary values (700);
insert into salary values (800);
insert into salary values (900);
2nd highest
Select distinct(salary)
from salary e
where 2 = (
Select count(distinct(salary))
from salary
where e.salary <= salary);
What is the sub query doing?
For each distinct salary in the outer query, the sub query will execute. select distinct(salary) from salary will list out all the distinct salary in no order.
For each of those salaries, the sub query will filter data higher than salary of the outer query's salary. Then it will count distinct salaries. Let's do a run through:
When outer query has salary of 100, sub query execute something like this: select count(distinct(salary)) from salary where salary >= 100 resulting in 9. That means, there are 9 distinct salaries >= 100.
When outer query has salary of 200, sub query executes select count(distinct(salary)) from salary where salary >= 200. The result is 8.
As the query goes over and over again to the next salaries, it will come to a salary of 800. Sub query select count(distinct(salary)) from salary where salary >= 800 results in 2. At this point, the where clause is satisfied for the outer query and the salary is printed.
As far as the algorithm, it depends on whether there's index or not and whether you use order or not.
Related
I have a table containing user_id, movie_id, rating. These are all INT, and ratings range from 1-5.
I want to get the median rating and group it by user_id, but I'm having some trouble doing this.
My code at the moment is:
SELECT AVG(rating)
FROM (SELECT rating
FROM movie_data
ORDER BY rating
LIMIT 2 - (SELECT COUNT(*) FROM movie_data) % 2
OFFSET (SELECT (COUNT(*) - 1) / 2
FROM movie_data));
However, this seems to return the median value of all the ratings. How can I group this by user_id, so I can see the median rating per user?
The following gives the required median:
DROP TABLE IF EXISTS movie_data2;
CREATE TEMPORARY TABLE movie_data2 AS
SELECT user_id, rating FROM movie_data order by user_id, rating;
SELECT a.user_id, a.rating FROM (
SELECT user_id, rowid, rating
FROM movie_data2) a JOIN (
SELECT user_id, cast(((min(rowid)+max(rowid))/2) as int) as midrow FROM movie_data2 b
GROUP BY user_id
) c ON a.rowid = c.midrow
;
The logic is straightforward but the code is not beautified. Given encouragement or comments I will improve it. In a nutshell, the trick is to use rowid of SQLite.
This is not easily possible because SQLite does not allow correlated subqueries to refer to outer values in the LIMIT/OFFSET clauses.
Add WHERE clauses for the user_id to all three subqueries, and execute them for each user ID.
SELECT user_id,AVG(rating)
FROM movie_data
GROUP BY user_id
ORDER BY rating
Lets pretend that we have a supermarket.
We got a table called Sales where every record is one article), so if we scan 3 articles we will have 3 rows with following columns: ArticleId and Amount where amount Always is 1.
And then we have a table called Articles which have columns: ArticleId and AvailableAmount.
When the sale is done we need to Count records that are the same in Sales table and then update AvailableAmount with AvailableAmount subtracted with the sum of each article.
I'm thinking something like this but i dont know if im thinking right:
UPDATE Articles
SET
AvailableAmount = AvailableAmount - (
Select ArticleId,Count(*) From Sales Group by ArticleId HAVING Count(*) > 1
)
WHERE
ArticleId in(Select distinct ArticleId FROM Sales)
This query is almost correct, but
the subquery must return only one column,
HAVING Count(*) > 1 does not make sense, and
the subquery must return only one value, so you need a correlated subquery:
UPDATE Articles
SET AvailableAmount = AvailableAmount -
(SELECT COUNT(*)
FROM Sales
WHERE ArticleId = Articles.ArticleId)
WHERE ArticleId IN (SELECT ArticleId
FROM Sales)
I have been tasked with the following:
Write a PL/SQL anonymous block that inserts 100 employee IDs,
starting at number 3000. Use a FOR loop (similar to the 100 you added starting at 2000).
Add the AVERAGE salary of the original employees table (use SELECT!)
in the salary column of the newly created employee rows.
So, I have created the new rows. These have employee ID's from 2000 - 3000.
I have to find the average of the all the rows that have employee ID below 2000 (the original rows in the table) and add this to the salary column of the new rows?
Can anyone give me some help with this?
Would it be something along the lines of
UPDATE emp2 -- table name
SELECT AVG( salary )
FROM emp2
WHERE employee_ID < 2000
Not too sure how to do this?
You can use the below query,
UPDATE emp2
SET salary = (SELECT AVG(salary)
FROM emp2
WHERE employee_ID < 2000
)
WHERE employee_id >= 2000;
Sqlite doesn't have a row number function. My database however could have several thousands of records. I need to sort a table based upon a date (the date field is actually an INTEGER) and then return a specific range of rows. So if I wanted all the rows from 600 to 800, I need to somehow create a row number and limit the results to fall within my desired range. I cannot use RowID or any auto-incremented ID field because all the data is inserted with random dates. The closest I can get is this:
CREATE TABLE Test (ID INTEGER, Name TEXT, DateRecorded INTEGER);
Insert Into Test (ID, Name, DateRecorded) Values (5,'fox', 400);
Insert Into Test (ID, Name, DateRecorded) Values (1,'rabbit', 100);
Insert Into Test (ID, Name, DateRecorded) Values (10,'ant', 800);
Insert Into Test (ID, Name, DateRecorded) Values (8,'deer', 300);
Insert Into Test (ID, Name, DateRecorded) Values (6,'bear', 200);
SELECT ID,
Name,
DateRecorded,
(SELECT COUNT(*)
FROM Test AS t2
WHERE t2.DateRecorded > t1.DateRecorded) AS RowNum
FROM Test AS t1
where RowNum > 2
ORDER BY DateRecorded Desc;
This will work except it's really ugly. The Select Count(*) will result in carrying out that Select statement for every row encountered. So if I have several thousands of rows, that will be a very poor performance.
This is what the LIMIT/OFFSET clauses are for:
SELECT *
FROM Test
ORDER BY DateRecorded DESC
LIMIT 200 OFFSET 600
What is an example query to retrieve the first, second and third largest number from a database table using SQL Server?
You can sort by your value descendingly and take the top 3.
SELECT TOP 3 YourVal FROM YourTable ORDER BY YourVal DESC
Or if you wanted each result separate,
first number :
SELECT TOP 1 YourVal FROM YourTable ORDER BY YourVal DESC
second number:
SELECT TOP 1 YourVal FROM YourTable
WHERE YourVal not in (SELECT TOP 1 YourVal FROM YourTable ORDER BY YourVal DESC)
ORDER BY YourVal DESC
third number:
SELECT TOP 1 YourVal FROM YourTable
WHERE YourVal not in (SELECT TOP 2 YourVal FROM YourTable ORDER BY YourVal DESC)
ORDER BY YourVal DESC
assuming YourVal is unique
EDIT : following on from OPs comment
to get the nth value, select the TOP 1 that isn't in the TOP (n-1), so fifth can be chosen by:
SELECT TOP 1 YourVal FROM YourTable
WHERE YourVal not in (SELECT TOP 4 YourVal FROM YourTable ORDER BY YourVal DESC)
ORDER BY YourVal DESC
The proposed SELECT TOP n ... ORDER BY key will work but you need to be aware of the fact that you might get unexpected results if the column you're sorting on is not unique. Find more information on the topic here.
Sudhakar,
It may be easier to use ROW_NUMBER() or DENSE_RANK() for some of these questions. For example, to find YourVal and other columns from the fifth row in order of YourVal DESC:
WITH TRanked AS (
SELECT *,
ROW_NUMBER() OVER (
ORDER BY YourVal DESC, yourPrimaryKey
) AS rk
)
SELECT YourVal, otherColumns
FROM TRanked
WHERE rk = 5;
If you want all rows with the fifth largest distinct YourVal value, just change ROW_NUMBER() to DENSE_RANK().
One really big advantage to these functions is the fact that you can immediately change a "the nth highest YourVal" query to a "the nth highest YourVal for each otherColumn" query just by adding PARTITION BY otherColumn to the OVER clause.
In certain DBMS packages the top command may not work. Then how to do this? Suppose we need to find the 3rd largest salary in employee table. So we select the distinct salary from the table in descending order:
select distinct salary from employee order by salary desc
Now among the salaries selected we need top 3 salaries, for that we write:
select salary from (select distinct salary from employee order by salary desc) where rownum<=3 order by salary
This gives top 3 salaries in ascending order. This makes the third largest salary to go to first position. Now we have the final task of printing the 3rd largest number.
select salary from (select salary from (select distinct salary from employee order by salary desc) where rownum<=3 order by salary) where rownum=1
This gives the third largest number. For any mistake in the query please let me know. Basically to get the nth largest number we can rewrite the above query as
select salary from (select salary from (select distinct salary from employee order by salary desc) where rownum<=**n** order by salary) where rownum=1
If you have a table called Orders and 3 columns Id, ProductId and Quantity then to retrieve the top 3 highest quantities your query would look like:
SELECT TOP 3 [Id], [ProductId], [Quantity] FROM [Orders] ORDER BY [Quantity] DESC
or if you just want the quantity column:
SELECT TOP 3 [Quantity] FROM [Orders] ORDER BY [Quantity] DESC
This works prefect!
select top 1 * from Employees where EmpId in
(
select top 3 EmpId from Employees order by EmpId
) order by EmpId desc;
If you would like to get 2nd, 3rd or 4th highest just change top3 to appropriate number.