How to count data where sum is greater than x - sqlite

I am very new to SQL and am using SQLite 3 to run basket analysis on sales data.
The relevant columns are the product ID, a unique transaction ID (which identifies the basket) and the product quantity. Where a customer has bought more than one product type, the unqiue transaction ID is repeated.
I am wanting to count the number of baskets where the customer has bought 1 item.
So far I have tried select count(distinct uniqID) from salesdata having sum(qty) = 1;
But this brought up an error saying a GROUP BY clause is required before HAVING.
I then tried select count(distinct uniqID) from salesdata group by uniqID having sum(qty) = 1
SQlite accepted this, but returned me a list of just 1s, which isn't right either!
I then tried select count(uniqID) from salesdata group by qty having sum(qty) = 1
SQlite also accepted this but returned nothing at all.
Any ideas would be hugely appreciated!
E

Try something like this to retrieve every user which has more or equal than one item in his basket
select uniqID, sum(qty) as total from salesdata group by uniqID having total >= 1
if you want to have only the users which have 1 item in their baskets replace >=1 with =1
like:
select uniqID, sum(qty) as total from salesdata group by uniqID having total = 1
If you want the numbers of users with 1 item in their baskets you get this like this:
SELECT COUNT(*) FROM (select uniqID, sum(qty) as total from salesdata group by uniqID having total
= 1)

Selecting the number of baskets that have only one item. This will also filter out baskets with a quantity higher than one on one single item. If you don't want that, remove the WHERE qty = 1 part.
SELECT
COUNT(uniqID) FROM
(SELECT
uniqID, SUM(qty) AS total
FROM
salesdata
WHERE
qty = 1
GROUP BY
uniqID
HAVING
total = 1)

Related

SQLite Calculate sum outside group by

I'm trying to calculate the percentage of a customer has spent over the total sales value.
I have calculated the total sales value per customer using sum() and group by, but after I use group by, I cannot differentiate the total sales value and the individual total for each sustomer.
is there anyway i could get around this?
i got to here so far and dont know what to do next:
select c.firstname ||' '|| c.lastname as 'Ful name',
sum(total) as 'Sales value',
/*something to calculate percentage*/,
from invoice i inner join customer c on i.customerid = c.customerid
group by i.customerid order by sum(total) desc limit 5;
To calculate the simple sum over the entire table, move it into an independent subquery:
SELECT ...,
sum(total) / (SELECT sum(total) FROM invoice)
FROM ...;

Count columns in a table where columns are same and then subtract from a column in another table

Lets pretend that we have a supermarket.
We got a table called Sales where every record is one article), so if we scan 3 articles we will have 3 rows with following columns: ArticleId and Amount where amount Always is 1.
And then we have a table called Articles which have columns: ArticleId and AvailableAmount.
When the sale is done we need to Count records that are the same in Sales table and then update AvailableAmount with AvailableAmount subtracted with the sum of each article.
I'm thinking something like this but i dont know if im thinking right:
UPDATE Articles
SET
AvailableAmount = AvailableAmount - (
Select ArticleId,Count(*) From Sales Group by ArticleId HAVING Count(*) > 1
)
WHERE
ArticleId in(Select distinct ArticleId FROM Sales)
This query is almost correct, but
the subquery must return only one column,
HAVING Count(*) > 1 does not make sense, and
the subquery must return only one value, so you need a correlated subquery:
UPDATE Articles
SET AvailableAmount = AvailableAmount -
(SELECT COUNT(*)
FROM Sales
WHERE ArticleId = Articles.ArticleId)
WHERE ArticleId IN (SELECT ArticleId
FROM Sales)

How to do a 'count' for several criteria at once

I am very new to SQL and am using SQLite 3 to run basket analysis on sales data.
The relevant columns are the product ID, a unique transaction ID (which identifies the basket) and the product quantity. Where a customer has bought more than one product type, the unqiue transaction ID is repeated.
I am wanting to count the number of baskets where the customer has bought 1, 2, 3, 4, 5 and more than 5 items in order to analyse what percentage of customers only bought 1 item.
The code I am using is:
select count (*) as One from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 0) where total = 1;
select count (*) as Two from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 1) where total = 2;
select count (*) as Three from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 1) where total = 3;
select count (*) as Four from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 1) where total = 4;
select count (*) as Five from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 1) where total = 5;
select count (*) as Six from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 1) where total = 6;
select count (*) as Sevenplus from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 1) where total > 6;
This code does work but firstly, as you can see, it is rather unwieldy looking and secondly, the data comes out in the following format when I open it in Excel:
One
1353697
Two
483618
Three
166148
Four
76236
Five
35079
Six
18904
Sevenplus
27896
Ideally I would like the number of items along the top, with the number of baskets meeting that criteria underneath. Whilst I can obviously sort the problem out manually at the moment, I need to run similar analysis on a much bigger scale soon!
Any suggestions on how to write the code so that it structures it the way I want would be greatly appreciated!
This is what you are looking for:
select
case
when total=1 then 'One'
when total=2 then 'Two'
when total=3 then 'Three'
when total=4 then 'Four'
when total=5 then 'Five'
when total=6 then 'Six'
when total=7 then 'SevenPlus'
end,
count(total)
from
(select case when count(uniqID) <= 6 then count(uniqID) else 7 end as total from otcdata3 group by uniqID) as totals
group by total
order by total
This returns two columns, the first column is a text representing the number of items in the transaction, and the second column is the number of distinct purchases with that number of items.

How to select data where sum is greater than x

I am very new to SQL and am using SQLite 3 to run basket analysis on sales data.
The relevant columns are the product ID, a unique transaction ID (which identifies the basket) and the product quantity. Where a customer has bought more than one product type, the unqiue transaction ID is repeated.
I am wanting to select only baskets where the customer has bought more than 1 item.
Is there any way on SQLite to select the unique transaction ID and the sum of the quantity, but only for unique transaction IDs where the quantity is more than one?
So far I have tried:
select uniqID, sum(qty) from salesdata where sum(qty) > 1 group by uniqID;
But SQLite gives me the error 'misuse of aggregate: sum()'
Sorry if this is a simple question but I am struggling to find any relevant information by googling!
Try
select uniqID, sum(qty) from salesdata group by uniqID having sum(qty) > 1
"where" cannot be used on aggregate functions - you can only use where on uniqId, in this case.
if you want to put any condition on the result you get with group by you must use having.
select uniqID, sum(qty) as sumqty from salesdata group by uniqID having sumqty > 1
you can put any of the condition with having normaly as in where.
having sumqty = 1 ,having sumqty < 1 ,having sumqty IN (1,2,3) etc..

SQL query for finding the first, second and third highest numbers

What is an example query to retrieve the first, second and third largest number from a database table using SQL Server?
You can sort by your value descendingly and take the top 3.
SELECT TOP 3 YourVal FROM YourTable ORDER BY YourVal DESC
Or if you wanted each result separate,
first number :
SELECT TOP 1 YourVal FROM YourTable ORDER BY YourVal DESC
second number:
SELECT TOP 1 YourVal FROM YourTable
WHERE YourVal not in (SELECT TOP 1 YourVal FROM YourTable ORDER BY YourVal DESC)
ORDER BY YourVal DESC
third number:
SELECT TOP 1 YourVal FROM YourTable
WHERE YourVal not in (SELECT TOP 2 YourVal FROM YourTable ORDER BY YourVal DESC)
ORDER BY YourVal DESC
assuming YourVal is unique
EDIT : following on from OPs comment
to get the nth value, select the TOP 1 that isn't in the TOP (n-1), so fifth can be chosen by:
SELECT TOP 1 YourVal FROM YourTable
WHERE YourVal not in (SELECT TOP 4 YourVal FROM YourTable ORDER BY YourVal DESC)
ORDER BY YourVal DESC
The proposed SELECT TOP n ... ORDER BY key will work but you need to be aware of the fact that you might get unexpected results if the column you're sorting on is not unique. Find more information on the topic here.
Sudhakar,
It may be easier to use ROW_NUMBER() or DENSE_RANK() for some of these questions. For example, to find YourVal and other columns from the fifth row in order of YourVal DESC:
WITH TRanked AS (
SELECT *,
ROW_NUMBER() OVER (
ORDER BY YourVal DESC, yourPrimaryKey
) AS rk
)
SELECT YourVal, otherColumns
FROM TRanked
WHERE rk = 5;
If you want all rows with the fifth largest distinct YourVal value, just change ROW_NUMBER() to DENSE_RANK().
One really big advantage to these functions is the fact that you can immediately change a "the nth highest YourVal" query to a "the nth highest YourVal for each otherColumn" query just by adding PARTITION BY otherColumn to the OVER clause.
In certain DBMS packages the top command may not work. Then how to do this? Suppose we need to find the 3rd largest salary in employee table. So we select the distinct salary from the table in descending order:
select distinct salary from employee order by salary desc
Now among the salaries selected we need top 3 salaries, for that we write:
select salary from (select distinct salary from employee order by salary desc) where rownum<=3 order by salary
This gives top 3 salaries in ascending order. This makes the third largest salary to go to first position. Now we have the final task of printing the 3rd largest number.
select salary from (select salary from (select distinct salary from employee order by salary desc) where rownum<=3 order by salary) where rownum=1
This gives the third largest number. For any mistake in the query please let me know. Basically to get the nth largest number we can rewrite the above query as
select salary from (select salary from (select distinct salary from employee order by salary desc) where rownum<=**n** order by salary) where rownum=1
If you have a table called Orders and 3 columns Id, ProductId and Quantity then to retrieve the top 3 highest quantities your query would look like:
SELECT TOP 3 [Id], [ProductId], [Quantity] FROM [Orders] ORDER BY [Quantity] DESC
or if you just want the quantity column:
SELECT TOP 3 [Quantity] FROM [Orders] ORDER BY [Quantity] DESC
This works prefect!
select top 1 * from Employees where EmpId in
(
select top 3 EmpId from Employees order by EmpId
) order by EmpId desc;
If you would like to get 2nd, 3rd or 4th highest just change top3 to appropriate number.

Resources