How to select data where sum is greater than x - sqlite

I am very new to SQL and am using SQLite 3 to run basket analysis on sales data.
The relevant columns are the product ID, a unique transaction ID (which identifies the basket) and the product quantity. Where a customer has bought more than one product type, the unqiue transaction ID is repeated.
I am wanting to select only baskets where the customer has bought more than 1 item.
Is there any way on SQLite to select the unique transaction ID and the sum of the quantity, but only for unique transaction IDs where the quantity is more than one?
So far I have tried:
select uniqID, sum(qty) from salesdata where sum(qty) > 1 group by uniqID;
But SQLite gives me the error 'misuse of aggregate: sum()'
Sorry if this is a simple question but I am struggling to find any relevant information by googling!

Try
select uniqID, sum(qty) from salesdata group by uniqID having sum(qty) > 1
"where" cannot be used on aggregate functions - you can only use where on uniqId, in this case.

if you want to put any condition on the result you get with group by you must use having.
select uniqID, sum(qty) as sumqty from salesdata group by uniqID having sumqty > 1
you can put any of the condition with having normaly as in where.
having sumqty = 1 ,having sumqty < 1 ,having sumqty IN (1,2,3) etc..

Related

How to count number of distinct ID's where status does not match

I have a dataset that looks similar to the below:
I would like to count the total number of distinct ID's that have both a 'SEND' and 'REC'. In other words, where the status does not match (values are limited to SEND and REC for Status field). In this case, the desired query would return a value of 2 since there are 2 distinct ID's that have both a 'SEND' and 'REC' in the dataset.
I tried the following query but did not work since there could only be one status per row and this query is looking for both of those status' within one row.
SELECT COUNT(DISTINCT ID) FROM Table WHERE Date BETWEEN '2022-01-19' AND '2022-01-19' AND Status = 'SEND' AND Status = 'REC' ;
If it's only two distinct values the most efficient way is probably:
select count(*)
from
(
SELECT ID
FROM Table
WHERE Date BETWEEN '2022-01-19' AND '2022-01-19'
AND Status IN ('SEND','REC') -- only two possible values per id
GROUP BY ID
HAVING MIN(Status) <> Max(Status) -- both values exist
) as dt ;
SELECT COUNT(DISTINCT ID) FROM table WHERE ID IN (SELECT ID FROM table WHERE status = 'SEND' INTERSECT SELECT ID FROM table WHERE status = 'REC')

Getting median of column values in each group

I have a table containing user_id, movie_id, rating. These are all INT, and ratings range from 1-5.
I want to get the median rating and group it by user_id, but I'm having some trouble doing this.
My code at the moment is:
SELECT AVG(rating)
FROM (SELECT rating
FROM movie_data
ORDER BY rating
LIMIT 2 - (SELECT COUNT(*) FROM movie_data) % 2
OFFSET (SELECT (COUNT(*) - 1) / 2
FROM movie_data));
However, this seems to return the median value of all the ratings. How can I group this by user_id, so I can see the median rating per user?
The following gives the required median:
DROP TABLE IF EXISTS movie_data2;
CREATE TEMPORARY TABLE movie_data2 AS
SELECT user_id, rating FROM movie_data order by user_id, rating;
SELECT a.user_id, a.rating FROM (
SELECT user_id, rowid, rating
FROM movie_data2) a JOIN (
SELECT user_id, cast(((min(rowid)+max(rowid))/2) as int) as midrow FROM movie_data2 b
GROUP BY user_id
) c ON a.rowid = c.midrow
;
The logic is straightforward but the code is not beautified. Given encouragement or comments I will improve it. In a nutshell, the trick is to use rowid of SQLite.
This is not easily possible because SQLite does not allow correlated subqueries to refer to outer values in the LIMIT/OFFSET clauses.
Add WHERE clauses for the user_id to all three subqueries, and execute them for each user ID.
SELECT user_id,AVG(rating)
FROM movie_data
GROUP BY user_id
ORDER BY rating

Count columns in a table where columns are same and then subtract from a column in another table

Lets pretend that we have a supermarket.
We got a table called Sales where every record is one article), so if we scan 3 articles we will have 3 rows with following columns: ArticleId and Amount where amount Always is 1.
And then we have a table called Articles which have columns: ArticleId and AvailableAmount.
When the sale is done we need to Count records that are the same in Sales table and then update AvailableAmount with AvailableAmount subtracted with the sum of each article.
I'm thinking something like this but i dont know if im thinking right:
UPDATE Articles
SET
AvailableAmount = AvailableAmount - (
Select ArticleId,Count(*) From Sales Group by ArticleId HAVING Count(*) > 1
)
WHERE
ArticleId in(Select distinct ArticleId FROM Sales)
This query is almost correct, but
the subquery must return only one column,
HAVING Count(*) > 1 does not make sense, and
the subquery must return only one value, so you need a correlated subquery:
UPDATE Articles
SET AvailableAmount = AvailableAmount -
(SELECT COUNT(*)
FROM Sales
WHERE ArticleId = Articles.ArticleId)
WHERE ArticleId IN (SELECT ArticleId
FROM Sales)

How to do a 'count' for several criteria at once

I am very new to SQL and am using SQLite 3 to run basket analysis on sales data.
The relevant columns are the product ID, a unique transaction ID (which identifies the basket) and the product quantity. Where a customer has bought more than one product type, the unqiue transaction ID is repeated.
I am wanting to count the number of baskets where the customer has bought 1, 2, 3, 4, 5 and more than 5 items in order to analyse what percentage of customers only bought 1 item.
The code I am using is:
select count (*) as One from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 0) where total = 1;
select count (*) as Two from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 1) where total = 2;
select count (*) as Three from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 1) where total = 3;
select count (*) as Four from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 1) where total = 4;
select count (*) as Five from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 1) where total = 5;
select count (*) as Six from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 1) where total = 6;
select count (*) as Sevenplus from (select uniqID, sum(qty) as total from otcdata3 group by uniqID having total > 1) where total > 6;
This code does work but firstly, as you can see, it is rather unwieldy looking and secondly, the data comes out in the following format when I open it in Excel:
One
1353697
Two
483618
Three
166148
Four
76236
Five
35079
Six
18904
Sevenplus
27896
Ideally I would like the number of items along the top, with the number of baskets meeting that criteria underneath. Whilst I can obviously sort the problem out manually at the moment, I need to run similar analysis on a much bigger scale soon!
Any suggestions on how to write the code so that it structures it the way I want would be greatly appreciated!
This is what you are looking for:
select
case
when total=1 then 'One'
when total=2 then 'Two'
when total=3 then 'Three'
when total=4 then 'Four'
when total=5 then 'Five'
when total=6 then 'Six'
when total=7 then 'SevenPlus'
end,
count(total)
from
(select case when count(uniqID) <= 6 then count(uniqID) else 7 end as total from otcdata3 group by uniqID) as totals
group by total
order by total
This returns two columns, the first column is a text representing the number of items in the transaction, and the second column is the number of distinct purchases with that number of items.

How to count data where sum is greater than x

I am very new to SQL and am using SQLite 3 to run basket analysis on sales data.
The relevant columns are the product ID, a unique transaction ID (which identifies the basket) and the product quantity. Where a customer has bought more than one product type, the unqiue transaction ID is repeated.
I am wanting to count the number of baskets where the customer has bought 1 item.
So far I have tried select count(distinct uniqID) from salesdata having sum(qty) = 1;
But this brought up an error saying a GROUP BY clause is required before HAVING.
I then tried select count(distinct uniqID) from salesdata group by uniqID having sum(qty) = 1
SQlite accepted this, but returned me a list of just 1s, which isn't right either!
I then tried select count(uniqID) from salesdata group by qty having sum(qty) = 1
SQlite also accepted this but returned nothing at all.
Any ideas would be hugely appreciated!
E
Try something like this to retrieve every user which has more or equal than one item in his basket
select uniqID, sum(qty) as total from salesdata group by uniqID having total >= 1
if you want to have only the users which have 1 item in their baskets replace >=1 with =1
like:
select uniqID, sum(qty) as total from salesdata group by uniqID having total = 1
If you want the numbers of users with 1 item in their baskets you get this like this:
SELECT COUNT(*) FROM (select uniqID, sum(qty) as total from salesdata group by uniqID having total
= 1)
Selecting the number of baskets that have only one item. This will also filter out baskets with a quantity higher than one on one single item. If you don't want that, remove the WHERE qty = 1 part.
SELECT
COUNT(uniqID) FROM
(SELECT
uniqID, SUM(qty) AS total
FROM
salesdata
WHERE
qty = 1
GROUP BY
uniqID
HAVING
total = 1)

Resources