Select single row per unique field value with SQL Developer - oracle11g

I have thousands of rows of data, a segment of which looks like:
+-------------+-----------+-------+
| Customer ID | Company | Sales |
+-------------+-----------+-------+
| 45678293 | Sears | 45 |
| 01928573 | Walmart | 6 |
| 29385068 | Fortinoes | 2 |
| 49582015 | Walmart | 1 |
| 49582015 | Joe's | 1 |
| 19285740 | Target | 56 |
| 39506783 | Target | 4 |
| 39506783 | H&M | 4 |
+-------------+-----------+-------+
In every case that a customer ID occurs more than once, the value in 'Sales' is also the same but the value in 'Company' is different (this is true throughout the entire table). I need for each value in 'Customer ID to only appear once, so I need a single row for each customer ID.
In other words, I'd like for the above table to look like:
+-------------+-----------+-------+
| Customer ID | Company | Sales |
+-------------+-----------+-------+
| 45678293 | Sears | 45 |
| 01928573 | Walmart | 6 |
| 29385068 | Fortinoes | 2 |
| 49582015 | Walmart | 1 |
| 19285740 | Target | 56 |
| 39506783 | Target | 4 |
+-------------+-----------+-------+
If anyone knows how I can go about doing this, I'd much appreciate some help.
Thanks!

Well it would have been helpful, if you have put your sql generate that data.
but it might go something like;
SELECT customer_id, Max(Company) as company, Count(sales.*) From Customers <your joins and where clause> GROUP BY customer_id
Assumes; there are many company and picks out the most number of occurance and the sales data to be in a different table.
Hope this helps.

Related

how to query/add filter attribute in dynamodb query?

I am creating a composite key with artist and song in dynamo db. artist as a primary key and song as an sort key. i'm aware that query might be slow and expensive given how dynamo db works and how filter is applied, is it possible to apply filter to multiple attribute and how does it look like , for example - query all the records by an artist (joe) in a particular genre (country) with rating higher than 7
id | artist | song | albumTitle | Price | Genre | rating
1 | TI | hello | south | 90 | Rap | 7.0
2 | joe | good | free | 87 | pop | 8
3 | joe | bye | one | 99 | country| 7
4 | joe | beat | one | 99 | country| 5
5 | joe | sun | one | 99 | country| 9

How to get most recent data from DynamoDB for each primary partition key in PartiQL

inspired from this How to get most recent data from DynamoDB for each primary partition key?
I have a table in dynamodb. It stores account stats. It's possible that the account stats will be updated several times per day. So table records may look like:
+------------+--------------+-------+-------+
| account_id | record_id | views | stars |
+------------+--------------+-------+-------+
| 3 | 2019/03/16/1 | 29 | 3 |
+------------+--------------+-------+-------+
| 2 | 2019/03/16/2 | 130 | 21 |
+------------+--------------+-------+-------+
| 1 | 2019/03/16/3 | 12 | 2 |
+------------+--------------+-------+-------+
| 2 | 2019/03/16/1 | 57 | 12 |
+------------+--------------+-------+-------+
| 1 | 2019/03/16/2 | 8 | 2 |
+------------+--------------+-------+-------+
| 1 | 2019/03/16/1 | 3 | 0 |
+------------+--------------+-------+-------+
account_id is a primary partition key. record_id is a primary sort key
How I can get only latest records for each of the account_ids? So from the example above I expect to get:
+------------+--------------+-------+-------+
| account_id | record_id | views | stars |
+------------+--------------+-------+-------+
| 3 | 2019/03/16/1 | 29 | 3 |
+------------+--------------+-------+-------+
| 2 | 2019/03/16/2 | 130 | 21 |
+------------+--------------+-------+-------+
| 1 | 2019/03/16/3 | 12 | 2 |
+------------+--------------+-------+-------+
This data is convenient to use for a reporting purposes.
Execute the following PartiQL query for each account_id:
SELECT * FROM <Table> WHERE account_id='3' AND record_id > '2021/11' ORDER BY record_id DESC
PartiQL has no LIMIT keyword, so will return all matching records.
You can reduce overfetching by constraining the record_id date to the extent possible. If only the current date is of interest, for example, the sort key expression would be record_id > 2021/12/01.
As in the referenced example, you must execute one query for each account_id of interest. Batching operations are supported.

Filter multiple occurrences based on group [duplicate]

This question already has answers here:
dplyr - filter by group size
(7 answers)
Keep only groups of data with multiple observations
(2 answers)
Closed 3 years ago.
I have a dataset like mentioned below:
df=data.frame(Supplier_id=c("1","2","7","7","7","4","5","8","12","7"), Supplier=c("Tian","Yan","Goldy","Goldy","Goldy","Amy","Lauren","Cassy","Shaan","Goldy"),Date=c("1/17/2019","4/30/2019","11/29/2018","11/29/2018","11/29/2018","5/21/2018","5/23/2018","5/24/2018","6/15/2018","6/20/2018"),Buyer=c("Unclassified","Unclassified","Kelly","Kelly","Kelly","Kelly","Amanda","Echo","Shao","Shao"))
df$Supplier_id=as.numeric(as.character(df$Supplier_id))
Thus, df appears like below:
| Supplier_id | Supplier | Date | Buyer |
|-------------|----------|------------|--------------|
| 1 | Tian | 1/17/2019 | Unclassified |
| 2 | Yan | 4/30/2019 | Unclassified |
| 7 | Goldy | 11/29/2018 | Kelly |
| 7 | Goldy | 11/29/2018 | Kelly |
| 7 | Goldy | 11/29/2018 | Kelly |
| 4 | Amy | 5/21/2018 | Kelly |
| 5 | Lauren | 5/23/2018 | Amanda |
| 8 | Cassy | 5/24/2018 | Echo |
| 12 | Shaan | 6/15/2018 | Shao |
| 7 | Goldy | 6/20/2018 | Shao |
Now, I want to filter out the Supplier_id's that occur only once for each unique Buyer. For example, in the above dataset, Supplier_id '1' and '2' belong to 'unclassified' buyer, but because they have different ids, I do not want them in my final output. However, when we look at the buyer 'Kelly', it has two supplier_ids, '7' and '4', where, '7' is occurring 3 times and '4' only once. So, the output table should have the record with supplier_id='7'. The grouping should be based on 'Buyer'. So it is important to note that since the supplier_id '7' exists for both 'Kelly' and 'Shao', but it should be grouped differently for both these buyers and not considered together.
The expected output should be:
| Supplier_id | Supplier | Date | Buyer_id |
|-------------|:--------:|-----------:|----------|
| 7 | Goldy | 11/29/2018 | Kelly |
| 7 | Goldy | 11/29/2018 | Kelly |
| 7 | Goldy | 11/29/2018 | Kelly |
I have tried using group_by and filter but this would not work because there will be distinct supplier_id's for every buyer.I have also tried using duplicate but not sure how can I group the supplier_id for each buyer.
df <-df %>% group_by(Buyer) %>% filter(Supplier_id>1)
and also this
df2=df[duplicated(df[1]) | duplicated(df[1], fromLast=TRUE),]
EDIT: The original dataset has many such instances and there are n occurrences of different supplier_id for each buyer.
What could be other way to get the desired output?
I think you need -
df %>% group_by(Supplier_id, Buyer) %>% filter(n() > 1)

add and subtract by type

I have a SQLite table payments:
+------+--------+-------+
| user | amount | type |
+------+--------+-------+
| AAA | 100 | plus |
| AAA | 200 | plus |
| AAA | 50 | minus |
| BBB | 100 | plus |
| BBB | 20 | minus |
| BBB | 5 | minus |
| CCC | 200 | plus |
| CCC | 300 | plus |
| CCC | 25 | minus |
I need to calculate the sum with type 'plus' and subtract from it the sum with type 'minus' for each user.
The result table should look like this:
+------+--------+
| user | total |
+------+--------+
| AAA | 250 |
| BBB | 75 |
| CCC | 475 |
I think that my query is terrible, and I need help to improve it:
select user,
(select sum(amount) from payments as TABLE1 WHERE TABLE1.type = 'plus' AND
TABLE1.user= TABLE3.user) -
(select sum(amount) from payments as TABLE2 WHERE TABLE2.type = 'minus' AND
TABLE2.user= TABLE3.user) as total
from payments as TABLE3
group by client
order by id asc
The type is easier handled with a CASE expression. And then you can merge the aggregation into the outer query:
SELECT user,
SUM(CASE type
WHEN 'plus' THEN amount
WHEN 'minus' THEN -amount
END) AS total
FROM payments
GROUP BY client
ORDER BY id;

select sql table rows as columns for survey application

I am developing a survey application, a very simple one that has two tables.
table_survey_answers
+------------+------------+----------------+
| customerid | questionID | answer |
+------------+------------+----------------+
| 1 | 100 | Good |
| 1 | 101 | Acceptable |
| 1 | 102 | Excellent |
| 2 | 100 | Not acceptable |
| 2 | 101 | Acceptable |
| 2 | 102 | Good |
+------------+------------+----------------+
table_questions
+------------+-----------------------------------+
| QuestionID | Question |
+------------+-----------------------------------+
| 100 | Kindly rate our customer service? |
| 101 | How fast is our product delivery? |
| 102 | Quality of the Product A? |
+------------+-----------------------------------+
Now I want display survey result as follow in asp.net gridview.
+------------+-----------------------------------+-----------------------------------+---------------------------+
| CustomerID | Kindly rate our customer service? | How fast is our product delivery? | Quality of the Product A? |
+------------+-----------------------------------+-----------------------------------+---------------------------+
| 1 | Good | Acceptable | Excellent |
| 2 | Not Acceptable | acceptable | Good |
+------------+-----------------------------------+-----------------------------------+---------------------------+
I already created tables to get survey responses. Only thing I want export the result in gridview as explained above format.
Use Pivot which will transpose your rows to columns
SELECT *
FROM (SELECT customerid,
answer,
Question
FROM table_questions a
JOIN table_survey_answers b
ON a.QuestionID = b.questionID) a
PIVOT (Max(answer)
FOR Question IN([Kindly rate our customer service?],
[How fast is our product delivery?],
[Quality of the Product A?])) piv
SQL FIDDLE DEMO

Resources