Maybe I'm missing something here but I feel this isn't such an uncommon thing to do in SQL.
I have a DB table1 with a lot of values, and several other tables that each have a foreign key to the primary key of table1. I want for every row in table1 to generate a list of all the tables which reference it.
For example, I have a table called Books, which contains rows describing different books. I then have a table called SchoolLibrary which contains rows of all the books their library has and where they are stored. Not all the books in Books appear in school library. I then have a table called PublicLibrary, which similarly contains rows of all the books they have, as well as information about who checked them out.
I would then want to select all the values in Books and add a column with a list of which libraries reference that book, so it would say either {1, []} or {1, [PublicLibrary]} or {1, [SchoolLibary, PublicLibrary]} for book with id 1, etc...
Perhaps more useful would be not only which table references it, but also the id of the row in that table which references it. So the list would be something like {1, [PublicLibrary: 5]}, so the row with id 5 in PublicLibrary references the book with id 1.
How can I do this?
After discussing with someone I realized there's a simple solution to this question. Just do a LEFT JOIN of all the tables
SELECT Books._id, SchoolLibrary._id, PublicLibrary._id
FROM Books
LEFT JOIN SchoolLibrary ON Books._id = SchoolLibrary.bookId
LEFT JOIN PublicLibrary ON Books._id = PublicLibrary.bookId
This will return a table with all the values in Books, with a column for the other tables. If the other tables reference Books, the id of which entry references Books will be stored. If it doesn't, it'll be NULL
For example (where SchoolLibrary and PublicLibrary have _id, bookId as columns):
Books SchoolLibrary PublicLibrary
----- ------------- -------------
1 1 | 1 1 | 2
2 2 | 3 2 | 3
3
With the above script, it will return the following:
Book._id SchoolLibrary._id PublicLibrary._id
-------- ----------------- -----------------
1 1
2 1
3 2 2
Related
Background:
Hey everyone! I'm hoping you can help me with something that I've been trying to figure out. I have a dataset/table called customer_universe that shows all of our in scope customers. Every row/cust_id in that table is unique.
Let's say this table has 60,000 total rows. Every cust_id entry in this table is unique so total rows = unique row count.
There is also a dataset that I created (customer_sport_product_purch) that lists out all of customers (from the customer_universe table) and any of the 3 in-scope sports products they purchased along with a purchase date. This tables only contains customers who have purchased one of the three sport products but since there are three sport products and a customer may have purchased multiple, cust_id field does not contain only unique customers.
Let's say this table has 46,000 total rows but only 25,000 unique customer.
Goal Query Output:
I need to write a query that lists out every customer in the customer_universe table and one more column with a binary (1/0) value that will indicate if they have purchased a sport product or not.
So this query output should have a total of 60000 records and only two columns.
Environment and Attempted Solutions Details
I'm currently building these queries using Impala in Hue. I'm trying to use a case statement to get me my desired result but I'm getting the error message provided below.
Customer_universe Table:
Cust_ID
Customer_Since
1
02-20-2019
2
01-13-2020
3
06-17-2012
4
06-19-2021
5
06-06-2017
Customer_sport_product_purch Table:
Cust ID
Product
Purch_Dt
1
Basketball
01-01-2022
1
BoxGlove
02-01-2020
5
BoxGlove
12-15-2019
Desired Query Output:
Cust_ID
Sport_Purch
1
1
2
0
3
0
4
0
5
1
Queries I've attempted and the Error Messages I've Received:
Query 1:
SELECT a.cust_id,
case when (a.cust_id in (select distinct b.cust_id from DB.customer_sport_purch b)
then 1 else 0 end as Sport_Purch
FROM DB.customer_universe
GROUP BY cust_id;
Error Message 1:
Error while compiling statement: FAILED: SemanticException [Error 10249]: line 2:72 Unsupported SubQuery Expression 'cust_id': Currently SubQuery expressions are only allowed as Where Clause predicates
Query 2:
SELET a.cust_id,
case when (a.cust_id in sportPurch) then 1 else 0 end as Sport_Purch
FROM DB.customer_universe a,
(select distinct cust_id from DB.customer_sport_purch) sportPurch
GROUP BY a.cust_id;
Error Message 2:
Error while compiling statement: FAILED: ParseException line 2:36 cannot recognize input near 'sportPurch' ')' 'then' in expression specification
Other Considerations:
I cannot bring bring the customer_sport_table.cust_id values into a text file and have the query read from file since those values will change frequently and need to be able to just re-execute queries.
Thanks in advance!
I essentially have 4 tables, but not all the tables have common fields
Table 1 has A
Table 2 has A and B
Table 3 has B and C
Table 4 has C
so when I tried to join them all, it doesn't work because
SQLITE_ERROR: cannot join using column C - column not present in
all tables
Which I understand, not all the table share the same columns.
I tried creating a view (TABLE_ABC) using "table1, table2, and table3", then tried doing a join to that view
Join TABLE_ABC using (C)
but I get the same SQLITE Error.
So my questions are:
Is there a way to join all 4 tables even though they all do not share a column? Do I just need to create a 5th table using "table1, table2, and table3" and connect 4 to that?
Can you do a join to a view?
I have a simple one two many relationship.
There are three tables
Main:
ID TITLE
________
1 Peter
2 Lars
Orders:
SKU MAIN_ID
___________
RFX 1
HNI 2
RRP 2
Tools:
NAME MAIN_ID
____________
FORK 1
KNIFE 1
SPOON 2
So orders and tools hava a MAIN_ID which refers to the Main table.
So Peter has the order RFX and the tools FORK and SPOON
Lars has the orders HNI and RRP and the tool SPOON.
How can I do a single query to find out which orders and tools peter has and which ones lars has?
I tried it with an inner join but then there are duplicate entries.
You probably want to use group_concat() to get the values in one row. However, you need to pre-aggregate the data before the join:
select m.*, o.skus, t.tools
from main m join
(select main_id, group_concat(sku) as skus
from orders
group by main_id
) o
on o.main_id = m.id join
(select main_id, group_concat(name) as tools
from tools
group by main_id
) t
on t.main_id = m.id;
I have an SQLite Database, two of the tables look like this:
ID Name
1 Test1
2 Test2
3 Test3
4 Test4
ID Color
1 Blue
1 White
1 Red
2 Green
2 Red
4 Black
In the first Tables, ID is unique, the second table lists colors an ID has, it can be from 0 to n colors.
Now I want to select all Names exactly once, that have one or more given color. Lets say, I want to have all names associated with blue, white and/or green. The resultset should have the IDs 1 and 2.
I am completly lost here, as I normally dont do any SQL. I am just familiar with very basic SQL. What I would do is Join the tables together, but I dont know how I do that, as ID is not unique in the second table. Also there would be the problem of IDs beeing duplicated in the resultset, if it has multiple colors that I want to select.
Thanks in advance for any help.
You don't need a join for this. Get the list of IDs from the color table in a subquery, and fetch the names from the test table with an in clause:
sqlite> select * from tests where id in
(select id from colors where name in ('Blue', 'White', 'Green'));
1|Test1
2|Test2
Duplicates don't matter in the subquery, but you could use distinct if you want that list without duplicates in other contexts.
As the title says, I need a method of flattening multiple rows into a one luine output per account. For example table looks like this:
Account Transaction
12345678 ABC
12345678 DEF
12346578 GHI
67891011 ABC
67891011 JKL
I need the output to be:
12345678|ABC|DEF|GHI
67891011|ABC|JKL
The amount of transactions is unknown. For some accounts it could be 1 or 2, all the way up to 100's.
You can do this using a customised version of Tom Kyte's STRAGG function, like this:
select account||'|'||stragg(transaction)
from mytable
where ...
group by account;
The function as given uses commas to separate the values, but you can easily change it to use '|'.
An example using EMP (and with commas still):
SQL> select deptno || '|' || stragg(ename) names
2 from emp
3 group by deptno;
NAMES
--------------------------------------------------------------------------------
10|CLARK,KING,FARMER,MILLER
20|JONES,FORD,SCOTT
30|ALLEN,TURNER,WARD,MARTIN,BLAKE