Join Cloud SQL data models with aggregated field - google-app-maker

I need to create a view for my app that displays the data from two Cloud SQL models in a single table with one field aggregated.
My data models look like this (simplified).
Projects Table
projectID, projectName , projectBudget
0001 , Project Alpha, 25000
0002 , Project Beta , 2000
Costs Table
projectID, costAccrued
0001 , 1000
0001 , 5000
0002 , 1000
0001 , 8000
0002 , 500
0002 , 300
Combined Table
I am trying to create a view that shows the data combined where costAccrued is summed.
projectID, projectName , projectBudget, costAccrued
0001 , Project Alpha, 25000 , 16000
0002 , Project Beta , 2000 , 1800
I haven't been able to find any code examples where Cloud SQL tables are joined with an aggregated field but I'm given to understand that creating a Calculated SQL table is the correct way to do this.
However, I'm stuck on how to write the SQL query. This is what I've tried to use but it doesn't work and I think there is something to do with parameters that I'm not understanding.
SELECT projects.`projectID`, projects.`projectName`, projects.`projectBudget`, SUM(costs.`costAccrued`)
FROM projects
LEFT JOIN costs on projects.`projectId` = costs.`projectID`
GROUP BY projects.`projectID`;

I solved this myself. The issue had to do with the output column names not matching the field names in the Calculated SQL model.
The correct query looks like this (note each output column is labeled to match the names in my Calculated SQL model):
SELECT projects.`projectID` as "projectId", projects.`projectName` as "name", projects.`projectBudget` as "budget", SUM(costs.`costAccrued`) as "costAccrued"
FROM projects
LEFT JOIN costs on projects.`projectId` = costs.`projectID`
GROUP BY projects.`projectID`;
Reference: https://developers.google.com/appmaker/models/cloudsql#calculated_sql_models

Related

Impala Using CASE to determine if an ID from one Tableis in another Table

Background:
Hey everyone! I'm hoping you can help me with something that I've been trying to figure out. I have a dataset/table called customer_universe that shows all of our in scope customers. Every row/cust_id in that table is unique.
Let's say this table has 60,000 total rows. Every cust_id entry in this table is unique so total rows = unique row count.
There is also a dataset that I created (customer_sport_product_purch) that lists out all of customers (from the customer_universe table) and any of the 3 in-scope sports products they purchased along with a purchase date. This tables only contains customers who have purchased one of the three sport products but since there are three sport products and a customer may have purchased multiple, cust_id field does not contain only unique customers.
Let's say this table has 46,000 total rows but only 25,000 unique customer.
Goal Query Output:
I need to write a query that lists out every customer in the customer_universe table and one more column with a binary (1/0) value that will indicate if they have purchased a sport product or not.
So this query output should have a total of 60000 records and only two columns.
Environment and Attempted Solutions Details
I'm currently building these queries using Impala in Hue. I'm trying to use a case statement to get me my desired result but I'm getting the error message provided below.
Customer_universe Table:
Cust_ID
Customer_Since
1
02-20-2019
2
01-13-2020
3
06-17-2012
4
06-19-2021
5
06-06-2017
Customer_sport_product_purch Table:
Cust ID
Product
Purch_Dt
1
Basketball
01-01-2022
1
BoxGlove
02-01-2020
5
BoxGlove
12-15-2019
Desired Query Output:
Cust_ID
Sport_Purch
1
1
2
0
3
0
4
0
5
1
Queries I've attempted and the Error Messages I've Received:
Query 1:
SELECT a.cust_id,
case when (a.cust_id in (select distinct b.cust_id from DB.customer_sport_purch b)
then 1 else 0 end as Sport_Purch
FROM DB.customer_universe
GROUP BY cust_id;
Error Message 1:
Error while compiling statement: FAILED: SemanticException [Error 10249]: line 2:72 Unsupported SubQuery Expression 'cust_id': Currently SubQuery expressions are only allowed as Where Clause predicates
Query 2:
SELET a.cust_id,
case when (a.cust_id in sportPurch) then 1 else 0 end as Sport_Purch
FROM DB.customer_universe a,
(select distinct cust_id from DB.customer_sport_purch) sportPurch
GROUP BY a.cust_id;
Error Message 2:
Error while compiling statement: FAILED: ParseException line 2:36 cannot recognize input near 'sportPurch' ')' 'then' in expression specification
Other Considerations:
I cannot bring bring the customer_sport_table.cust_id values into a text file and have the query read from file since those values will change frequently and need to be able to just re-execute queries.
Thanks in advance!

SQLite - Joining 2 tables excluding certain rows based on a partial string match

Imagine I have two tables:
Table A
Names
Sales
Department
Dave
5
Shoes
mike
6
Apparel
Dan
7
Front End
Table B
Names
SALES
Department
Dave
5
Shoes
mike
12
Apparel
Dan
7
Front End
Gregg
23
Shoes
Kim
15
Front End
I want to create a query that joins the tables by names and separates sum of sales by table. I additionally want to filter my query to remove string matches or partial matches in this case by certain names.
What I want is the following result
Table C:
A Sales Sum
B Sales Sum
18
24
I know I can do this with a query like the following:
SELECT SUM(A.sales) AS 'A Sales Sum', SUM(B.sales) AS 'B sales Sum' FROM A
JOIN B
ON B.names = A.Names
WHERE Names NOT LIKE '%Gregg%' OR NOT LIKE '%Kim%'
The problem with this is the WHERE clause doesn't seem to apply, or applies to the wrong table. Since the Names column doesn't exactly match between the two, what I think is happening is when they are joined 'ON B.names = A.Names', the extras from B are being excluded? When I flip things around though I get the same result, which is no filter being applied. The wrong result I am getting is the following:
Table D:
A Sales Sum
B Sales Sum
18
62
Clearly I have a syntax issue here since I'm pretty new to SQL. What am I missing? Thanks!
You don't need a join or a union of the tables and you shouldn't do it.
Aggregate in each table separately and return the results with 2 subqueries:
SELECT
(SELECT SUM(Sales) FROM A WHERE Names NOT LIKE '%Gregg%' AND Names NOT LIKE '%Kim%') ASalesSum,
(SELECT SUM(Sales) FROM B WHERE Names NOT LIKE '%Gregg%' AND Names NOT LIKE '%Kim%') BSalesSum
I think you want a union approach here:
SELECT
SUM(CASE WHEN src = 'A' THEN sales ELSE 0 END) AS "A Sales Sum",
SUM(CASE WHEN src = 'B' THEN sales ELSE 0 END) AS "B Sales Sum"
FROM
(
SELECT sales, 'A' AS src FROM A WHERE Names NOT IN ('Gregg', 'Kim')
UNION ALL
SELECT sales, 'B' FROM B WHERE Names NOT IN ('Gregg', 'Kim')
) t;
Here is a demo showing that the above query is working.

update one table with 2 where conditions in the same and one condition in another table

I have 2 tables fees and students. i want to update one field of fees with 3 WHERE conditions, i.e, 2 conditions in table 'fees' and 1 condition in table 'students'.
I tried many queries like
UPDATE fees, students SET fees.dues= 300 WHERE fees.month= November
AND fees.session= 2017-18 AND students.class= Nursery
It gives me error like java.sql.SQLException: near",": syntax error
I am using sqlite as database. Please suggest me a query or let me correct this query.
Thanks
You cannot join tables in a UPDATE command in SQLite. Therefore, use a sub-query in the where condition
UPDATE fees
SET dues = 300
WHERE
month = November AND
session = 2017-18 AND
student_id IN (SELECT id FROM students WHERE class=Nursery)
Also, I am not sure about the types of your columns. String literals must be enclosed in single quotes ('). The expression 2017-18 would yield the number 2017 minus 18 = 1999. Should it be a string literal as well?
UPDATE fees
SET dues = 300
WHERE
month = 'November' AND
session = '2017-18' AND
student_id IN (SELECT id FROM students WHERE class='Nursery')

How to bulid a report with a total and breakout columns with SQL Server and Reporting Services

I have a data structure where I have two tables Alpha and Beta and they are one to many. For the sake of an example let's say that table alpha has a column for "State" and table B has "Colors you like" and you can pick more than one. I would like to build a report that has columns like this:
STATE TOTAL RED GREEN BLUE
Alaska 5 1 3 1
Florida 2 2 2 0
New York 10 5 8 1
The column TOTAL would be a count of the records in Alpha and as you can see due to the one to many relationship the sum of the colors can exceed the count. I suppose it could be less as well if people didn't like colors.
How would you build a report like this. I'll be using SQL Server and Reporting Services in .NET so it could either be a complex query that I just dump into a data table report or a less complex query with some counting and totaling done by the report. I just don't really know the best way to tackle this.
Since you don't know which colors are going to be the columns you should use the Matrix Control
You'll need to set up the query
SELECT
a.State,
b.ColorName,
COUNT(b.ColorID) ColorCount
FROM
alpha a
LEFT JOIN beta b
ON a.id = b.a_id
GROUP BY
a.State,
b.ColorName
Just drag state for the rows, color for the columns and ColorCount for the data (Count(ColorID) will display in the data field))
Note: The LEFT JOIN and Count(ColorID) instead of Count(*) are required if you want a 0 value to appear correctly.
If you did know the colors you could use PIVOT or the sum case technique
SELECT state SUM(CASE WHEN Color = 'RED' THEN 1 ELSE 0 END) as Red, ...

Get Correct Price based on Effectivity Date

I have a problem getting the right "Price" for a product based on Effectivity date.
Example, I have 2 tables:
a. "Transaction" table --> this contains the products ordered, and
b. "Item Master" table --> this contains the product prices and effectivity dates of those prices
Inside the Trasaction table:
INVOICE_NO INVOICE_DATE PRODUCT_PKG_CODE PRODUCT_PKG_ITEM
1234 6/29/2009 ProductA ProductA-01
1234 6/29/2009 ProductA ProductA-02
1234 6/29/2009 ProductA ProductA-03
Inside the "Item_Master" table:
PRODUCT_PKG_CODE PRODUCT_PKG_ITEM PRODUCT_ITEM_PRICE EFFECTIVITY_DATE
ProductA ProductA-01 25 6/1/2009
ProductA ProductA-02 22 6/1/2009
ProductA ProductA-03 20 6/1/2009
ProductA ProductA-01 15 5/1/2009
ProductA ProductA-02 12 5/1/2009
ProductA ProductA-03 10 5/1/2009
ProductA ProductA-01 19 4/1/2009
ProductA ProductA-02 17 4/1/2009
ProductA ProductA-03 15 4/1/2009
In my report, I need to display the Invoices and Orders,
as well as the Price of the Order Item which was effective
at the time it was paid (Invoice Date).
My query looks like this (my source db is Oracle):
SELECT T.INVOICE_NO,
T.INVOICE_DATE,
T.PRODUCT_PKG_CODE,
T.PRODUCT_PKG_ITEM,
P.PRODUCT_ITEM_PRICE FROM TRANSACTION T,
ITEM_MASTER P WHERE T.PRODUCT_PKG_CODE = P.PRODUCT_PKG_CODE
AND T.PRODUCT_PKG_ITEM = P.PRODUCT_PKG_ITEM
AND P.EFFECTIVITY_DATE <= T.INVOICE_DATE
AND T.INVOICE_NO = '1234';
...which shows 2 prices for each item.
I did some other different query styles
but to no avail, so I decided
it's time to get help. :)
Thanks to any of you who can
share your knowledge. --CJ--
p.s. Sorry, my post doesn't even look right! :D
If it's returning two rows with different effective dates that are less than the invoice date, you may want to change your date join to
'AND T.INVOICE_DATE = (
select max(effectivity_date)
from item_master
where effectivity_date < t.invoice_date)'
or something like that, to only get the one price that is the most recent one before the invoice date.
Analytics is your friend. You can use the FIRST_VALUE() function, for example, to get all the product_item_prices for the given product, sort by effectivity_date (descending), and just pick the first one. You'll need a DISTINCT as well so that only one row is returned for each transaction.
SELECT DISTINCT
T.INVOICE_NO,
T.INVOICE_DATE,
T.PRODUCT_PKG_CODE,
T.PRODUCT_PKG_ITEM,
FIRST_VALUE(P.PRODUCT_ITEM_PRICE)
OVER (PARTITION BY T.INVOICE_NO, T.INVOICE_DATE,
T.PRODUCT_PKG_CODE, T.PRODUCT_PKG_ITEM
ORDER BY P.EFFECTIVITY_DATE DESC)
as PRODUCT_ITEM_PRICE
FROM TRANSACTION T,
ITEM_MASTER P
WHERE T.PRODUCT_PKG_CODE = P.PRODUCT_PKG_CODE
AND T.PRODUCT_PKG_ITEM = P.PRODUCT_PKG_ITEM
AND P.EFFECTIVITY_DATE <= T.INVOICE_DATE
AND T.INVOICE_NO = '1234';
While your question's formatting is a bit too messy for me to get all the details, it sure does look like you're looking for the standard SQL construct ROW_NUMBER() OVER with both PARTITION and ORDER_BY -- it's in PostgreSql 8.4 and has been in Oracle [and MS SQL Server too, and DB2...] for quite a while, and it's the handiest way to select the "top" (or "top N") "by group" and with a certain order of anything in a SQL query. Look it up, see here for the PosgreSQL-specific docs.

Resources