Combining multiple table with certain critera for Teradata - teradata

I'm trying to combine 3 different tables into one with certain criteria.
Table 2 - Calculate the Total Weight of the shipments base on Shipper Number and Ship Date
Table 3 - Calaculate the Total Revenue Amount base on Shipper Number and Ship Date
Result - Combined the tables into a single table.
Additionally, the SQL should also filter for "Customer Since" <= 720 and "Ship Date" <= 360
(Note: Customer Number = Shipper Number)

It's difficult to tell what you're looking for, but this could get you started
SELECT t2.Shipper_Number, t2.Ship_Date, t2.Total_Weight, t3.Total_Revenue
FROM (
SELECT Shipper_Number, Ship_Date, SUM(Weight) AS Total_Weight
FROM Table2
GROUP BY Shipper_Number, Ship_Date
) t2
INNER JOIN (
SELECT Shipper_Number, Ship_Date, SUM(Revenue) AS Total_Revenue
FROM Table3
GROUP BY Shipper_Number, Ship_Date
) t3 ON t2.<PK_column> = t3.<PK_column>
A few notes:
INNER JOIN will only return rows that match in both tables
You can add your "Customer Since" and "Ship Date" restrictions as WHERE clauses in the sub-queries
Update
If you want to get one row per (Shipper_Number, Ship_Date) group, then you need to make your JOIN condition for those two derived tables using that column combination (Shipper_Number, Ship_Date).
And if you want to further filter your rows on "Customer Since", then just JOIN to the source table and add the filtering condition. Something like this:
SELECT t2.Shipper_Number, t2.Ship_Date, t2.Total_Weight, t3.Total_Revenue
FROM (
SELECT Shipper_Number, Ship_Date, SUM(Weight) AS Total_Weight
FROM Table2
WHERE Ship_Date <= Ship_Date - INTERVAL '360' DAY -- Assumes you want only order that shipped within the last 360 days
GROUP BY Shipper_Number, Ship_Date
) t2
INNER JOIN (
SELECT Shipper_Number, Ship_Date, SUM(Revenue) AS Total_Revenue
FROM Table3
WHERE Ship_Date <= Ship_Date - INTERVAL '360' DAY -- Assumes you want only order that shipped within the last 360 days
GROUP BY Shipper_Number, Ship_Date
) t3 ON t2.Shipper_Number = t3.Shipper_Number AND t2.Ship_Date = t3.Ship_Date
INNER JOIN Table1 t1 ON t2.Shipper_Number = t1.Customer_Number -- Get shipper/customer info
WHERE t1."Customer Since" <= "Customer Since" - INTERVAL '720' DAY -- Assuming you want to get customers that are fewer than 720 days old
Again, using an INNER JOIN, this assumes that matching rows exist in both tables for (Shipper_Number and Ship_Date), otherwise it won't return that row.
Also, if feasible, you may want to consider combining your t2 and t3 tables into a single table based on a common key (shipper_number, shipping_number) or just shipping_number if it's a unique value.

Related

unique one column adn return all data with mariaDB [duplicate]

My database structure contains columns: id, name, value, dealer. I want to retrieve row with lowest value for each dealer. I've been trying to mess up with MIN() and GROUP BY, still - no solution.
Solution1:
SELECT t1.* FROM your_table t1
JOIN (
SELECT MIN(value) AS min_value, dealer
FROM your_table
GROUP BY dealer
) AS t2 ON t1.dealer = t2.dealer AND t1.value = t2.min_value
Solution2 (recommended, much faster than solution1):
SELECT t1.* FROM your_table t1
LEFT JOIN your_table t2
ON t1.dealer = t2.dealer AND t1.value > t2.value
WHERE t2.value IS NULL
This problem is very famous, so there is a special page for this in Mysql's manual.
Check this: Rows Holding the Group-wise Maximum/Minimum of a Certain Column
select id,name,MIN(value) as pkvalue,dealer from TABLENAME
group by id,name,dealer;
here you group all rows by id,name,dealer and then you will get min value as pkvalue.
SELECT MIN(value),dealer FROM table_name GROUP BY dealer;
First you need to resolve the lowest value for each dealer, and then retrieve rows having that value for a particular dealer. I would do this that way:
SELECT a.*
FROM your_table AS a
JOIN (SELECT dealer,
Min(value) AS m
FROM your_table
GROUP BY dealer) AS b
ON ( a.dealer= b.dealer
AND a.value = b.m )
Try following:
SELECT dealer, MIN(value) as "Lowest value"
FROM value
GROUP BY dealer;
select id, name, value, dealer from yourtable where dealer
in(select min(dealer) from yourtable group by name, value)
These answers seem to miss the edge case of having multiple minimum values for a dealer and only wanting to return one row.
If you want to only want one value for each dealer you can use row_number partition - group - the table by dealer then order the data by value and id. we have to make the assumption that you will want the row with the smallest id.
SELECT ord_tbl.id,
ord_tbl.name,
ord_tbl.value,
ord_tbl.dealer
FROM (SELECT your_table.*,
ROW_NUMBER() over (PARTITION BY dealer ORDER BY value ASC, ID ASC)
FROM your_table
) AS ord_tbl
WHERE ord_tbl.ROW_NUMBER = 1;
Be careful though that value, id and dealer are indexed. If not this will do a full table scan and can get pretty slow...

No more spool space in Teradata while trying Update

I'm trying to update a table with to many rows 388.000.
This is the query:
update DL_RG_ANALYTICS.SH_historico
from
(
SELECT
CAST((MAX_DIA - DIA_PAGO) AS INTEGER) AS DIAS_AL_CIERRE_1
FROM
(SELECT * FROM DL_RG_ANALYTICS.SH_historico A
LEFT JOIN
(SELECT ANO||MES AS ANO_MES, MAX(DIA) AS MAX_DIA FROM DL_RG_ANALYTICS.SH_CALENDARIO
GROUP BY 1) B
ON A.ANOMES = B.ANO_MES
) M) N
SET DIAS_AL_CIERRE = DIAS_AL_CIERRE_1;
Any help is apreciate.
This first thing I'd do is replace the SELECT * with only the columns you need. You can also remove the M derived table to make it easier to read:
UPDATE DL_RG_ANALYTICS.SH_historico
FROM (
SELECT CAST((MAX_DIA - DIA_PAGO) AS INTEGER) AS DIAS_AL_CIERRE_1
FROM DL_RG_ANALYTICS.SH_historico A
LEFT JOIN (
SELECT ANO || MES AS ANO_MES, MAX(DIA) AS MAX_DIA
FROM DL_RG_ANALYTICS.SH_CALENDARIO
GROUP BY 1
) B ON A.ANOMES = B.ANO_MES
) N
SET DIAS_AL_CIERRE = DIAS_AL_CIERRE_1;
What indexes are defined on the SH_CALENDARIO table? If there is a composite index of (ANO, MES) then you should re-write your LEFT JOIN sub-query to GROUP BY these two columns since you concatenate them together anyways. In general, you want to perform joins, GROUP BY and OLAP functions on indexes, so there will be less row re-distribution and they will run more efficiently.
Also, this query is updating all rows in the table with the same value. Is this intended, or do you want to include extra columns in your WHERE clause?

Retrieve a table to tallied numbers, best way

I have query that runs as part of a function which produces a one row table full of counts, and averages, and comma separated lists like this:
select
(select
count(*)
from vw_disp_details
where round = 2013
and rating = 1) applicants,
(select
count(*)
from vw_disp_details
where round = 2013
and rating = 1
and applied != 'yes') s_applicants,
(select
LISTAGG(discipline, ',')
WITHIN GROUP (ORDER BY discipline)
from (select discipline,
count(*) discipline_number
from vw_disp_details
where round = 2013
and rating = 1
group by discipline)) disciplines,
(select
LISTAGG(discipline_count, ',')
WITHIN GROUP (ORDER BY discipline)
from (select discipline,
count(*) discipline_count
from vw_disp_details
where round = 2013
and rating = 1
group by discipline)) disciplines_count,
(select
round(avg(util.getawardstocols(application_id,'1','AWARD_NAME')), 2)
from vw_disp_details
where round = 2013
and rating = 1) average_award_score,
(select
round(avg(age))
from vw_disp_details
where round = 2013
and rating = 1) average_age
from dual;
Except that instead of 6 main sub-queries there are 23.
This returns something like this (if it were a CSV):
applicants | s_applicants | disciplines | disciplines_count | average_award_score | average_age
107 | 67 | "speed,accuracy,strength" | 3 | 97 | 23
Now I am programmatically swapping out the "rating = 1" part of the where clauses for other expressions. They all work rather quickly except for the "rating = 1" one which takes about 90 seconds to run and that is because the rating column in the vw_disp_details view is itself compiled by a sub-query:
(SELECT score
FROM read r,
eval_criteria_lookup ecl
WHERE r.criteria_id = ecl.criteria_id
AND r.application_id = a.lgo_application_id
AND criteria_description = 'Overall Score'
AND type = 'ABC'
) reader_rank
So when the function runs this extra query seems to slow everything down dramatically.
My question is, is there a better (more efficient) way to run a query like this that is basically just a series of counts and averages, and how can I refactor to optimize the speed so that the rating = 1 query doesn't take 90 seconds to run.
You could choose to MATERIALIZE the vw_disp_details VIEW. That would pre-calculate the value of the rating column. There are various options for how up-to-date a materialized view is kept, you would probably want to use the ON COMMIT clause so that vw_disp_details is always correct.
Have a look at the official documentation and see if that would work for you.
http://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_6002.htm
Do all most of your queries in only one. Instead of doing:
select
(select (count(*) from my_tab) as count_all,
(select avg(age) from my_tab) as avg_age,
(select avg(mypkg.get_award(application_id) from my_tab) as_avg-app_id
from dual;
Just do:
select count(*), avg(age),avg(mypkg.get_award(application_id)) from my_tab;
And then, maybe you can do some union all for the other results. But this step all by itself should help.
I was able to solve this issue by doing two things: creating a new view that displayed only the results I needed, which gave me marginal gains in speed, and in that view moving the where clause of the sub-query that caused the lag into the where clause of the view and tacking on the result of the sub-query as column in the view. This still returns the same results thanks to the fact that there are always going to be records in the table the sub-query accessed for each row of the view query.
SELECT
a.application_id,
util.getstatus (a.application_id) status,
(SELECT score
FROM applicant_read ar,
eval_criteria_lookup ecl
WHERE ar.criteria_id = ecl.criteria_id
AND ar.application_id = a.application_id
AND criteria_description = 'Overall Score' //THESE TWO FIELDS
AND type = 'ABC' //ARE CRITERIA_ID = 15
) score
as.test_total test_total
FROM application a,
applicant_scores as
WHERE a.application_id = as.application_id(+);
Became
SELECT
a.application_id,
util.getstatus (a.application_id) status,
ar.score,
as.test_total test_total
FROM application a,
applicant_scores as,
applicant_read ar
WHERE a.application_id = as.application_id(+)
AND ar.application_id = a.application_id(+)
AND ar.criteria_id = 15;

Getting All the record of particular month - Building SQL Query

I need some help to build SQL Query. I have table having data like:
ID Date Name
1 1/1/2009 a
2 1/2/2009 b
3 1/3/2009 c
I need to get result something like...
1 1/1/2009 a
2 1/2/2009 b
3 1/3/2009 c
4 1/4/2009 Null
5 1/5/2009 Null
6 1/6/2009 Null
7 1/7/2009 Null
8 1/8/2009 Null
............................
............................
............................
30 1/30/2009 Null
31 1/31/2009 Null
I want query something like..
Select * from tbl **where month(Date)=1 AND year(Date)=2010**
Above is not completed query.
I need to get all the record of particular month, even if some date missing..
I guess there must be equi Join in the query, I am trying to build this query using Equi join
Thanks
BIG EDIT
Now understand the OPs question.
Use a common table expression and a left join to get this effect.
DECLARE #FirstDay DATETIME;
-- Set start time
SELECT #FirstDay = '2009-01-01';
WITH Days AS
(
SELECT #FirstDay as CalendarDay
UNION ALL
SELECT DATEADD(d, 1, CalendarDay) as CalendarDay
FROM Days
WHERE DATEADD(d, 1, CalendarDay) < DATEADD(m, 1, #FirstDay)
)
SELECT DATEPART(d,d.CalendarDay), **t.date should be (d.CalendarDay)**, t.Name FROM Days d
LEFT JOIN tbl t
ON
d.CalendarDay = t.Date
ORDER BY
d.CalendarDay;
Left this original answer at bottom
You need DATEPART, sir.
SELECT * FROM tbl WHERE DATEPART(m,Date) = 1
If you want to choose month and year, then you can use DATEPART twice or go for a range.
SELECT * FROM tbl WHERE DATEPART(m,Date) = 1 AND DATEPART(yyyy,Date) = 2009
Range :-
SELECT * FROM tbl WHERE Date >= '2009-01-01' AND Date < '2009-02-01'
See this link for more info on DATEPART.
http://msdn.microsoft.com/en-us/library/ms174420.aspx
You can use less or equal to.
Like so:
select * from tbl where date > '2009-01-01' and date < '2009-02-01'
However, it is unclear if you want month 1 from all years?
You can check more examples and functions on "Date and Time Functions" from MSDN
Create a temporary table containing all days of that certain month,
Do left outer join between that table and your data table on tempTable.month = #month.
now you have a big table with all days of the desired month and all the records matching the proper dates + empty records for those dates who have no data.
i hope that's what you want.

select rows from a table with date in the region- 90days ago and now.?

SELECT gameratingstblx245v.gameid,avg( gameratingstblx245v.rating ) as avgrating, count(gameratingstblx245v.rating) as count,gamedata.name ,gamedata.gameinfo
FROM gameratingstblx245v
LEFT JOIN gamedata ON gamedata.id = gameratingstblx245v.game_id
WHERE gameratingstblx245v.game_id=gameratingstblx245v.game_id
GROUP BY gameid
ORDER BY avg( gameratingstblx245v.rating ) DESC LIMIT 0,8
Table gameratingstblx245v - gameid, rating
Rable gamedata - id, gameinfo, name, releasedate
This is the query I am currently using to extract data from two tables gamedata and gameratingstblx245v.... What I am doing here is taking the avg. of all the ratings from table gameratingstblx245v in descending order of their avg. rating and I am also extracting the related info corresponding to the selected gameid's from table gamedata...
Now what I want to extract is the top avg. ratings from game_ratingstblx245v but for the games whose field releasedate from table gamedata is in the last 90 days...
Help would be appreciated..Thanks
Here's how I'd design that query:
SELECT d.id, d.name, d.gameinfo,
AVG(r.rating) AS avgrating, COUNT(r.rating) AS count
FROM gamedata d
LEFT JOIN gameratingstblx245v r ON (d.id = r.game_id)
WHERE d.releasedate BETWEEN NOW() - INTERVAL 90 DAY AND NOW()
GROUP BY d.id
ORDER BY avgrating DESC LIMIT 0,8;

Resources