I have following query in sql server 2005 with a table
select t1.id, CONVERT(VARCHAR,t1.dt,103) date_1, CONVERT(VARCHAR,t2.dt,103) date_2, t1.hotel,
t1.price price_1, t2.price price_2, t2.price - t1.price difference, ((t2.price - t1.price)/t1.price)*100 as Diff_percentage
from test t1
inner join
(
select *
from test
) t2
on t1.hotel = t2.hotel
and t1.dt < t2.dt and t2.dt=(SELECT MAX(dt) from TEST) and t1.dt=(SELECT MAX(dt-1) from TEST)
I want to use count if function within this query. Based on difference column. So that i can count "How many increased, How many decreased, How many same, How many unavailable"
COUNT IF Difference>0 //How many increased
COUNT IF Difference<0 //How many decreased
COUNT IF Difference=0 //How many same
COUNT IF Difference="" //How many unavailable --Difference is blank.
DEMO : http://sqlfiddle.com/#!3/b6f37/29
If I wanted your existing query result I would rewrite the query as:
select t1.id,
CONVERT(VARCHAR,t1.dt,103) date_1,
CONVERT(VARCHAR,t2.dt,103) date_2,
t1.hotel,
t1.price price_1,
t2.price price_2,
t2.price - t1.price difference,
((t2.price - t1.price)/t1.price)*100 as Diff_percentage
from test t1
join (select max(dt) maxDt from test) d
on t1.dt = d.maxDt-1
join test t2
on t2.hotel = t1.hotel
and t2.dt = d.maxDt
To extend your original query to include hotels with missing rows:
select t1.id,
CONVERT(VARCHAR,t1.dt,103) date_1,
CONVERT(VARCHAR,t2.dt,103) date_2,
h.hotel,
t1.price price_1,
t2.price price_2,
t2.price - t1.price difference,
((t2.price - t1.price)/t1.price)*100 as Diff_percentage
from (select distinct hotel from test) h
cross join (select max(dt) maxDt from test) d
left join test t1
on t1.hotel = h.hotel
and t1.dt = d.maxDt-1
left join test t2
on t2.hotel = h.hotel
and t2.dt = d.maxDt
The above query would be much more efficient with a normalized HOTEL table (1 row per hotel) to replace the SELECT DISTINCT subquery.
To get your requested result, I would use:
select count(case when (t2.price-t1.price) < 0 then 1 end) decrease_count,
count(case when (t2.price-t1.price) > 0 then 1 end) increase_count,
count(case when (t2.price-t1.price) = 0 then 1 end) same_count,
count(distinct t1.hotel) - count(case when (t2.price-t1.price) is not null then 1 end) unavailable_count
from test t1
left join (select max(dt) maxDt from test) d
on t1.dt = d.maxDt-1
left join test t2
on t2.hotel = t1.hotel
and t2.dt = d.maxDt
Again the above would be more efficient with a normalized HOTEL table. I would restructure the query more like the previous one: select from HOTEL cross joined to the MAX date query, then outer join to the TEST table twice for the data for the 2 dates. The unavailable count could then be measured more directly, counting the number of rows where the difference computation is NULL.
Here is the SQL Fiddle for all the queries, along with some extended test data.
This approach uses your query and then just summarizes the results:
with t as (
select t1.id, CONVERT(VARCHAR,t1.dt,103) as date_1,
CONVERT(VARCHAR,t2.dt,103) as date_2,
t1.hotel,
t1.price as price_1, t2.price as price_2,
t2.price - t1.price as difference,
((t2.price - t1.price)/t1.price)*100 as Diff_percentage
from test t1 join
test t2
on t1.hotel = t2.hotel and
t1.dt < t2.dt and
t2.dt=(SELECT MAX(dt) from TEST) and
t1.dt=(SELECT MAX(dt-1) from TEST)
)
select sum(case when diff_percentage > 0.0 then 1 else 0 end) as numIncrease,
sum(case when diff_percentage < 0.0 then 1 else 0 end) as numDecrease,
sum(case when diff_percentage = 0.0 then 1 else 0 end) as numSame,
sum(case when diff_percentage is NULL then 1 else 0 end) as numBlank
from t
I'm not sure what "dt - 1" means. With date/datetime values in SQL Server, one usually uses "dateadd(day, -1, )" to subtract a date. In any case, there may be other ways to calculate what you want, but this answers your specific question.
Related
Good afternoon, can you help me?
I have a table of "routes"
Me need calculate count all routes, and count routes with "where". And % two from one.
For example:
select count(routes) from table1 "~ 150000 items"
select count (routes) from table1 where distance < 100 "~ 15000 items"
How to get a number 10%?
You could conditionally aggregate values:
select sum(case when distance < 100 then 1 else 0 end) /
count(*) * 100 as result
from table1;
Or (worse), use current queries as subqueries:
select
(select count(routes) from table1 where distance < 100) /
(select count(routes) from table1) * 100 as result
from dual;
Try this:
select count(CASE WHEN distance < 100 THEN 1 ELSE 0 END)/count(routes)*100 from table1 where distance < 100
I am trying to calculate percentages by joining 3 tables data to get the percentages of positive_count, negative_count, neutral_count of each user's tweets. I have succeeded in getting positive, negative and neutral counts, but failing to get the same as percentages instead of counts. Here is the query to get counts:
SELECT
t1.u_id,count() as total_tweets_count ,
(
SELECT count() from t1,t2,t3 c
WHERE
t1.u_id='18839785' AND
t1.u_id=t2.u_id AND
t2.ts_id=t3.ts_id AND
t3.sentiment='Positive'
) as pos_count ,
(
SELECT count() from t1,t2,t3
WHERE
t1.u_id='18839785' AND
t1.u_id=t2.u_id AND
t2.ts_id=t3.ts_id AND
t3.sentiment='Negative'
) as neg_count ,
(
SELECT count() from t1,t2,t3
WHERE
t1.u_id='18839785' AND
t1.u_id=t2.u_id AND
t2.ts_id=t3.ts_id AND
t3.sentiment='Neutral'
) as neu_count
FROM t1,t2,t3
WHERE
t1.u_id='18839785' AND
t1.u_id=t2.u_id AND
t2.ts_id=t3.ts_id
GROUP BY t1.u_id;
**OUTPUT:**
u_id | total_tweets_count | pos_count | neg_count | neu_count
-----------------+--------------------+-----------+-----------+-------
18839785| 88 | 38 | 25 | 25
(1 row)
Now I want the same in percentages instead of counts. I have written the query in the following way but failed.
SELECT
total_tweets_count,pos_count,
round((pos_count * 100.0) / total_tweets_count, 2) AS pos_per,neg_count,
round((neg_count * 100.0) / total_tweets_count, 2) AS neg_per,
neu_count, round((neu_count * 100.0) / total_tweets_count, 2) AS neu_per
FROM (
SELECT
count(*) as total_tweets_count,
count(
a.u_id='18839785' AND
a.u_id=b.u_id AND
b.ts_id=c.ts_id AND
c.sentiment='Positive'
) AS pos_count,
count(
a.u_id='18839785' AND
a.u_id=b.u_id AND
b.ts_id=c.ts_id AND
c.sentiment='Negative'
) AS neg_count,
count(
a.u_id='18839785' AND
a.u_id=b.u_id AND
b.ts_id=c.ts_id AND
c.sentiment='Neutral') AS neu_count
FROM t1,t2, t3
WHERE
a.u_id='18839785' AND
a.u_id=b.u_id AND
b.ts_id=c.ts_id
GROUP BY a.u_id
) sub;
Can anyone help me out in achieving as percentages for each user data as below?
u_id | total_tweets_count | pos_count | neg_count | neu_count
------------------+--------------------+-----------+-----------+-----
18839785| 88 | 43.18 | 28.4 | 28.4
(1 row)
I am not entirely sure what you are looking for.
For starters, you can simplify your query by using conditional aggregation instead of three scalar subqueries (which btw. do not need to repeat the where condition on a.u_id)
You state you want to "count for all users", so you need to remove the WHERE clause in the main query. The simplification also gets rid of the repeated WHERE condition.
select u_id,
total_tweets_count,
pos_count,
round((pos_count * 100.0) / total_tweets_count, 2) AS pos_per,
neg_count,
round((neg_count * 100.0) / total_tweets_count, 2) AS neg_per,
neu_cont,
round((neu_count * 100.0) / total_tweets_count, 2) AS neu_per
from (
SELECT
t1.u_id,
count(*) as total_tweets_count,
count(case when t3.sentiment='Positive' then 1 end) as pos_count,
count(case when t3.sentiment='Negative' then 1 end) as neg_count,
count(case when t3.sentiment='Neutral' then 1 end) as neu_count
FROM t1
JOIN t2 ON t1.u_id=t2.u_id
JOIN t3 t2.ts_id=t3.ts_id
-- no WHERE condition on the u_id here
GROUP BY t1.u_id
) t
Note that I replaced the outdated, ancient and fragile implicit joins in the WHERE clause with "modern" explicit JOIN operators
With a more up-do-date Postgres version, the expression count(case when t3.sentiment='Positive' then 1 end) as pos_count can also be re-written to:
count(*) filter (where t3.sentiment='Positive') as pos_count
which is a bit more readable (and understandable I think).
In your query you can achieve the repetition of the global WHERE condition on the u_id by using a co-related subquery, e.g.:
(
SELECT count(*)
FROM t1 inner_t1 --<< use different aliases than in the outer query
JOIN t2 inner_t2 ON inner_t2.u_id = inner_t1.u_id
JOIN t3 inner_t3 ON inner_t3.ts_id = inner_t2.ts_id
-- referencing the outer t1 removes the need to repeat the hardcoded ID
WHERE innter_t1.u_id = t1.u_id
) as pos_count
The repetition of the table t1 isn't necessary either, so the above could be re-written to:
(
SELECT count(*)
FROM t2 inner_t2
JOIN t3 inner_t3 ON inner_t3.ts_id = inner_t2.ts_id
WHERE inner_t2.u_id = t1.u_id --<< this references the outer t1 table
) as pos_count
But the version with conditional aggregation will still be a lot faster than using three scalar sub-queries (even if you remove the unnecessary repetition of the t1 table).
I have a requirement below to choose latest status.
Table 1:
Table2:
Results Expected:
Below is the logic we need.
SELECT
Table1.ID,
,CASE WHEN (Table1.hub=Table2.hub) THEN Table2.Status ELSE NULL END AS Original_Status
,CASE WHEN (Table1.hub<>Table2.hub AND Table2.Status like 'Found%' ) THEN Table2.hub ELSE NULL END AS Derived_Hub
,CASE WHEN (Table1.hub<>Table2.hub AND Table2.Status like 'Found%' ) THEN Table2.Status ELSE NULL END AS Derived_Status
from
Table1
Join Table2
ON (Table1.ID=Table2.ID)
With this code, I am getting 3 rows. If I put max in the above case statements, I am getting Hub3 instead of Hub2.
Could you please let me know how I can merge everything into single row. Please note that I want to show latest found Status from Table 2 even if there are 2 hubs with found status.
Let me know if you need any further information.
I think your best bet is to join into Table 2 twice. The first time, much like you are doing here, but using a LEFT OUTER JOIN on both id and hub. Then the second time on a derived version of table2 where you only select the top 1 records when sorted by timestamp in descending order:
SELECT
t1.id,
t1.hub as "Original Hub"
t2.status as "Original Hub Status",
t3.hub as "Found Hub",
t3.status as "Found Hub Status"
FROM
Table1 t1
LEFT OUTER JOIN Table2 t2 ON
t1.id = t2.id AND
t1.hub = t2.hub
LEFT OUTER JOIN
(
--Select a hub with the same id, that doesn't share the same hub number
--Only choose the top record when sorted by timestamp in descending order
SELECT TOP 1 id, hub, status
FROM table2
WHERE t1.hub <> table2.hub
ORDER BY TimeStamp Desc
) t3 ON
t1.id = t3.id
That subquery that has alias t3 is using a correlated subquery by referencing the t1.hub value in it's WHERE statement.
In TD14.10 you can utilize LAST_VALUE to access data from the "last" row. The CASEs are based on your query:
SELECT
t2.ID,
CASE WHEN t1.Hub = t2.Hub THEN t2.Hub END AS Original_Hub,
CASE WHEN t1.Hub = t2.Hub THEN t2.Status END AS Original_Status,
-- get the last Hub
LAST_VALUE(CASE WHEN t1.Hub <> t2.Hub AND Table2.Status like 'Found%'
THEN t2.Hub
END)
OVER (PARTITION BY t1.ID
ORDER BY CASE WHEN t1.Hub = t2.Hub THEN 0 ELSE 1 END, t2.TS
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS Derived_Hub,
-- get the last Status
LAST_VALUE(CASE WHEN t1.Hub <> t2.Hub AND Table2.Status like 'Found%'
THEN t2.Status
END)
OVER (PARTITION BY t1.ID
ORDER BY CASE WHEN t1.Hub = t2.Hub THEN 0 ELSE 1 END, t2.TS
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS Derived_Status
FROM
Table1 AS t1
JOIN Table2 AS t2
ON (t1.ID=t2.ID)
QUALIFY
ROW_NUMBER () -- return the 1st row only
OVER (PARTITION BY t1.ID
ORDER BY CASE WHEN t1.Hub = t2.Hub THEN 0 ELSE 1 END, t2.TS) = 1
EXPLAIN should combine all OVER into a single STATS-step as they use the same PARTITION BY and ORDER BY.
There are totally 9 fields in the select clause of the query where 5 are direct fields. 2 of them are result of select query in the select clause and 2 more are from using the result from previous 2 fields used in case when statements. The SUM aggregate is applied on the last two columns.
All the first 5 columns are listed in the group by clause.
Query :
SELECT
sub.a,sub.b,sub.c,Table1.d,Table1.e,
(select x from table t1 where y=current_date) as f,
(select z from table t1 where y=w) as g,
sum(case when f=g then 1 else 0)) as h,
sum(case when f+1=g then 1 else 0)) as i
FROM Table1
LEFT JOIN
(SELECT a,b,c,l,m,n,o FROM Table2 INNER JOIN Table3 ON Table2.p=Table3.t)sub
ON Table1.e = sub.l
GROUP BY sub.a,sub.b,sub.c,Table1.d,Table1.e
The above query which works perfectly fine in Teradata v14.10.05.09 but fails with
"SELECT Failed. 3504: Selected non-aggregate values must be part of
the associated group"
in Teradata v14.00.07.02.
Is this because of something introduced in the newer version that is causing the query to fail but run fine in older version?
Or am I missing something?
Thanks in Advance.
This is the actual query that is giving the problem: SELECT
sub.key,sub.div,sub.reg_nm,tb_cal.calendar_yr,tb_cal.calendar_mth,
(SEL fiscal_yr FROM db1.tb_cal tb_cal WHERE calendar_dt=CURRENT_DATE) current_fy,
(SEL fiscal_yr FROM db1.tb_cal tb_cal WHERE calendar_dt=clse_dt) clse_dt_fy,
SUM(CASE WHEN current_fy=clse_dt_fy THEN sub.amt ELSE 0 END) cnt
FROM db1.tb_cal tb_cal
LEFT JOIN( SELECT
tbl_a.key,tbl_a.stage,COALESCE(tbl_u.div, 'Un-assigned') div,COALESCE(tbl_u.reg_nm,'Un-assigned') reg_nm,tbl_a.clse_dt,tbl_a.amt
FROM db_a.tbl_a tbl_a
INNER JOIN db1.tbl_u tbl_u
ON tbl_u.unit_key=tbl_a.unit_key
WHERE tbl_a.ctg IN ('G','O','C','F','I') AND tbl_a.stage NOT IN ('R', 'D', 'N','A') AND tbl_a.ind = 0
)sub
ON sub.clse_dt = tb_cal.calendar_dt
GROUP BY sub.key,sub.div,sub.reg_nm,tb_cal.calendar_yr,tb_cal.calendar_mth
SELECT
sub.key,
sub.div,
sub.reg_nm,
tb_cal.calendar_yr,
tb_cal.calendar_mth,
(SEL fiscal_yr FROM db1.tb_cal tb_cal WHERE calendar_dt=CURRENT_DATE) current_fy,
(SEL fiscal_yr FROM db1.tb_cal tb_cal WHERE calendar_dt=clse_dt) clse_dt_fy,
SUM(CASE WHEN current_fy=clse_dt_fy THEN sub.amt ELSE 0 END) cnt
FROM db1.tb_cal tb_cal
LEFT JOIN( SELECT
tbl_a.key,tbl_a.stage,COALESCE(tbl_u.div, 'Un-assigned') div,COALESCE(tbl_u.reg_nm,'Un-assigned') reg_nm,tbl_a.clse_dt,tbl_a.amt
FROM db_a.tbl_a tbl_a
INNER JOIN db1.tbl_u tbl_u
ON tbl_u.unit_key=tbl_a.unit_key
WHERE tbl_a.ctg IN ('G','O','C','F','I') AND tbl_a.stage NOT IN ('R', 'D', 'N','A') AND tbl_a.ind = 0
) sub
ON sub.clse_dt = tb_cal.calendar_dt
GROUP BY sub.key,sub.div,sub.reg_nm,tb_cal.calendar_yr,tb_cal.calendar_mth,current_fy,clse_dt_fy;
This should work fine. But as per your comments, you said adding the last to columns in group by would alter your required answer. If that is the case, you can post what is your desired result set. We can help you out in that. Although looking at your query it doesn't seem that adding current_fy and clse_dt_fy will change your current result set.
Assume I have a table like
how can I create a table like
where the groups are created of timeintervals with the length of 1 second.
Thank you in advance!
Here is an idea, but you need a table of numbers
select (m.startts + n.n - 1) as starttime,
(m.startts + n.n) as enddtime,
sum(case when vehicle_type = 'bus' then 1 else 0 end) as bus,
sum(case when vehicle_type = 'car' then 1 else 0 end) as car
from (select min(Timestamp) as startts from table t) m cross join
(select 1 as n union all select 2 union all select 3) n left join
table t
on t.timestamp >= m.startts + n.n - 1 and
t.timestamp < m.startts + n.n
group by m.startts + n.n;
This is a little dangerous because of the floating point arithmetic, but it will probably work for your purposes.