SQL Server 2012 query to compute calculations - math

I am building a query within SQL Server that is calculating scores we receive for our surveys. We have a column called overall_score, where the user inputs a number from 1-5 as a rating. I am trying to create a stored procedure that will calculate ratings based off the scores.
Score rating = (Total count of scores 4 and 5)/(Total number of responses) * 100
I have three separate select statements that create results I need, but when I go to combine them together my output is 0.
Can someone please guide me on what I am doing wrong here?
Separate SQL Statements:
SELECT count(overall_score) FROM Layer1_DataMerge WHERE overall_score = 4;
SELECT count(overall_score) FROM Layer1_DataMerge WHERE overall_score = 5;
SELECT count(overall_score) FROM Layer1_DataMerge;
Combined together:
SELECT distinct
(
(
(SELECT count(overall_score) FROM Layer1_DataMerge WHERE Overall_Score = 4) +
(SELECT count(overall_score) FROM Layer1_DataMerge WHERE overall_score = 5)
) / (SELECT count(overall_score) FROM Layer1_DataMerge)
) AS CSAT
FROM Layer1_DataMerge;

Well the reason you're getting zero is because you're doing integer division. With integer division 1/3 = 0. You need to convert to floating-point arithmetic, plus you can do it all in one query:
SELECT 100.0 *
(SUM(CASE WHEN overall_score = 4 THEN 1 ELSE 0 END) +
SUM(CASE WHEN overall_score = 5 THEN 1 ELSE 0 END)) /
COUNT(overall_score)
or
SELECT 100.0 *
SUM(CASE WHEN overall_score IN (4,5) THEN 1 ELSE 0 END) /
COUNT(overall_score)
To only show 2 decimals you can either cast to a numeric type with 2 decimals:
SELECT CAST(
100.0 *
SUM(CASE WHEN overall_score IN (4,5) THEN 1 ELSE 0 END) /
COUNT(overall_score)
AS NUMERIC(5,2))
Or use STR to convert to a string:
SELECT STR(
100.0 *
SUM(CASE WHEN overall_score IN (4,5) THEN 1 ELSE 0 END) /
COUNT(overall_score)
,5,2)

Related

Oracle. How to calculate % in one select

Good afternoon, can you help me?
I have a table of "routes"
Me need calculate count all routes, and count routes with "where". And % two from one.
For example:
select count(routes) from table1 "~ 150000 items"
select count (routes) from table1 where distance < 100 "~ 15000 items"
How to get a number 10%?
You could conditionally aggregate values:
select sum(case when distance < 100 then 1 else 0 end) /
count(*) * 100 as result
from table1;
Or (worse), use current queries as subqueries:
select
(select count(routes) from table1 where distance < 100) /
(select count(routes) from table1) * 100 as result
from dual;
Try this:
select count(CASE WHEN distance < 100 THEN 1 ELSE 0 END)/count(routes)*100 from table1 where distance < 100

How to calculate a row value based on the previous row value in the same column

I have the following data set:
DATE CODE RANK PARTITION
? ABS 0 1
12/04/2014 RET 1 1
20/04/2014 RET 2 1
01/05/2014 ABS 2 1
13/05/2014 RET 2 1
01/06/2015 ABS 2 1
09/10/2015 RETk 2 1
? ABS 0 2
02/04/2015 RET 1 2
03/04/2015 RET 2 2
04/04/2015 ABS 2 2
05/04/2015 STT 3 2
06/04/2015 RETk 4 2
07/04/2015 RETk 4 2
RANK is the column I want to calculate in my SQL given the columns DATE, CODE AND the previous value of the same column. It's initialized here to 0.
The logic I want to implement is as follows:
If RANK-1 (previous row) IS NULL AND CODE = ABS THEN RANK = 0
If RANK-1 (previous row) IS NULL AND CODE <> ABS THEN RANK <- (RANK-1) + 1
If RANK-1 = 0 or 1 AND CODE = RET THEN RANK <- (RANK-1) + 1
If RANK-1 = 2 AND CODE = STT THEN RANK <- (RANK-1) + 1
If RANK-1 = 3 AND CODE = RETk THEN RANK <- (RANK-1) + 1
If CODE = ABS THEN RANK <- (RANK-1) (previous row)
Else 0
The Teradata release I am using is R14. The calculation is done on a partition basis as shown in the example above. I have added some more constraints in the model to make it clearer. In this example, if the current code is RET, I do not increase the rank until the previous one is 0 or 1. Similarly, If my current code is RETk, I do not increase the rank until the previous one is equal to 3, otherwise, I do not change the rank. I repeat the same process in the following partition and so on ...
I cannot figure out how to update the current column value given the previous one... I tried many logic implementation with OLAP functions without success.
Can anyone give me a hint?
Thank you very much for your help
You can always use a recursive query for tasks like this. But performance will be bad unless the number of rows per group is low.
First you need a way to advance to the next row, as the next row's date can't be calculated based on the current row's date you must materialize the data and add a ROW_NUMBER:
CREATE TABLE tab(dt DATE, CODE VARCHAR(10), rnk INT, part INT);
INSERT INTO tab( NULL,'ABS' ,0 , 1);
INSERT INTO tab(DATE'2014-04-12','RET' ,1 , 1);
INSERT INTO tab(DATE'2014-04-20','RET' ,2 , 1);
INSERT INTO tab(DATE'2014-05-01','ABS' ,2 , 1);
INSERT INTO tab(DATE'2014-05-13','RET' ,2 , 1);
INSERT INTO tab(DATE'2014-06-01','ABS' ,2 , 1);
INSERT INTO tab(DATE'2014-10-09','RETk',2 , 1);
INSERT INTO tab( NULL,'ABS' ,0 , 2);
INSERT INTO tab(DATE'2015-04-02','RET' ,1 , 2);
INSERT INTO tab(DATE'2015-04-03','RET' ,2 , 2);
INSERT INTO tab(DATE'2015-04-04','ABS' ,2 , 2);
INSERT INTO tab(DATE'2015-04-05','STT' ,3 , 2);
INSERT INTO tab(DATE'2015-04-06','RETk',4 , 2);
INSERT INTO tab(DATE'2015-04-07','RETk',4 , 2);
CREATE VOLATILE TABLE vt AS
(
SELECT dt, code, part
-- used to find the next row
,ROW_NUMBER() OVER (PARTITION BY part ORDER BY dt) AS rn
FROM tab
) WITH DATA
PRIMARY INDEX(part, rn)
ON COMMIT PRESERVE ROWS
;
And now it's just applying your logic using CASE row after row:
WITH RECURSIVE cte (dt, code, rnk, part, rn) AS
(
SELECT
dt
,code
,CASE WHEN code = 'ABS' THEN 0 ELSE 1 END
,part
,rn
FROM vt
WHERE rn = 1
UNION ALL
SELECT
vt.dt
,vt.code
,CASE
WHEN cte.rnk IN (0,1) AND vt.CODE = 'RET' THEN cte.rnk + 1
WHEN cte.rnk = 2 AND vt.CODE = 'STT' THEN cte.rnk + 1
WHEN cte.rnk = 3 AND vt.CODE = 'RETk' THEN cte.rnk + 1
WHEN vt.CODE = 'ABS' THEN cte.rnk
ELSE cte.rnk
END
,vt.part
,vt.rn
FROM vt JOIN cte
ON vt.part =cte.part
AND vt.rn =cte.rn + 1
)
SELECT *
FROM cte
ORDER BY part, dt;
But I think your logic is not actually like this (based on the previous rows exact RANK value), you're just stuck in procedural thinking :-)
You might be able to do what you want using OLAP-functions only...
Something along the lines of:
create table table1
(
datecol date,
code varchar(10),
rankcol integer
);
--insert into table1 select '2014/05/13', 'RETj', 0;
select
case
when s1.code='ABS' and s2.rankcol = 1 then 1
when s1.code='RET' and s2.rankcol = 0 then 1
when s1.code='RET' and s2.rankcol = 1 then 2
else 0
end RET_res,
s1.*, s2.*
from
(select rankcol, code, row_number() OVER (order by datecol) var1 from table1) s1,
(select rankcol, code, row_number() OVER (order by datecol) var1 from table1) s2
where s1.var1=s2.var1-1
order by s1.var1
;

How to get maximum column values across a row in Teradata sql?

I have a table named cnst_chrctrstc_abc with 10 columns (equ_gender1 - bb_population_flag) each row which contain numeric values (count) .
I want to get maximum 5 values out of each row across those 10 numeric columns.
The query I have looks something like the following ..
SEL
FROM
(
SEL
SUM(CASE WHEN COALESCE(act.equ_gender1,'') = COALESCE(inact.equ_gender1,'') THEN 0 ELSE 1 END ) AS equ_gender1_chg_cnt,
SUM(CASE WHEN COALESCE(act.exp_ex_bmyr1,'') = COALESCE(inact.exp_ex_bmyr1,'') THEN 0 ELSE 1 END ) AS exp_ex_bmyr1_chg_cnt,
SUM(CASE WHEN COALESCE(act.equ_age1,'') = COALESCE(inact.equ_age1,'') THEN 0 ELSE 1 END ) AS equ_age1_chg_cnt,
SUM(CASE WHEN COALESCE(act.maritalstatus1,'') = COALESCE(inact.maritalstatus1,'') THEN 0 ELSE 1 END ) AS maritalstatus1_chg_cnt,
SUM(CASE WHEN COALESCE(act.person_type1,'') = COALESCE(inact.person_type1,'') THEN 0 ELSE 1 END ) AS person_type1_chg_cnt,
SUM(CASE WHEN COALESCE(act.homeowner,'') = COALESCE(inact.homeowner,'') THEN 0 ELSE 1 END ) AS homeowner_chg_cnt,
SUM(CASE WHEN COALESCE(act.dwelling_size,'') = COALESCE(inact.dwelling_size,'') THEN 0 ELSE 1 END ) AS dwelling_size_chg_cnt,
SUM(CASE WHEN COALESCE(act.lengthofresidence,'') = COALESCE(inact.lengthofresidence,'') THEN 0 ELSE 1 END ) AS lengthofresidence_chg_cnt,
SUM(CASE WHEN COALESCE(act.childrenage0_18,'') = COALESCE(inact.childrenage0_18,'') THEN 0 ELSE 1 END ) AS childrenage0_18_chg_cnt,
SUM(CASE WHEN COALESCE(act.bb_population_flag,'') = COALESCE(inact.bb_population_flag,'') THEN 0 ELSE 1 END ) AS bb_population_flag
FROM
(SEL * FROM arc_mdm_Tbls.cnst_chrctrstc_abc WHERE load_id=1024 AND cnst_chrctrstc_end_dt='9999-12-31' (DATE))act
LEFT JOIN
(SEL * FROM arc_mdm_Tbls.cnst_chrctrstc_abc WHERE load_id=1024 AND cnst_chrctrstc_end_dt<'9999-12-31' (DATE)
QUALIFY ROW_NUMBER() OVER (PARTITION BY cnst_mstr_id ORDER BY cnst_chrctrstc_strt_ts DESC)=1
)inact
ON act.cnst_mstr_id = inact.cnst_mstr_id
)X
I know SEL GREATEST would produce the maximum value out of each row . But I want 5 top values and assign a rank to them.
Something like for some row first five columns may hold the top 5 values and for some last five i.e. homeowner to bb_population_flag may hold the top 5 values.
so if the columns and values from cnst_chrctrstc_abc look something like the following
cdi_batch_id | a | b | c | d | e | f | g | h | i |j
1024 |116|105|102|100|117|119|108|104|101|121
so the select query should return me columns j,f,e,a,g having the top 5 values.
And then I would assign a rank to them accordingly .
Should it be done using unpivot or something ?
Thanks in advance.
Yes, you need to unpivot your result.
Before TD14.10 you will need a list of those column names, either as a table
create table ColumnList (col varchar(128));
Insert into ColumnList('equ_gender1' );
Insert into ColumnList('exp_ex_bmyr1' );
Insert into ColumnList('equ_age1' );
Insert into ColumnList('maritalstatus1' );
Insert into ColumnList('person_type1' );
Insert into ColumnList('homeowner' );
Insert into ColumnList('dwelling_size' );
Insert into ColumnList('lengthofresidence' );
Insert into ColumnList('childrenage0_18' );
Insert into ColumnList('bb_population_flag');
or on-thy-fly using a bulky
with ColumnList as
(
select * from (select 'equ_gender1' as Col) as dt union all
select * from (select 'exp_ex_bmyr1' as Col) as dt union all
select * from (select 'equ_age1' as Col) as dt union all
select * from (select 'maritalstatus1' as Col) as dt union all
select * from (select 'person_type1' as Col) as dt union all
select * from (select 'homeowner' as Col) as dt union all
select * from (select 'dwelling_size' as Col) as dt union all
select * from (select 'lengthofresidence' as Col) as dt union all
select * from (select 'childrenage0_18' as Col) as dt union all
select * from (select 'bb_population_flag' as Col) as dt
)
Then you CROSS JOIN to unpivot:
select
col,
case col
when 'equ_gender1' then equ_gender1
when 'exp_ex_bmyr1' then exp_ex_bmyr1
when 'equ_age1' then equ_age1
when 'maritalstatus1' then maritalstatus1
when 'person_type1' then person_type1
when 'homeowner' then homeowner
when 'dwelling_size' then dwelling_size
when 'lengthofresidence' then lengthofresidence
when 'childrenage0_18' then childrenage0_18
when 'bb_population_flag' then bb_population_flag
end as Counts,
rank() over (order by Counts desc) as rnk
FROM
(
your current select
) as dt
cross join ColumnList
qualify rnk <= 5
In TD14.10 you could utilize the TD_UNPIVOT function:
SELECT Col, rank() over (order by Counts desc) as rnk
from TD_UNPIVOT(
ON (
your current select
)
USING
VALUE_COLUMNS('Counts')
UNPIVOT_COLUMN('Col')
COLUMN_LIST('equ_gender1'
,'exp_ex_bmyr1'
,'equ_age1'
,'maritalstatus1'
,'person_type1'
,'homeowner'
,'dwelling_size'
,'lengthofresidence'
,'childrenage0_18'
,'bb_population_flag')
) dt
qualify rnk <= 5;
Edit:
Additionally you might replace your LEFT JOIN with a single OLAP-function. Depending on the number of rows per cnst_mstr_id this might be more efficient as you need a ROW_NUMBER anyway:
SEL
SUM(CASE WHEN COALESCE(equ_gender1,'') = COALESCE(last_equ_gender1,'') THEN 0 ELSE 1 END ) AS equ_gender1_chg_cnt,
...
FROM
( SELECT
min(equ_gender1) OVER (PARTITION BY cnst_mstr_id ORDER BY cnst_chrctrstc_strt_ts DESC rows between 1 following and 1 following) as equ_gender1,
...
FROM arc_mdm_Tbls.cnst_chrctrstc_abc
WHERE load_id=1024
qualify cnst_chrctrstc_end_dt= date '9999-12-31'
)act

group by time interval

Assume I have a table like
how can I create a table like
where the groups are created of timeintervals with the length of 1 second.
Thank you in advance!
Here is an idea, but you need a table of numbers
select (m.startts + n.n - 1) as starttime,
(m.startts + n.n) as enddtime,
sum(case when vehicle_type = 'bus' then 1 else 0 end) as bus,
sum(case when vehicle_type = 'car' then 1 else 0 end) as car
from (select min(Timestamp) as startts from table t) m cross join
(select 1 as n union all select 2 union all select 3) n left join
table t
on t.timestamp >= m.startts + n.n - 1 and
t.timestamp < m.startts + n.n
group by m.startts + n.n;
This is a little dangerous because of the floating point arithmetic, but it will probably work for your purposes.

SQLite count(*) in while clause

I have a calendar table in which there are all the dates in the future and a workday field:
fld_date / fld_workday
2014-01-01 / 1
2014-01-02 / 1
2014-01-03 / 0
...
I want select a date which are n workday far from another date. I tried two ways, but i failed:
The 5th workday from 2014-11-07:
1.
SELECT n1.fld_date FROM calendar as n1 WHERE n1.fld_workday=1 AND
(select count(*) FROM calendar as n2 WHERE n2.fld_date>='2014-11-07' AND n2.fld_workday=1)=5
It gave back 0 row.
2.
SELECT fld_date FROM calendar WHERE fld_date>='2014-11-07' AND fld_workday=1 LIMIT 1 OFFSET 5
It's ok, but i would like to change the 5 days constant to a field, and it's cannot (it would be inside a bigger select statement):
SELECT fld_date FROM calendar WHERE fld_date>='2014-11-07' AND fld_workday=1 LIMIT 1 OFFSET fld_another_field
Any suggestion?
In the first query, the subquery does not refer to the row in n1.
You need a correlated subquery:
SELECT fld_Date
FROM Calendar AS n1
WHERE fld_WorkDay = 1
AND (SELECT COUNT(*)
FROM Calendar AS n2
WHERE fld_Date BETWEEN '2014-11-07' AND n1.fld_Date
AND fld_WorkDay = 1
) = 5
LIMIT 1
The subquery is extremly inefficient if there is no index on the fld_Date column.
You can avoid executing the subquery for every row in n1 by adding another condition with an estimate of the result date (assuming that there are between about four to five work days per week, and using a few extra days to be sure):
...
WHERE fldDate BETWEEN date('2014-11-07', (5 * 4/7 - 10) || ' days')
AND date('2014-11-07', (5 * 5/7 + 10) || ' days')
AND fldWorkDay = 1
AND (SELECT ...

Resources