Teradata - Case - PARTITION - case

i have the following query that brings for each party_id all the business_party_id related, their monthly accumulated value for each BP and the rank for that value. there s a special business_party_id = 200 that when a given party_id has more than 1 related business_party and that business_party_ ranks 1 while being 200 then it shouldn t be considered and the rank 2 for that party_id should be the one considered as rank 1 and move the 200 to rank 2.
how can i do that?
here s my query
select party_id,
business_party_id,
monthly_accum_event_amt,
Row_number() OVER (PARTITION BY party_id ORDER BY monthly_accum_event_amt desc) orden
from (
select * from P_DMT_VIEWS.BUSINESS_AGREEMENT_PAYROLL
where LAST_DAY( payroll_dt) = '2018-05-31'
and not business_payrroll_concept_val in ('SUE004','SUE005','SUE00A')
and monthly_event_qty<>0
QUALIFY Row_number() OVER (PARTITION BY business_party_id,party_id ORDER BY monthly_event_qty desc ) =1) a

Related

Can we use the at sign in sqlite for a constant?

It is actually possible to use # (the at sign) with sqlite to be able to use a calculated value as a constant in an other query ?
I am using a variable(a total) that i calculated previously to get an other variable (a proportion) over two time periods.
Total amout of sale
Proportion of sale between the first semester and second semester.
I copy the first query to get the constant and i had the first query to the second.
The answer is no BUT:-
This could possibly be done in a single query.
Consider this simple demo with hopefully easy to understand all-in-one queries:-
First the sales table:-
i.e. 2 columns semester and amount
10 rows in total so 1000 is the total amount
6 rows are S1 (amount is 600) so 60%
4 rows are S2 (amount is 400) so 40%
Created and populated using:-
CREATE TABLE IF NOT EXISTS sales (semester TEXT, amount REAL);
INSERT INTO sales VALUES('S1',100),('S1',100),('S1',100),('S1',100),('S1',100),('S1',100),('S2',100),('S2',100),('S2',100),('S2',100);
So you could use an all-in-one query such as:-
SELECT
(SELECT sum(amount) FROM sales) AS total,
(SELECT sum(amount) FROM sales WHERE semester = 'S1') AS s1total,
((SELECT sum(amount) FROM sales WHERE semester = 'S1') / (SELECT sum(amount) FROM sales)) * 100 AS s1prop,
(SELECT sum(amount) FROM sales WHERE semester = 'S2') AS s2total,
((SELECT sum(amount) FROM sales WHERE semester = 'S2') / (SELECT sum(amount) FROM sales)) * 100 AS s2prop
;
This would result in
i.e. s1prop and s2prop the expected results (the other columns may be useful)
An alternative, using a CTE (Common Table Expressions) that does the same could be:-
WITH cte_total(total,s1total,s2total) AS (SELECT
(SELECT sum(amount) FROM sales),
(SELECT sum(amount) FROM sales WHERE semester = 'S1'),
(SELECT sum(amount) FROM sales WHERE semester = 'S2')
)
SELECT total, s1total, (s1total / total) * 100 AS s1prop, s2total, (s2total / total) * 100 AS s2prop FROM cte_total;
you can have multiple CTE's and gather data from other tables or even being passed as parameters. They can be extremely useful and would even allow values to be accessed throughout.
e.g.
Here's an example where a 2nd cte is added (as the first cte) that mimics passing 3 dates (instead of the hard coded values ?'s could be coded and the parameters passed via parameter binding).
As the sales table has no date for the sale a literal value has been coded, this would be normally be the column with the sale date instead of WHERE '2023-01-01' /*<<<<< would be the column that holds the date */
the hard coded date has purposefully been used so result in the BETWEEN clause resulting in true.
if the date column did exist then WHERE criteria for the semester could then be by between the respective dates for the semester.
The example:-
WITH
dates AS (SELECT
'2023-01-01' /*<<<<< ? and can then be passed as bound parameter*/ AS startdate,
'2023-03-01' /*<<<<< ? and can then be passed as bound parameter*/ AS semester2_start,
'2023-05-30' /*<<<<< ? and can then be passed as bound parameter*/as enddate
),
cte_total(total,s1total,s2total) AS (SELECT
(SELECT sum(amount) FROM sales
WHERE '2023-01-01' /*<<<<< would be the column that holds the date */
BETWEEN (SELECT startdate FROM dates)
AND (SELECT enddate FROM dates)),
(SELECT sum(amount) FROM sales WHERE semester = 'S1'),
(SELECT sum(amount) FROM sales WHERE semester = 'S2')
)
SELECT total, s1total, (s1total / total) * 100 AS s1prop, s2total, (s2total / total) * 100 AS s2prop FROM cte_total;

SQL OLAP vs KQL || row_number() over partition by in SQL vs KQL

I have OLAP Query in SQL and requirement is same query result need to fetch in ADX(Azure Data explorer) using KQL.
I'm new to KQL and trying to convert same SQL into KQL but not able to achieve row_number() over partition by part in KQL.
Tried below article from MS but no luck using EXPLAIN key word.
https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/sqlcheatsheet
Below is SQL, Need same in KQL. Can you please help me on this?
Select logdate, adlspathL, count as totalRequests ,repeated as repeatedInLast7Days
from (select logdate, adlspathL, count ,SUM(rnk1) over(partition by adlspathL) repeated, rnk2
from(Select logdate, adlspathL, count, rnk,
rank() over(partition by adlspathL order by adlspathL) rnk1,
row_number() over(partition by adlspathL order by count desc) rnk2
from (Select logdate, adlspathL, count,
rank() over(partition by logdate order by count desc) rnk from ( Select logdate, adlspath as adlspathL, SUM(cnt) count
from `LogAnalyticsADLS`.`LogAnalyticsProdADLSAgg_OpsName`
where logdate between '2021-08-23' and '2021-08-29'
group by logdate, adlspathL
)a
)b
where rnk between 1 and 10
order by logdate , rnk
)a
)b
where rnk2 =1
Thanks,
Brahma
Here is a KQL solution.
Currently there isn't a built-in function for rank, only for dense rank (row_rank), however a rank functionallity could be achieved by the scan operator.
Retrieval of max value + additional values from the same row, could be easily achieved using the arg_max function.
// Generation of a data sample. Not a prat of the solution
let loganalyticsprodadlsagg_opsname = range i from 1 to 1000 step 1 | project logdate = bin(datetime(2021-09-01) - 10d*rand(), 1d), adlspath = strcat("adlspath_", tostring(toint(rand(15)))), cnt = toint(rand(100));
// Solution starts here.
loganalyticsprodadlsagg_opsname
| where logdate between (datetime('2021-08-23') .. datetime('2021-08-29'))
| summarize cnt = sum(cnt) by logdate, adlspath
| partition by logdate
(
order by cnt desc
| scan declare(rank:int = 1, rn:int = 0)
with
(
step s : True => rn = s.rn + 1, rank = iff(cnt == s.cnt, s.rank, s.rn+1);
)
| where rank <= 10
)
| summarize totalrequests = arg_max(cnt, logdate), repeatedInLast7Days = count() by adlspath
adlspath
totalrequests
logdate
repeatedInLast7Days
adlspath_7
644
2021-08-23T00:00:00Z
4
adlspath_8
472
2021-08-23T00:00:00Z
6
adlspath_1
760
2021-08-24T00:00:00Z
6
adlspath_12
400
2021-08-24T00:00:00Z
5
adlspath_5
599
2021-08-24T00:00:00Z
3
adlspath_11
410
2021-08-25T00:00:00Z
4
adlspath_14
469
2021-08-25T00:00:00Z
5
adlspath_13
565
2021-08-25T00:00:00Z
6
adlspath_9
472
2021-08-25T00:00:00Z
2
adlspath_2
514
2021-08-26T00:00:00Z
6
adlspath_10
391
2021-08-26T00:00:00Z
4
adlspath_3
414
2021-08-27T00:00:00Z
4
adlspath_6
594
2021-08-28T00:00:00Z
3
adlspath_0
499
2021-08-28T00:00:00Z
6
adlspath_4
552
2021-08-29T00:00:00Z
6
Fiddle

SQLite Nested Query for maximum

I'm trying to use DB Browser for SQLite to construct a nested query to determine the SECOND highest priced item purchased by the top 10 spenders. The query I have to pick out the top 10 spenders is:
SELECT user_id, max(item_total), SUM (item_total + shipping_cost -
discounts_applied) AS total_spent
FROM orders AS o
WHERE payment_reject = "FALSE"
GROUP BY user_id
ORDER BY total_spent DESC
LIMIT 10
This gives the user_id, most expensive item they purchased (not counting shipping or discounts) as well as the total amount they spent on the site.
I was trying to use a nested query to generate a list of the second most expensive items they purchased, but keep getting errors. I've tried
SELECT user_id, MAX(item_total) AS second_highest
FROM orders
WHERE item_total < (SELECT user_id, SUM (item_total + shipping_cost -
discounts_applied) AS total_spent
FROM orders
WHERE payment_reject = "FALSE"
GROUP BY user_id
ORDER BY total_spent DESC
LIMIT 10)
group by user_id
I keep getting a row value misused error. Does anyone have pointers on this nested query or know of another way to find the second highest item purchased from within the group found in the first query?
Thanks!
(Note: The following assumes you're using Sqlite 3.25 or newer since it uses window functions).
This will return the second-largest item_total for each user_id without duplicates:
WITH ranked AS
(SELECT DISTINCT user_id, item_total
, dense_rank() OVER (PARTITION BY user_id ORDER BY item_total DESC) AS ranking
FROM orders)
SELECT user_id, item_total FROM ranked WHERE ranking = 2;
You can combine it with your original query with something like:
WITH ranked AS
(SELECT DISTINCT user_id, item_total
, dense_rank() OVER (PARTITION BY user_id ORDER BY item_total DESC) AS ranking
FROM orders),
totals AS
(SELECT user_id
, sum (item_total + shipping_cost - discounts_applied) AS total_spent
FROM orders
WHERE payment_reject = 0
GROUP BY user_id)
SELECT t.user_id, r.item_total, t.total_spent
FROM totals AS t
JOIN ranked AS r ON t.user_id = r.user_id
WHERE r.ranking = 2
ORDER BY t.total_spent DESC, t.user_id
LIMIT 10;
Okay, after fixing your table definition to better reflect the values being stored in it and the stated problem, and fixing the data and adding to it so you can actually get results, plus an optional but useful index like so:
CREATE TABLE orders (order_id INTEGER PRIMARY KEY
, user_id INTEGER
, item_total REAL
, shipping_cost NUMERIC
, discounts_applied NUMERIC
, payment_reject INTEGER);
INSERT INTO orders(user_id, item_total, shipping_cost, discounts_applied
, payment_reject) VALUES (9852,60.69,10,0,FALSE),
(2784,123.91,15,0,FALSE), (1619,119.75,15,0,FALSE), (9725,151.92,15,0,FALSE),
(8892,153.27,15,0,FALSE), (7105,156.86,25,0,FALSE), (4345,136.09,15,0,FALSE),
(7779,134.93,15,0,FALSE), (3874,157.27,15,0,FALSE), (5102,108.3,10,0,FALSE),
(3098,59.97,10,0,FALSE), (6584,124.92,15,0,FALSE), (5136,111.06,10,0,FALSE),
(1869,113.44,20,0,FALSE), (3830,129.63,15,0,FALSE), (9852,70.69,10,0,FALSE),
(2784,134.91,15,0,FALSE), (1619,129.75,15,0,FALSE), (9725,161.92,15,0,FALSE),
(8892,163.27,15,0,FALSE), (7105,166.86,25,0,FALSE), (4345,146.09,15,0,FALSE),
(7779,144.93,15,0,FALSE), (3874,167.27,15,0,FALSE), (5102,118.3,10,0,FALSE),
(3098,69.97,10,0,FALSE), (6584,134.92,15,0,FALSE), (5136,121.06,10,0,FALSE),
(1869,123.44,20,0,FALSE), (3830,139.63,15,0,FALSE);
CREATE INDEX orders_idx_1 ON orders(user_id, item_total DESC);
the above query will give:
user_id item_total total_spent
---------- ---------- -----------
7105 156.86 373.72
3874 157.27 354.54
8892 153.27 346.54
9725 151.92 343.84
4345 136.09 312.18
7779 134.93 309.86
3830 129.63 299.26
6584 124.92 289.84
2784 123.91 288.82
1619 119.75 279.5
(If you get a syntax error from the query now, it's because you're using an old version of sqlite that doesn't support window functions.)

Increase row number every time a flag changes

I have gone through a similar post in Stack overflow...
but my query is :
If my table generates a flag in run time execution,then how can I increase Grp_number(generate run time) every time my flag changes.
my Oracle query:
Select emp_id,
Case when MOD(rownum/3)=1 and rownum>1 then 'Y' else 'N' as flag
from Transaction_table
Desired o/p Data format:
emp_id Flag GRP_number
1 N 1
2 N 1
3 N 1
4 Y 2
5 N 2
6 N 2
7 Y 3
You cannot reference a column in another column in the same select list. You need to use sub query to avoid INVALID IDENTIFIER error.
Do it like -
WITH DATA AS(
SELECT emp_id,
CASE
WHEN MOD(rownum/3)=1
AND rownum >1
THEN 'Y'
ELSE 'N' AS flag
FROM Transaction_table
)
SELECT emp_id, flag, SUM(gap) over (PARTITION BY person
ORDER BY DAY) grp
FROM(
SELECT emp_id, flag,
CASE WHEN flag = lag(flag) over (PARTITION BY person
ORDER BY DAY)
THEN 0
ELSE 1
END gap
FROM DATA)

Group by not returning 0 value

My table contains pk_id,reviewer_id,rating.
There are 4 type of rating.
1-very good.
2-good.
3-bad.
4-very bad.
I want to calculate how much rating given by each reviewer.
Means:
If Akee having id 200 has given 2 very good,4 good,3 bad and zero very bad rating to different code.
I want result
count--- rate
2---------1
4---------2
3---------3
0---------4
My query is
SELECT COUNT(RATE),RATE
FROM CODE_REVIEW WHERE CODE_REVIEWER_ID= 200
GROUP BY RATE;
It is showing result
count--- rate
2---------1
4---------2
3---------3
I want to show the fourth row that is 4 rating zero.
How can it be done??
If Rate is not the primary key in another table then you need define your own list of rates so MySQL knows what the permutations of rate are:
SELECT Rates.Rate,
COUNT(Code_Review.Rate) AS CountOfRate
FROM ( SELECT 1 AS Rate UNION ALL
SELECT 2 AS Rate UNION ALL
SELECT 3 AS Rate UNION ALL
SELECT 4
) AS Rates
LEFT JOIN Code_Review
ON Code_Review.Rate = Rates.Rate
AND CODE_REVIEWER_ID = 200
GROUP BY Rates.Rate
Try this query:
SELECT coalesce(c.cnt, 0), r.rate
FROM (SELECT 1 AS rate UNION ALL SELECT 2
UNION ALL SELECT 3 UNION ALL SELECT 4) AS r
LEFT JOIN (SELECT COUNT(RATE),RATE
FROM CODE_REVIEW WHERE CODE_REVIEWER_ID= 200
GROUP BY RATE) AS c
ON r.rate = c.rate;
The first subquery creates a list of possible rates. You can avoid it if you have a table which defines all rates;
Second subquery is yours;
LEFT JOIN guarantees that all rates will be shown;
coalesce() is needed to convert NULL into 0.
Assuming that you do not have a separate table where the rates are defined.
SElECT * from (
SELECT distinct(m.rate), countrate from code_review m
LEFT JOIN
(SELECT COUNT(rate) as countrate,rate FROM code_review
WHERE code_reviewer_id=200 GROUP BY rate) t
ON m.rate=t.rate) a
You could do it somthing like this
SELECT
rates.RATE
, SUM(COUNT) COUNT
FROM
(
SELECT 1 RATE, 0 COUNT UNION ALL
SELECT 2 RATE, 0 COUNT UNION ALL
SELECT 3 RATE, 0 COUNT UNION ALL
SELECT 4 RATE, 0 COUNT
) Rates
LEFT JOIN
(
SELECT
RATE
, COUNT(RATE) COUNT
FROM
CODE_REVIEW
WHERE
CODE_REVIEWER_ID= 200
GROUP BY RATE
) Ratings200
ON Ratings200.RATE = Rates.RATE
If you can, you should push to try to get it in column format as it is simple as:
SELECT
SUM(rate = 1) AS 1,
SUM(rate = 2) AS 2,
SUM(rate = 3) AS 3,
SUM(rate = 4) AS 4
FROM
code_review
WHERE
code_reviewer_id = 200
But if you really need a row format, you could do:
SELECT
a.rate,
COUNT(b.rate) AS cnt
FROM
(
SELECT 1 AS rate UNION ALL
SELECT 2 AS rate UNION ALL
SELECT 3 AS rate UNION ALL
SELECT 4 AS rate
) a
LEFT JOIN
code_review b ON a.rate = b.rate AND code_reviewer_id = 200
GROUP BY
a.rate
SELECT
Rate,
totCount
FROM
(
Select
Rate,
count(Rate) as totCount
from
Code_Review
where
CODE_REVIEWER_ID = 200
group by
Rate
union
select 4, 0
union
select 3, 0
union
select 2, 0
union
select 1, 0
) AS T
group by
T.Rate

Resources