How use Qualify row_number in teradata - teradata

how are you?
I never used qualify row_number()/rank() and I have some questions.
I am trying this query in teradata:
select sit_site_id
, count(distinct shp_shipment_id)
from WHOWNER.BT_SHP_SHIPMENTS
group by sit_site_id
QUALIFY RANK() OVER (PARTITION BY sit_site_id
ORDER BY count(distinct shp_shipment_id) ) = 3
But the result is: 'No data available in table'.
I want to get the first 3 sit_site_id values with more shp_shipment_id.
Where is my error?
Thanks!

I want to get the first 3 sit_site_id values with more shp_shipment_id
If so, then you do not need to use RANK() / ROW_NUMBER(). Instead, you can just use TOP. Your query should be like below:
select TOP 3 sit_site_id
, count(distinct shp_shipment_id) shp_shipment_id_cnt
from WHOWNER.BT_SHP_SHIPMENTS
group by sit_site_id
order by shp_shipment_id_cnt DESC;

If you want to use qualify for this, then you can do:
select sit_site_id, count(distinct shp_shipment_id)
from WHOWNER.BT_SHP_SHIPMENTS
group by sit_site_id
qualify row_number() over (order by count(distinct shp_shipment_id) desc) <= 3;
However, I would recommend top as in #Clark's answer.

Related

unique one column adn return all data with mariaDB [duplicate]

My database structure contains columns: id, name, value, dealer. I want to retrieve row with lowest value for each dealer. I've been trying to mess up with MIN() and GROUP BY, still - no solution.
Solution1:
SELECT t1.* FROM your_table t1
JOIN (
SELECT MIN(value) AS min_value, dealer
FROM your_table
GROUP BY dealer
) AS t2 ON t1.dealer = t2.dealer AND t1.value = t2.min_value
Solution2 (recommended, much faster than solution1):
SELECT t1.* FROM your_table t1
LEFT JOIN your_table t2
ON t1.dealer = t2.dealer AND t1.value > t2.value
WHERE t2.value IS NULL
This problem is very famous, so there is a special page for this in Mysql's manual.
Check this: Rows Holding the Group-wise Maximum/Minimum of a Certain Column
select id,name,MIN(value) as pkvalue,dealer from TABLENAME
group by id,name,dealer;
here you group all rows by id,name,dealer and then you will get min value as pkvalue.
SELECT MIN(value),dealer FROM table_name GROUP BY dealer;
First you need to resolve the lowest value for each dealer, and then retrieve rows having that value for a particular dealer. I would do this that way:
SELECT a.*
FROM your_table AS a
JOIN (SELECT dealer,
Min(value) AS m
FROM your_table
GROUP BY dealer) AS b
ON ( a.dealer= b.dealer
AND a.value = b.m )
Try following:
SELECT dealer, MIN(value) as "Lowest value"
FROM value
GROUP BY dealer;
select id, name, value, dealer from yourtable where dealer
in(select min(dealer) from yourtable group by name, value)
These answers seem to miss the edge case of having multiple minimum values for a dealer and only wanting to return one row.
If you want to only want one value for each dealer you can use row_number partition - group - the table by dealer then order the data by value and id. we have to make the assumption that you will want the row with the smallest id.
SELECT ord_tbl.id,
ord_tbl.name,
ord_tbl.value,
ord_tbl.dealer
FROM (SELECT your_table.*,
ROW_NUMBER() over (PARTITION BY dealer ORDER BY value ASC, ID ASC)
FROM your_table
) AS ord_tbl
WHERE ord_tbl.ROW_NUMBER = 1;
Be careful though that value, id and dealer are indexed. If not this will do a full table scan and can get pretty slow...

SQLite Nested Query for maximum

I'm trying to use DB Browser for SQLite to construct a nested query to determine the SECOND highest priced item purchased by the top 10 spenders. The query I have to pick out the top 10 spenders is:
SELECT user_id, max(item_total), SUM (item_total + shipping_cost -
discounts_applied) AS total_spent
FROM orders AS o
WHERE payment_reject = "FALSE"
GROUP BY user_id
ORDER BY total_spent DESC
LIMIT 10
This gives the user_id, most expensive item they purchased (not counting shipping or discounts) as well as the total amount they spent on the site.
I was trying to use a nested query to generate a list of the second most expensive items they purchased, but keep getting errors. I've tried
SELECT user_id, MAX(item_total) AS second_highest
FROM orders
WHERE item_total < (SELECT user_id, SUM (item_total + shipping_cost -
discounts_applied) AS total_spent
FROM orders
WHERE payment_reject = "FALSE"
GROUP BY user_id
ORDER BY total_spent DESC
LIMIT 10)
group by user_id
I keep getting a row value misused error. Does anyone have pointers on this nested query or know of another way to find the second highest item purchased from within the group found in the first query?
Thanks!
(Note: The following assumes you're using Sqlite 3.25 or newer since it uses window functions).
This will return the second-largest item_total for each user_id without duplicates:
WITH ranked AS
(SELECT DISTINCT user_id, item_total
, dense_rank() OVER (PARTITION BY user_id ORDER BY item_total DESC) AS ranking
FROM orders)
SELECT user_id, item_total FROM ranked WHERE ranking = 2;
You can combine it with your original query with something like:
WITH ranked AS
(SELECT DISTINCT user_id, item_total
, dense_rank() OVER (PARTITION BY user_id ORDER BY item_total DESC) AS ranking
FROM orders),
totals AS
(SELECT user_id
, sum (item_total + shipping_cost - discounts_applied) AS total_spent
FROM orders
WHERE payment_reject = 0
GROUP BY user_id)
SELECT t.user_id, r.item_total, t.total_spent
FROM totals AS t
JOIN ranked AS r ON t.user_id = r.user_id
WHERE r.ranking = 2
ORDER BY t.total_spent DESC, t.user_id
LIMIT 10;
Okay, after fixing your table definition to better reflect the values being stored in it and the stated problem, and fixing the data and adding to it so you can actually get results, plus an optional but useful index like so:
CREATE TABLE orders (order_id INTEGER PRIMARY KEY
, user_id INTEGER
, item_total REAL
, shipping_cost NUMERIC
, discounts_applied NUMERIC
, payment_reject INTEGER);
INSERT INTO orders(user_id, item_total, shipping_cost, discounts_applied
, payment_reject) VALUES (9852,60.69,10,0,FALSE),
(2784,123.91,15,0,FALSE), (1619,119.75,15,0,FALSE), (9725,151.92,15,0,FALSE),
(8892,153.27,15,0,FALSE), (7105,156.86,25,0,FALSE), (4345,136.09,15,0,FALSE),
(7779,134.93,15,0,FALSE), (3874,157.27,15,0,FALSE), (5102,108.3,10,0,FALSE),
(3098,59.97,10,0,FALSE), (6584,124.92,15,0,FALSE), (5136,111.06,10,0,FALSE),
(1869,113.44,20,0,FALSE), (3830,129.63,15,0,FALSE), (9852,70.69,10,0,FALSE),
(2784,134.91,15,0,FALSE), (1619,129.75,15,0,FALSE), (9725,161.92,15,0,FALSE),
(8892,163.27,15,0,FALSE), (7105,166.86,25,0,FALSE), (4345,146.09,15,0,FALSE),
(7779,144.93,15,0,FALSE), (3874,167.27,15,0,FALSE), (5102,118.3,10,0,FALSE),
(3098,69.97,10,0,FALSE), (6584,134.92,15,0,FALSE), (5136,121.06,10,0,FALSE),
(1869,123.44,20,0,FALSE), (3830,139.63,15,0,FALSE);
CREATE INDEX orders_idx_1 ON orders(user_id, item_total DESC);
the above query will give:
user_id item_total total_spent
---------- ---------- -----------
7105 156.86 373.72
3874 157.27 354.54
8892 153.27 346.54
9725 151.92 343.84
4345 136.09 312.18
7779 134.93 309.86
3830 129.63 299.26
6584 124.92 289.84
2784 123.91 288.82
1619 119.75 279.5
(If you get a syntax error from the query now, it's because you're using an old version of sqlite that doesn't support window functions.)

Average Analytics Function On GA360 Visit ID for LatencyTracking

Looking to get the average latencyTracking for a visitid out of our GA 360 export.
Setup the following query but getting the following error and I'm not sure why since all these are all aggregate functions: SELECT list expression references hits.latencyTracking.serverResponseTime which is neither grouped nor aggregated at [3:5]
select
TIMESTAMP_SECONDS(visitStartTime) as visitStartTime,
AVG(hits.latencyTracking.serverResponseTime) OVER (PARTITION BY visitid) as avgServerResponseTime,
AVG(hits.latencyTracking.serverConnectionTime) OVER (PARTITION BY visitid) as avgServerConnectionTime,
AVG(hits.latencyTracking.domInteractiveTime) OVER (PARTITION BY visitid) as avgdomInteractiveTime,
AVG(hits.latencyTracking.pageLoadTime) OVER (PARTITION BY visitid) as avgpageLoadTime
from `xxx.xxx.ga_sessions_2018*`,
UNNEST(hits) AS hits
where hits.latencyTracking.serverResponseTime is not null
group by visitStartTime
The way your query written - AVG() is not just Aggregate Function but rather Aggregate Analytic Function.
To make it work you can remove OVER() so AVG() will really become aggregate function here corresponding to GROUP BY
select
TIMESTAMP_SECONDS(visitStartTime) as visitStartTime,
AVG(hits.latencyTracking.serverResponseTime) as avgServerResponseTime,
AVG(hits.latencyTracking.serverConnectionTime) as avgServerConnectionTime,
AVG(hits.latencyTracking.domInteractiveTime) as avgdomInteractiveTime,
AVG(hits.latencyTracking.pageLoadTime) as avgpageLoadTime
from `xxx.xxx.ga_sessions_2018*`,
UNNEST(hits) AS hits
where hits.latencyTracking.serverResponseTime is not null
group by visitStartTime
Having windows and group by in conjunction can be confusing.
In your case it is not even necessary, neither is the flattening - you can write simple subqueries to get your numbers per session:
SELECT
TIMESTAMP_SECONDS(visitStartTime) AS visitStartTime,
(
SELECT AVG(latencyTracking.serverResponseTime)
FROM t.hits
WHERE latencyTracking.serverResponseTime IS NOT NULL) AS avgServerResponseTime,
(
SELECT AVG(latencyTracking.serverConnectionTime)
FROM t.hits
WHERE latencyTracking.serverConnectionTime IS NOT NULL) AS avgServerConnectionTime,
(
SELECT AVG(latencyTracking.domInteractiveTime)
FROM t.hits
WHERE latencyTracking.domInteractiveTime IS NOT NULL ) AS avgdomInteractiveTime,
(
SELECT AVG(latencyTracking.pageLoadTime)
FROM t.hits
WHERE latencyTracking.pageLoadTime IS NOT NULL ) AS avgpageLoadTime
FROM `xxx.xxx.ga_sessions_2018*`
It also doesn't involve grouping which makes it faster.

Can't cast the result of LAG or LEAD into an Integer

I have been trying to do a calculation based on the result of either the LAG or LEAD functions.
Encapsulating the function in the INTEGER() casting function seems to cause an issue with the OVER function within and throws the following error:
Unrecognized Analytic Function: INT64 cannot be used with an OVER() clause
The following is the base code that works just fine, but when I add a function, it produces an error:
LEAD(hits.hitNumber, 1) OVER (PARTITION BY fullvisitorID, visitid, visitnumber ORDER BY hits.hitNumber DESC) as nextHit
The code that I was using to produce this error is as follows:
INTEGER(LEAD(hits.hitNumber, 1)) OVER (PARTITION BY fullvisitorID, visitid ORDER BY hits.hitNumber DESC) as nextHit
The following doesn't seem to work either:
INTEGER(LEAD(hits.hitNumber, 1) OVER (PARTITION BY fullvisitorID, visitid ORDER BY hits.hitNumber DESC))as nextHit
Encountered " "OVER" "OVER "" at line 8, column 36. Was expecting: ")"
Do I really need to make this a sub-query to make this work or is there a different solution?
2 possible solutions:
As Jordan says, bring the INTERGER() cast inside LEAD():
SELECT LEAD(INTEGER(hits.hitNumber), 1) OVER (PARTITION BY fullvisitorID, visitid, visitnumber ORDER BY hits.hitNumber DESC) as nextHit
FROM [dataset.ga_sessions_20140107]
Or as in your suggestion, with a sub-query:
SELECT INTEGER(nextHit) FROM (
SELECT LEAD(hits.hitNumber, 1) OVER (PARTITION BY fullvisitorID, visitid, visitnumber ORDER BY hits.hitNumber DESC) as nextHit
FROM [dataset.ga_sessions_20140107]
)

Sum of two columns in mysql

I am working in a crystal report and I am generating performance based report. This is my query
select mguser.userID, mguser.userName as Loger,date_format(trschedule.insDate,'%Y-%m-%d') as _Date,
count(distinct trschedule.scheduleID) as Schedule_Count, count(*) segment_count
FROM
mguser
I want summation of these two columns--- count(distinct trschedule.scheduleID) as Schedule_Count, count(*) segment_count
How can I do this in mysql???
select Loger,_Date, (Schedule_Count + segment_count) sum from
(
select mguser.userID, mguser.userName as Loger,
date_format(trschedule.insDate,'%Y-%m-%d') as _Date,
count(distinct trschedule.scheduleID) as Schedule_Count,
count(*) segment_count
FROM
mguser
)d

Resources