I'm trying to create a metabase query, where I want to calculate the % of a particular feedback-label by total, grouped by start-batter
I count all columns by start-battery
I then join that count with the original data again, so each column has the total of that start battery.
I then count again by start-battery and feedback-label
SQL
SELECT
"source"."START_BATTERY" AS "START_BATTERY",
"source"."count" AS "count",
"Ride Dev"."FEEDBACK_TAGS_STRING" AS "FEEDBACK_TAGS_STRING",
COUNT(*) AS "count"
FROM
(SELECT
"MART"."CORE"."RIDE_DEV"."START_BATTERY" AS "START_BATTERY",
COUNT(*) AS "count"
FROM
"MART"."CORE"."RIDE_DEV"
WHERE
((NOT (LOWER("MART"."CORE"."RIDE_DEV"."FEEDBACK_TAGS_STRING") LIKE '%,%')
OR "MART"."CORE"."RIDE_DEV"."FEEDBACK_TAGS_STRING" IS NULL)
AND "MART"."CORE"."RIDE_DEV"."START_BATTERY" <= 100
AND "MART"."CORE"."RIDE_DEV"."START_BATTERY" > 20
AND "MART"."CORE"."RIDE_DEV"."FEEDBACK_TAGS" IS NOT NULL
AND "MART"."CORE"."RIDE_DEV"."VEHICLE_MODEL" = 'Voiager 3X (NB Max)')
GROUP BY
"MART"."CORE"."RIDE_DEV"."START_BATTERY"
ORDER BY
"MART"."CORE"."RIDE_DEV"."START_BATTERY" ASC) "source"
LEFT JOIN
"MART"."CORE"."RIDE_DEV" "Ride Dev" ON "source"."START_BATTERY" = "Ride Dev"."START_BATTERY"
WHERE
("Ride Dev"."FEEDBACK_TAGS_STRING" IS NOT NULL
AND ("Ride Dev"."FEEDBACK_TAGS_STRING" <> '' OR "Ride Dev"."FEEDBACK_TAGS_STRING" IS NULL)
AND (NOT (LOWER("Ride Dev"."FEEDBACK_TAGS_STRING") LIKE '%,%')
OR "Ride Dev"."FEEDBACK_TAGS_STRING" IS NULL))
GROUP BY
"source"."START_BATTERY", "source"."count",
"Ride Dev"."FEEDBACK_TAGS_STRING" ORDER BY "source"."START_BATTERY" ASC,
"source"."count" ASC, "Ride Dev"."FEEDBACK_TAGS_STRING" ASC
Result
Start Battery Count Ride Dev → Feedback Tags String Count
21 6598 couldnt_end_ride 916
21 6598 motor_didnt_start 2
21 6598 other 1
21 6598 scooter_didnt_start 1569
21 6598 slow 3667
21 6598 stopped_during_ride 3150
22 6815 couldnt_end_ride 866
22 6815 other 2
22 6815 scooter_didnt_start 1617
22 6815 screws_missing 1
Is it possible to name the counted columns so they are easier to keep apart?
I'd like to create a calculated field with count / Ride Dev → Count I click summarize and add custom field [Ride Dev.coun] / [Count] but it doesn't recognize the first field. How can I reference the first field?
Related
I have OLAP Query in SQL and requirement is same query result need to fetch in ADX(Azure Data explorer) using KQL.
I'm new to KQL and trying to convert same SQL into KQL but not able to achieve row_number() over partition by part in KQL.
Tried below article from MS but no luck using EXPLAIN key word.
https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/sqlcheatsheet
Below is SQL, Need same in KQL. Can you please help me on this?
Select logdate, adlspathL, count as totalRequests ,repeated as repeatedInLast7Days
from (select logdate, adlspathL, count ,SUM(rnk1) over(partition by adlspathL) repeated, rnk2
from(Select logdate, adlspathL, count, rnk,
rank() over(partition by adlspathL order by adlspathL) rnk1,
row_number() over(partition by adlspathL order by count desc) rnk2
from (Select logdate, adlspathL, count,
rank() over(partition by logdate order by count desc) rnk from ( Select logdate, adlspath as adlspathL, SUM(cnt) count
from `LogAnalyticsADLS`.`LogAnalyticsProdADLSAgg_OpsName`
where logdate between '2021-08-23' and '2021-08-29'
group by logdate, adlspathL
)a
)b
where rnk between 1 and 10
order by logdate , rnk
)a
)b
where rnk2 =1
Thanks,
Brahma
Here is a KQL solution.
Currently there isn't a built-in function for rank, only for dense rank (row_rank), however a rank functionallity could be achieved by the scan operator.
Retrieval of max value + additional values from the same row, could be easily achieved using the arg_max function.
// Generation of a data sample. Not a prat of the solution
let loganalyticsprodadlsagg_opsname = range i from 1 to 1000 step 1 | project logdate = bin(datetime(2021-09-01) - 10d*rand(), 1d), adlspath = strcat("adlspath_", tostring(toint(rand(15)))), cnt = toint(rand(100));
// Solution starts here.
loganalyticsprodadlsagg_opsname
| where logdate between (datetime('2021-08-23') .. datetime('2021-08-29'))
| summarize cnt = sum(cnt) by logdate, adlspath
| partition by logdate
(
order by cnt desc
| scan declare(rank:int = 1, rn:int = 0)
with
(
step s : True => rn = s.rn + 1, rank = iff(cnt == s.cnt, s.rank, s.rn+1);
)
| where rank <= 10
)
| summarize totalrequests = arg_max(cnt, logdate), repeatedInLast7Days = count() by adlspath
adlspath
totalrequests
logdate
repeatedInLast7Days
adlspath_7
644
2021-08-23T00:00:00Z
4
adlspath_8
472
2021-08-23T00:00:00Z
6
adlspath_1
760
2021-08-24T00:00:00Z
6
adlspath_12
400
2021-08-24T00:00:00Z
5
adlspath_5
599
2021-08-24T00:00:00Z
3
adlspath_11
410
2021-08-25T00:00:00Z
4
adlspath_14
469
2021-08-25T00:00:00Z
5
adlspath_13
565
2021-08-25T00:00:00Z
6
adlspath_9
472
2021-08-25T00:00:00Z
2
adlspath_2
514
2021-08-26T00:00:00Z
6
adlspath_10
391
2021-08-26T00:00:00Z
4
adlspath_3
414
2021-08-27T00:00:00Z
4
adlspath_6
594
2021-08-28T00:00:00Z
3
adlspath_0
499
2021-08-28T00:00:00Z
6
adlspath_4
552
2021-08-29T00:00:00Z
6
Fiddle
I need to get the next row from an ORDERBY query
I have 2 columns, ID(Primary key), Age(float) in a table T and I need something like the following
SELECT ID FROM T WHERE !> (inputted ID) + 1 rowID/Next row <! ORDERBY Age (then primary key, but I suspect if the Age values are the same SQLite would default to order by primary key anyway) LIMIT 1
Essentially it would select the next row after the inputted ID in the ordered table, its the next row / rowID + 1 I am not sure how to get.
As suggested here is a data set as an example
https://dbfiddle.uk?rdbms=sqlite_3.27&fiddle=19685ac20cc42041a59d318a01a2010f
ID Age
1 12.2
2 36.8
3 22.5
4 41
5 16.7
I am attempting to get the the following row from the ordered (by age) list given a specific ID
ID Age
1 12.2
5 16.7
3 22.5
2 36.8
4 41
Something similar to
SELECT ID FROM OrderedInfo WHERE ID = 5 ORDER BY Age ASC LIMIT 1 OFFSET 1;
My expected result would be '3' from the example data above
I have expanded the data set to include duplicate entries as I didn't implicitly state it could have such data - as such forpas answer works for the first example with no duplicate entries - thanks for your help
https://dbfiddle.uk?rdbms=sqlite_3.27&fiddle=f13d7f5a44ba414784547d9bbdf4997e
Use a subquery for the ID that you want in the WHERE clause:
SELECT *
FROM OrderedInfo
WHERE Age > (SELECT Age FROM OrderedInfo WHERE ID = 5)
ORDER BY Age LIMIT 1;
See the demo.
If there are duplicate values in the column Age use a CTE that returns the row that you want and join it to the table so that you expand the conditions:
WITH cte AS (SELECT ID, Age FROM OrderedInfo WHERE ID = 5)
SELECT o.*
FROM OrderedInfo o INNER JOIN cte c
ON o.Age > c.Age OR (o.Age = c.Age AND o.ID > c.ID)
ORDER BY o.Age, o.ID LIMIT 1;
See the demo.
It's a question I got this afternoon:
There a table contains ID, emp_Name,emp_mailid and Salary of Employees, get names of the first-highest salary employees, in oracle
here this is my table
id emp_name emp_mailid salary
2 dinesh dinesh#gmail.com 5000
3 ganesh ganesh#gmail.com 6000
6 ramesh ramesh#gmail.com 4500
10 suresh suresh#gmail.com 10000
11 rajesh rajesh#gmail.com 15000
15 kamesh kamesh#gmail.com 16000
16 kamalesh kamalesh#gmail.com 7800
19 neelash neelash#gmail.com 12563
20 rajan rajan#gmail.com 156231
22 vignesh vignesh#gmail.com 45220
30 rubesh rubesh#gmail.com 78000
31 john john#gmail.com 6522
and this my query:
select *
from
(
select
salary
,dense_rank() over (order by salary desc) ranking
from test
)
where ranking = 1
when i execute this it shows maximum salary as 78000 but actually its wrong maximum salary is 156231
Can you suggest me a better query?
If it shows you the maximum as 78000, it's probably because the salary column is varchar and not number.
try this:
select *
from
(
select
salary
,dense_rank() over (order by to_number(salary) desc) ranking
from test
)
where ranking = 1
So my situation. Got a table with few thousand entries, I took one column and counted the number of the same entries.
SELECT mycolumn, COUNT(*)
FROM mytable
WHERE myid = 6867
GROUP BY mycolumn
ORDER BY COUNT(*) DESC
Outputs:
6 885
1 715
4 562
5 557
2 232
3 181
I get the desired results. Now all I need is to add a bottom row with a sum of all counted entries.
Typically you would do this in some reporting tool, like SSRS, not SQL.
To do it in SQL, add a UNION statement:
UNION ALL
SELECT 0, COUNT(*)
FROM mytable
WHERE myid = 6867;
I have a sample table with following values
SNO | Mon
-----+-------
100 | 1
101 | 1
102 | 1
100 | 2
101 | 2
102 | 2
100 | 3
101 | 3
Now I need a query to count the total sno's which are in 3 months
The result should be 2, as 100 & 101 are in mon 1,2 and 3. However, 102 is only present in mon 1,2.
Thanks,
RK
This Query in theory should work.
SELECT
tmpTbl.sNo
FROM
tmpTbl
GROUP BY
tmpTbl.sNo
HAVING
Count(tmpTbl.monNo) = (SELECT Count(*) FROM (SELECT tmpTbl.monNo FROM tmpTbl GROUP BY tmpTbl.monNo));
The result would be,
sNo
----
100
101
I have used two SubQueries to get the result. Teh both are used in the HAVING clause of the SQL. First SqubQuery (inner most). Will get the number of Unique Month's available in your table, the outer SubQuery will then Count the number of Unique months. So the Overall Query can be translated as "SELECT the serial number FROM the table HAVING the Count of Month equal to the Number of unique records in the same table".
The reason I used SbQuery instead of a number is because of the fact this will also be applicable when your month number increases. Hope this helps !
EDIT
Here is the Query for getting the count.
SELECT
Count(*) As simpleCount
FROM
(
SELECT
tmpTbl.sNo
FROM
tmpTbl
GROUP BY
tmpTbl.sNo
HAVING
Count(tmpTbl.monNo) = (SELECT Count(*) FROM (SELECT tmpTbl.monNo FROM tmpTbl GROUP BY tmpTbl.monNo))
);