Kusto query to show summary by percent of totals - azure-data-explorer

I am trying to get summary of failures in percentages of totals, see my query below. It is good, but I want it to show me Vendor1=0.5 and Vendor2=0.5 (50% failures), and not just Vendor1=1 (one failure with 0), Vendor2=2 (two failures of 0)
datatable (Vendor:string, failure:int)
["Vendor1",3,
"Vendor2",0,
"Vendor2",0,
"Vendor2", 7,
"Vendor1",0,
"Vendor2", 1]
| where failure == 0
| summarize Failures=count() by Vendor

Please check if next query solves your scenario:
datatable (Vendor:string, failure:int)
["Vendor1",3,
"Vendor2",0,
"Vendor2",0,
"Vendor2", 7,
"Vendor1",0,
"Vendor2", 1]
| summarize Failures=countif(failure == 0), Total=count() by Vendor
| extend Result=Failures*1.0/Total

A slight variation of #Alexander Sloutsky's answer:
datatable (Vendor:string, failure:int)
["Vendor1",3,
"Vendor2",0,
"Vendor2",0,
"Vendor2", 7,
"Vendor1",0,
"Vendor2", 1]
| summarize Result = 1.0*countif(failure==0)/count() by Vendor
Demo

Related

How to make an Application Insights kusto query sort correctly on performanceBucket?

Is there a way to make an Application Insights kusto query sort on performanceBucket 'correctly', i.e. on bucket duration? When I summarize or sort using performanceBucket and don't specify a sort I get something like this (note for example that 1-3sec is not adjacent to 3-7sec):
If I add a sort by performanceBucket it's done alphanumerically:
I want it to be in this order (or the reverse of it)
<250ms
250ms-500ms
500ms-1sec
1sec-3sec
3sec-7sec
7sec-15sec
15sec-30sec
30sec-1min
1min-2min
You need to artificially add a column that indicates your preferred sorting order, then sort by it, and project it away:
// Synthetic data - don't copy this
let YourResult = datatable(perfBucket:string, count_:long) [
"250ms-500ms", 14000,
"7sec-15sec", 600,
"1sec-3sec", 9700
];
// This is the actual query
YourResult
| extend sortOrder =
case(perfBucket == "<250ms", 1,
perfBucket == "250ms-500ms", 2,
perfBucket == "500ms-1sec", 3,
perfBucket == "1sec-3sec", 4,
perfBucket == "3sec-7sec", 5,
perfBucket == "7sec-15sec", 6,
perfBucket == "15sec-30sec", 7,
perfBucket == "30sec-1min" ,8,
perfBucket == "1min-2min", 9,
10)
| order by sortOrder asc
| project-away sortOrder
Result:
perfBucket
count_
250ms-500ms
14000
1sec-3sec
9700
7sec-15sec
600

Kusto for sliding window

I am new to Kusto Query language. Requirement is to alert when the continuous 15 minute value of machine status is 1.
I have two columns with column1:(timestamp in every second) and column2:machine status(values 1 and 0).How can I use a sliding window to find if the machine is 1 for continuous 15 minutes.
Currently I have used the bin function, but it does not seem to be the proper one.
summarize avg_value = avg(status) by customer, machine,bin(timestamp,15m)
What could be the better solution for this.
Thanks in advance
Here is another option using time series functions:
let dt = 1s;
let n_bins = tolong(15m/dt);
let coeffs = repeat(1, n_bins);
let T = view(M:string) {
range Timestamp from datetime(2022-01-11) to datetime(2022-01-11 01:00) step dt
| extend machine = M
| extend status = iif(rand()<0.002, 0, 1)
};
union T("A"), T("B")
| make-series status=any(status) on Timestamp step dt by machine
| extend rolling_status = series_fir(status, coeffs, false)
| extend alerts = series_equals(rolling_status, n_bins)
| project machine, Timestamp, alerts
| mv-expand Timestamp to typeof(datetime), alerts to typeof(bool)
| where alerts == 1
You can also do it using the scan operator.
thanks
Here is one way to do it, the example uses generated data, hopefully it fits in your scenario:
let view = range x from datetime(2022-01-10 13:00:10) to datetime(2022-01-10 13:10:10) step 1s
| extend status = iif(rand()<0.01, 0, 1)
| extend current_sum = row_cumsum(status)
| extend prior_sum = prev(current_sum, 15)
| extend should_alert = (current_sum-prior_sum != 15 and isnotempty(prior_sum))
If you have multiple machines you need to sort it first by machines and restart the row_cumsum operation:
let T = view(M:string) {
range Timestamp from datetime(2022-01-10 13:00:10) to datetime(2022-01-10 13:10:10) step 1s
| extend machine = M
| extend status = iif(rand()<0.01, 0, 1)
};
union T("A"), T("B")
| sort by machine asc, Timestamp asc
| extend current_sum = row_cumsum(status, machine != prev(machine))
| extend prior_sum = iif(machine == prev(machine, 15), prev(current_sum, 15), int(null))
| extend should_alert = (current_sum-prior_sum != 15 and isnotempty(prior_sum))

Kusto query to split pie chart in half as per results

I am trying to display result using Kusto KQL query in pie chart.The goal is to display pie chart as half n half color in case of failure and full color in case of pass.
Basically log from a site displays rows as pass and failed row .In case where all are pass , pie chart should display 100 % same color.In case of even single failure in any rows , it should display 50% one color and 50% other color.Below query works when 1) When all rows are pass as full color 2) when some are pass and some fail or even one fails (displays pie chart in half n half) color 3)BUT WHEN ALL ROW HAS FAILS ,this is displaying in one color and not splitting pie chart in half n half
QUERY I USED:
results
| where Name contains "jobqueues"
| where timestamp > ago(1h)
| extend PASS = (ErLvl)>2 )
| extend FAIL = ((ErLvl<2 )
| project PASS ,FAIL
| extend status = iff(PASS==true,"PASS","FAIL")
| summarize count() by status
| extend display = iff(count_>0,1,0)
| summarize percentile(display, 50) by status
| render piechart
Please suggest what can be done to solve this problem.Thanks in advance.
Let's summarize your question:
There are only two outcomes of your query:
A piechart showing 50% vs 50%
A piechart showing 100%
From your description we learn that when
All rows are PASS we plot piechart 2.
Any row has FAIL we plot piechart 1.
Lets see how we can achieve this after this line from your code:
| extend status = iff(PASS==true,"PASS","FAIL")
| summarize count() by status
We should have a table looking like so:
status
count_
PASS
x
FAIL
y
Looks like we need to perform some logic on this. You were originally plotting based on the operation result. My idea was to just generate a table of pass = 1 and fail = 1 for the 50%v50% case and another table of pass = 1 and fail = 0 for the 100% case.
So following that logic we need to perform the following mapping:
status
count_
status
count2
fail
>0
maps to
fail
1
pass
>0
pass
1
status
count_
status
count2
fail
>0
maps to
fail
1
pass
=0
pass
1
status
count_
status
count2
fail
=0
maps to
fail
0
pass
>0
pass
1
Logical representation:
(given count_ >=0):
if fail > 0 count2 = 0 else count 1
pass is always equal to 1
We only need to apply this to the row where status == FAILED but summarize doesn't guarantee a row if there are no observations
Guarantee summarize results:
| extend fail_count = iif(status == "FAIL", count_, 0)
| extend pass_count = iif(status == "PASS", count_, 0)
| project fail_count,pass_count
| summarize sum(fail_count), sum(pass_count)
Apply logic
| extend FAIL = iff(sum_fail_count > 0, 1, 0)
| extend PASS = 1
| project FAIL, PASS
Now our result is looking like:
PASS
FAIL
1
1 or 0
In order to plot this as a pie chart we just need to transpose it so the columns PASSED and FAILED are rows of the "status" column.
We can use a simple pack and mv-expand for this
//transpose for rendering
| extend tmp = pack("FAIL",FAIL,"PASS",PASS)
| mv-expand kind=array tmp
| project Status=tostring(tmp[0]), Count=toint(tmp[1])
| render piechart
And that's it!~
Final query:
results
| where Name contains "jobqueues"
| where timestamp > ago(1h)
| extend PASS = (ErLvl)>2 )
| extend FAIL = ((ErLvl<2 )
| project PASS ,FAIL
| extend status = iff(PASS==true,"PASS","FAIL")
| summarize count() by status
//ensure results
| extend fail_count = iif(status == "FAIL", count_, 0)
| extend pass_count = iif(status == "PASS", count_, 0)
| project fail_count,pass_count
| summarize sum(fail_count), sum(pass_count)
//apply logic
| extend FAIL = iff(sum_fail_count > 0, 1, 0)
| extend PASS = 1
| project FAIL, PASS
//transpose for rendering
| extend Temp = pack("FAIL",FAIL,"PASS",PASS)
| mv-expand kind=array Temp
| project Status=tostring(Temp[0]), Count=toint(Temp[1])
| render piechart

sqlite timecode calculations

I have two tables which I export from my video editing suite, one ("MediaPool") containing a row for each media file imported into the project, another ("Montage") for the portions of that file used in a specific edit. The fields that are associated between the two are MediaPool.FileName and Montage.Name, which are very similar (Filename only adds the file extension).
# MediaPool
Filename | Take
---------------------------------
somefile.mp4 | Getty
file2.mov | Associated Press
file3.mov | Associated Press
and
# Montage
Name | RecordIn | RecordOut
------------------------------------------
somefile | 01:01:01:01 | 01:01:20:19
somefile | 01:05:15:23 | 01:05:16:10
somefile | 01:25:19:10 | 01:30:16:04
file2 | 01:30:11:10 | 01:31:18:12
file2 | 01:40:15:22 | 01:42:21:17
The tables contain many more columns of course, but only the above is relevant.
Only the "MediaPool" table contains the field called "Take" which designates the file's copyright holder (long story). It can't be included in the "Montage" export. I needed to calculate the total duration of footage used from each source, by subtracting the RecordIn timecode from RecordOut and adding each result. This turned out to be more complicated than I expected, as I have some notions of programming but almost none when it comes to SQL (sqlite in my case).
I managed to come up with the following, which works fine and runs in under 4 seconds. However, from the little programming I've done, it seems overlong and very inelegant. Is there a shorter way to achieve this?
BTW, I'm using 25 fps timecode and I can't use LPAD in sqlite.
SELECT
Source,
SUBSTR('00' || CAST(DurationFrames/(60*60*25) AS TEXT), -2, 2) || ':' ||
SUBSTR('00' || CAST(DurationFrames%(60*60*25)/(60*25) AS TEXT), -2, 2) || ':' ||
SUBSTR('00' || CAST(DurationFrames%(60*60*25)%(60*25)/25 AS TEXT), -2, 2) || ':' ||
SUBSTR('00' || CAST(DurationFrames%(60*60*25)%(60*25)%25 AS TEXT), -2, 2)
AS DurationTC
FROM
(
SELECT
MediaPool.Take AS Source,
Montage.RecordIn,
Montage.RecordOut,
SUM(CAST(SUBSTR(Montage.RecordOut, 1, 2) AS INT)*3600*25 +
CAST(SUBSTR(Montage.RecordOut, 4, 2) AS INT)*60*25 +
CAST(SUBSTR(Montage.RecordOut, 7, 2) AS INT)*25 +
CAST(SUBSTR(Montage.RecordOut, 10, 2) AS INT) -
CAST(SUBSTR(Montage.RecordIn, 1, 2) AS INT)*3600*25 -
CAST(SUBSTR(Montage.RecordIn, 4, 2) AS INT)*60*25 -
CAST(SUBSTR(Montage.RecordIn, 7, 2) AS INT)*25 -
CAST(SUBSTR(Montage.RecordIn, 10, 2) AS INT))
AS DurationFrames
FROM
MediaPool
JOIN
Montage ON MediaPool.FileName LIKE '%' || Montage.Name || '%'
GROUP BY
Take
ORDER BY
Take
)
Here's a simplified query that produces the same results as yours on your test data. Mostly it uses printf() instead of a bunch of string concatenation and substr()s, and uses strftime() to calculate the total seconds of the hours minutes seconds part of the timecode:
WITH frames AS
(SELECT Take, sum((strftime('%s', substr(RecordOut,1,8))*25 + substr(RecordOut,10))
- (strftime('%s', substr(RecordIn,1,8))*25 + substr(RecordIn,10)))
AS DurationFrames
FROM MediaPool
JOIN Montage ON MediaPool.Filename LIKE Montage.Name || '.%'
GROUP BY Take)
SELECT Take AS Source
, printf("%02d:%02d:%02d:%02d", DurationFrames/(60*60*25),
DurationFrames%(60*60*25)/(60*25),
DurationFrames%(60*60*25)%(60*25)/25,
DurationFrames%(60*60*25)%(60*25)%25)
AS DurationTC
FROM frames
ORDER BY Take;

Alert on error rate exceeding threshold using Azure Insights and/or Analytics

I'm sending customEvents to Azure Application Insights that look like this:
timestamp | name | customDimensions
----------------------------------------------------------------------------
2017-06-22T14:10:07.391Z | StatusChange | {"Status":"3000","Id":"49315"}
2017-06-22T14:10:14.699Z | StatusChange | {"Status":"3000","Id":"49315"}
2017-06-22T14:10:15.716Z | StatusChange | {"Status":"2000","Id":"49315"}
2017-06-22T14:10:21.164Z | StatusChange | {"Status":"1000","Id":"41986"}
2017-06-22T14:10:24.994Z | StatusChange | {"Status":"3000","Id":"41986"}
2017-06-22T14:10:25.604Z | StatusChange | {"Status":"2000","Id":"41986"}
2017-06-22T14:10:29.964Z | StatusChange | {"Status":"3000","Id":"54234"}
2017-06-22T14:10:35.192Z | StatusChange | {"Status":"2000","Id":"54234"}
2017-06-22T14:10:35.809Z | StatusChange | {"Status":"3000","Id":"54234"}
2017-06-22T14:10:39.22Z | StatusChange | {"Status":"1000","Id":"74458"}
Assuming that status 3000 is an error status, I'd like to get an alert when a certain percentage of Ids end up in the error status during the past hour.
As far as I know, Insights cannot do this by default, so I would like to try the approach described here to write an Analytics query that could trigger the alert. This is the best I've been able to come up with:
customEvents
| where timestamp > ago(1h)
| extend isError = iff(toint(customDimensions.Status) == 3000, 1, 0)
| summarize failures = sum(isError), successes = sum(1 - isError) by timestamp bin = 1h
| extend ratio = todouble(failures) / todouble(failures+successes)
| extend failure_Percent = ratio * 100
| project iff(failure_Percent < 50, "PASSED", "FAILED")
However, for my alert to work properly, the query should:
Return "PASSED" even if there are no events within the hour (another alert will take care of the absence of events)
Only take into account the final status of each Id within the hour.
As the request is written, if there are no events, the query returns neither "PASSED" nor "FAILED".
It also takes into account any records with Status == 3000, which means that the example above would return "FAILED" (5 out of 10 records have Status 3000), while in reality only 1 out of 4 Ids ended up in error state.
Can someone help me figure out the correct query?
(And optional secondary questions: Has anyone setup a similar alert using Insights? Is this a correct approach?)
As mentioned, since you're only querying on a singe hour your don't need to bin the timestamp, or use it as part of your aggregation at all.
To answer your questions:
The way to overcome no data at all would be to inject a synthetic row into your table which will translate to a success result if no other result is found
If you want your pass/fail criteria to be based on the final status for each ID, then you need to use argmax in your summarize - it will return the status corresponding to maximal timestamp.
So to wrap it all up:
customEvents
| where timestamp > ago(1h)
| extend isError = iff(toint(customDimensions.Status) == 3000, 1, 0)
| summarize argmax(timestamp, isError) by tostring(customDimensions.Id)
| summarize failures = sum(max_timestamp_isError), successes = sum(1 - max_timestamp_isError)
| extend ratio = todouble(failures) / todouble(failures+successes)
| extend failure_Percent = ratio * 100
| project Result = iff(failure_Percent < 50, "PASSED", "FAILED"), IsSynthetic = 0
| union (datatable(Result:string, IsSynthetic:long) ["PASSED", 1])
| top 1 by IsSynthetic asc
| project Result
Regarding the bonus question - you can setup alerting based on Analytics queries using Flow. See here for a related question/answer
I'm presuming that the query returns no rows if you have no data in the hour, because the timestamp bin = 1h (aka bin(timestamp,1h)) doesn't return any bins?
but if you're only querying the last hour, i don't think you need the bin on timestamp at all?
without having your data it's hard to repro exactly but... you could try something like (beware syntax errors):
customEvents
| where timestamp > ago(1h)
| extend isError = iff(toint(customDimensions.Status) == 3000, 1, 0)
| summarize totalCount = count(), failures = countif(isError == 1), successes = countif(isError ==0)
| extend ratio = iff(totalCount == 0, 0, todouble(failures) / todouble(failures+successes))
| extend failure_Percent = ratio * 100
| project iff(failure_Percent < 50, "PASSED", "FAILED")
hypothetically, getting rid of the hour binning should just give you back a single row here of
totalCount = 0, failures = 0, successes = 0, so the math for failure percent should give you back 0 failure ratio, which should get you "PASSED".
without being to try it i'm not sure if that works or still returns you no row if there's no data?
for your second question, you could use something like
let maxTimestamp = toscalar(customEvents where timestamp > ago(1h)
| summarize max(timestamp));
customEvents | where timestamp == maxTimestamp ...
// ... more query here
to get just the row(s) that have that have a timestamp of the last event in the hour?

Resources