Application Insights Summarize with Having clause - azure-application-insights

I need to summarize an Application Insights query where the count > 1. I don't see any "Having" clause like SQL has. How can I limit my query to only include records when count > 1?
traces
| extend MessageId = tostring(customDimensions.MessageId)
| summarize Count = count() by MessageId
| order by Count desc

Once you've called the summarize function Count is treated as a column so you can use a where clause to filter it:
traces
| extend MessageId = tostring(customDimensions.MessageId)
| summarize Count = count() by MessageId
| where Count > 1
| order by Count desc

Related

Passing table list to "Find In" operator dynamically at run time in Kusto Query Language

I have a where condition which I want to run over a set of tables in my Azure Data Explorer DB. I found "Find in ()" operator in Kusto query quite useful, works fine when I pass list of tables as intended.
find withsource=DataType in (AppServiceFileAuditLogs,AzureDiagnostics)
where TimeGenerated > ago(31d)
project _ResourceId, _BilledSize, _IsBillable
| where _IsBillable == true
| summarize BillableDataBytes = sum(_BilledSize) by _ResourceId, DataType | sort by BillableDataBytes nulls last
However, in my scenario, I would like to decide the list of tables at run time using another query.
Usage
| where TimeGenerated > ago(32d)
| where StartTime >= startofday(ago(31d)) and EndTime < startofday(now())
| where IsBillable == true
| summarize BillableDataGB = sum(Quantity) / 1000 by DataType
| sort by BillableDataGB desc
|project DataType
find withsource=DataType in (<pass resulting table expression from above query here as comma separated list of tables>)
where TimeGenerated > ago(31d)
project _ResourceId, _BilledSize, _IsBillable
| where _IsBillable == true
| summarize BillableDataBytes = sum(_BilledSize) by _ResourceId, DataType | sort by BillableDataBytes nulls last
Found some examples of passing all tables in a database or cluster using wildcards but that does not fit my scenario. Can somebody help me here.
Here is one way to achieve this:
let Tables = toscalar(Usage
| where TimeGenerated > ago(32d)
| where StartTime >= startofday(ago(31d)) and EndTime < startofday(now())
| where IsBillable == true
| summarize by DataType);
union withsource=T *
| where T in (Tables)
| count
Note that there is a significance to the toscalar expression, it precalculates the list of tables and optimizes the filter on the union expression. I also updated your query to avoid unnecessary work.

How to retreive custom property corresponding to another property in azure

I am trying to write a kusto query to retrieve a custom property as below.
I want to retrieve count of pkgName and corresponding organization. I could retrieve the count of pkgName and the code is attached below.
let mainTable = union customEvents
| extend name =replace("\n", "", name)
| where iif('*' in ("*"), 1 == 1, name in ("*"))
| where true;
let queryTable = mainTable;
let cohortedTable = queryTable
| extend dimension = customDimensions["pkgName"]
| extend dimension = iif(isempty(dimension), "<undefined>", dimension)
| summarize hll = hll(itemId) by tostring(dimension)
| extend Events = dcount_hll(hll)
| order by Events desc
| serialize rank = row_number()
| extend dimension = iff(rank > 10, 'Other', dimension)
| summarize merged = hll_merge(hll) by tostring(dimension)
| project ['pkgName'] = dimension, Counts = dcount_hll(merged);
cohortedTable
Please help me to get the organization along with each pkgName projected.
Please try this simple query:
customEvents
| summarize counts=count(tostring(customDimensions.pkgName)) by pkgName=tostring(customDimensions.pkgName),organization=tostring(customDimensions.organization)
Please feel free to modify it to meet your requirement.
If the above does not meet your requirement, please try to create another table which contains pkgName and organization relationship. Then use join operator to join these tables. For example:
//create a table which contains the relationship
let temptable = customEvents
| summarize by pkgName=tostring(customDimensions.pkgName),organization=tostring(customDimensions.organization);
//then use the join operator to join these tables on the keyword pkgName.

computing offset for prev dynamically

I want to set offset for prev dynamically, based on number of items in a group. for e.g
T
| make-series value = sum(value) on timestamp from .. to .. step 5m by customer
| summarize by bin(timestamp,1h), customer
| extend prev_value = prev(value,<offset>)
The offset here should be equal to number of distinct customers. How can i compute this offset dynamically
If you can split query into small parts, you can use toscalar function to get number of unique customers.
This would be my approach...
let tab_series =
T
| make-series value = sum(value) on timestamp from .. to .. step 5m by customer
;
let no_of_distinct_customers =
toscalar(tab_series | distinct customer | summarize count())
;
tab_series
| summarize by bin(timestamp, 1h), customer
| extend prev_value = prev(value, no_of_distinct_customers)
You can find example here.

Use values from one table in the bin operator of another table

Consider the following query:
This will generate a 1 cell result for a fixed value of bin_duration:
events
| summarize count() by id, bin(time , bin_duration) | count
I wish to generate a table with variable values of bin_duration.
bin_duration will take values from the following table:
range bin_duration from 0 to 600 step 10;
So that the final table looks something like this:
How do I go about achieving this?
Thanks
The bin(value,roundTo) aka floor(value,roundTo), will round value down to the nearest multiple of roundTo, so you don't need an external table.
events
| summarize n = count() by bin(duration,10)
| where duration between(0 .. 600)
| order by duration asc
You can try this out on the Stormevents tutorial:
let events = StormEvents | extend duration = (EndTime - StartTime) / 1h;
events
| summarize n = count() by bin(duration,10)
| where duration between(0 .. 600)
| order by duration asc
When dealing with timeseries data, bin() also understands the handy timespan literals, ex.:
let events = StormEvents | extend duration = (EndTime - StartTime);
events
| summarize n = count() by bin(duration,10h)
| where duration between(0h .. 600h)
| order by duration asc

Application Insights analytics - Select first value per category

I want do an equivalent of the following SQL query -
(roughly)
SELECT
Name,
application_Version
Rank() OVER (PARTITION BY application_Version ORDER BY CountOfEventNamePerVersion)
FROM
customEvents
Assuming I get the CountOfCompanyPerVersion field easily. I want to do the same using AIQL but I'm not able to do this. Here's a query that I am tried -
customEvents
| summarize count() by name, application_Version
| project name, application_Version, count_
| summarize x = count(count_) by application_Version
| where x = count_
Basically I want to get the most common Name per application_Version. How can I do this?
arg_max should do the trick:
customEvents
| summarize count() by Name, application_Version
| summarize arg_max(count_, Name) by application_Version
| order by application_Version
| project application_Version, Name=max_count__Name

Resources