table counterpart of column_ifexists() - azure-data-explorer

We do have a function column_ifexists() which refers to a certain column if it exists, otherwise it refers to another option if we provide. Is there a similar function for table? I want to refer to a table and run some logic against it in the query , if the table exists , but if it doesn't exist, there shouldn't be a failure -- it should simply return no data.
e.g.
table_ifexists('sometable') | ...<logic>...

Please note that the fields referenced in the query should be defined in the dummy table, otherwise in case of non-existing table, the query will yield an exception
Failed to resolve scalar expression named '...'
In the following example these fields are StartTime, EndTime & EventType
Table exists
let requested_table = "StormEvents";
let dummy_table = datatable (StartTime:datetime, EndTime:datetime, EventType:string)[];
union isfuzzy=true table(requested_table), dummy_table
| where EndTime - StartTime > 30d
| summarize count() by EventType
EventType
count_
Drought
1635
Flood
20
Heat
14
Wildfire
4
Fiddle
Table does not exist
let requested_table = "StormEventsXXX";
let dummy_table = datatable (StartTime:datetime, EndTime:datetime, EventType:string)[];
union isfuzzy=true table(requested_table), dummy_table
| where EndTime - StartTime > 30d
| summarize count() by EventType
EventType
count_
Fiddle

Related

Kusto Query Dynamic sort Order

I have started working on Azure Data Explorer( Kusto) recently.
My requirement to make sorting order of Kusto table in dynamic way.
// Variable declaration
let SortColumn ="run_date";
let OrderBy="desc";
// Actual Code
tblOleMeasurments
| take 10
|distinct column1,column2,column3,run_date
|order by SortColumn OrderBy
Here My code working fine till Sortcolumn but when I tried to add [OrderBy] after [SortColumn] kusto gives me error .
My requirement here is to pass Asc/desc value from Variable [OrderBy].
Kindly assist here with workarounds and solutions which help me .
The sort column and order cannot be an expression, it must be a literal ("asc" or "desc"). If you want to pass the sort column and sort order as a variable, create a union instead where the filter on the variables results with the desired outcome. Here is an example:
let OrderBy = "desc";
let sortColumn = "run_date";
let Query = tblOleMeasurments | take 10 |distinct column1,column2,column3,run_date;
union
(Query | where OrderBy == "desc" and sortColumn == "run_date" | order by run_date desc),
(Query | where OrderBy == "asc" and sortColumn == "run_date" | order by run_date asc)
The number of union legs would be the product of the number of candidate sort columns times two (the two sort order options).
An alternative would be sorting by a calculated column, which is based on your sort_order and sort_column. The example below works for numeric columns
let T = range x from 1 to 5 step 1 | extend y = -10 * x;
let sort_order = "asc";
let sort_column = "y";
T
| order by column_ifexists(sort_column, "") * case(sort_order == "asc", -1, 1)

Kusto: How summarize calculated data

I have start and end calculated columns which I have read from Table1.
And comparing the how many events are happened in this between time .
Input Data:
let Mytable1=datatable (Vin:string,start_time:datetime ,End_time:datetime )
[ABC,datetime(2021-03-18 08:49:08.467), datetime(2021-03-18 13:32:28.000),
ABC,datetime(2021-03-18 13:41:59.323),datetime(2021-03-18 13:41:59.323),
ABC,datetime(2021-03-18 13:46:59.239),datetime(2021-03-18 14:58:02.000)];
let Mytable2=datatable(Vin:string,Timestamp:datetime)
[ABC,datetime(2021-03-18 08:49:08.467),ABC,datetime(2021-03-18 08:59:08.466),ABC,datetime(2021-03-18 09:04:08.460),ABC,datetime(2021-03-18 13:24:27.0000000)];
Query:
let Test=Table1
|where Vin =="ABC" | distinct Vin,Start_Time,End_Time;
let min1=toscalar(Test |summarize min1= min(Start_Time));
let max1=toscalar(Test |summarize max1=max(End_Time));
Table2
|where Vin =="ABC" and Timestamp between (todatetime(min1) ..todatetime(max1))
| join kind=fullouter Test
on $left.Vin == $right.Vin and $left.Timestamp== $right.Start_Time
|summarize Events= (count()) by Timestamp,Vin,Start_Time,End_Time
|project Timestamp,Start_Time,End_Time,Events
Output of above query is :
But My expected output is :
Means Events count from between two start and end time.
You should not have timestamp in your final aggregation. A working example could look like:
let measurement_range=datatable (vin:string,start_time:datetime ,end_time:datetime )
["ABC",datetime(2021-03-18 08:49:08.467),datetime(2021-03-18 13:32:28.000),
"ABC",datetime(2021-03-18 13:41:59.323),datetime(2021-03-18 13:44:59.323),
"ABC",datetime(2021-03-18 13:46:59.239),datetime(2021-03-18 14:58:02.000),
];
let measurement=datatable(vin:string,timestamp:datetime)
["ABC",datetime(2021-03-18 08:49:08.467),
"ABC",datetime(2021-03-18 08:59:08.466),
"ABC",datetime(2021-03-18 09:04:08.460),
"ABC",datetime(2021-03-18 13:42:27.0000000)];
measurement_range
| join kind=inner (measurement)
on vin
| where timestamp between (start_time..end_time)
| summarize event=(count()) by vin, start_time, end_time
With this you get a count for your measurement window. In this example you get a large intermediate resultset, as the timerange is considered in the where statement.
Please see the Azure Data Explorer Documentation how to optimize time window joins (the example is not efficient for larger datasets).

Passing table list to "Find In" operator dynamically at run time in Kusto Query Language

I have a where condition which I want to run over a set of tables in my Azure Data Explorer DB. I found "Find in ()" operator in Kusto query quite useful, works fine when I pass list of tables as intended.
find withsource=DataType in (AppServiceFileAuditLogs,AzureDiagnostics)
where TimeGenerated > ago(31d)
project _ResourceId, _BilledSize, _IsBillable
| where _IsBillable == true
| summarize BillableDataBytes = sum(_BilledSize) by _ResourceId, DataType | sort by BillableDataBytes nulls last
However, in my scenario, I would like to decide the list of tables at run time using another query.
Usage
| where TimeGenerated > ago(32d)
| where StartTime >= startofday(ago(31d)) and EndTime < startofday(now())
| where IsBillable == true
| summarize BillableDataGB = sum(Quantity) / 1000 by DataType
| sort by BillableDataGB desc
|project DataType
find withsource=DataType in (<pass resulting table expression from above query here as comma separated list of tables>)
where TimeGenerated > ago(31d)
project _ResourceId, _BilledSize, _IsBillable
| where _IsBillable == true
| summarize BillableDataBytes = sum(_BilledSize) by _ResourceId, DataType | sort by BillableDataBytes nulls last
Found some examples of passing all tables in a database or cluster using wildcards but that does not fit my scenario. Can somebody help me here.
Here is one way to achieve this:
let Tables = toscalar(Usage
| where TimeGenerated > ago(32d)
| where StartTime >= startofday(ago(31d)) and EndTime < startofday(now())
| where IsBillable == true
| summarize by DataType);
union withsource=T *
| where T in (Tables)
| count
Note that there is a significance to the toscalar expression, it precalculates the list of tables and optimizes the filter on the union expression. I also updated your query to avoid unnecessary work.

How to retreive custom property corresponding to another property in azure

I am trying to write a kusto query to retrieve a custom property as below.
I want to retrieve count of pkgName and corresponding organization. I could retrieve the count of pkgName and the code is attached below.
let mainTable = union customEvents
| extend name =replace("\n", "", name)
| where iif('*' in ("*"), 1 == 1, name in ("*"))
| where true;
let queryTable = mainTable;
let cohortedTable = queryTable
| extend dimension = customDimensions["pkgName"]
| extend dimension = iif(isempty(dimension), "<undefined>", dimension)
| summarize hll = hll(itemId) by tostring(dimension)
| extend Events = dcount_hll(hll)
| order by Events desc
| serialize rank = row_number()
| extend dimension = iff(rank > 10, 'Other', dimension)
| summarize merged = hll_merge(hll) by tostring(dimension)
| project ['pkgName'] = dimension, Counts = dcount_hll(merged);
cohortedTable
Please help me to get the organization along with each pkgName projected.
Please try this simple query:
customEvents
| summarize counts=count(tostring(customDimensions.pkgName)) by pkgName=tostring(customDimensions.pkgName),organization=tostring(customDimensions.organization)
Please feel free to modify it to meet your requirement.
If the above does not meet your requirement, please try to create another table which contains pkgName and organization relationship. Then use join operator to join these tables. For example:
//create a table which contains the relationship
let temptable = customEvents
| summarize by pkgName=tostring(customDimensions.pkgName),organization=tostring(customDimensions.organization);
//then use the join operator to join these tables on the keyword pkgName.

Application Insights query to get time between 2 custom events

I am trying to write a query that will get me the average time between 2 custom events, sorted by user session. I have added custom tracking events throughout this application and I want to query the time it takes the user from 'Setup' event to 'Process' event.
let allEvents=customEvents
| where timestamp between (datetime(2019-09-25T15:57:18.327Z)..datetime(2019-09-25T16:57:18.327Z))
| extend SourceType = 5;
let allPageViews=pageViews
| take 0;
let all = allEvents
| union allPageViews;
let step1 = materialize(all
| where name == "Setup" and SourceType == 5
| summarize arg_min(timestamp, *) by user_Id
| project user_Id, step1_time = timestamp);
let step2 = materialize(step1
| join
hint.strategy=broadcast (all
| where name == "Process" and SourceType == 5
| project user_Id, step2_time=timestamp
)
on user_Id
| where step1_time < step2_time
| summarize arg_min(step2_time, *) by user_Id
| project user_Id, step1_time,step2_time);
let 1Id=step1_time;
let 2Id=step2_time;
1Id
| union 2Id
| summarize AverageTimeBetween=avg(step2_time - step1_time)
| project AverageTimeBetween
When I run this query it produces this error message:
'' operator: Failed to resolve table or column or scalar expression named 'step1_time'
I am relatively new to writing queries with AI and have not found many resources to assist with this problem. Thank you in advance for your help!
I'm not sure what the let 1id=step1_time lines are intended to do.
those lines are trying to declare a new value, but step1_time isn't a thing, it was a field in another query
i'm also not sure why you're doing that pageviews | take 0 and unioning it with events?
let allEvents=customEvents
| where timestamp between (datetime(2019-09-25T15:57:18.327Z)..datetime(2019-09-25T16:57:18.327Z))
| extend SourceType = 5;
let step1 = materialize(allEvents
| where name == "Setup" and SourceType == 5
| summarize arg_min(timestamp, *) by user_Id
| project user_Id, step1_time = timestamp);
let step2 = materialize(step1
| join
hint.strategy=broadcast (allEvents
| where name == "Process" and SourceType == 5
| project user_Id, step2_time=timestamp
)
on user_Id
| where step1_time < step2_time
| summarize arg_min(step2_time, *) by user_Id
| project user_Id, step1_time,step2_time);
step2
| summarize AverageTimeBetween=avg(step2_time - step1_time)
| project AverageTimeBetween
if I remove the things I don't understand (like union with 0 pageviews, and the lets, I get a result, but I don't have your data so I had to use other values than "Setup" and "Process" so I don't know if it is what you expect?
you might want to look at the results of the step2 query without the summarize to just see what you're getting matches what you expect.

Resources